Elasticsearch index snapshot - technical deep dive

book

Article ID: 100055299

calendar_today

Updated On:

Description

Description

This document is intended for an audience who has basic knowledge about the Elasticsearch indexing engine and wants to know how Elasticsearch internally manages snapshots. This white paper will provide more technical details on Elasticsearch snapshots. 

The Elasticsearch index is divided into shards which is nothing but the Lucene index, which is built up of segments that either reside in memory or on disks. A segment is the lowest-level storage object in Elasticsearch. Segments are created during indexing or ingesting new data on a rolling basis as an index ‘refreshes’ (usually every 1–60 seconds). Elasticsearch transparently merges these segments behind the scenes to keep things sane. Segments are immutable, i.e., they never change after they are written, which is critical to how Elasticsearch manages incremental snapshots since once a segment is written to backup, it cannot be updated or written again.
Snapshot and restore works at the segment level and is incremental as it snapshots a segment only once, even though multiple snapshots may use that segment. As segments merge, the new segments are also backed up, and as there can be multiple segments that hold the same records, snapshotting is not incremental at the record level. Some redundant data may be in the backup snapshots because of the merging of segments of Lucene. If documents are constantly indexing into the Elasticsearch cluster, the merging of Lucene segments will continuously happen in the background. As a result, the same item, document, or record will end up in multiple segments over time, resulting in a considerably larger repository than the index size.

Snapshot

A snapshot is a copy of all the cluster data and may contain both indexes and cluster settings. In Enterprise Vault, a snapshot can be taken at the site level or index server level using the PowerShell command. The command takes a snapshot of all the index data of all the Elasticsearch indexes residing on that index server.

Snapshot repository

A snapshot repository is a location where Elasticsearch stores snapshots. One snapshot repository can store multiple snapshots. Inside a snapshot repository, snapshots are incremental. That is, new snapshots will snapshot only those parts that were not snapshotted in the previous snapshot to avoid wasting time and storage space. There are many types of snapshot repositories, such as local repositories or remote repositories for filesystem, and cloud providers, such as AWS S3, Google Cloud Storage, Azure Blob Storage, Aliyun OSS, and so on.

Setting up the snapshot repository

A snapshot resides within a repository. Several repositories may be defined for a cluster, and each repository has a type. The two repository types available in core Elasticsearch are fs (filesystem) and URL. Enterprise Vault supports only the fs type repository. This setup is created through the Register Process. The repository must be registered before taking snapshots and restoring earlier snapshots. 

Enterprise Vault administrator must set the index snapshot location using the PowerShell command at the site or index server level. The location could be local to the index server, or CIFS shared location accessible from all index servers. After restarting the Enterprise Vault Indexing service on all index servers, the index snapshot repository will be created in the Elasticsearch indexing engine running on the server, which in turn creates a separate folder inside the index snapshot location. 

Consider a single site single server deployment of Enterprise Vault, if \\Server\IndexSnapshotLocation is set as index snapshot location on the index server, then restartingthe Enterprise Vault indexing service creates the folder \\Server\IndexSnapshotLocation\7.x\c1e9b269-920a-4cef-bed1-c93488b3424e. 7.x is the version of Elasticsearch used by Enterprise Vault, and c1e9b269-920a-4cef-bed1-c93488b3424e is the unique GUID which is the name of the index snapshot repository created in Elasticsearch.

In case of multiple servers, if same location is configured as the index snapshot location, then there will be a separate folder created with the unique GUID, which is the name of the index snapshot repository created in Elasticsearch running on that Enterprise Vault index server.

Snapshot process

Snapshots copy each indices’ shard’s segments to the remote storage repository, keeping track of which index, shard, and segment is part of which set of snapshots. Snapshots can include the whole cluster(that is, all indexes and cluster metadata) or just some indexes. Although Enterprise Vault currently does not support taking snapshots of a particular index, it supports snapshots at the index server level. If the snapshot is taken at the index server, all the indices’ shard segments are copied to the remote storage repository.

Snapshots are instantaneous snaps of the state when the snapshot STARTS. Any data indexed after the snapshot start is NOT included (get it next time). This is because, the first step on each node is to build the segment list, and by definition, any ‘new’ data indexed after that point will be in new segments that are not included in the list.

  1. When the snapshot process starts, the Master node builds a list of the primary shards (for each index) and the corresponding nodes. Thus, the list details which node is responsible for snapping which shards and which indices. Only the primary shards are snapshotted, so if they are missing, the backup tracks that. Enterprise Vault uses Elasticsearch as a single-node cluster. Each index server is the Master node and is responsible for building the list of primary shards of each index present on that index server.
  2. Once the Master builds a list of indexes, primary shards, and nodes, each node figures out which index shard segments it has to write to the snapshot repository.
  3. As segments are immutable, any existing segment in the repository is not written again. So, the node builds a list of each shard’s segments on the node and also reads the list of this shard’s segments already in the snapshot repository. By comparing these two lists, it knows which segments must be written to the snapshot repository and whose reference counts must be incremented.
  4. Each node starts copying its segment files to the snapshot repository and updates the Master with the status as it completes each shard. This is done by the SnapshotShardsService, and each shard can be Success, Failed, or Missing. Once all the shards on all the nodes have been completed (or failed), the Master writes the final metadata for the snapshot to the repository, including what indexes and segments are included in which snapshots.
  5. All the Elasticsearch snapshots are incremental. The cluster (each node) looks at what segments it has to snapshot as compared to what segments are already in the repository and writes the missing ones, along with a bunch of references, states, and other metadata.

Snapshot data

The JSON of index-0 file in the above example is:

File path Explanation
index.latest The file is a pointer that represents the last generation index file in digital form, which is the number N mentioned above. Here N is a hexadecimal number, for example, the index of the 100th generation (decimal), and finally represented by 64 in hexadecimal, because 64 = 16*6 + 4.
index-N Repository data serialized in JSON format, including all snapshot IDs and their corresponding indexes. N represents the generation of this file.
incompatible-snapshots A list of all snapshot IDs no longer compatible with the current cluster version. Not shown in the above image because there are no incompatible snapshots.
meta-HDDskFlfSHi3nAKdtxoc5w.dat (meta-.dat)

Metadata serialized in SMILE format used to represent the metadata corresponding to the snapshot HDDskFlfSHi3nAKdtxoc5w (includes only global metadata). 

HDDskFlfSHi3nAKdtxoc5w is UUID of the snapshot.

snap-HDDskFlfSHi3nAKdtxoc5w.dat (snap-.dat)

Snapshot Info serialized in SMILE format used to represent the information corresponding to the snapshot HDDskFlfSHi3nAKdtxoc5w.

HDDskFlfSHi3nAKdtxoc5w is the UUID of the snapshot

indices/ Data for all indices

The indices folder contains folders corresponding to indexes contained in the snapshot. The folder name is a unique identifier for an index in the repository. The UUID of the index in the repository is different from the UUID of the index in the actual index location. The information about this UUID can be found in the index-N file in JSON format, index-0 file in the above example. b1__UAx-TLulzX5eBHIrRg is the UUID of the index in the repository for index regev3.earth.local_sd_1 as shown in the above example.

Contents of indices folders are shown below:

File path Explanation
indices/ Data for all indices
indices/b1__UAx-TLulzX5eBHIrRg/ Index the data corresponding to regev3.earth.local_sd_1. The UUID of the index in the repository is b1__UAx-TLulzX5eBHIrRg. But do not confuse it with the UUID of the index.
indices/b1__UAx-TLulzX5eBHIrRg/meta-OeG4ZoUBgXNiQzc99ea6.dat
(meta-.dat)
Index metadata of regev3.earth.local_sd_1 index serialized in JSON format. Metadata identifier for index looked up from index-0 file as shown above example. OeG4ZoUBgXNiQzc99ea6 is the metadata index identifier for b1__UAx-TLulzX5eBHIrRg which is the UUID of the index in repository for index regev3.earth.local_sd_1.
indices/b1__UAx-TLulzX5eBHIrRg/0
indices/b1__UAx-TLulzX5eBHIrRg/1
indices/b1__UAx-TLulzX5eBHIrRg/2
indices/b1__UAx-TLulzX5eBHIrRg/3
indices/b1__UAx-TLulzX5eBHIrRg/4
Index regev3.earth.local_sd_1 data corresponding to shard 0 to shard 4. “shard_generations” JSON node in the index-0 file specifies the number of shards for the index present in the snapshot.

Each shard folder has files inside it.

File path Explanation
indices/ b1__UAx-TLulzX5eBHIrRg Index the data corresponding to regev3.earth.local_sd_1. The UUID of the index in the repository is b1__UAx-TLulzX5eBHIrRg. But do not confuse with the UUID of the index.
indices/b1__UAx-TLulzX5eBHIrRg/0 Index regev3.earth.local_sd_1 data corresponding to shard 0.
indices/b1__UAx-TLulzX5eBHIrRg/0/__1JLTOwvTQryKoU0V8bssZA
indices/b1__UAx-TLulzX5eBHIrRg/0/__4WTXofHjQoi9h2I9FTiNAg
.
.
Segment files, with specific mappings real segment, see snap-* file
indices/b1__UAx-TLulzX5eBHIrRg/0/snap-HDDskFlfSHi3nAKdtxoc5w.dat Snapshot HDDskFlfSHi3nAKdtxoc5w BlobStoreIndexShardSnapshot serialized in SMILE format.
indices/b1__UAx-TLulzX5eBHIrRg/0/index-cM7WJX9yRLiK46kBzbZkVA Shard 0 BlobStoreIndexShardSnapshots serialized in SMILE format. Files are generated with a UUID of shard suffix. See the JSON of index-0 above. “Shard_generations” in above file shows cM7WJX9yRLiK46kBzbZkVA UUID for shard 0

Size of snapshot

The size of the snapshot is the size of index data present in the index at that time plus the size of snapshot metadata files. It means the size of the snapshot will be higher than the size of the index data. Elasticsearch supports compression of the snapshot, and this must be specified while creating the index snapshot repository in Elasticsearch. Enterprise Vault registers index snapshot repository with Compress=True. Therefore, the size of the snapshot will be the same or less than the size of index data present in the index at that time.

Size of the snapshot repository

The size of the snapshot repository varies and depends on the size of the snapshot included in that repository and also the size of index data present in the index when snapshots are taken. Sometimes the size of the repository is higher (double/triple) than the size of the index data. As mentioned earlier, snapshot and restore work at the segment level and are incremental in that they will only snapshot a segment once even if it is used in multiple snapshots. Whenever new items are indexed, segments are merged automatically in the background. The same record will end up in multiple segments; these new segments will also be backed up, as there will be multiple segments in the repository that hold the same records. Over time, the size of the snapshot repository will be considerably larger than the index size. Snapshotting is not incremental at the record level.

The size of the snapshot repository will depend on indexing patterns. If many segments are merged between snapshots, a lot of space can be saved. 

Deletion of snapshot

Any snapshots which are no longer needed can be deleted. Even though snapshots are incremental, unlike most incremental backup solutions, snapshots in Elasticsearch have no special significant ordering or dependency on one another, even if they are incremental. Hence, the first snapshot can be deleted without affecting the latter snapshot, as long as it is deleted using the Elasticsearch API and not by picking files and deleting them from the filesystem.

Deleting a snapshot does not remove segments as long as there is one snapshot that uses them even if they were copied as part of a snapshot that is being deleted. Therefore, any older snapshots can be deleted without compromising the integrity of newer ones.

If snapshot S0 copies a segment that does not change, Snapshot S1 will not copy it again but instead reference it. If you delete snapshot S0, this segment stays in the repository as it is still used by snapshot S1. Snapshot S1 will therefore contain all segments present in the cluster at the time it was taken no matter when they were copied.

This means that the restore of the snapshot is done from the latest and older snapshots can be removed safely. 

Deleting old snapshots removes segments not referred to by any snapshot. This may not save too much space because of the reference issue, but it could save some space if possible. This will again depend on indexing patterns. If many segments are merged between snapshots, you could save quite a lot of space.

Snapshot deletes are run entirely by the Master node and work by first reading the repository to build a list of indices, shards, and segments in the to-be-deleted snapshot. The Master then re-reads all the other snapshot metadata and, importantly, the indexes and segments they contain. Next, it compares the list from the to-be-deleted snapshot to the used-by-other-snapshots list. The ones not referenced by any remaining snapshot are marked for deletion, deleting segments that are no longer needed while keeping all those that are. It also deletes any indexes in the repo that are not referenced by the remaining snapshots.

Snapshot recoveries

The restoration of a snapshot is straightforward. First, the repository must be registered before restoring the snapshot. Then, the Master reads the snapshot metadata from the repository, builds a list of indexes, shards, and segments to restore, and makes that happen. The Master also restores the cluster metadata if needed.

Issue/Introduction

Elasticsearch index snapshot - technical deep dive

Additional Information

JIRA: EV-5585