The mechanism used to retrieve large content from ECS was not working as expected and had to be updated.
When users trigger the retrieval of archived content by opening or copying a placeholder hosted on Celerra / VNX / Vmax3 eNAS* platforms, the user I/O request is processed by the OE for File redirecting the request via HTTP instructions sent to the Enterprise Vault (EV) server for retrieval from the Vault Store Partition. These HTTP instructions are fragmented into multiple byte-range requests for chunks of the content. EV then has to access the content from Storage and return each byte-range chunk as per the respective HTTP request. Large files may result in hundreds, thousands or millions of requests.
When processing retrieval requests from Celerra / VNX / Vmax3 eNAS* platforms, EV creates byte-range temporary files in the EV Cache folder for processing the specific byte-range chunks.
If the Vault Store Partition hosting the archived content is configured on an ECS platform, all reading operations are performed via the Streamer software and read in 64 KB chunks. These chunks are written to a temp file under the VSA user profile prior to being written to the corresponding byte-range temp file in the EV Cache location.
Depending on the size of the file being retrieved, there will be one or more byte-range temp files created to answer all the byte-range http requests.
Ensure the read method used with the file system hosting the FSA Placeholder shortcuts is configured properly. With the Celerra / VNX / Vmax3 eNAS* platforms, the file systems and respective connections can use 3 different methods for how the data is read.
Using the parameter read_policy_override the methods can be defined as:
? Full recall involves reading the entire file from secondary storage back onto the VNX, replacing the stub file with the file. After a full recall the file is no longer considered to be stored on secondary storage.
Operation: On the first read request from the client the entire file is recalled from secondary and stored in the primary file system and the stub file reverts to a normal file. Only when all data is recalled to primary is any returned to the client.
? Passthrough involves reading the file from secondary storage, and providing it to the client, but not storing it on primary storage, and leaving the stub file intact. The data is 'passed through' the VNX to the client. This is generally used for carrying out backups, as the backup may access every file on the filesystem, but you don't want the files to be stored on Primary storage.
Operation: As the client reads the file the VNX reads the requested data from secondary storage and returns it to the client. The recalled data does not land in the primary file system.
? Partial recall involves only recalling the part of the file that the client is attempting to access. So for example if the client is only attempting to access a small part of the file, only that part will be recalled.
Operation: As the client reads the file the VNX reads the requested data from secondary storage and returns it to the client. The recalled data lands in the primary file system as it passes through. When all data is recalled file stub file reverts to a normal file.
The main consideration here is that if the file being retrieved is very large, i.e. a few GBs, using the method 'full' means that the client requesting the content will only start receiving data packages after the entire file is restored to the file system, therefore the Client's OS would trigger a timeout error.
There are a few requirements which need to be observed in order to successfully process requests for very large files within the above implementation.
- Streamer 1.0.16 as a minimum
- EV 12.2.3 as a minimum
- VNX file system / connection parameter read_policy_override is recommended to be set to 'passthrough' or 'partial'.
The Streamer software is provided by the platform Vendor.
EV 12.2.3 was released with a fix that ensures correct processing of all the fragmented readings from Storage and subsequent redirection to the requesting platform.
The read_policy_override methods 'passthrough' and 'partial' start transferring data packages to the client straight away, therefore avoiding the timeout error at the client.
NOTE: As of March 2018, the VMAX3 eNAS Platform is undergoing certification testing and is currently not fully supported. Once the platform has been fully certified by functional testing the Enterprise Vault Compatibility Guide will be updated accordingly.