How to implement eDiscovery Platform (Clearwell) on VMware - best practice guide

book

Article ID: 100046951

calendar_today

Updated On:

Description

Description

This document provides design and deployment considerations for implementing eDiscovery Platform on VMware.
Author: Kevin Graves

Scope of Document:

This document aims to provide guidance on designing and deploying eDiscovery Platform on the VMware vSphere platform.
This document should be used in conjunction with other performance and best practice guides as outlined in the “Related Documents” section of this document.

Intended Audience:

This document is aimed at system administrators, solutions architects, and consultants.
It is assumed that the reader has a thorough understanding of the architecture and operational aspects of eDiscovery Platform.
It is also assumed that the reader has experience and understanding of VMware vSphere.

 

Choosing the Right Platform for Your Environment

Virtualization  technology  has  helped  many  customers  introduce  cost  savings  both  in  terms  of  lowered  data center  power  consumption  and  cooling  requirements.  Virtualization  typically  also  simplifies the datacenter landscape through server consolidation, requiring less hardware to provide the same service to end users with the added benefit of application independent high availability.

Application architectures are rapidly evolving towards highly distributed, loosely-coupled applications.  The conventional x86 computing model, in which applications are tightly coupled to physical servers, is too static and restrictive to efficiently support most modern applications.  With a virtual deployment, the architecture can be as modular as is appropriate, without expanding the hardware footprint.  The dynamic nature of virtual machines mean that the design can grow and adapt as required, without the need for an initial “perfect” design.

Virtual deployments typically take minutes, can share currently deployed hardware, and can be adjusted “on the fly” when more resources are required.  Certain server applications however are less suitable for virtualization, especially those requiring heavy use of physical server resources such as CPU and memory.

Traditionally customers have been reluctant to place applications with high service level agreements such as Microsoft Exchange Server and SQL Server on a virtual platform, not only because the application’s demand on resources meant that only one or two virtual machines could co-exist on a single server, but also because the server could not offer the same performance it would have on a physical server. 

A number of factors should be considered before deploying eDiscovery Platform in a VMware environment:

  • eDiscovery Platform is heavily dependent on CPU and memory resources. In a typical physical server configuration, it is not unusual for the CPU to run at 90% or higher utilization while ingesting data, running an OCR job or exporting is being performed.
    • Generally, the more powerful the processor, the better the ingestion and retrieval rates
  • The minimum recommendation for CPU and memory configuration for Stand-Alone eDiscovery Platform is 32 CPU cores and 128GB RAM for an eDiscovery Platform server running Collections, Legal Holds and Cluster Master (with no cases).
    • If the eDiscovery Platform server will be used as a Worker Node for Pre-processing, Processing, Analysis and Review, the minimum recommended configuration is 24 CPU cores and 96GB RAM.
  • It is recommended that CPU and Memory resources are dedicated (reserved) and Locked to the eDiscovery Platform server, and not shared with other virtual machines on the host.
  • Other system components such as network and storage need to be sized accordingly to prevent them from becoming a bottleneck

If the above considerations are acceptable and supported by the customer environment, then it is likely that virtualizing the eDiscovery Platform environment will be a good fit for the organization.

 

Sizing eDiscovery Platform for VMware

One of the most important considerations when sizing eDiscovery Platform is a thorough understanding of the expected workload on each of the eDiscovery Platform servers; with the main consideration being the customer requirements for collecting, processing, reviewing and exporting.
It is outside the scope of this document to provide a design and sizing introduction to eDiscovery Platform, but in general terms, once the customer requirements are understood, a close look at the function of each eDiscovery Platform server will help determine what minimal server resources will be required.
The most common mistake when designing eDiscovery Platform Vault is to size for capacity, as opposed to sizing for performance.  The following sections in this guide will provide detail on how to design the various components for optimal configuration.


** MINIMUM REQUIREMENTS **

FUNCTION CPU RAM
Legal Hold Confirmation Server 16 32GB
Legal Hold Server 16 32GB
Collections Server 32 64GB
Collection, Legal Hold and Confirmation Server  32 64GB
Pre-Processing, Analysis, Review and Export Server 24 96GB*
All features combined eDP Server 32 128GB*

 

FUNCTION CPU RAM
Cluster Master with MySQL on a separate server 32 128GB*
Cluster Master with mySQL 48 128GB*
Worker Nodes 24 96GB*
Utility Nodes 4 8GB

 

* Indicates RAM is Reserved and Locked                                   


Special Considerations:

  • Do not combine servers with Reserved and Locked memory with non-Reserved/Locked resource, servers.
  • Ensure that the total number of vCPUs assigned to the virtual machines is equal or less than the total number of cores on theESX host
  • Do  not  enable  Hyperthreading –in  most  cases  this  provides  little  or  no  benefit  to  multi-CPU  virtual machines,  internal  testing  have  shown  that  Hyperthreading  provides  no  performance  benefit
  • All other hardware/software recommendations, follow the Veritas Installation Guide for the correct version of the product.

Throttling:  for limited use (For smaller or non-production environments)

There are times when resources are limited, where throttling (lowering active threads) may be also required to obtain the desired numbers.  Below are a series of technical articles that can be used as a guide to adjust the outcome of the Performance Monitor Counter results.

How to adjust ASM Memory Components
How to Throttle an eDP Process

 

 

Using Performance Monitor Counters to assist is sizing

Considerations:

  • The information provided by the Performance Reports is an average, so the resource consumption peaks are higher than the average.
  • In order to provide a useful report, the Performance Monitor data collection must be run only during the action of concern (ex: During Ingestion of data or the running of a Collection Task...etc...)
  • The Performance Monitor data collection task must be stopped immediately after the action of concern has completed.

Create and execute the Data Collection Set:

  1. perfmon.msc > Data Collector Sets > User Defined > New > Data Collector Set
  2. Name: Bottle Necks > Create manually > Next
  3. Create data logs > Performance counter > Next
  4. Add (select each of the counters listed below)
  5. Sample interval: 15 seconds > Next
  6. Save to an easy folder to access the report.
  7. Start the 'Bottle Necks' Counter
  8. Start the new data ingestion
  9. Stop the 'Bottle Necks' Counter immediately after the completion of data ingestion.
  10. Open the  *.blg file(s) in for analysis.


COUNTERS:

  • Memory:  Available Bytes
  • Memory:  Cache Faults /sec
  • Memory:  Page Faults /sec
  • Memory:  Page Reads /sec
  • Memory:  Page Writes /sec
  • Memory:  Pages/sec
    • (select each hard drive - do not select _Total or )
    • (select each hard drive - do not select _Total or )
  • Physical Disk: Avg. Disk Queue Length
  • Physical Disk: Avg. Disk Read Queue Length  
  • Physical Disk: Avg. Disk Write Queue Length
  • Physical Disk: Avg. Disk sec/Read
  • Physical Disk: Avg. Disk sec/Write
  • Logical Disk: Avg. Disk Queue Length
  • Logical Disk: Avg. Disk Read Queue Length
  • Logical Disk: Avg. Disk Write Queue Length
  • Logical Disk: Avg. Disk Read /sec
  • Logical Disk: Avg. Disk Write /sec
  • Paging File: % Usage
    • (select each individual processor - do not select _Total or )
  • Processor:  % Processor Time
    • (select each cwjava process and OCR if available)
  • Process: Page File Bytes
  • System:  Processor Queue Length

Analyzing the counters

Note:  This analysis is only for eDiscovery Platform performance.

MEMORY:

  • Available Bytes: (Amount of memory available to the server)
  • Cache Faults /sec (This will always be HIGH >2K)
  • Page Faults / sec (This will always be HIGH >15K)
  • Page Reads/sec: (Should not exceed 15)
  • Pages Write/sec: (Should not exceed 80).
  • Pages/sec: (An average of 20 pages or less, per second is normal).  


LOGICAL and PHYSICAL DISK:

  • Avg. Disk Write Queue Length (Should be less than 2)
  • Avg. Disk Queue Length  (Should not be higher then the number of spindles plus 2)
  • Avg. Disk Read Queue Length  (Should be less then 2)
  • Avg. Disk Write Queue Length  (Should be less then 2)
  • Avg. Disk Read /sec (Should be under 20ms, if over 50ms indicates a serious bottleneck)
  • Avg. Disk Write /sec (Manufacturer dependent)


PAGING FILE
% Usage (Should be below 1.0)


PROCESSOR
% Processor Time (Average between 15% - 20%)


PROCESS
Page File Bytes  (First instance is Clearwell, for all other processes, the higher the more efficient >9GB)


SYSTEM
Processor Queue Length  (Should not exceed 2 per CPU.  Example, if the server contains 16 CPU's, the count should not exceed 32).

>

Memory bottlenecks:

  • Page Reads/sec is HIGH:  The reason this counter is high, memory page needed by the program is not located in the physical RAM.
    Recommendation:  Increase the allotted RAM.  Reserve and Lock the RAM allocated to the VM.
  • Pages/sec is HIGH. This counter tracks the hard page faults  (should not exceed 80).  Microsoft states: If you have a high rate of page faults combined with a high rate of page reads then you may have an issue where you have insufficient RAM given the high rate of hard faults. 
    Recommendation: In this case we have both a very high Pages Input per second and a high Page Reads per second.

 

HDD bottlenecks: (Verify the issue is not a memory bottleneck first)

  • Avg. Disk Read Queue Length and Avg. Disk Write Queue Length counts are both above 2.  The HDD array is inadequate to handle the workload. 
    Recommendation: Add more or improved HDDs to the array.
  • Avg. Disk Read Queue Length is low, but the Avg. Disk Write Queue Length is high ( >50 ), the Anti-Virus exclusions are not in place.
    Recommendation:  Apply the appropriate AV exclusions as outlined in the technical article:  https://www.veritas.com/support/en_US/article.100013987

 

CPU bottlenecks:

  • % Processor Time and Processor Queue Length are HIGH, the amount of CPU researches is not adequate for the job requested. 
    Recommendation: Add more CPUs to the environment.
  • Processor Queue Length is HIGH with % Processor Time LOW.
    Recommendation:  Move the MySQL database to a remote server.

 

Issue/Introduction

How to implement eDiscovery Platform (Clearwell) on VMware - best practice guide