Statistical Sampling stops adding items into Review

book

Article ID: 100053975

calendar_today

Updated On:

Description

Error Message

No error message is present.

 

Cause

The first component of Random Sampling is Department Tagging. This is performed by the EV Storage service. The CA Departments and Monitored Employees' memberships are sent to EV Storage. Storage Tagging will add CA Department Tags to a message or item (message and item are used interchangeably here) based on the Monitored Employee SMTP address to Department information the Storage Service has at the time the message was processed. The Tagged messages' information is sent to CA where they wait for Random Sampling. Random Sampling will then process each message based on the Sampling Mode.

A message must contain the following to be eligible for Tagging:

- A sender and/or recipient that is a Monitored Employee (an Employee that is a member of one or more Departments).
- A valid message type, e.g.: Exchange, Domino, Bloomberg, Fax. This is the EV Index metadata attribute Vault.MsgType.
- A valid message direction, e.g.: internal, external inbound, external outbound. This is the EV Index metadata attribute Vault.MsgDirection.

Storage Tagging processes each eligible message based on the above criteria and adds the following metadata tags:

- KVSCA.Department: the DepartmentID(s) of a Monitored Employee that is a sender and/or recipient.
- KVSCA.DeptRecips: the DepartmentID(s) associated with the recipients.
- KVSCA.DeptAuthor: the DepartmentID(s) associated with the author.

All messages tagged by an inclusion policy will be included in the Random Sampling result set, which could cause the total number of messages captured to exceed the Policy Percentage calculations. An include tag supersedes an exclude tag, meaning an message having an include tag as well as an exclude tag will be considered as an inclusion messages by CA. Inclusion tags also supersede any Capping values in Statistical Sampling. If the number of inclusion-tagged messages exceeds the Random Sampling percentage calculations, the inclusion-tagged messages will be included in the Review Set and no more messages will be added by the Random Sampling process.

Guaranteed sampling is the default Sampling method. Using this Sampling method, Storage Tagging does not determine which messages need to be Sampled. All messages that were Department Tagged are sent to CA for Sampling. CA runs Random Sampling once a day at 1 AM by default. At the Sampling time, the Random Sampling process starts by checking for any enabled Guaranteed Sample Searches. If Guaranteed Sample Searches are configured, they will be run with Random Sampling; if Guaranteed Sample Searches are not configured, Random Sampling will run by itself.

Guaranteed Sample Searches work in combination with Random Sampling. Simply put, a Guaranteed Sample Search is a Search that runs during Random Sampling. Think of Guaranteed Sample Searches as a more focused Random Sample, by using key Search terms instead of just randomly selecting messages. Random Sampling will first run any Guaranteed Sample Searches and will add their results into Review. Then the Random Sampling process looks at the Guaranteed Sample Search results and asks a question: "Do the Guaranteed Sample Search results meet or exceed the Monitoring Policy Percentages per Monitored Employee in this Department?" If the answer is yes, then the Random Sampling process does not need to randomly add any more messages, as the Monitoring Policy Percentages have been fulfilled by the Guaranteed Sample Searches, and moves on to the next Department. If the answer if no, then the Random Sampling process goes to work and randomly selects as many messages as needed to fulfil the Monitoring Policy Percentage per Monitored Employee in the Department, before moving on to the next Department.

After all Guaranteed Sample Searches have completed, Random Sampling using Guaranteed Sampling looks at a Monitored Employee as the entity being Sampled. This means the Sampling Percentage is applied per Monitored Employee per message type per message direction per Department. For each Monitored Employee in the Department, Random Sampling will review each Policy Percentage for each message type and direction, review the messages available for Sampling, randomly select messages to meet the Policy Percentages for each message type and message direction, then add those Randomly Sampled messages to the Review Set. This process is repeated for each Monitored Employee in each Department using all messages available for Random Sampling. Once Random Sampling is complete, all messages that were used for that day's Sampling are removed so they will not be processed in the next Random Sampling run.

In Statistical Sampling, the process is different. Messages are still Tagged by the EV Storage Service. However, it is the Storage Tagging process, not CA, that runs Random Sampling throughout the day, based on having a certain number of messages to Randomly Sample (100 by default) or having reached a certain timeout since the last time it ran Random Sampling (10 minutes by default). Statistical Sampling looks at the Department as the entity being Sampled. This means the Sampling Percentage is applied per message type per message direction per Department, not per Monitored Employee per message type per message direction per Department. This is a main difference between Guaranteed Sampling and Statistical Sampling, and can result in Statistical Sampling returning fewer results than Guaranteed Sampling.

For each Department, Random Sampling (performed by EV Storage Tagging) will review each Policy Percentage for each message type and direction, review the messages available for Sampling, randomly select messages to meet the Policy Percentages for each message type and message direction, then sends the Randomly Sampled messages to CA. This process is repeated for each Department using all messages available for Random Sampling. Once the Randomly Sampled messages are sent to CA for all Departments, all messages that were not Randomly Sampled are removed so they will not be processed in the next EV Storage Tagging Random Sampling run. CA will still run Random Sampling every day at 1 AM by default, but now it simply adds all messages it received from Storage Tagging to the Review Set, as they have already been Randomly Sampled by the Storage Tagging process.

When using Statistical Sampling, if Sampling suddenly stops adding items into the Review Set and there are no indications of any issues the EV Event Logs on the CA and EV servers, this could indicate the Sampling Mode is not being correctly registered with CA.

 

Resolution

To resolve this issue, toggle the Sampling Mode as follows:

1. Connect to the CA Customer via the Client using an account having rights to edit the Configuration Settings, such as the Vault Service Account (VSA).
2. Click on the Configuration tab, click on the Settings sub-tab, then expand the Random Capture section to locate the number under the Value column for the Sampling mode Setting. This should be set to 0 for Statistical Sampling.
3. Click on the number 0, edit it from 0 to 1, then click the Save button and acknowledge any prompts to restart remoting, Customer's Background Tasks or services.
4. Restart the Enterprise Vault Accelerator Manager Service in the Services console on the CA server.
5. Repeat the above steps to re-edit the Sampling mode Setting from 1 to 0.

 

Issue/Introduction

Enterprise Vault (EV) Compliance Accelerator (CA) Random Sampling stops adding items into the Review Set when using Statistical Sampling. There are no Warning or Error Events in the EV Event Logs on the CA server, nor any indication of errors in the Review Set. There also are no known issues with EV Archiving and/or Indexing, and no Warning or Error Events in the EV Event Logs on the EV servers.