How to reprocess a NSF or PST email file.

book

Article ID: 100038378

calendar_today

Updated On:

Description

Description

 

After processing an email file such as a NSF or PST file, the Manage Sources page might sometimes still show items left to be processed (Figure 1). 

Figure 1.

One situation in which this can happen is if there were issues with the Crawler during processing. In such situations, reprocessing the same email file can often process the remaining items. However, by default, if a NSF or PST email file has already been processed, future requests to reprocess the same file will be ignored. In order to reprocess the file, the crawl status associated with the file needs to be reset. This HOWTO explains how to do this, so that an email file can be reprocessed. 

 

The remainder of this HOWTO will detail the steps to reprocess an example NSF file called nsf_file_2.nsf .

  1. Make a note of the current date range filter applied to the email file.

    Note: We will need this information later, when we reset the date range filter.

    Determine the current date range filter as follows:

    • Go to Processing > Processing Status > Processing Statistics
    • Select the most recent processing batch where the email file was processed.
    • Click the Export button in the bottom-left corner of the screen (Figure 2).

      Figure 2.

      This will generate a CSV report.

      The following example report shows that none of the sub-sources under the case folder source DatasetA have an explicit date range filter specified. They are therefore inheriting the date range of the case folder source, which is 'everything on or after 1st October 2007' (Figure 3.):

      Figure 3.

       
  2. Make a note of which sub-sources are currently disabled and also the "Last Indexed" value of the email file.

    When we reprocess the NSF file, we will first disable the other sub-sources under the same case folder source. After the NSF file has been reprocessed, we will re-enable any sub-sources that we disabled. In order to do this, we need to record the current state of the other sub-sources.

    It is also worth recording the Last Indexed value, because it does have some significance. For PST files, this is the last crawl date, as specified in processing options. If no dates were specified for PST, Clearwell crawls to the date 30826 and processes. For NSF files, the Last Indexed time (if no dates were specified in the processing options) is the machine time when the last indexing was run on this source. This is also explained in the Case Administration Guide.

    Capture the current contents of the Manage Sources page as follows:

    • Go to Processing > Sources & Pre-Processing > Manage Sources
    • Use the Export Table option (Figure 4):

      Figure 4.

      This will generate a CSV report. There is no need to select any particular source; the report will always include all of them. 

      The Mailboxenabled field in the following example report shows that all of the sub-sources are currently enabled (Figure 5):

      Figure 5.

       

  3. Reset the date range filter of the email file.

     

    • Go to page Processing > Sources & Pre-Processing > Pre-Processing Options
    • Select only the single email file. A quick way to do this is to use the only link to the right of the file (Figure 6):

      Figure 6.

       

    • Set any date range for the email file.
      For example, change the date range to All Dates and then click the Apply button.
       
    • Change the date range back to what it was before.
      Examine the report that was generated earlier from the Processing > Processing Status > Processing Statistics page. This report states that the date range was 'everything on or after 1st October 2007'. Set this as the date range for the email file and then click the Apply button again (Figure 7):

      Figure 7.

      Note: There is now a green exclamation mark to the left of the sub-source. It is there because we have now explicitly defined a date range for this sub-source and are therefore overriding the date range previously inherited from the parent case folder source. If you hover your mouse over the green exclamation mark, you will see the following message: Processing options are set on this sub-source which override those of the source.

      The Manage Sources page also shows the green exclamation mark, and the Last Indexed value has been reset to Never Indexed for the email file (Figure 8):

      Figure 8.


      We are now almost ready to reprocess the email file.
  4. Disable the other sub-sources.

    We only want to reprocess one email file. We should therefore disable the other sub-sources under the same source.
    On the Processing > Sources & Pre-Processing > Manage Sources page, select the other sub-sources, choose option Disable processing and then click the Go button (Figure 9):

    Figure 9.

     

  5. Select the email file and reprocess it.

    On the Manage Sources page, with the other sub-sources still disabled, select the email file, choose Start processing source without discovery and then click the Go button (Figure 10):

    Figure 10.

     

  6. Re-enable sub-sources that were disabled.

    After the email file has been reprocessed, re-enable the other sub-sources that were disabled earlier.

     

  7. [OPTIONAL] Remove the green exclamation mark associated with the email file.
     
    • Go to Processing > Sources & Pre-Processing > Pre-Processing Options
    • Use the only link to select just the email file.
    • Click the button Remove Overrides from Selected Items (Figure 11). There is no need to click the Apply button afterwards.

      Figure 11.

      This will remove the explicit date range filter associated with the email file sub-source. The email file will therefore once again inherit the date range filter from the parent case folder source, and the green exclamation mark will no longer be displayed.

 

Related Article: Sources show 'Never Indexed' in eDiscovery Platform

Issue/Introduction

How to reprocess a NSF or PST email file.