OCR error with Load File source

book

Article ID: 100011429

calendar_today

Updated On:

Description

Error Message

The Clearwell UI will indicate that there was an OCR error during processing (Figure 1).

Figure 1.


 

Cause

This is currently considered to be an unsupported workflow. If OCR is enabled in the case processing settings, the user must select Native file(s) for the Priority option in the load file Processing settings.

 

Resolution

There are currently no plans to address this issue by way of a hotfix or cumulative hotfix in the current or previous versions of the software at the present time. This issue may be resolved in a future major revision of the software at a later time. However, this particular issue is not currently scheduled for any release.  If you feel this issue has a direct business impact for you and your continued use of the product, please contact your Veritas Sales representative or the Veritas Sales group to discuss these concerns.  For information on how to contact Veritas Sales, please see https://www.veritas.com


Workaround

  1. Restore the case to a state before the affected documents were processed (indexed) into Clearwell.
  2. Set the case-level property esa.processing.ocr.max_file_size to 0. This prevents any OCR-ing from occurring.
  3. Process the documents.
  4. Set the case-level property esa.processing.ocr.max_file_size back to the original value.

This workaround prevents any OCR-ing from occurring and results in the external text file contents being indexed, as the user would have originally expected. If any of the native files do need to be OCR-ed, they can be later OCR-ed using the "OCR" Action from the "Analysis & Review" page.

 

 

Issue/Introduction

If OCR is enabled in the case processing settings, a Load File Import can fail to index any text for documents that include both native and extracted text files. This happens when all of the following conditions are true:
  • OCR is enabled in the case processing settings on the Processing > Settings page.
  • The load file settings include the option Load file contains a link to an external text file and a text file value is supplied.
  • The load file settings include the option Link to natives in load file and a native file value is supplied.
  • The Priority option in the load file settings is set to Extracted text.
  • The size of the external text file is within the OCR-able size limits specified in the case processing settings on the Processing > Settings page.
Note: This issue occurs even if you have selected the Native Files option Use metadata from load file... and have mapped a 'size' field that has a value outside of the OCR-able size limits specified in the case processing settings. The size used for the OCR-able size check is always based on the size of the external text file.

Additional Information

JIRA: ESA-30468