The Storage service converts items to HTML or text, if possible, and this converted content is then used to index the item. The Enterprise Vault Storage Service uses Outside In® Technology content converters from Oracle® Corporation to convert most file types. To provide Optical Character Recognition (OCR) conversion for image file type, Enterprise Vault uses a Windows TIFF IFilter. Windows TIFF IFilter is an optional Windows feature that the Enterprise Vault installer enables automatically, if it is not already enabled. To enable OCR conversion for images present in PDF documents, following are the steps:
“OCR conversion of embedded images” to “ON”
“OCR conversion of scanned pages” to “ON”
Note: To enable OCR conversion for MS Word documents we only need to enable “OCR conversion of embedded images” setting.