Occasionally, discovery of PST-based source data will fail. This is frequently caused by PST files which are in an inconsistent and/or corrupt state. An example error seen in the logs is the following:
05/06/2024 15:13:03 Scan Result: "\\sourcedata\projects\football\Export.pst","0","The mail file might be corrupt for crawling because of unknown reason. Error code: 15015 HRes: -2147024891 Error str: ",,,,,,,,
Another symptom of a likely corrupt/inconsistent PST file is that the discovery takes much longer than expected, because the corrupt PST has caused the scanning to get stuck. When this happens, the job log ( statuslog.txt ) will show periods of inactivity lasting almost 1 hour exactly, which is the default timeout for scanning a PST file. For example:
05/06/2024 15:14:36 All files marked for Scan processed
05/06/2024 15:15:31 Waiting for custodian discoveries to complete - Pending discoveries 3
05/06/2024 15:21:33 Waiting for custodian discoveries to complete - Pending discoveries 3
05/06/2024 15:24:34 Waiting for custodian discoveries to complete - Pending discoveries 3
05/06/2024 15:30:37 Waiting for custodian discoveries to complete - Pending discoveries 3
05/06/2024 15:33:38 Waiting for custodian discoveries to complete - Pending discoveries 3
05/06/2024 15:39:41 Waiting for custodian discoveries to complete - Pending discoveries 3
05/06/2024 15:45:43 Waiting for custodian discoveries to complete - Pending discoveries 3
05/06/2024 15:48:44 Waiting for custodian discoveries to complete - Pending discoveries 3
05/06/2024 15:54:47 Waiting for custodian discoveries to complete - Pending discoveries 3
05/06/2024 15:57:48 Waiting for custodian discoveries to complete - Pending discoveries 3
05/06/2024 16:03:50 Waiting for custodian discoveries to complete - Pending discoveries 3
05/06/2024 16:06:52 Waiting for custodian discoveries to complete - Pending discoveries 3
05/06/2024 16:13:30 Launched Scan Process 05/06/2014 16:13:30 Processing file: "\\sourcedata\projects\football\Export.pst",1,0,0,bbabd02c8059592faafe70bb92cc26b3,4631525228284305
Note: Any PST file that fails discovery is marked as disabled in the Sources & Pre-Processing > Manage Sources page.
ScanPST can often repair the PST file in these situations. ScanPST is a free utility from Microsoft designed to repair corrupt or damaged Outlook PST files. The Inbox Repair tool (scanpst.exe) is designed to help repair problems that are associated with personal folder (.pst) files.
Using ScanPST.exe
For further information on using the SCANPST utility please reference the following Microsoft KB articles:
https://support.microsoft.com/kb/272227
https://learn.microsoft.com/en-us/troubleshoot/outlook/data-files/how-to-repair-personal-folder-file