How to verify the integrity of indexes with the IndexCheck utility when using Enterprise Vault (EV).
Problem
How to verify the integrity of indexes with the IndexCheck utility when using Enterprise Vault (EV).
Solution
Since Enterprise Vault (EV) version 6.0 Service Pack 1 (SP1), a utility has been included which can help verify the integrity of the 32-bit (Alta Vista) Indexes used by EV.
The tool is called INDEXCHECK.EXE and is located in the EV installation folder (for 32-bit OS, default folder is C:\Program Files\Enterprise Vault, for 64-bit OS, default folder is C:\Program Files (x86)\Enterprise Vault)
This utility can be used to verify a single specified index or all the indexes within the specified index location. The utility has a number of checking modes, the most useful being the ' exist
' check. This ensures index integrity from the file level, i.e. all the files that should exist. This should be enough to detect most types of index corruptions.
The utility must be run from a Command Prompt as follows:
INDEXCHECK -c exist -f <index location> -ignorewarnings > d:\evlogs\indexcheck.log
use INDEXCHECK ?
to get more information on parameters you can use.
If an index location is specified, rather than a particular index (i.e., E:\Indexes), this means that all the indexes within that index location will be checked. Typically this check takes about 2 to 3 seconds per index. Since it is simply a file level check, it does not open the index and search it in any way.
If any argument for IndexCheck other than " exist
" is used, each index volume that is specified or in the index location would be opened while the utility is running. When this happens, the need exists to ensure that no other processes are attempting to access the index volume that is open. An alternative is running the tool against copies of indexes and not the 'live' versions.
By default, the output of IndexCheck is displayed on the screen. There are occasions where writing the output to a file is needed for external review. The above example shows the argument to use to create such an output file. When outputting the information to a file the ' -ignorewarnings
' parameter is needed so that no additional prompts are displayed.
Below is an example output from one index.
C:\Program Files\Enterprise Vault>indexcheck -c exist -f D:\EVMBX1\MBX_INDEX\LOCATION2\12BB36C7C2F1A164087C7167e7638d4011110000evlab -ignorewarnings
Start time: 13/12/2005 06:10:28
Running with ignorewarnings set
Processing index folder D:\EVMBX1\MBX_INDEX\LOCATION2\12BB36C7C2F1A164087C7167e7638d4011110000evlab
Checking file existence...
Completed OK. 95 lines
Finished. Checked 1 index(es), 0 with errors, 0 with warnings
End time: 13/12/2005 06:10:28
Duration: 0 minute(s) 0 second(s)
Here is example batch file to run IndexCheck:
@ECHO OFF
IndexCheck.exe -f d:\evdata\indexlocation1\ -ignorewarnings > d:\evlogs\indexcheck.log
IF ERRORLEVEL 3 GOTO label3
IF ERRORLEVEL 2 GOTO label2
IF ERRORLEVEL 1 GOTO label1
IF ERRORLEVEL 0 GOTO label4
GOTO End
:label1
ECHO indexcheck has returned warnings
GOTO End
:label2
ECHO indexcheck has returned errors
GOTO End
:label3
ECHO indexcheck has returned errors and warnings
GOTO End
:label4
ECHO indexcheck has not found any problems
GOTO End
:End
Using a batch file like this on a nightly basis to check all index volumes, or against targeted index volumes, could be used as a "Best Practice" to monitor index volume health.
Since EV 7.0 SP1, there have been some major enhancements added to this utility which are listed in detail below:
1. Generating and validating index file checksums
The Indexing service can now generate a checksum for an index volume, and use the checksum to perform index validation before opening the index volume. The checksum is stored in a file named Checksum.dat in the index volume folder.
The following new options can be used by the IndexCheck utility to validate index volumes:
- -c ChecksumCreate - This option generates or updates checksums for index volumes.
- -c ChecksumValidate - This option validates index volumes against existing checksums.
For each of the options, the path to the target index folder must be specified using the -f parameter. If specifying the index location that is displayed in the (Enterprise Vault) Administration Console, on the Index Locations page of the Indexing service properties, then IndexCheck will attempt to create or validate the checksum for each of the index volumes in the index location. If specifying the path for a particular index volume, then IndexCheck will attempt to create or validate the checksum for that index volume only.
IndexCheck.exe -f <index_folder_path> -c ChecksumCreate
IndexCheck.exe -f <index_folder_path> -c ChecksumValidate
- -c MissingDocs - Write the list of items (savesets) missing from the index to IndexMissing.log.
- -c MissingContent - Write the list of items with missing content to IndexMissing.log.
- -c MissingItemsLogFile - Report on the contents of IndexMissing.log, if it exists.
IndexCheck -c MissingDocs -f <index_folder_path> -db <Directory_database_server>
IndexCheck -c MissingContent -f <index_folder_path> -db <Directory_database_server>
IndexCheck -c MissingItemsLogFile -f <index_folder_path> -db <Directory_database_server>
- -f <index_folder_path> is the path to the index folder. If specifying the index location that is displayed in the Administration Console, on the Index Locations page of the Indexing service properties, then IndexCheck will process each index volume in the index location. If specifying the path for a particular index volume, then IndexCheck will process that index volume only. If there are spaces in the path, the path will have to be enclosed in quotes.
- -db <Directory_database_server> is the name of SQL server that manages the EnterpriseVaultDirectory database.
- The default schema (SchemaType 0)
- The Item Granularity schema (SchemaType 1)
\SOFTWARE
\KVS
\Enterprise Vault
\Indexing
- Compare the highest Index Sequence Number in the vault store database with the highest Item Sequence Number in the index volume.
- Compare the number of top-level items reported in the vault store database with the top-level Item Count in the index volume.
- -c stats For the specified index volumes, compare the information in the index volume with the information in the vault store database.
- -f <index_folder_path> The index volumes to validate. If specifying the index location that is displayed in the Administration Console, on the Index Locations page of the Indexing service properties, then IndexCheck validates all index volumes in that location. If specifying the path for a particular index volume, then IndexCheck validates only that index volume.
- -db <Directory_database_server> The name of SQL server that manages the EnterpriseVaultDirectory database. This is needed to ascertain the required vault store databases.
- -diff <integer> The permitted tolerance. A warning is reported for an index volume if the difference between the index volume and database information is greater than <integer>. The tolerance specified will depend on individual company requirements. Note that the number of items in an open index can fluctuate greatly, depending on the amount of activity on the index. On a very busy system there may be a delay before the vault store database information is updated. The default value is 1.
- -csv <file_name> The results of the validation can be written to a csv file. If used, specify a file name and path, or just a file name. If no path is given, the file is created in the folder from which IndexCheck is run. This option can only be used with the -c stats argument.
- The difference between the highest Index Sequence Number in the vault store database and the highest Item Sequence Number in the index volume is greater than the number specified in the -diff parameter.
- The difference between the number of top-level items reported in the vault store database and the top-level Item Count in the index volume is greater than the number specified in the -diff parameter.
IndexCheck.exe -c stats -f C:\Program Files\Enterprise Vault\Indexing\1773A46CFC34... -db SQLserver2 -diff 20
-csv C:\IndexCheck\ValidationFeb2007.csv
Processing index folder C:\Program Files\Enterprise Vault\Indexing\1773A46CFC34...
Counting the number of top level documents...
Stats from the index
Highest Index Sequence Number : 1002
Top Level Document Count : 1002
Stats from the database [Index Volume Identity = 14]
Highest Index Sequence Number : 970
Top Level Document Count : 970
WARNING! Mismatch on top level document count
Finished. Checked 1 index(es), 0 with errors, 0 with warnings
End time: 12/02/2007 13:01:18
Duration: 0 minute(s) 3 second(s)