Search <book_title>...

Important Update: Cohesity Products Documentation

All Cohesity product documentation are now managed via the Cohesity Docs Portal: https://docs.cohesity.com/HomePage/Content/home.htm. Some documentation available here may not reflect the latest information or may no longer be accessible.

Arctera Insight Information Governance Administrator's Guide

Last Published: 2025-11-24

Product(s): Data Insight (7.2)

Platform: Windows

Section I. Getting started
1. Introduction to Arctera Insight Information Governance administration
  1. About Arctera Insight Information Governance administration
    1. Operation icons on the Management Console
    2. Information Governance administration tasks
2. Configuring Information Governance global settings
Section II. Configuring Information Governance
Section III. Configuring native file systems in Information Governance
Section IV. Configuring SharePoint data sources
1. Configuring monitoring of SharePoint web applications
2. Configuring monitoring of SharePoint Online accounts
Section V. Configuring cloud data sources
Section VI. Configuring Object Storage Sources
1. Amazon S3
Section VII. Health and monitoring
1. Using Arctera Insight Information Governance dashboards
2. Monitoring Information Governance
Section VIII. Alerts and policies
1. Configuring policies
Section IX. Remediation
1. Configuring remediation settings
Section X. Reference

Troubleshooting installation of Tesseract software

The Tesseract software extracts text from images thus facilitating the classification of images. The software is installed on the Management server, Collector nodes, and Classification server during installation of Information Governance. If Tesseract installation has failed and you need to enable classification of images, you will need to manually install Tesseract. Use the steps listed below to install Tesseract and update the ocr-config.properties file.

To install Tesseract

Open tesseract-ocr-w64-setup-4.0.0.20181030.exe located at <installdir>/deps/tesseract-ocr-w64-setup-v4.0.0.20181030.exe and run it on the classification node.
In the installation wizard, select the location: {drive}\{folder}.
After the installation is complete, open the file ocr-config.properties located at <installdir>/vic/vic-service.
Edit the following lines in the ocr-config.properties file:
- tesseractPath={drive}\\{folder}\\Tesseract-OCR
- tessdataPath={drive}\\{folder}\\Tesseract-OCR\\tessdata
- pageSegMode=3
Save the updated file and restart the DataInsightVICClient and DataInsightVICServer services.

Note:

Tesseract offers default support for the English language. However, if you want to classify image files that are not in English, you can download the appropriate training data, unzip it, and copy the .traineddata file into the tessdata directory ({drive}\{folder}\Tesseract-OCR).

You also need to update the language in the ocr-config.properties file.