Arctera Insight Information Governance Administrator's Guide
- Section I. Getting started
- Introduction to Arctera Insight Information Governance administration
- Configuring Information Governance global settings
- About scanning and event monitoring
- About filtering certain accounts, IP addresses, and paths
- About archiving data
- About Information Governance integration with Data Loss Prevention (DLP)
- Configuring advanced analytics
- About open shares
- About user risk score
- About bulk assignment of custodians
- Configuring Metadata Framework
- Section II. Configuring Information Governance
- Configuring Information Governance product users
- Configuring Information Governance product servers
- About node templates
- About automated alerts for patches and upgrades
- Configuring saved credentials
- Configuring directory service domains
- Adding a directory service domain to Information Governance
- Configuring containers
- Server Pools
- Section III. Configuring native file systems in Information Governance
- Configuring clustered NetApp file server monitoring
- About configuring secure communication between Information Governance and cluster-mode NetApp devices
- Configuring EMC Celerra or VNX monitoring
- Configuring EMC Isilon monitoring
- Configuring EMC Unity VSA file servers
- Configuring Hitachi NAS file server monitoring
- Configuring Windows File Server monitoring
- Configuring Arctera File System (VxFS) file server monitoring
- Configuring monitoring of a generic device
- Managing file servers
- Adding filers
- Adding shares
- Renaming storage devices
- Configuring clustered NetApp file server monitoring
- Section IV. Configuring SharePoint data sources
- Configuring monitoring of SharePoint web applications
- About the Information Governance web service for SharePoint
- Adding web applications
- Adding site collections
- Configuring monitoring of SharePoint Online accounts
- About SharePoint Online account monitoring
- Adding site collections to SharePoint Online accounts
- Configuring monitoring of SharePoint web applications
- Section V. Configuring cloud data sources
- Configuring monitoring of Box accounts
- Configuring OneDrive account monitoring
- Configuring Azure Netapp Files Device
- Managing cloud sources
- Section VI. Configuring Object Storage Sources
- Section VII. Health and monitoring
- Section VIII. Alerts and policies
- Configuring policies
- Managing policies
- Configuring policies
- Section IX. Remediation
- Configuring remediation settings
- Section X. Reference
- Appendix A. Information Governance best practices
- Appendix B. Migrating Information Governance components
- Appendix C. Backing up and restoring data
- Appendix D. Arctera Information Governance health checks
- About Information Governance health checks
- About Information Governance health checks
- Appendix E. Command File Reference
- Appendix F. Arctera Information Governance jobs
- Appendix G. Troubleshooting
- Troubleshooting FPolicy issues on NetApp devices
Troubleshooting installation of Tesseract software
The Tesseract software extracts text from images thus facilitating the classification of images. The software is installed on the Management server, Collector nodes, and Classification server during installation of Information Governance. If Tesseract installation has failed and you need to enable classification of images, you will need to manually install Tesseract. Use the steps listed below to install Tesseract and update the ocr-config.properties
file.
To install Tesseract
- Open
tesseract-ocr-w64-setup-4.0.0.20181030.exe
located at<installdir>/deps/tesseract-ocr-w64-setup-v4.0.0.20181030.exe
and run it on the classification node. - In the installation wizard, select the location:
{drive}\{folder}
. - After the installation is complete, open the file
ocr-config.properties
located at<installdir>/vic/vic-service
. - Edit the following lines in the
ocr-config.properties
file:tesseractPath={drive}\\{folder}\\Tesseract-OCR
tessdataPath={drive}\\{folder}\\Tesseract-OCR\\tessdata
pageSegMode=3
- Save the updated file and restart the DataInsightVICClient and DataInsightVICServer services.
Note:
Tesseract offers default support for the English language. However, if you want to classify image files that are not in English, you can download the appropriate training data, unzip it, and copy the .traineddata
file into the tessdata
directory ({drive}\{folder}\Tesseract-OCR
).
You also need to update the language in the ocr-config.properties
file.