Important Update: Cohesity Products Documentation
All Cohesity product documentation are now managed via the Cohesity Docs Portal: https://docs.cohesity.com/HomePage/Content/home.htm. Some documentation available here may not reflect the latest information or may no longer be accessible.
Arctera Insight Information Governance Installation Guide
- Understanding the Arctera Insight Information Governance architecture
- About Arctera Insight Information Governance
- About the Management Server
- About the Collector worker node
- About the Indexer worker node
- About the Classification worker node
- About the Self-Service Portal node
- About Communication Service
- About the DataInsightWatchdog service
- About the DataInsightWorkflow service
- About Arctera Insight Information Governance installation tiers
- Preinstallation
- Installing Arctera Insight Information Governance
- About installing Arctera Insight Information Governance
- Federal Information Processing Standards (FIPS)
- Performing a single-tier installation
- Performing a two-tier installation
- Performing a three-tier installation
- Installing the Management Server
- Installing the worker node
- Installing the Classification Server
- Installing the Self-Service Portal
- Installing a Linux Classification Server or Collector worker node
- Installing Arctera Insight Information Governance in Azure Cloud Environment
- Installing Arctera Insight Information Governance in AWS Cloud Environment
- Upgrading Arctera Insight Information Governance
- Post-installation configuration
- Installing Windows File Server agent
- Getting started with Information Governance
- Uninstalling Arctera Insight Information Governance
- Appendix A. Installing Information Governance using response files
About the Collector
The Collector (Audit Pre-processor) is a Information Governance process that enables you to collect and parse access events from various storage repositories. The Collector examines the access events available on these storage systems to parse the events that report the read, write, create, delete, and rename activity on files or folders. The access events are processed in batches that consist of several thousand events. Each batch of events that are collected in a cycle is stored in a separate file with appropriate timestamp that indicates the ending time of the last entry in that batch. This data is pruned based on exclude rules or events that are not from the configured shares, site collections or equivalent data sources, and is then segregated on a per-share basis. These files are periodically shipped to the appropriate Indexer node.
In Information Governance, a Collector Pool is a logical grouping of collector nodes designed to scan data sources and collect access events efficiently. Managing collectors as a pool ensures better scalability, balanced load distribution, and high availability. If one collector node in the pool goes down, the others continue processing to avoid disruption. Collector Pools also simplify administration, as configuration can be applied at the pool level, ensuring consistency across collectors and enabling reliable handling of large-scale data collection.
Information Governance collects information about access events from various storage repositories through exposed vendor APIs.
For detailed information about which audit service is appropriate for your data source, see the Arctera Insight Information Governance Administrator's Guide.