Veritas Data Insight Administrator's Guide
- Section I. Getting started
- Introduction to Veritas Data Insight administration
- Configuring Data Insight global settings
- Overview of Data Insight licensing
- About scanning and event monitoring
- About filtering certain accounts, IP addresses, and paths
- About archiving data
- About Data Insight integration with Symantec Data Loss Prevention (DLP)
- Configuring advanced analytics
- About open shares
- About bulk assignment of custodians
- Section II. Configuring Data Insight
- Configuring Data Insight product users
- Configuring Data Insight product servers
- About node templates
- About automated alerts for patches and upgrades
- Configuring saved credentials
- Configuring directory service domains
- Configuring containers
- Section III. Configuring native file systems in Data Insight
- Configuring NetApp file server monitoring
- Configuring clustered NetApp file server monitoring
- About configuring secure communication between Data Insight and cluster-mode NetApp devices
- Configuring EMC Celerra or VNX monitoring
- Configuring EMC Isilon monitoring
- Configuring EMC Unity VSA file servers
- Configuring Hitachi NAS file server monitoring
- Configuring Windows File Server monitoring
- Configuring Veritas File System (VxFS) file server monitoring
- Configuring monitoring of a generic device
- Managing file servers
- Adding filers
- Adding shares
- Renaming storage devices
- Configuring NetApp file server monitoring
- Section IV. Configuring SharePoint data sources
- Configuring monitoring of SharePoint web applications
- About the Data Insight web service for SharePoint
- Adding web applications
- Adding site collections
- Configuring monitoring of SharePoint Online accounts
- About SharePoint Online account monitoring
- Adding SharePoint Online accounts
- Adding site collections to SharePoint Online accounts
- Configuring monitoring of SharePoint web applications
- Section V. Configuring cloud data sources
- Section VI. Configuring ECM data sources
- Section VII. Health and monitoring
- Section VIII. Alerts and policies
- Section IX. Remediation
- Section X. Reference
- Appendix A. Backing up and restoring data
- Appendix B. Data Insight health checks
- Appendix C. Command File Reference
- Appendix D. Data Insight jobs
- Appendix E. Troubleshooting
- Troubleshooting FPolicy issues on NetApp devices
Configuring advanced settings
You can edit various settings of the Data Insight servers from the
> > page.The advanced settings are divided into the following categories:
Filesystem Scanner settings - Configures how the server scans file systems. Data Insight performs two types of scans on the configured shares:
Full scans
During a full scan, Data Insight scans the complete share. These scans can run for several hours, if the share is very big. Typically, a full scan should be run once for a newly added share. After the first full scan, you can perform full scans less frequently based on your preference. Ordinarily, you need to run a full scan only to scan those paths which might have been modified while event monitoring was not running for any reason. In all other cases, the incremental scan is sufficient to keep information about the file system metadata up-to-date.
See Table: File system scanner settings - Full scan settings.
Incremental scans
During an incremental scan, Data Insight re-scans only those paths that have been modified since the last full scan. It does so by monitoring incoming access events to see which paths had a create event or write event on it since the last scan.
See Table: File system scanner settings - Incremental scan settings.
Indexer settings - Configures how the indexes are updated with new information. This setting is applicable only for Indexers.
Audit events preprocessor settings - Configures how often raw access events coming from file servers must be processed before they are sent to the Indexer.
High availability settings - Configures how this server is monitored.
Each server periodically monitors its CPU, memory, state of essential services, number of files in its inbox, outbox, and err folders. Events are published if these numbers cross the configured thresholds. Also, each worker node periodically heartbeats with the Management Server. The Management Server publishes events if it does not receive a heartbeat from a node in the configured interval.
FPolicy safeguard settings - Configures the safeguards related to FPolicy communication. You can either choose to use the global settings or customize the settings for a specific Collector node. You can configure settings for FPolicy Cluster-Mode and 7-mode
Report settings - Configures settings for reports.
Classification settings - Configures how the Classification Server scans file system contents.
Table: Classification settings - Content scans for file system
Table: Classification settings - Content scans for SharePoint
Windows File Server Agent settings - Configures the behavior of the Windows File Server filter driver. This setting is applicable only for the Windows File Server Agent server.
Veritas File System server (VxFS) settings - Configures how Data Insight scans the VxFS filer.
NFS settings - Configures how Data Insight scans NFS shares.
See Table: NFS settings.
SharePoint settings - Configures the duration for which old audit logs are kept on the SharePoint server. Audit logs that are fetched from the SharePoint server are automatically deleted from the Data Insight database. You can disable this feature at the web application level.
Troubleshooting settings - Configures settings that aid troubleshooting.
Set custom properties - Configures certain advanced properties of a Data Insight worker node. Using this facility, you can customize certain properties that are not accessible by the normal settings.
Note:
Veritas recommends using the custom properties settings under the guidance of Veritas Support.
You can configure the advanced settings per node or save commonly used settings as a templates. See About node templates.
To configure advanced settings
- In the Console, click Settings > Data Insight Servers.
- Click the server, for which you want to configure the advanced settings.
- Click Advanced settings.
- Click Edit.
- Make necessary configuration changes, and click Save.
Each of the categories for the advanced settings are described in detail below.
Table: File system scanner settings - Full scan settings
Setting | Description |
---|---|
Maximum scans to run in parallel on this server | The Collector can perform multiple full scans in parallel. This setting puts a limit on the total number of full scans that can run in parallel on a Collector. The default value is two threads. Configure more threads, if you want scans to finish faster. The setting is disabled by default. |
Maximum shares per filer to scan in parallel | If multiple shares of a filer can be scanned in parallel, this setting puts a limit on the total number of shares of a filer that you can scan in parallel. |
Default scan schedule | Specifies how often full scans must be performed. By default, full scans are scheduled to repeat at 19:00 P.M. on the last Friday of each month. Select the check box to override this setting at a filer or at a share level. |
Pause scanner for specific times | You can configure the hours of the day when scanning should not be allowed. This setting ensures that Data Insight does not scan during peak loads on the filer. The setting is enabled by default. Scans resume from the point they were at before they were paused. |
Pause scanner schedule | Configures when scanning should not be allowed to run. By default, scanning is paused from 7:00 A.M. to 7.00 P.M., Monday to Friday. You can specify multiple scanner pause schedules for different days of the week. For example, you can choose to pause scanning from 7:00 A.M. to 7:00 P.M. on weekdays and from 7:00 A.M. to 9:00 P.M. on Saturdays and Sundays. To add a scanning schedule:
You can also edit or delete existing scanning schedules. |
Table: File system scanner settings - Incremental scan settings
Setting | Description |
---|---|
Maximum scans to run in parallel on this server | The Collector can perform multiple full scans in parallel. This setting puts a limit on the total number of full scans that can run in parallel on a Collector. The default value is two threads. Configure more threads, if you want scans to finish faster. The setting is disabled by default. |
Maximum shares per filer to scan in parallel | If multiple shares of a filer can be scanned in parallel, this setting puts a limit on total number of shares of a filer that can be scanned in parallel. The default value is 2. |
Default scan schedule | Specifies how often incremental scans must be performed. By default, incremental scans are scheduled to run at 7:00 P.M. each night. Schedule incremental scans more or less frequently based on how up-to-date you need information in Data Insight to be. |
Pause scanner for specific times | You can configure hours of the day when scanning should not be allowed. This setting ensures that Data Insight does not scan during peak loads on the filer. This setting is enabled by default. Scans resume from the point they were at before they were paused. |
Pause scanner schedule | Configures when scanning should not be allowed to run. By default, scanning is paused from 7:00 A.M. to 7:00 P.M. Monday to Friday. |
Table: File system scanner settings - Throttling for NetApp 7-mode and NetApp cluster-mode filer
Setting | Description |
---|---|
Use global settings | This option is selected by default. When this option is selected, Data Insight uses the throttling thresholds defined on the > > page. |
Use custom settings | Select to disable scanning or override the global throttling thresholds and define custom values. |
Throttle scanning based on latency of the filer | Clear the check box to disable scanning for the filers that the collector is monitoring. Select to enable throttling of Data Insight scans for NetApp 7-mode and Cluster-Mode file servers. This option is not selected by default. Data Insight collects latency information from NetApp file servers. It can use this information to throttle scanning, if latency of the file server increases above a certain level. This ensures scanner does not put additional load on the file server during peak load conditions. You can configure the following parameters to enable throttling for NetApp file servers:
|
Table: File system scanner settings - Common settings
Setting | Description |
---|---|
Scanner snapshot interval | Scanning a big share can take several hours. The scanner periodically saves information to a disk so that information is visible sooner without waiting for the entire scan to finish. You can configure how often information is saved to the disk by the scanner. By default, the scanner creates a snapshot of new information every 300 seconds (5 minutes). The minimum value you can set for this parameter is 300. |
Table: Indexer settings
Setting | Description |
---|---|
Total indexer threads | The indexer processes incoming scan and access event information for various shares and updates the per-share database. This setting configures how many databases can be updated in parallel. By default 2 threads are configured. Specify a larger value for bigger setups where indexer is not able to keep up with incoming rate of information. This is indicated when you observe too many files in the inbox of the Indexer worker node. However, you must ensure that the Indexer has adequate CPU and memory when configuring a higher number of indexer threads. You need approximately 1 GB of RAM per indexer thread. |
Limit maximum events processed in memory | By default, the indexer processes all new incoming events in memory before saving the events to the disk. If your are falling short of RAM on your Indexer, you can limit the maximum number of events that the indexer processes in memory before it saves them to the disk. Note: Specifying a small number makes the indexing very slow. |
Reconfirm deleted paths when reconciling full scan information | After Data Insight indexes full scan data, it computes the paths that no longer seem to be present on the file system. Similarly, if Data Insight discovers folders or subsites missing in a SharePoint site, then it computes those subsites or folders as deleted without performing a reconfirmation scan. Select this check box to have Data Insight re-confirm if those paths are indeed deleted using an incremental scan before removing them from the index. When the check box is clear, Data Insight readily removes the missing paths from the indexes without carrying out a re-confirmation. Note: Re-confirm scan is not supported for site collections. |
Indexer schedule | Specify how often an index should be updated with new information. By default, all new data is consumed once every four hours. Indexer gets better throughput if more information is given to it when indexing. However, if you configure a very high value, new information will not be visible in the Console for a much longer period. |
Indexer integrity checking schedule | Data Insight checks the integrity of its databases once a week. If any errors are found in the database, an event is published. You can configure a different schedule if required. |
Table: Audit events preprocessor settings
Setting | Description |
---|---|
Audit events preprocessor schedule | Incoming raw audit events from file servers must be pre-processed before sending them to the Indexer. At this stage, By default, raw events are processed every 2 hours. |
Batch size (MB) | The maximum size of the raw audit event files that a single Collector thread can process. The default batch size is 2 GB. |
Total Collector threads | The Collector can run multiple pre-processors in parallel. This setting configures how many instances can run in parallel. |
Table: FPolicy safeguard settings
Setting | Description |
---|---|
Use global settings | This option is selected by default. Use the FPolicy safeguard settings for NetApp 7-mode and Cluster-mode filers as defined on the > > page.See Configuring scanning and event monitoring . Data Insight collects latency information from NetApp file servers. It can use this information to initiate safeguard mode, if latency of the file server increases above or falls below a certain level. When the safeguard is in effect, Data Insight drops its FPolicy connection to the filer. This ensures event collection does not put additional load on the file server in peak load conditions. If the latency on the physical file server increases above the configured threshold, Data Insight disconnects from the associated virtual file server. This information is also displayed on the Data Insight System Overview dashboard. |
Use custom settings | Select to disable the safeguard settings or to override the global safeguard thresholds and define custom values. |
Enable FPolicy safeguard settings | Select one of the following, as appropriate:
These FPolicy safeguard settings are not selected by default. When these check boxes are cleared, the safeguard settings are not in effect. Configure the following values:
|
Table: High availability settings
Setting | Description |
---|---|
Ping timeout (in minutes) | If a worker node does not heartbeat in the specified interval, Management server will publish an event to that effect. This setting is only applicable for the Management Server. |
Notify when CPU continuously over (percentage) | If CPU used on this server is consistently over the specified percentage, an event is published. (Default value: 90%) |
Notify when memory continuously over (percentage) | If Memory used on this server is consistently over the specified percentage, an event is published. (Default value: 80%) |
Notify when disk usage over (percentage) | If disk usage, either for the system drive or data drive, is over the specified threshold, an event is published. (Default value: 80%) |
Notify when disk free size under (MB) | If the free disk space for the system drive or data drive is over the specified threshold in megabytes, an event is published. (Default value: 500 MB) |
Notify when number of files in | If Data Insight is not able to process an incoming file for some reason, that file is moved to an |
Notify when number of files in | If Data Insight is not able to process incoming data fast enough, the number of files in the transient folders, |
Table: Classification settings - Content scans for file system
Setting | Description |
---|---|
Maximum content scans to run in parallel on this server | The Classification Server can perform multiple content scans in parallel. This setting puts a limit on the total number of scans that can run in parallel on a Classification Server. The default value is two threads. Configure more threads, if you want scans to finish faster. Ensure that the server has enough resources to run the configured parallel threads. |
Maximum classification threads to run in parallel on this server | When content is being classified in parallel, this setting puts a limit on the total number of classification threads that you can run in parallel. |
Maximum shares per filer to content scan in parallel | When multiple shares of a filer are scanned in parallel, this setting puts a limit on the total number of shares of a filer that you can scan in parallel. |
Pause content scanner for specific times | You can configure the hours of the day when content scanning should not be allowed. This setting ensures that Data Insight does not scan during peak loads on the filer. The setting is enabled by default. Scans resume from the point they were at before they were paused. Content scanning is paused by default from Monday to Friday from 7:00 A.M. to 7:00 P.M. To configure the pause schedule, click on the scheduler to configure the days and time on which content scanning should be paused.You can configure more than one pause schedules. Click to add new pause schedule. |
Custom Property | Veritas Classification Server master ID This property represents the master Classification Server ID assigned to slave Classification Server. Veritas Classification lb. disabled This property is applicable to Veritas Classification master server only if at least one server is in the pool used to turn load balancing feature on/off. By default, it is set to false. If you want to disable this feature set its value to true. Matrix.classify.fetch.max_batches This is the global property used to set maximum number of batches per priority to keep in content fetch queue on every slave. This is used only if load balancing is enabled. The default value is 10. |
Table: Classification settings - Content scans for SharePoint
Setting | Description |
---|---|
Maximum content scans to run in parallel on this server | The Classification Server can perform multiple content scans in parallel. This setting puts a limit on the total number of scans that can run in parallel on a Classification Server. The default value is two threads. Configure more threads, if you want scans to finish faster. Ensure that the server has enough resources to run the configured parallel threads. |
Maximum classification threads to run in parallel on this server | When content is being classified in parallel, this setting puts a limit on the total number of classification threads that you can run in parallel. |
Maximum site collections per web application to content scan in parallel | If multiple site collections can be scanned in parallel, this setting puts a limit on the total number of site collections per web application that you can scan in parallel. |
Pause content scanner for specific times | You can configure the hours of the day when content scanning should not be allowed. This setting ensures that Data Insight does not scan during peak loads on the SharePoint server. The setting is enabled by default. Scans resume from the point they were at before they were paused. Content scanning is paused by default from Monday to Friday from 7:00 A.M. to 7:00 P.M. To configure a different pause schedule, click on the scheduler to configure the days and time on which content scanning should be paused.You can configure more than one pause schedules. Click to add new pause schedule. |
Table: Reports settings
Setting | Description |
---|---|
Maximum memory when generating report output | Specifies the maximum memory that can be used for generating a report output. By default, it is 1024 MB on a 32 bit machine and 2048 MB on a 64 bit machine |
Total threads for generating report output | Configure the number of threads for generating report output (PDF/HTML/CSV) in parallel. Default value is 2. This setting applies to the Management Server. |
Number of threads for a single report run | Configure the number of threads responsible for generating the report output database for a single report. This configuration applies to the Indexer node. This setting helps you speed up the process of report generation. For a particular Data Insight server, the thread count applies to all types of reports. |
Maximum reports that can run simultaneously | By default, Veritas Data Insight Administrator's Guide executes two reports in parallel. However, you can configure a higher value to run multiple reports in parallel. |
Table: Windows File Server agent settings
Setting | Description |
---|---|
Maximum kernel ring buffer size | The Windows File Server filter driver puts events in an in-memory buffer before the DataInsightWinnas service, consumes them. By default, it uses a 10MB buffer. You can use a bigger buffer. Data Insight publishes an event that indicates events are being dropped due to a high incoming rate. Note: Buffer is in kernel and is limited on a 32 bit operating system. |
Ignore accesses made by Local System account | The Windows File Server filter driver ignores accesses made by processes running with Local System account. This setting ensures that Data Insight can ignore most events originating from the operating system processes or other services like antivirus and backup. Clear this check box to enable monitoring accesses made by LOCAL SYSTEM account. This is not recommended on a production file server. |
Table: Veritas File System server settings
Setting | Description |
---|---|
Flush events on VxFS filer before audit | Set this option to true, if you want to force VxFS to flush its events to disk each time requests for information. This option is useful in Proof-of-Concept (POC) setups and enables you to see events faData Insightster. |
Maximum number of audit threads | This option determines how many filers to fetch audit information from in parallel. |
Maximum kernel ring buffer size (Number of records) | The access event records are saved in a log file on the VxFS filer before Data Insight consumes them. By default, 50,000 records can be saved in the log file. You can also specify a larger number. Data Insight publishes an event that indicates that events are being dropped due to a high incoming rate. |
Table: NFS settings
Setting | Description |
---|---|
Set default credentials for NFS scanner | Set this option to true if you want to allow Data Insight to use the specified User and Group ID to log in to scan NFS shares. |
User ID | The ID of the NFS user that the Data Insight uses to scan the filer. You can set the value to 0 to allow root access from the Data Insight scan hosts. |
Group ID | The ID of the group that the Data Insight uses to scan the filer. You can set the value to 0 to allow root access from the Data Insight scan hosts. |
Table: SharePoint settings
Setting | Description |
---|---|
Automatically delete audit events from SharePoint server that are older than (days) | When configuring a SharePoint web application, you can choose to let Data Insight delete audit logs that have already been fetched from SharePoint. By default, Data Insight deletes audit logs older than two days. Deletion of audit logs takes place every 12 hours. |
Schedule to fetch audit events from SharePoint server | Data Insight fetches new audit events from SharePoint periodically. By default, it does so every 2 hours. You can configure a different schedule. |
Scan multiple site collections of a web application in parallel | The Collector can perform multiple full scans in parallel. This setting configures how many full scans can run in parallel. The default value is 2 parallel threads. Configure more threads if you want scans to finish faster. The setting disabled by default. |
Maximum site collections per web application to scan in parallel | If multiple site collections of a web application can be scanned in parallel, this setting puts a limit on the total number of site collections of a web application that you can scan in parallel. |
Default scan schedule | Specifies how often scans need to be performed. You can override this setting at a web application or site collection level. By default, scans are scheduled to repeat 11:00 P.M. each night. |
Pause scanner for specific times | You can configure the hours of the day when scanning should not be allowed. This ensures that Data Insight does not scan during peak loads on the SharePoint servers. The setting is enabled by default. Scans resume from the point they were at before they were paused. |
Pause scanner schedule | Specify when scanning should not be allowed to run. By default, scanning is paused from 7:00 A.M. to 7:00 P.M., Monday to Friday. |
Pause auto-delete for specific times | You can configure the hours of the day when auto-delete of audit events from SharePoint server should not be allowed. This feature can help you to avoid overloading the SharePoint servers during the peak hours. By default, deletion of audit logs takes place every 12 hours. |
Pause schedule for auto-delete | Specify when auto-delete of the audit logs should not be allowed to run. |
Table: Troubleshooting settings
Setting | Setting |
---|---|
Preserve intermediate files | As new data comes into a Data Insight system, it moves between various modules. In this process the original files are deleted and a new processed file is generated for the next stage of processing. To aid troubleshooting, select this check box to retain the intermediate data files. These files get stored in |
Preserve raw audit event files | Events processed by the Audit Pre-processor stage are deleted once consumed. If this setting is enabled, raw audit event files will be preserved in the |
More Information