What is the health checker?
The PureDisk Health Checker performs a series of tests on all components, workflows and configuration
settings in the storage pool. These are validated and matched against recommended values.
What is it not?
The Health Checker does not check:
- Consistency of meta-data (recoverMB)
- Consistency of data store (recoverCR)
- Collect and analyze logs (PDgetlogs/PDX_AnaLog)
The PureDisk health checker is designed to run an all versions of PureDisk Remote Office
or NetBackup 50x0 Deduplication Appliances, but has only been tested extensively on the
following versions and above:
Deduplication Appliance 184.108.40.206
In case a failure occurs on a lower version, please note that this version is not
recommended and the customer should upgrade as soon as possible. The health checker
tool will not be fixed for any versions lower than the ones mentioned here.
How to deploy?
Extract the tarball on the SPA node, with the following command:
tar xvzf health_checker_*.tgz -C /opt/pdconfigure/scripts/support/
The tarball only needs to be extracted on the SPA. The "multi-node" option (see below),
will automatically copy the required files to the other nodes in the storage pool.
How to run?
The Health Checker comes with the following options:
pd_spa:~ # /opt/pdag/bin/php /opt/pdconfigure/scripts/support/health_checker.php --help
Usage: /opt/pdag/bin/php health_checker.php [-v] [-o <file>] [-nc] [-m] [-q]
-v : verbose
-q : quiet (only show test output, not test itself
-m : run multi-node (start on SPA node)
-o <file> : output to file
-r : show recommended actions
-nc : no colors in output
--version : show current version
--help : What you are looking at
You can run the health checker on any node in the storage pool, or run it on the SPA node
with the multi-node ("-m") option to collect the report from each node automatically.
The "verbose" and "recommended actions" flags give more info about the test and what to do
in case of issues.
The duration of the health checker is kept to a minimum (a few minutes), so it can easily be
run on an environment and return output instantly.
How to read the output?
The output will return either nothing (no issues or recommendations), or one of the following
NOTICE: an item for the user to note, but no impact on the system
WARNING: an item that requires attention, but will not cause issues or errors
ERROR: requires immediate attention and can be the cause of various errors and failures
CRITICAL: high severity issue, that can lead to data loss or corruption. Must be addressed
immediately by user and/or support engineer.
Important: one issue, for example: a service that is not running, might result in multiple
errors from the health checker. Make sure you read through the entire report first and start
addressing the issues as seen fit, rather than fix each issue individually.
A full listing of tests, descriptions, example output, recommended actions and manual
verification steps, is available in the Health Checker documentation. Match the test results
you find on the environment to the documentation.
PDgetlogs (the log file collection tool), also collects a health_checker.log for each node,
starting at versions PureDisk 220.127.116.11 EEB20 and Deduplication Appliance 1.4.
Attached to this tech article is more information about the recommended actions for each
test and steps on how to verify the issue manually. This is constant work in progress and
will be updated frequently.