The NetBackup Appliance may become unresponsive or show memory allocation errors in the system messages log.
Problem
NetBackup 52x0 Appliances may become unresponsive or show memory allocation errors in the system messages log. In some cases, this can be caused by an excessive number of processes running in relation to appliance monitoring. The parent process for monitoring will be listed in the process list as callhome.pl. Typically, there should only be 1-2 callhome.pl processes active. If ps -ef command output shows many (10-20+) callhome.pl processes active, the Appliance is likely affected. Left unchecked, callhome.pl processes may continue to accumulate until all memory is used and the Appliance becomes unresponsive.
Error Message
When this issue is occurring (processes are starting but not exiting), the system messages log may report memory allocation errors as RAM is exhuasted. Eventually, the Appliance may be unresponsive to backup requests, logins, etc.
Cause
The callhome.pl script forks several processes - if any of these child processes hang, callhome.pl will wait indefinitely.
Solution
Veritas Corporation has acknowledged the the above-mentioned issue (ETrack 3152119) is present in the current version(s) of the product(s) mentioned in this article. Veritas Corporation is committed to product quality and satisfied customers. This issue was scheduled to be addressed in the following release:
- NetBackup 52x0 Appliances 2.5.4
When NetBackup Appliances 2.5.4 is released, please visit this link for download and README information:
https://go.Veritas.com/nba
Please note that Veritas Corporation reserves the right to remove any fix from the targeted release if it does not pass quality assurance tests or introduces new risks to overall code stability. Veritas's plans are subject to change and any action taken by you based on the above information or your reliance upon the above information is made at your own risk.
Workaround:
If running version 2.5.3, please download the RPM attached below and access the Related Article linked below for instructions on applying the hotfix on an Appliance.
Additional code checks have been added to callhome.pl to ensure a timeout is placed on forked processes. This ensures callhome.pl exits within a reasonable time period.
Note: This hotfix is no longer publicly available, as all of its fixes are included in the NetBackup Appliances 2.5.4 Release Update. If this issue is experienced in a NetBackup Appliances 2.5.x environment, the supported resolution is to apply the latest maintenance release OR upgrade to NetBackup Appliances 2.6 or above.
Applies To
This issue affects NetBackup 52x0 Appliances through version 2.5.3.