Minimum O/S ulimit settings on primary and media server Linux/UNIX platforms

Article: 100022164
Last Published: 2023-03-20
Ratings: 7 3
Product(s): NetBackup & Alta Data Protection

Problem

Minimum O/S ulimit settings detected during a NetBackup install or upgrade.

The ulimit settings are most crucial on primary and media servers that execute many hundreds of simultaneously backup or restore jobs.

Settings that are too low will cause application faults and job failures once the concurrent job load exceeds the resources allowed to be consumed by the processes.  A host may run without issue for months or years, but then begin to fail once the load increases sufficiently.

The warning below can be ignored on most NetBackup client hosts unless the ulimit value is less than 1024 or the concurrent job/stream count is >10 .
 

Error Message

The following check may fail during either NetBackup install or NetBackup upgrade:

not ok ulimit_nofiles: nofiles ulimit <value> is too low.
  NetBackup Master and Media Server processes may run slower if they are
  limited to fewer than 8000 open file descriptors.  This test runs
  'ulimit -n' and checks that the result is at least 8000 on NetBackup
  servers.  See
    https://www.veritas.com/support/en_US/article.TECH75332
  for more information.

The following error message may be displayed or logged at other times:

Resource temporarily unavailable

Solution

The operating system (O/S) must be configured to allow NetBackup programs to utilize sufficient O/S resources to run the configured job load.  The error above is specific to:

  • Open files per process (nofile); recommended to be at least 8192 (soft) and 65536 (hard), never to unlimited.

Related file and process resources can also be reviewed and adjusted at the same time.

  • Number of processes and threads per user (nproc); recommended to be at least 65536, for nbwmc across Flex containers.
  • Maximum file size (fsize); recommended to be unlimited, never less than 500 GB.

Note: These resource limits are relevant to the root user, and also to the WEBSVC_USER (NetBackup 8.1+),  and also to the SERVICE_USER (NetBackup 9.1+).

When are the ulimit settings for a process determined?

At process startup, soft and hard resource limits are granted by the operating system.  Thus, changing the limits involves changing the runtime environment before a process is started.  Accordingly, be very aware of both ‘when’ and ‘how’ NetBackup processes are started on the host.

  1. From the system boot or reboot environment.
  2. From a clustering technology using a cluster, node, or resource on-line script, such as Veritas Cluster Server.
  3. Via an independent service group (Linux systemctl) or in association with a resource group (Solaris project).
  4. From a user login shell environment, by manual execution of a startup script:
    /usr/openv/netbackup/bin/goodies/netbackup
    /usr/openv/netbackup/bin/bp.start_all
  5. From a user login shell environment, by manual execution of an individual program: initbprd, initbpdbm, bpcd -standalone, etc.

Note: After startup, processes can change their current allocation up to the hard limit imposed by the operating system but may not use additional resources.  Processes can also change their current allocation to a lower value, but thereafter cannot raise them.  NetBackup services that require significant resources will, upon startup, request to set nofile to 8192 or 65336 or other value as appropriate for their needs.  The hard limit must be high enough to allow the request to succeed or the process may subsequently fail if the resource limit is reached.

Note: Linux (prlimit) and Solaris (plimit) allow some limits to be increased after a process is started, but many processes check the limits at startup to pre-configure and optimize operations.  Later changes to the limits will not have any effect because the code has already executed.  Do not adjust the ulimit settings for NetBackup processes on-the-fly.

How are ulimit settings for a process configured?

All operating systems provide methods to temporarily set or constrain the limits for the various environments that they host, and for persisting changes through a host reboot.  The exact methods and default values depend on the operating system version and the features it provides.  

Review the operating system vendor documentation for complete details; sysctl, security/limits.d, systemctl/system.d/system, ulimit, PAM, etc.

Examples of some of the more common configuration methods for default environments are shown below.  If an environment has already been customized, there may be overriding values configured in other places such as /etc/profile, /etc/bashrc, /opt/VRTSvcs/bin/vcsenv, etc.  Seek assistance from the operating system administrator or vendor as needed. 

For Linux, do sections A, B, C, and D.
For Solaris, do sections A, B, and E.
For AIX, do sections B and F.
For HP-UX, do sections B and G.
 

A) [Linux/Solaris] Verifying the current ulimit settings of running processes.

Obtain a short list of PIDs for NetBackup processes that simultaneously use thousands of file descriptors/handles and many process threads.

$ /usr/openv/netbackup/bin/bpps | egrep 'nbwmc|beam.smp|vnetd.*inbound|bpjobd' | cut -c1-100
nbsvcusr  5089     1  0 20:42 ?      00:00:00 /usr/openv/netbackup/bin/vnetd -proxy inbound_proxy
nbwebsvc  5487     1 46 20:43 ?      00:01:24 /usr/openv/java/jre/bin/java -Dnop -Djava.util.loggi
nbwebsvc  5977     1  3 20:43 ?      00:00:04 /opt/openv/mqbroker/erlang/erts-11.1/bin/beam.smp -W
nbsvcusr  6830  6819  0 20:43 pts/0  00:00:00 /usr/openv/netbackup/bin/bpjobd

Note: Starting with NetBackup 9.1, bpjobd and vnetd -proxy inbound are owned by the SERVICE_USER which should be a non-root login.

Check the ulimit settings for each of the displayed PIDs, for example.

Linux$ prlimit --pid=5089 | egrep -i '^RESOURCE|nofile|nproc|fsize'
RESOURCE   DESCRIPTION                      SOFT       HARD UNITS
FSIZE      max file size              2147483648 2147483648 blocks
NOFILE     max number of open files         8192       8192 
NPROC      max number of processes         27147      27147

Solaris$ plimit 5089 | egrep -i 'nofile|file.*blocks'
  file(blocks)          unlimited       unlimited
  nofiles(descriptors)  8192            8192

Note: On Linux, the total system-wide number of concurrently open files actively in use across all processes should also be verified. This limit is set by the fs.file-max setting. Veritas recommends that this number is at least 65536 for all NetBackup versions prior to 10.2, or at least 131072 for versions 10.2 and above. It is likely that the current value is already higher than these minimum values, and if that is the case it should not be changed.

This value must accommodate all applications on the host, and Veritas cannot make a recommendation for the number required by other applications.

Previous configuration changes to reduce fs.file-max or applications holding exceptionally large numbers of open files may cause the system to hit this file limit. If the limit is reached, active primary servers will encounter job failures with status 800, and the syslog will show that the “file-max limit” has been reached.

Check/verify both the number of open files (8032) and the maximum open files (688307). The latter value should not be suspiciously small; it is typically one million or higher.

Linux$ sysctl fs.file-nr
fs.file-nr = 8032       0       688307

Linux$ sysctl fs.file-max
fs.file-max = 688307

To increase the value, first edit 'fs.file-max' in the /etc/sysctl.conf file, then run 'sysctl -p' to apply the changes.

Note: Solaris does not have a system-wide limit to the number of open files by all processes.  It is constrained only by available memory.
 

B) Checking/Changing ulimit settings within the login shell environment before manual command-line startup of processes

Confirm the current user ID, and then review the current soft (-S) and hard (-H) limits for the maximum number of open files per process (nofile), the maximum file size (fsize), and the maximum number of process threads per user (nproc).

$ id -a
uid=0(root) guid=0(root) groups=0(root)…

$ ulimit -a -S | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
file size            (blocks, -f) 1097152  <== Lower than recommended value
open files                   (-n) 1024     <== Lower than recommended value
max user processes           (-u) 2048     <== Lower than recommended value

$ ulimit -a -H | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
file size            (blocks, -f) unlimited
open files                   (-n) 65535
max user processes           (-u) 65536

Note: HP-UX does not support ulimit -u (nproc).

If any specific soft limit is less than the recommended value, increase it to the recommended value. If the hard limit is also too low, the limits used to configure the terminal shell will need to first be adjusted by the system administrator using one of the sections below or another technique.

$ ulimit -f unlimited
$ ulimit -n 8192
$ ulimit -u 65536

Note: Do not decrease limits that are already set higher than the recommended values.

Always verify the expected changes were successful. 

$ ulimit -a -S | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
file size            (blocks, -f) unlimited   <== Raised by ulimit -f unlimited
open files                   (-n) 8192        <== Raised by ulimit -n 8192
max user processes           (-u) 65536       <== Raised by ulimit -u 65536

$ ulimit -a -S | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
file size            (blocks, -f) unlimited
open files                   (-n) 8192        <== Lowered by ulimit -n 8192
max user processes           (-u) 65536

Note: Do not needlessly change limits, it immediately constrains the hard limit. Notice that the hard limit for nofile was lowered from 65535 to 8192.

Then stop NetBackup processes, confirm they are down, and restart.

$ /usr/openv/netbackup/bin/goodies/netbackup stop
$ /usr/openv/netbackup/bin/bpps -a
$ /usr/openv/netbackup/bin/goodies/netbackup start
or
$ /usr/openv/netbackup/bin/bp.start_all

For Linux and Solaris, verify the expected ulimit values are in use after process restart.  See section A.
 

C) [Linux] Checking/Changing ulimit settings for future login shell environments.

The exact method for persisting change for future login shells varies by O/S distribution and version but is generally controlled by PAM entries in the /etc/security/limits.conf file.  

Note: Do not perform this step on a Flex-based instance.  Appropriate ulimit settings are set for Flex instances. Please contact Support for assistance if adjustments are necessary.

Any current non-default settings will generally be located in these files.  

$ egrep -i 'no*file|no*proc|fi*l*e*size' /etc/security/limits* /etc/security/limits.d/*.conf 2>/dev/null
… snip …
/etc/security/limits.d/20-nproc.conf:*          soft    nproc     4096
/etc/security/limits.d/20-nproc.conf:root       soft    nproc     unlimited
/etc/security/limits.conf:#        - fsize - maximum filesize (KB)
/etc/security/limits.conf:#        - nofile - max number of open file descriptors
/etc/security/limits.conf:#        - nproc - max number of processes

Also review these next outputs to confirm existing or default soft and hard limits while running as the root user.  

$ id -a
uid=0(root) gid=0(root) groups=0(root)
$ ulimit -a -S | egrep "\-n|\-f|\-u"
$ ulimit -a -H | egrep "\-n|\-f|\-u"

Then repeat for the other non-root users that own NetBackup processes.

Replace ‘nbwebsvc’ with the configured login name for the WEBSVC_USER used in NetBackup 8.1+.

$ su nbwebsvc
nbwebsvc$ ulimit -a -S | egrep "\-n|\-f|\-u"
nbwebsvc$ ulimit -a -H | egrep "\-n|\-f|\-u"
nbwebsvc$ exit

Replace ‘nbsvcusr’ with the configured login name for the SERVICE_USER used in NetBackup 9.1+.

$ su nbsvcusr
nbsvcusr$ ulimit -a -S | egrep "\-n|\-f|\-u"
nbsvcusr$ ulimit -a -H | egrep "\-n|\-f|\-u"
nbsvcusr$ exit 

If any outputs above were less than the recommended values, add or change appropriate entries to either a /etc/security/limits.d/*.conf file if it exists, or to the /etc/security/limits.conf file.  On some O/S versions, the asterisk (*) in the user column matches only non-root users, and it is necessary to add entries specifically for the root user.  If the site does not want these settings applied to all (*) users, then add rows specific to the NetBackup login names for the SERVICE_USER and WEBSVC_USER. 

*    soft    nofile         8192
*    hard    nofile        65536
*    soft    nproc         65536
*    hard    nproc         65536
*    soft    fsize     unlimited
*    hard    fsize     unlimited

Note: Do not decrease limits that are already set higher than the recommended values.

Confirm PAM is enabled appropriately.  One of the ‘/etc/pam.d/*’ files should contain a line similar to ‘session required pam_limits.so’.  The specific pathname for the shared object library will vary by O/S release and version.

The new limits will take effect for new login sessions.  To verify, see section B after starting a new terminal shell.
 

D) [Linux] Checking/Changing ulimit settings for systemd controlled services, both NetBackup and Veritas Cluster Server.

NetBackup versions through at least 10.1, are not formal systemctl/systemd service units.  But some versions of systemctl will still allow the system administrator to start, status, and stop NetBackup, and that may cause a different configuration of ulimit values to be used.

Because NetBackup could be started by systemd, check for pre-existing configuration that constrains nofile, fsize, or nproc.  

$ systemctl show netbackup | egrep -i 'nofile|nproc|fsize'
LimitFSIZE=2097152
LimitNOFILE=1024
LimitNPROC=2048

If NetBackup is constrained to values less than the recommended values, then configure the recommended values to avoid problems should systemd be used to start NetBackup.  Use ‘infinity’ to specify ‘unlimited’ when relevant.  On some platforms it may be necessary to use ‘systemctl edit --force netbackup’.

Note: Edit this next command to remove key=value pairs that should not be lowered from higher values that already exist, including the preceding newline (\n), as necessary.

$ echo -e "[Service]\nLimitNOFILE=8192\nLimitFSIZE=infinity\nLimitNPROC=65536" | SYSTEMD_EDITOR="tee" systemctl edit netbackup

Confirm the settings were saved to the service unit specific file and are available to be used upon subsequent restart of the service.

$ systemctl show netbackup | egrep -i 'nofile|nproc|fsize'
LimitFSIZE=infinitiy
LimitNOFILE=8192
LimitNPROC=65536

$ egrep -I -r -i 'nofile|nproc|fsize' /etc/sys* /usr/lib/sys* /run/sys* | grep netbackup
/etc/systemd/system/netbackup.service.d/override.conf:LimitNOFILE=8192
/etc/systemd/system/netbackup.service.d/override.conf:LimitFSIZE=infinity
/etc/systemd/system/netbackup.service.d/override.conf:LimitNPROC=65536

Use systemctl to stop the service, confirm all processes are down, ensure systemctl has loaded the configuration changes, and then restart the service. This should pick up the config changes performed above.

$ systemctl stop netbackup
$ /usr/openv/netbackup/bin/bpps -a
$ systemctl daemon-reload
$ systemctl start netbackup

Verify the expected ulimit values are in use after the restart; see section A.

Note: If NetBackup is under the control of a clustering technology, also check if the cluster service unit is under systemctl control.  This example is for Veritas Cluster Server, use the appropriate service unit name for other clustering software.

$ systemctl status vcs

$ systemctl show vcs | egrep -i 'nofile|nproc|fsize'

Note: Edit this next command to remove key=value pairs that should not be lowered from higher values that already exist, including the preceding newline (\n), as necessary.

$ echo -e "[Service]\nLimitNOFILE=8192\nLimitFSIZE=infinity\nLimitNPROC=65536" | SYSTEMD_EDITOR="tee" systemctl edit vcs

$ systemctl show vcs | egrep -i 'nofile|nproc|fsize'

$ egrep -I -r -i 'nofile|nproc|fsize' /etc/sys* /usr/lib/sys* /run/sys* | grep vcs
 

E) [Solaris] Checking/Changing persistent ulimit settings for future login shell environments or project programs. 

Oracle recommends implementing ulimit changes via projects, but the deprecated technique of updating the /etc/system and/or /etc/system.d/* file(s) is still permitted.  

Review the current project and non-default system settings.

$ projects -l -v

$ egrep -v '^\*|^$' /etc/system /etc/system.d/* 2>/dev/null

The following commands either adjust the resources for processes running under a project named ‘netbackup’ or append entries to the system file(s).  If values already exist in system file(s), edit the existing entries and change the values instead of appending additional conflicting entries.  Updates to the system file(s) require a reboot to take effect.

Note: Do not decrease limits that are already set higher than the recommended values.

Increasing the hard and soft limits for nofile:

$ projmod -s -K "process.max-file-descriptor=(priv,65536,deny)" netbackup
$ projmod -s -K "process.max-file-descriptor=(basic,8192,deny)" netbackup
or
echo 'set rlim_fd_max=65536' >> /etc/system
echo 'set rlim_fd_cur=8196' >> /etc/system

Increasing the hard and soft limits for fsize:

$ projmod -s -K "process.max-file-size=(priv,107374182400,deny)" netbackup
$ projmod -s -K "process.max-file-size=(basic,107374182400,deny)" netbackup

Increasing the hard and soft limits for nproc:

$ projmod -s -K "project.max-processes=(priv,65536,deny)" netbackup
$ projmod -s -K "project.max-processes=(basic,65536,deny)" netbackup
or
echo 'set maxuprc=65536' >> /etc/system

Note: Be sure to apply the project to the scripts or processes that start the NetBackup application.

Verify the expected ulimit values are in use after restarting NetBackup under control of the project or rebooting to pick up the system file(s) changes.  See section A.
 

F) [AIX] Checking/Changing persistent ulimit settings for future login shell environments.

The exact method for persisting change for future login shells varies by O/S version but is generally configured by entries in the /etc/security/limits file.  

$ egrep -i 'no*file|no*proc|fi*l*e*size' /etc/security/limits* /etc/security/limits.d/*.conf 2>/dev/null
/etc/security/limits:* fsize      - soft file size in blocks
/etc/security/limits:* nofiles    - soft file descriptor limit
/etc/security/limits:* fsize_hard - hard file size in blocks
/etc/security/limits:* nofiles_hard - hard file descriptor limit
/etc/security/limits:*   fsize_hard    set to fsize
/etc/security/limits:*   nofiles_hard      -1
/etc/security/limits:    fsize = -1
/etc/security/limits:    nofiles = 2000 

Review these outputs to confirm existing or default soft and hard limits while running as the root user.  Then repeat for other non-root users that own NetBackup processes.

$ id -a
uid=0(root) gid=0(root) groups=0(root)
$ ulimit -a -S | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
$ ulimit -a -H | egrep '\-n|^nofile|\-f|^file|\-u|^processes'

Replace ‘nbwebsvc’ with the configured login name for WEBSVC_USER used on NetBackup 8.1+ primary servers.

$ su nbwebsvc
nbwebsvc$ ulimit -a -S | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
nbwebsvc$ ulimit -a -H | egrep '\-n|^nofile|\-f|^file|\-u|^processes'
nbwebsvc$ exit

If any outputs above were less than the recommended values, add or change appropriate entries in the /etc/security/limits file.  In this example, an unlimited fsize already is the default.

Note: Do not decrease limits that are already set higher than the recommended values.

default:
    fsize = -1
    ...snip...
    nofiles = 2000

nbwebsvc:
    nofiles = 8192
    nofiles_hard = 65536
    nproc = 65536

Note: Some versions of AIX use ‘-1’ in place of ‘unlimited’.

The new limits will take effect for new login sessions.  To verify, see section B after starting a new terminal shell.
 

G) [HP-UX] Checking/Changing persistent ulimit settings for future login shell environments.

Please review available options with the system administrator and/or the O/S vendor.

To verify, see section B after starting a new terminal shell or rebooting the host as needed.
 

 

Was this content helpful?