Search <product_name> all support & community content...

Access 3340/3350 initial cluster configuration fails on NTP step

Article: 100055559

Last Published: 2023-03-31

Ratings: 1 2

Product(s): Appliances

Problem

The initial cluster configuration fails because nodes can't synchronize with choosen NTP server:

Error Message

In /opt/VRTS/install/logs/ this error is seen:

ntpd: no servers found Time syncronized with NTP server. 2023-03-30T08:18:08.451404-0700 2 NTP configuration return : 1 2023-03-30T08:18:08.452149-0700 2 CPI ERROR V-9-40-2702 NTP configuration failed.

Running the same command run by configuration the output is:

access-appliance:~ # /opt/VRTSnas/pysnas/system/ntp.py install_ntp xx.xx.xx.xx 2>&1

Stopping ntpd service.

ntpd service stopped.

Disabling ntpd service.

ntpd service disabled.

Setting defaults in ntp.conf.

Querying NTP server.

server xx.xx.xx.xx, stratum 4, offset 10.049736, delay 0.04684 31 Mar 02:15:14 ntpdate[115412]: step time server xx.xx.xx.xx offset 10.049736 sec

Adding NTP server to ntp.conf. Synchronizing time. ntpd: no servers found

Time syncronized with NTP server.

Failed to synchronize time.

Cause

The problem is the network latency between the Access nodes and NTP server.

Troubleshooting steps:

Use ntpq to check synchronization status:

access-appliance:~ # ntpq -c assoc
ind assid status conf reach auth condition last_event cnt
===========================================================
1 3898 961a yes yes none sys.peer sys_peer 1

Note: If the NTP works fine, the result should be reach=yes, condition=sys.peer.

ntpq> rv 3898
associd=3898 status=961a conf, reach, sel_sys.peer, 1 event, sys_peer, srcadr=10.XX.1XX.1X0, srcport=123, dstadr=10.XX.1XX.1X1, dstport=123, leap=00, stratum=12, precision=-6, rootdelay=31.250, rootdisp=64.575, refid=10.62.68.236, reftime=e0d00ab8.2af01902 Wed, Jul 10 2019 6:56:56.167, rec=e0d00c5e.d78d706e Wed, Jul 10 2019 7:03:58.842, reach=377,

If the "reach" is NOT "yes", and "condition" is NOT "sys.peer" (which means the time synchronization is having issue), check the local time and NTP server time. If the local time is greater/less than 1000 seconds ntpd will not set the clock. The time must be manually set.

The following status is showing the abnormal synchronization status:

access-appliance:~ # ntpq -c assoc
ind assid status conf reach auth condition last_event cnt =========================================================== 1 58280 8011 yes no none reject mobilize 1

The " reach=no" means the NTP server does not respond to the request or the network is unavailable. Troubleshoot the network and NTP server.

Scenario 1: Network issue:
Use ping to check if the NTP Server is reachable and follow the network troubleshooting to check. Once the network issue is confirmed, ask the user to engage the network team and confirm the network issue is fixed.

Scenario 2: Wrong NTP IP or Service issue:
If the NTP server is pingable, it may be that the user inputs the wrong NTP IP or the NTP service runs into an issue. Confirm with the user the NTP IP address is correct, or use another NTP server if the user has one and asked the user to engage their admin team to check. Sometimes a server reboot can fix the issue, so we can try that route, if that is acceptable for the user.

Scenario 3: Windows NTP server:
Windows time service implements a non full-featured NTP. If the user uses a Windows Server as NTP server, the rootdisp may be higher than 1000. In that case, configure Windows NTP Server to synchronize a reliable external NTP Server.

If the reach=yes, but condition=reject, use ntpq with assoc and rv to check the flash code, dispersion, and rootdisp.

vrm:~ # ntpq -c assoc
ind assid status conf reach auth condition last_event cnt =========================================================== 1 3898 9014 yes yes none reject reachable 1

Note: The assoc option can show the assid which is needed for rv later.

Use the rv command to get the flash code, dispersion, and rootdisp.

Run the ntpq command to enter the ntpq shell, then use rv assid to get the detailed information.

access-appliance:~ # ntpq
ntpq> rv 3898
associd=3898 status=9014 conf, reach, sel_reject, 1 event, reachable, srcadr=10.XX.1XX.1X0, srcport=123, dstadr=10.XX.1XX.1X1, dstport=123, leap=00, stratum=12, precision=-6, rootdelay=31.250, rootdisp=1814.209, refid=10.XX.XX.2X6, reftime=e0cff348.12fb407d Wed, Jul 10 2019 5:16:56.074, rec=e0cff42b.60680b73 Wed, Jul 10 2019 5:20:43.376, reach=377, unreach=0, hmode=3, pmode=4, hpoll=6, ppoll=6, headway=50, flash=400 peer_dist, keyid=0, offset=-2536.264, delay=0.354, dispersion=16.515, jitter=4.414, xleave=0.038, filtdelay= 0.35 0.29 0.32 0.26 0.28 3.22 0.28 0.35, filtoffset= -2536.2 -2538.2 -2529.4 -2536.2 -2541.6 -2530.0 -2532.5 -2538.1, filtdisp= 15.63 16.63 17.59 18.55 19.53 20.53 21.52 22.50 flash=400 peer_dist >>>>> reject reason dispersion=16.515 >>>>> it presents the error/variance between that NTP server and client rootdisp=1814.209 >>>>> it presents the total amount of error/variance from the root NTP server to client

flash=400 peer_dist indicates the distance to the root NTP server is too long. It is unfit to synchronize.

"flash" staus codes are: Code Message Description 0001 pkt_dup duplicate packet 0002 pkt_bogus bogus packet 0004 pkt_unsync server not synchronized 0008 pkt_denied access denied 0010 pkt_auth authentication failure 0020 pkt_stratum invalid leap or stratum 0040 pkt_header header distance exceeded 0080 pkt_autokey Autokey sequence error 0100 pkt_crypto Autokey protocol error 0200 peer_stratum invalid header or stratum 0400 peer_dist distance threshold exceeded 0800 peer_loop synchronization loop 1000 peer_unreach unreachable or nonselect

Solution

The workaround to allow Access nodes to properly synchronize with NTP server is to add:
tos maxdist 20

to both nodes /etc/ntp.conf and restart ntpd OS service.

Default threshold is 1.5 seconds and this setting will increase it to 20 seconds.

The customer will need to consult network team anyway to find out the root cause of the delays with NTP server in order to fix the issue.

Access 3340/3350 initial cluster configuration fails on NTP step

Problem

Error Message

Cause

Solution

Was this content helpful?

Translated Content

Access 3340/3350 initial cluster configuration fails on NTP step

Problem

Error Message

Cause

Solution

Was this content helpful?

Article Languages

Translated Content

Translated Content