NTP problems after upgrade to 2.2-RC (Jan 08)



  • I have finally upgraded two boxes from 2.1.5 to 2.2-RC and everything works a expected except NTP.
    I don't know what really has changed related to NTP, but I can't get it working as before.
    The ntp daemon is running and remotely reachable, but the clients don't expect the time.

    
    $ ntpq -c peers 10.2.1.1
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
    +time.ostseehaie 131.188.3.221    2 u   10   64    1   25.972   98.203   2.218
    *stratum2-3.ntp. 129.70.130.70    2 u   11   64    1   17.144   96.815   2.613
    -minion.webershe 192.53.103.104   2 u   14   64    1   24.525   98.300   1.728
    +foxtrot.zq1.de  122.227.206.195  3 u   14   64    1   24.565   94.514   1.454
    
    
    
    $ ntpdate -q 10.2.1.1
    server 10.2.1.1, stratum 16, offset -0.144872, delay 0.02753
     9 Jan 12:33:01 ntpdate[12523]: no server suitable for synchronization found
    
    

    Not sure why it responds with stratum 16, I am not really an ntp expert, but it used to work before the upgrade.
    Anyone having any hint for me how to investigate this?

    thanks, Till



  • Your machine cannot get the time from the peers: 'reach' is shown as 1, and for a properly running system reach should eventually move up to 377.  Clients don't accept the time because the server really is at stratum 16.

    http://www.ntp.org/ntpfaq/NTP-s-trouble.htm#Q-MON-REACH

    How have you configured NTP on the pfSense web gui?  Can you post contents of /var/etc/ntpd.conf?  Anything interesting in the NTP syslogs?  You can attach the output of "clog /var/log/ntpd.log | tail -100" if you aren't sure whether it's interesting or not :)

    ntpd also has to run a while after starting or re-starting.  Maybe your connection and setup are OK, and you are just looking at the server in the first minute or so after startup?



  • Hi,

    thanks for your offer to help, I can't see any reason why it shouldn't be able to reach it's peers.
    I was expecting the pfsense to retrieve the time from pool servers and all clients to retrieve time from pfsense then.

    regards, Till

    /var/etc/ntpd.conf:

    
    2.2-RC][admin@pfsense6.middle.earth]/root: less /var/etc/ntpd.conf
    # 
    # pfSense ntp configuration file 
    # 
    
    tinker panic 0 
    # Orphan mode stratum
    tos orphan 12
    
    # Upstream Servers
    server 0.de.pool.ntp.org iburst maxpoll 9
    server 1.de.pool.ntp.org iburst maxpoll 9
    server 2.de.pool.ntp.org iburst maxpoll 9
    server 3.de.pool.ntp.org iburst maxpoll 9
    
    disable monitor
    statsdir /var/log/ntp
    logconfig =syncall +clockall
    driftfile /var/db/ntpd.drift
    restrict default kod limited nomodify nopeer notrap
    restrict -6 default kod limited nomodify nopeer notrap
    
    interface ignore all
    interface listen vr0
    
    

    Logs since last daemon restart:

    
    Jan  9 12:32:28 pfsense6 ntpd[53450]: ntpd 4.2.8@1.3265-o Mon Dec 22 14:36:36 UTC 2014 (1): Starting
    Jan  9 12:32:28 pfsense6 ntpd[53450]: Command line: /usr/local/sbin/ntpd -g -c /var/etc/ntpd.conf -p /var/run/ntpd.pid
    Jan  9 12:32:28 pfsense6 ntpd[53587]: proto: precision = 2.323 usec (-19)
    Jan  9 12:32:28 pfsense6 ntpd[53587]: Listen and drop on 0 v6wildcard [::]:123
    Jan  9 12:32:28 pfsense6 ntpd[53587]: Listen and drop on 1 v4wildcard 0.0.0.0:123
    Jan  9 12:32:28 pfsense6 ntpd[53587]: Listen normally on 2 vr0 10.2.1.1:123
    Jan  9 12:32:28 pfsense6 ntpd[53587]: Listen normally on 3 vr0 [fe80::20d:b9ff:fe20:f8d0%1]:123
    Jan  9 12:32:28 pfsense6 ntpd[53587]: setsockopt IPV6_MULTICAST_IF 0 for fe80::20d:b9ff:fe20:f8d0%1 fails: Can't assign requested address
    Jan  9 12:32:28 pfsense6 ntpd[53587]: Listen normally on 4 lo0 127.0.0.1:123
    Jan  9 12:32:28 pfsense6 ntpd[53587]: Listen normally on 5 lo0 [::1]:123
    Jan  9 12:32:28 pfsense6 ntpd[53587]: Listening on routing socket on fd #26 for interface updates
    
    


  • b.t.w. reach is not 1 anymore, but stratum is still 16:

    
    [2.2-RC][admin@pfsense6.middle.earth]/root: ntpq -p
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
    +time.ostseehaie 131.188.3.221    2 u    3   64  377   20.961  325.163 113.461
    *stratum2-3.NTP. 129.70.130.71    2 u   17   64  177   15.939  228.837  75.943
    +minion.webershe 192.53.103.104   2 u   15   64  177   20.602  318.756 111.715
    -foxtrot.zq1.de  122.227.206.195  3 u   12   64  377   21.298  165.432 111.714
    
    


  • @skywalker:

    b.t.w. reach is not 1 anymore, but stratum is still 16:

    
    [2.2-RC][admin@pfsense6.middle.earth]/root: ntpq -p
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
    +time.ostseehaie 131.188.3.221    2 u    3   64  377   20.961  325.163 113.461
    *stratum2-3.NTP. 129.70.130.71    2 u   17   64  177   15.939  228.837  75.943
    +minion.webershe 192.53.103.104   2 u   15   64  177   20.602  318.756 111.715
    -foxtrot.zq1.de  122.227.206.195  3 u   12   64  377   21.298  165.432 111.714
    
    

    Your peer output looks OK to me: second host is selected as system peer, first and third hosts are selected as candidates, and the last host is rejected.  This should match what you see in the gui under status –> ntp tab

    What's the output of the sysinfo command in ntpq?  The rv command?



  • see below for the output.
    What really struck me here is that the reach goes up for a while and then falls back to 1.
    Surprisingly the same configuration did work before the upgrade (well, at least none of my clients complained before).

    thanks again for looking into this.

    -Till

    
    ntpq> sysinfo
    associd=0 status=0613 leap_none, sync_ntp, 1 event, spike_detect,
    system peer:        stratum2-3.NTP.TechFak.NET:123
    system peer mode:   client
    leap indicator:     00
    stratum:            3
    log2 precision:     -19
    root delay:         18.136
    root dispersion:    252.075
    reference ID:       129.70.132.36
    reference time:     d85b7f77.d360dde9  Sat, Jan 10 2015 11:37:43.825
    system jitter:      42.708488
    clock jitter:       26.742
    clock wander:       1.427
    broadcast delay:    0.000
    symm. auth. delay:  0.000
    
    
    
    ntpq> rv
    associd=0 status=0613 leap_none, sync_ntp, 1 event, spike_detect,
    version="ntpd 4.2.8@1.3265-o Mon Dec 22 14:36:36 UTC 2014 (1)",
    processor="i386", system="FreeBSD/10.1-RELEASE-p3", leap=00, stratum=3,
    precision=-19, rootdelay=18.136, rootdisp=252.630, refid=129.70.132.36,
    reftime=d85b7f77.d360dde9  Sat, Jan 10 2015 11:37:43.825,
    clock=d85b8011.78a8ff76  Sat, Jan 10 2015 11:40:17.471, peer=38903, tc=6,
    mintc=3, offset=109.039377, frequency=-198.172, sys_jitter=42.814557,
    clk_jitter=26.742, clk_wander=1.427
    ntpq> ntpq> rv
    associd=0 status=0613 leap_none, sync_ntp, 1 event, spike_detect,
    version="ntpd 4.2.8@1.3265-o Mon Dec 22 14:36:36 UTC 2014 (1)",
    processor="i386", system="FreeBSD/10.1-RELEASE-p3", leap=00, stratum=3,
    precision=-19, rootdelay=18.136, rootdisp=252.780, refid=129.70.132.36,
    reftime=d85b7f77.d360dde9  Sat, Jan 10 2015 11:37:43.825,
    clock=d85b801a.dc1e50e8  Sat, Jan 10 2015 11:40:26.859, peer=38903, tc=6,
    mintc=3, offset=109.039377, frequency=-198.172, sys_jitter=42.814557,
    clk_jitter=26.742, clk_wander=1.427
    
    


  • @skywalker:

    see below for the output.

    The output looks OK to me, that machine is sync'd and operating as a stratum 3

    What really struck me here is that the reach goes up for a while and then falls back to 1.
    Surprisingly the same configuration did work before the upgrade (well, at least none of my clients complained before).

    'reach' is a living number, updated every time a response is expected from your system peer(s).  (Please re-visit the link posted above).  If it decreases from 377, then you are not getting valid replies from you system peer(s).  Something may be blocking port 123.


Log in to reply