Navigation

    Netgate Discussion Forum
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search

    Ntpd silently exiting if time is substantially off

    2.1 Snapshot Feedback and Problems - RETIRED
    5
    14
    16533
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      dhatz last edited by

      I'm running pfsense 2.1-BETA0 running ntpd 4.2.6p5 in a VBox VM. If I "suspend" the host system overnight, the clock of the FreeBSD/pfsense VM will be off by several hours when the host system wakes up again the next morning.

      In this configuration I've often noticed that pfsense's ntpd process has exited (note: silently, no message in /var/log/ntpd.log) leaving time wrong, which I believe to be due to ntpd's behavior of "once the clock has been set, an error greater than 1000 s will cause ntpd to exit anyway." src

      I wonder if it's possible to configure ntpd to not exit, but sync time regardless when system time is substantially off ?

      1 Reply Last reply Reply Quote 0
      • D
        dhatz last edited by

        This is an interesting read on the subject:

        http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf

        It seems the way to solve this is to add "tinker panic 0" at the top of ntpd.conf

        Using NTP in Linux and Other Guests

        The Network Time Protocol is usable in a virtual machine with proper configuration of the NTP daemon.

        The following points are important:
        • Do not configure the virtual machine to synchronize to its own (virtual) hardware clock, not even as a fallback with a high stratum number. Some sample ntpd.conf files contain a section specifying the local clock as a potential time server, often marked with the comment “undisciplined local clock.” Delete any such server specification from your ntpd.conf file.
        • Include the option tinker panic 0 at the top of your ntp.conf file. By default, the NTP daemon sometimes panics and exits if the underlying clock appears to be behaving erratically. This option causes the daemon to keep running instead of panicking.
        • Follow standard best practices for NTP: Choose a set of servers to synchronize to that have accurate time and adequate redundancy.

        If you have many virtual or physical client machines to synchronize, set up some internal servers for them to use, so that all your clients are not directly accessing an external low-stratum NTP server and overloading it with requests.

        The following sample ntp.conf file is suitable if you have few enough clients that it makes sense for them to access an external NTP server directly. If you have many clients, adapt this file by changing the server names to reference your internal NTP servers.

        NOTE: Any tinker commands used must appear first.

        ntpd.conf

        tinker panic 0
        restrict 127.0.0.1
        restrict default kod nomodify notrap
        server 0.vmware.pool.ntp.org
        server 1.vmware.pool.ntp.org
        server 2.vmware.pool.ntp.org
        server 3.vmware.pool.ntp.org

        Here is a sample /etc/ntp/step-tickers corresponding to the sample ntp.conf file above.

        step-tickers

        0.vmware.pool.ntp.org
        1.vmware.pool.ntp.org

        Make sure that ntpd is configured to start at boot time. On some distributions this can be accomplished with the command chkconfig ntpd on, but consult your distribution’s documentation for details. On most distributions, you can start ntpd manually with the command /etc/init.d/ntpd start.

        PS: I've changed my local copy of the code that creates ntpd.conf (editing /var/etc/system.inc) and I'll let you know if it fixes it.

        1 Reply Last reply Reply Quote 0
        • D
          dhatz last edited by

          @dhatz:

          PS: I've changed my local copy of the code that creates ntpd.conf (editing /var/etc/system.inc) and I'll let you know if it fixes it.

          When using "tinker panic 0" ntpd won't exit if time is substantially off, but it doesn't seem to re-sync it also … I'll have a look at ntpd's config to see if there's anything about it.

          Btw is there some "watchdog" cron job that monitors running services (racoon, openvpn, ntpd etc) and re-starts them if needed ?

          1 Reply Last reply Reply Quote 0
          • D
            dhatz last edited by

            Apparently the "tinker panic 0" change does fix the issue afterall:

            1. ntpd doesn't exit anymore if time diff > 1000s, and
            2. eventually it re-syncs time (didn't monitor it closely enough to see how fast it does it)
            1 Reply Last reply Reply Quote 0
            • jimp
              jimp Rebel Alliance Developer Netgate last edited by

              I committed the fix, should be in the next snapshot.

              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

              Need help fast? Netgate Global Support!

              Do not Chat/PM for help!

              1 Reply Last reply Reply Quote 0
              • G
                gogol last edited by

                @jimp:

                I committed the fix, should be in the next snapshot.

                My opinion is that NTP was not intended for use on a virtual machine and this setting should be an option as a workaround.
                I know NTP is still under development but it is not very secure with missing "restrict" lines.

                1 Reply Last reply Reply Quote 0
                • johnpoz
                  johnpoz LAYER 8 Global Moderator last edited by

                  "My opinion is that NTP was not intended for use on a virtual machine"

                  Funny how this is not vmwares opinion ;)

                  http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006427
                  Timekeeping best practices for Linux guests

                  Note: VMware recommends you to use NTP instead of VMware Tools periodic time synchronization. NTP is an industry standard and ensures accurate time keeping in your guest. You may have to open the firewall (UDP 123) to allow NTP traffic.

                  http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1318
                  Timekeeping best practices for Windows, including NTP

                  Windows Version Recommended Time Sync Utility
                  Windows 2008 w32time or NTP
                  Windows Vista w32time or NTP
                  Windows 2003 w32time or NTP
                  Windows XP NTP
                  Windows 2000 NTP

                  As to security concerns - the configuration tab allows you to restrict which interfaces it will listen on.  I don't see it much of a concern if you let your time server serve time to your local lan ;)

                  I am sure as the addition of it gets more mature that more detailed configuration like specific restricts would be coming - worse case you can always modify the ntpd.conf in /etc/var if your really paranoid.

                  An intelligent man is sometimes forced to be drunk to spend time with his fools
                  If you get confused: Listen to the Music Play
                  Please don't Chat/PM me for help, unless mod related
                  SG-4860 23.01 | Lab VMs CE 2.6, 2.7

                  1 Reply Last reply Reply Quote 0
                  • G
                    gogol last edited by

                    @johnpoz:

                    Funny how this is not vmwares opinion ;)

                    That is understandable from their point of view.
                    I don't say I made a study of the NTP protocol but I read the newsgroup and am a Pool member (so I want to serve time on WAN). As far as I understand the NTP protocol it is made for 24/7, even under VMware  ;)
                    It is very good that PfSense switched to the latest version of the NTP protocol because it is actively developed.

                    I switched to PfSense as week ago and am now studying the behavior of NTP on my box. It certainly behaves well!
                    I know now how to change the settings in /etc/inc/system.inc (and not /var/etc/ntpd.conf!; overwritten at ntpd restart), but these are also not permanent. That is how I have done it for now.

                    1 Reply Last reply Reply Quote 0
                    • G
                      gerdesj last edited by

                      My personal experience of time sync on vmware and physical over several years has yielded the following:

                      Set the ESXis to sync via ntp to five sources
                      Windows DCs - use vm guest tools to sync to the host they are on
                      Windows non DCs - leave at defaults, ie sync to the PDC emulator
                      Unix style systems (*BSD, Linux et al) - sync via ntpd to the hosts

                      I watch timesync on around 500 odd systems around the country (UK) via Nagios and they all agree on time to within the last one or two milli-seconds depending on OS (Unix is best, Windows worst, if you count a milli-second drift on a VM as "bad").

                      I have not had to restart either ntpd or "windows time" in a very long … time using these rules.

                      With a manually configured ntpd I use tinker panic 0 to avoid a 30 second drift being considered "insane".  I also use iburst on the server lines to get a much quicker initial sync, and [ssh] I see PF does as well.

                      I used to use three pool systems but found that after a few weeks/months time would start to drift.  Since using five ({0,1,2,3}.pool.ntp.org and 0.uk.pool.ntp.org) I have not seen that behaviour on any system I manage in at least the last four years.

                      Cheers
                      Jon

                      1 Reply Last reply Reply Quote 0
                      • D
                        dhatz last edited by

                        I didn't find the time yet to monitor ntpd closely during a VM suspend cycle, but empirically I can say it can take quite a long time for ntpd to sync, e.g. it's been over 1hr since I restarted the suspended VM, yet ntpd still hasn't corrected the system time:

                        ntpq -p
                            remote          refid      st t when poll reach  delay  offset  jitter

                        cache.asda.gr  131.188.3.221    2 u  392  512  377  16.808  3499206 1870404
                        stitch.fr.zerol 192.93.2.20      2 u  399  512  377  72.440  3499206 1870404
                        noc.be.it2go.eu 193.190.230.65  2 u  160  512  377  89.826  3499206 1322575

                        1 Reply Last reply Reply Quote 0
                        • jimp
                          jimp Rebel Alliance Developer Netgate last edited by

                          We usually rely on ntpdate to make the big changes, and then let ntpd handle keeping the clock in line over time. If you restart ntpd (or save the settings, iirc) it will stop ntpd, run ntpdate, then restart ntpd.

                          It can take a long time for ntpd to recover from a large skew, since it will only step the clock by 0.128 second increments. This can be adjusted with the step parameter to the tinker config option, but I recall setting that larger had negative effects.

                          Though -x to ntpd might help…

                          -x      Normally, the time is slewed if the offset is less than the step
                                      threshold, which is 128 ms by default, and stepped if above the
                                      threshold.  This option sets the threshold to 600 s, which is
                                      well within the accuracy window to set the clock manually.  Note:
                                      Since the slew rate of typical Unix kernels is limited to 0.5
                                      ms/s, each second of adjustment requires an amortization interval
                                      of 2000 s.  Thus, an adjustment as much as 600 s will take almost
                                      14 days to complete.  This option can be used with the -g and -q
                                      options.  See the tinker command for other options.  Note: The
                                      kernel time discipline is disabled with this option.

                          Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                          Need help fast? Netgate Global Support!

                          Do not Chat/PM for help!

                          1 Reply Last reply Reply Quote 0
                          • D
                            dhatz last edited by

                            Btw ntpd finally sync'ed time, apparently in one big step, but it took ~1.5hr after the VM was resumed from yesterday's suspend. No message whatsoever in  /var/log/ntpd.log

                            1 Reply Last reply Reply Quote 0
                            • D
                              dhatz last edited by

                              As a test, I'm currently running ntpd with the following two lines added:

                              server  127.127.1.0    # local clock
                              fudge  127.127.1.0 stratum 10

                              server says that the local system clock is a timeserver. fudge says that this server is stratum 10. If you are connected to the Internet then you are likely using timeservers who are more l33t than stratum 10 what time it is, and these servers are used because they have lower stratum and thus; higher priority

                              However, if you are disconnected from the Internet then they are unavailable and you're left with the local clock. Using fudge to say that the local clock is stratum 10 makes ntp use the local clock when no timeservers are available. This is good because it makes sure you can disconnect your box from the Internet without getting your clock screwed.

                              1 Reply Last reply Reply Quote 0
                              • G
                                gogol last edited by

                                @dhatz:

                                As a test, I'm currently running ntpd with the following two lines added:

                                If you are trying to run ntpd isolated you should use "orphan mode" in this version of ntpd.

                                http://www.eecis.udel.edu/~mills/ntp/html/orphan.html

                                1 Reply Last reply Reply Quote 0
                                • First post
                                  Last post