NTP time sync issue
-
It looks as if you are using relatively local NTP servers to your physical location, which is good. You are using a cable modem, which can lead to high jitter, but c. 2.5ms system jitter isn't the end of the world.
Precision -22 suggests you might be using the on-processor TSC as your timing source.
Post the results of:
sysctl kern.timecounter.choice kern.timecounter.hardware
which will show the timecounter choices and weights, also the timecounter your system is currently using.My experience of running NTP servers on bare metal FreeBSD installations is that the HPET is often a more stable timing source. If you want to give this a go, add a line to /boot/loader.conf.local that reads:
kern.timecounter.tc.HPET.quality=5000You'll need to create /boot/loader.conf.local if it doesn't already exist. Once you've made the change, delete the ntpd.drift file (its contents are invalidated by the change of timing source) using:
pkill ntpd ; rm /var/db/ntpd.drift
then reboot. After the system has been running for at least 12 hours, re-run the command I gave in the previous post. Hopefully clk_jitter is significantly lower than the 2.9ms in your earlier output. -
Post the results of:sysctl kern.timecounter.choice kern.timecounter.hardware
sysctl kern.timecounter.choice kern.timecounter.hardware kern.timecounter.choice: TSC-low(1000) ACPI-safe(850) i8254(0) HPET(950) dummy(-1000000) kern.timecounter.hardware: TSC-low
-
-
delete the ntp.drift file
you meant: ntpd.drift ? which I deleted.
That is the file I meant. I've edited the earlier post accordingly.
As I thought, your system had chosen TSC (well TSC-low, though the difference is not material here) as its timecounter. If you repeat that command having made the change I suggested to /boot/loader.conf.local, you should find the quality figure after HPET is now 5000 and that kern.timecounter.hardware is now HPET. It will be interesting to see whether that proves to have lower jitter (clk_jitter) and at least as good short-term stability (clk_wander) as TSC.
It may take 24 hours for things to settle down as ntpd had no drift file value to start from.
It is worth turning on pfSense's ntp RRD graphs in Services -> NTP, though I would strongly recommend you apply the patch in https://redmine.pfsense.org/issues/4423 first.
-
The offset spikes are correlated with changing system load.
time2.google.com seems to be going to crap.
$ ntpq -c 'rl' -wp associd=0 status=0615 leap_none, sync_ntp, 1 event, clock_sync, version="ntpd 4.2.8p4@1.3265-o Mon Oct 26 14:28:17 UTC 2015 (1)", processor="amd64", system="FreeBSD/10.1-RELEASE-p24", leap=00, stratum=3, precision=-22, rootdelay=31.673, rootdisp=42.253, refid=216.239.38.15, reftime=da2c4aef.ab39b7b0 Mon, Dec 28 2015 17:57:35.668, clock=da2c5259.63466f54 Mon, Dec 28 2015 18:29:13.387, peer=10249, tc=9, mintc=3, offset=0.429836, frequency=-25.102, sys_jitter=0.149028, clk_jitter=0.241, clk_wander=0.005 remote refid st t when poll reach delay offset jitter ============================================================================== +ra.steadfastdns.net 216.86.146.46 2 u 140 512 377 14.474 0.305 0.356 +rb.steadfastdns.net 216.86.146.46 2 u 254 512 377 14.848 0.267 0.442 -dns1.steadfast.net 216.86.146.46 2 u 299 512 377 15.048 0.428 0.346 +time1.google.com 120.249.107.194 2 u 290 512 377 24.103 0.586 0.206 -time2.google.com 217.167.3.118 2 u 527 512 377 34.978 -2.121 2.197 +time3.google.com 46.254.142.6 2 u 234 512 377 37.441 0.580 0.462 *time4.google.com 112.106.149.195 2 u 327 512 377 24.257 0.455 0.290
-
If you repeat that command having made the change I suggested to /boot/loader.conf.local, you should find the quality figure after HPET is now 5000 and that kern.timecounter.hardware is now HPET.
It seemed no change, same as before, I did the reboot:
[2.2.6-RELEASE][root@router.home]/root: sysctl kern.timecounter.choice kern.timecounter.hardware kern.timecounter.choice: TSC-low(1000) ACPI-safe(850) i8254(0) HPET(950) dummy(-1000000) kern.timecounter.hardware: TSC-low
and my /boot/loader.conf.local:
ahci_load="YES" kern.timecounter.tc.HPET.quality=5000
EDIT: just checked, the ntpd.drift was auto created again, and ntpq -c 'rl' -wp:
[2.2.6-RELEASE][root@router.home]/root: ntpq -c 'rl' -wp associd=0 status=0615 leap_none, sync_ntp, 1 event, clock_sync, version="ntpd 4.2.8p4@1.3265-o Mon Oct 26 14:28:17 UTC 2015 (1)", processor="amd64", system="FreeBSD/10.1-RELEASE-p25", leap=00, stratum=2, precision=-22, rootdelay=13.651, rootdisp=539.738, refid=206.108.0.131, reftime=da2c62f2.f7edec27 Mon, Dec 28 2015 20:40:02.968, clock=da2c667c.d9dcdc22 Mon, Dec 28 2015 20:55:08.851, peer=26673, tc=8, mintc=3, offset=18.733259, frequency=14.160, sys_jitter=2.707991, clk_jitter=4.381, clk_wander=0.158 remote refid st t when poll reach delay offset jitter ============================================================================== *ntp1.torix.ca .PPS. 1 u 108 256 377 13.651 18.733 3.819 +ns509831.ip-167-114-101.net 192.95.25.79 3 u 247 256 377 38.462 21.008 2.539 +zero.gotroot.ca 30.114.5.31 2 u 45 256 377 63.321 18.919 5.488 -ntp3.torix.ca .PPS. 1 u 250 256 377 12.037 17.734 3.188
-
After 12 hours:
[2.2.6-RELEASE][root@router.home]/root: sysctl kern.timecounter.choice kern.timecounter.hardware kern.timecounter.choice: TSC-low(1000) ACPI-safe(850) i8254(0) HPET(950) dummy(-1000000) kern.timecounter.hardware: TSC-low
so the config: kern.timecounter.tc.HPET.quality=5000, didn't work?
[2.2.6-RELEASE][root@router.home]/root: ntpq -c 'rl' -wp associd=0 status=0615 leap_none, sync_ntp, 1 event, clock_sync, version="ntpd 4.2.8p4@1.3265-o Mon Oct 26 14:28:17 UTC 2015 (1)", processor="amd64", system="FreeBSD/10.1-RELEASE-p25", leap=00, stratum=2, precision=-22, rootdelay=13.518, rootdisp=534.552, refid=206.108.0.131, reftime=da2d185a.f89c6a19 Tue, Dec 29 2015 9:34:02.971, clock=da2d1b1c.a4e5378f Tue, Dec 29 2015 9:45:48.644, peer=26673, tc=9, mintc=3, offset=-4.883652, frequency=22.213, sys_jitter=3.014387, clk_jitter=1.394, clk_wander=0.350 remote refid st t when poll reach delay offset jitter ============================================================================== *ntp1.torix.ca .PPS. 1 u 176 512 377 13.518 -4.884 3.014 +ns509831.ip-167-114-101.net 192.95.25.79 3 u 174 512 377 37.712 -3.224 3.179 +zero.gotroot.ca 30.114.5.31 2 u 494 512 377 65.952 -2.459 2.313 +ntp3.torix.ca .PPS. 1 u 40 512 377 15.540 -3.908 2.592
-
Maybe you need to add "kern.timecounter.hardware=HPET" to loader.conf? that's what "man timecounters" seems to be saying.
-
@mer:
Maybe you need to add "kern.timecounter.hardware=HPET" to loader.conf? that's what "man timecounters" seems to be saying.
you meant: loader.config.local?
-
Correct file name is 'loader.conf.local'.
-
Yep, "loader.conf.local". Thanks for catching it.
-
Tried all, but the kern.timecounter.hardware is always TSC-low, unless I ran the command:
sysctl kern.timecounter.hardware=HPET
Any hints?
-
Solved by adding an entry (kern.timecounter.hardware, value: HPET) in the system->Advanced->System Tunables.
-
@RonpfS I know it's an ancient thread but I googled and couldn't find existing solution to this problem.
In my case time sync issues in Windows (all those 0x800705B4 errors) were fixed by unchecking the "Enable KOD packets" option in NTP server ACL page.
Hope it could help someone.