Ntpd / gps need some love part II
-
I think you can either delete all external NTP servers from your config, or add only one pointing to localhost. Or just tick "noselect" for them, and NTP service will not use them for timing.
Thank you Robi. Yup have another GPS in a different part of the house such that I have it connected to a Wintel Server which I can point PFSense to.
-
Just a heads-up here for serial GPS users.
I'm running 2.2 alpha with an Adafruit GPS, and noticed a number spikes in the logs, with corresponding spikes in the ntp rrd graphs:
May 20 22:19:50 pfsense ntpd[16682]: 0.0.0.0 c615 05 clock_sync May 20 22:19:51 pfsense ntpd[16682]: 0.0.0.0 0413 03 spike_detect -0.175410 s May 20 22:19:57 pfsense ntpd[16682]: 0.0.0.0 041b 0b leap_event May 20 22:20:07 pfsense ntpd[16682]: 0.0.0.0 0415 05 clock_sync May 20 22:24:55 pfsense ntpd[16682]: 0.0.0.0 041d 0d kern PPS enabled May 23 05:32:39 pfsense ntpd[16682]: 0.0.0.0 0413 03 spike_detect -0.249206 s May 23 05:32:55 pfsense ntpd[16682]: 0.0.0.0 0415 05 clock_sync May 23 21:49:59 pfsense ntpd[16682]: 0.0.0.0 0413 03 spike_detect -0.142089 s May 23 21:50:15 pfsense ntpd[16682]: 0.0.0.0 0415 05 clock_sync May 24 05:20:07 pfsense ntpd[16682]: 0.0.0.0 0413 03 spike_detect -0.138790 s May 24 05:20:23 pfsense ntpd[16682]: 0.0.0.0 0415 05 clock_sync May 25 20:47:35 pfsense ntpd[16682]: 0.0.0.0 0413 03 spike_detect -0.397899 s May 25 20:48:07 pfsense ntpd[16682]: 0.0.0.0 0415 05 clock_sync
Then I noticed a significant number of replies from the GPS couldn't be parsed properly by ntpd (badformat). Here, 166 of 34901 responses have had errors:
[2.2-ALPHA][root@pfsense.localdomain]/var/log(11): ntpq
ntpq> cv
associd=0 status=00f2 15 events, clk_bad_format,
device="NMEA GPS Clock",
timecode="$GPGGA,132646.000,3666.4384,N,08999.4250,W,2,8,0.98,224.9,M,-32.5,M,0000,0000*61",
poll=34901, noreply=0, badformat=166, baddata=0, fudgetime2=400.000,
stratum=0, refid=pps, flags=5
ntpq> qLooking further, I saw that my gps.init strings did not properly turn off the sentences as I had intended. Turns out I had the wrong checksum in the command I entered :) Once I got the GPS sending only the $GPGGA sentence, I haven't had any spikes (so far). If I was ambitious, I'd go back in my clockstats file, and see what the strings looked like around the times the spikes were detected. Hopefully it's fixed though.
So, moral of the story:
- Verify your checksums if you enter init commands by hand, and
- Turn off extra NMEA sentences!
-
I suggest to use ZDA or ZDG option instead of GGA when considering sentences in case you have problems like that, as $GPZDA is less than half in size than $GPGGA, it contains only Date & Time. In order to lessen even more the stress for ntpd to process the string every second.
-
Thank-you Robi!
Did a quickie check this morning and noticed that it wasn't running. Enabled ZDA or ZDG option and all is well this morning.
I wasn't paying attention. Google location stuff was eye candy but I liked the number of satellites it was getting.
ntpq> cv
localhost: timed out, nothing received
***Request timed out
ntpq><changed and="" saved="" settings="" here="">ntpq> cv
assID=0 status=0000 clk_okay, last_clk_okay,
device="NMEA GPS Clock", timecode="$GPZDA,120846.000,28,05,2014,,*57",
poll=2, noreply=0, badformat=0, baddata=0, fudgetime2=400.000,
stratum=0, refid=GPS, flags=5
ntpq></changed>So I calculated the checksum and edited the command text saving the correct check sum for each of the command lines.
-
I just found a couple of small issues, like typos and html display stuff, fine-tuned a bit SureGPS defaults and noticed that RRD graphs were not plotting on 2.1.3, fixed that.
If applied patch using system patches package, revert the old patch, and apply this new one, or simply overwrite the files on the system, and press "Save" on the Serial GPS page.
Pushed the fixes to GitHub also, for 2.2.(Edit: attachments removed, read further for updates)
-
Thanks Rob. For the update posted do I need to recalculate the checksum stuff?
I did notice while using FF after the save on the serial GPS page; the page looked a bit weird.
Here is the adjusted checksums for the SureGPS.
$PMTK225,025
$PMTK314,1,1,0,1,0,5,0,0,0,0,0,0,0,0,0,0,0,1,023
$PMTK301,220
$PMTK397,02D
$PMTK1023F
$PMTK313,120
$PMTK513,126
$PMTK319,02B
$PMTK527,0.000E
$PMTK251,960019
-
Love the annotation. ;D
Steve
-
Thanks Rob. For the update posted do I need to recalculate the checksum stuff?
I did notice while using FF after the save on the serial GPS page; the page looked a bit weird.
Here is the adjusted checksums for the SureGPS.
$PMTK225,025
$PMTK314,1,1,0,1,0,5,0,0,0,0,0,0,0,0,0,0,0,1,023
$PMTK301,220
$PMTK397,02D
$PMTK1023F
$PMTK313,120
$PMTK513,126
$PMTK319,02B
$PMTK527,0.000E
$PMTK251,960019What adjustments do these commands do? Why did you have to change them? There's a mistake for sure for example on the second line, checksum for that is not 23 but 2D. Didn't check all of them, but PMTK301,2 has the correct checksum of 2E, not 20.
Actually I use default settings (the ones which come when you select the model in the GPS pulldown), they all have already correct checksums precalculated. I'd say you should delete the whole thing, select "Generic" from the pulldown, then select "SureGPS" again, to re-load the defaults and press Save.
Don't know about the php error you see on that page, have you done any other modifications to your pfSense system? I use this patch on several machines in production, none of them shows this.
-
Once I got the GPS sending only the $GPGGA sentence, I haven't had any spikes (so far). If I was ambitious, I'd go back in my clockstats file, and see what the strings looked like around the times the spikes were detected. Hopefully it's fixed though.
I spoke (wrote?) too soon! My spikes are back, so I've got to do more digging; will update if I find anything.
-
I just found a couple of small issues, like typos and html display stuff, fine-tuned a bit SureGPS defaults and noticed that RRD graphs were not plotting on 2.1.3, fixed that.
Thanks for keeping this up! I'm testing 2.2, and I've noticed that the ntp RRD data does not survive an update, unlike the other RRD groups like system, traffic, packets, etc. Note that I mean update from one 2.2 snapshot to the next, not the update from 2.1.3 to 2.2 (which I didn't try).
Sorry I haven't looked into it any deeper; I hate posting a problem without a solution … but your post reminded me.
-
I also noticed that a restart of service sometimes, but a reconfiguration of NTP settings also deletes the entire RRD data, have no clue why.
-
I also noticed that a restart of service sometimes, but a reconfiguration of NTP settings also deletes the entire RRD data, have no clue why.
Found it, I think … in /etc/inc/rrd.inc:
/* NTP, set up the ntpd rrd file */ if (isset($config['ntpd']['statsgraph'])) { /* set up the ntpd rrd file */ if (!file_exists("$rrddbpath$ifname$ntpd")) { $rrdcreate = "$rrdtool create $rrddbpath$ntpd --step $rrdntpdinterval "; $rrdcreate .= "DS:offset:GAUGE:$ntpdvalid:0:1000 "; ...
There shouldn't be the '$ifname' in the check for the existing file, since ntpd stats are not per interface. From the previous captive portal stanza, $ifname is set to 'captiveportal' at this point anyway. So '/var/db/rrd/captiveportalntpd.rrd' is of course not found, and a new ntpd.rrd file is created every time through this code. Seems to be the case for both 2.2 and your 2.1.3 patch.
-
Also, I believe that UNKNOWN values need to be written in more data set fields during boot:
/* set up the ntpd rrd file */ if (!file_exists("$rrddbpath$ntpd")) { $rrdcreate = "$rrdtool create $rrddbpath$ntpd --step $rrdntpdinterval "; $rrdcreate .= "DS:offset:GAUGE:$ntpdvalid:0:1000 "; $rrdcreate .= "DS:sjit:GAUGE:$ntpdvalid:0:1000 "; $rrdcreate .= "DS:cjit:GAUGE:$ntpdvalid:0:1000 "; $rrdcreate .= "DS:wander:GAUGE:$ntpdvalid:0:1000 "; $rrdcreate .= "DS:freq:GAUGE:$ntpdvalid:0:1000 "; $rrdcreate .= "DS:disp:GAUGE:$ntpdvalid:0:1000 "; $rrdcreate .= "RRA:MIN:0.5:1:1200 "; $rrdcreate .= "RRA:MIN:0.5:5:720 "; $rrdcreate .= "RRA:MIN:0.5:60:1860 "; $rrdcreate .= "RRA:MIN:0.5:1440:2284 "; $rrdcreate .= "RRA:AVERAGE:0.5:1:1200 "; $rrdcreate .= "RRA:AVERAGE:0.5:5:720 "; $rrdcreate .= "RRA:AVERAGE:0.5:60:1860 "; $rrdcreate .= "RRA:AVERAGE:0.5:1440:2284 "; $rrdcreate .= "RRA:MAX:0.5:1:1200 "; $rrdcreate .= "RRA:MAX:0.5:5:720 "; $rrdcreate .= "RRA:MAX:0.5:60:1860 "; $rrdcreate .= "RRA:MAX:0.5:1440:2284 "; create_new_rrd($rrdcreate); unset($rrdcreate); } /* enter UNKNOWN values in the RRD so it knows we rebooted. */ if($g['booting']) { mwexec("$rrdtool update $rrddbpath$ntpd N:U"); }
The update line should be:
mwexec("$rrdtool update $rrddbpath$ntpd N:U:U:U:U:U:U");
-
Can you please attach your modified /etc/inc/rrd.inc?
-
Can you please attach your modified /etc/inc/rrd.inc?
How about a patch?
[2.2-ALPHA][root@pfsense.localdomain]/etc/inc(47): diff -ub rrd.inc rrd_new.inc --- rrd.inc 2014-04-29 11:57:57.000000000 -0400 +++ rrd_new.inc 2014-05-28 12:50:48.000000000 -0400 @@ -849,7 +849,7 @@ /* NTP, set up the ntpd rrd file */ if (isset($config['ntpd']['statsgraph'])) { /* set up the ntpd rrd file */ - if (!file_exists("$rrddbpath$ifname$ntpd")) { + if (!file_exists("$rrddbpath$ntpd")) { $rrdcreate = "$rrdtool create $rrddbpath$ntpd --step $rrdntpdinterval "; $rrdcreate .= "DS:offset:GAUGE:$ntpdvalid:0:1000 "; $rrdcreate .= "DS:sjit:GAUGE:$ntpdvalid:0:1000 "; @@ -876,7 +876,7 @@ /* enter UNKNOWN values in the RRD so it knows we rebooted. */ if($g['booting']) { - mwexec("$rrdtool update $rrddbpath$ntpd N:U"); + mwexec("$rrdtool update $rrddbpath$ntpd N:U:U:U:U:U:U"); } /* the ntp stats gathering function. */
-
Nice catch 8) Seems to work.
-
Here's the entire patch updated and the zipped pack too.
I'll submit it to GitHub too. -
Thank-you Rob.
What adjustments do these commands do? Why did you have to change them? There's a mistake for sure for example on the second line, checksum for that is not 23 but 2D. Didn't check all of them, but PMTK301,2 has the correct checksum of 2E, not 20.
I thought when you tested the checksum in the box below that the correct value would show. Each of the above commands had a different checksum so I changed them all from default.
Apologies; just noticed that I had typed an extra character to one side fat fingering it. Checksum numbers match fine.
Actually I use default settings (the ones which come when you select the model in the GPS pulldown), they all have already correct checksums precalculated. I'd say you should delete the whole thing, select "Generic" from the pulldown, then select "SureGPS" again, to re-load the defaults and press Save.
Will do.
Don't know about the php error you see on that page, have you done any other modifications to your pfSense system? I use this patch on several machines in production, none of them shows this.
I have not done any modifications. I did rebuild it from scratch and instead of uploading old configuration I added my stuff manually (rules and all).
Yeah I do not see it all of the time and thought it might be a Firefox thing.
Tried it on a couple of other PCs running Firefox and GUI looks fine on them.
-
Saw the error in the GUI come up again after I saved configuration.
Warning: substr_count(): Empty substring in /etc/inc/system.inc on line 1457 Warning: substr_count(): Empty substring in /etc/inc/system.inc on line 1458
-
Saw the error in the GUI come up again after I saved configuration.
Warning: substr_count(): Empty substring in /etc/inc/system.inc on line 1457 Warning: substr_count(): Empty substring in /etc/inc/system.inc on line 1458
In the NTP configuration, on the NTP page, do you have any NTP servers configured? Also, in General Setup, what do you see in the "NTP time server" textbox?