Ntpd / gps need some love part II
-
Thank-you for the pointers Robi.
-
Once I got the GPS sending only the $GPGGA sentence, I haven't had any spikes (so far). If I was ambitious, I'd go back in my clockstats file, and see what the strings looked like around the times the spikes were detected. Hopefully it's fixed though.
I spoke (wrote?) too soon! My spikes are back, so I've got to do more digging; will update if I find anything.
Just an update: It seems the spikes are an artifact.
I've modified /var/db/rrd/updaterrd.sh to keep the output of every 'ntpq -c rv', as well as the the values parsed for rrd. I see spikes suddenly, with no corresponding upsets in the ntp clockstats or loopstats files. So, it's an ntp issue, not a pfSense issue.
For reference, if anyone else faces a similar problem, my debugging mods to updaterrd.sh are like so:
LOGPATH=/tmp LOGFILE=$LOGPATH/`date +'%y.%m.%d_%H:%M:%S'` PERM_LOG=$LOGPATH/ntp_stuff.log /usr/local/sbin/ntpq -c rv | /usr/bin/awk 'BEGIN{ RS=","}{ print }' >> $LOGFILE NOFFSET=`grep offset $LOGFILE | awk 'BEGIN{FS="="}{print $2}'` NFREQ=`grep frequency $LOGFILE | awk 'BEGIN{FS="="}{print $2}'` NSJIT=`grep sys_jitter $LOGFILE | awk 'BEGIN{FS="="}{print $2}'` NCJIT=`grep clk_jitter $LOGFILE | awk 'BEGIN{FS="="}{print $2}'` NWANDER=`grep clk_wander $LOGFILE | awk 'BEGIN{FS="="}{print $2}'` NDISPER=`grep rootdisp $LOGFILE | awk 'BEGIN{FS="="}{print $2}'` /usr/bin/nice -n20 /usr/local/bin/rrdtool update /var/db/rrd/ntpd.rrd \N:${NOFFSET}:${NSJIT}:${NCJIT}:${NWANDER}:${NFREQ}:${NDISPER} echo $NOFFSET : $NSJIT : $NCJIT : $NWANDER : $NFREQ : $NDISPER >> $PERM_LOG
-
Thanks charliem!
So, it's an ntp issue, not a pfSense issue.
So its an NTP issue with BSD or an NTP issue relating to using a GPS with PPS for time sync or just an NTP issue relating to using NTP servers on the internet or a GPS/PPS?
Decided to start from scratch (hardware) on the other PFSense build.
-
The patch file ntpd_love_patch_213d.txt from here applies cleanly to pfSense 2.1.4 also.
Tested on x64 NanoBSD, should work on full install too. -
Thank-you robi!
Updated PFSense today to 2.1.4 along with adding the NTP stuff.
Works great. Still building box #2.
-
It's for a GPS with a uBlox chipset; I mentioned it in my first post about GPS bugs here:
https://forum.pfsense.org/index.php?topic=67189.msg367460#msg367460 I believe JimP later posted that it was added because it was funded by a customer. IMHO it should not be there by defaultYes, apparently a uBlox. And a poor command set at that, probably found on some webpage somewhere. But somebody did pay some cash to have it added to pfSense, which then became the basis to further develop it into something that became far more useful to many others. That is the reason it is the "default", with the caveat that it is not recommended. Far from ideal, but unless the person who originally payed for the feature expresses that the newer uBlox config is just as good or better, the original should remain.
BTW charliem, nice find on the rrd graphs, I never had enough time to time figure out why the DB didn't survive a reboot. :(
@pete: I run off a 16GB SLC CF card, using the standard version but with all the options to run as much as possible in RAM set. It's plenty big enough, but I don't run anything like Squid.
-
To get the patch to work in 2.1.5, I had to edit one services_ntpd.php hunk in Robi's 2.1.3 patch with change according to the commit:
https://github.com/pfsense/pfsense/commit/88c24958a9625d2daa55adb2bb685c70ec9d6eba
-
Don't know if this has been mentioned, but the NTP RRD graphs should scale from negative to positive as offset can have a negative value. Now it's clipped off as the graph is shown from 0 up.
-
To get the patch to work in 2.1.5, I had to edit one services_ntpd.php hunk in Robi's 2.1.3 patch with change according to the commit:
https://github.com/pfsense/pfsense/commit/88c24958a9625d2daa55adb2bb685c70ec9d6eba
Yes indeed. Please find attached the patch.
-
Now that 2.2-RELEASE is out, which includes all this, a quick note with good news for the people who had to tweak serial ports on their motherboard.
-
To make ntp rrd graphs look nicer I did some changes on my system to allow full scaling from negative to positive values and I made the graph scale a bit nicer (in my opinion).
Modify rrd.inc to allow negative values. I'm not sure if all of these can actually go to negative, but this was a lazy initial edit. I'm sure none of the values should ever get anywhere close to 1000 though.
$rrdcreate .= "DS:offset:GAUGE:$ntpdvalid:-1000:1000 "; $rrdcreate .= "DS:sjit:GAUGE:$ntpdvalid:-1000:1000 "; $rrdcreate .= "DS:cjit:GAUGE:$ntpdvalid:-1000:1000 "; $rrdcreate .= "DS:wander:GAUGE:$ntpdvalid:-1000:1000 "; $rrdcreate .= "DS:freq:GAUGE:$ntpdvalid:-1000:1000 "; $rrdcreate .= "DS:disp:GAUGE:$ntpdvalid:-1000:1000 ";
Modify the status_rrd_graph_img.php file to scale things better for the actual graph. Another part to touch in this file would be the COMMENT/GPRINT part for ntp graph to tweak the number of decimals etc.
$graphcmd .= "--alt-autoscale "; $graphcmd .= "--alt-y-grid "; $graphcmd .= "--units-exponent 0 "; $graphcmd .= "--rigid ";
I've never touched any of the pfSense code before and I don't have a github account nor have I signed the CLA. I would very much appreciate it if someone could take a look at this, make the actual changes needed and post a pull request.
-
To make ntp rrd graphs look nicer I did some changes on my system to allow full scaling from negative to positive values and I made the graph scale a bit nicer (in my opinion).
Can you post a diff? Would be easier for others to test.
Modify rrd.inc to allow negative values. I'm not sure if all of these can actually go to negative, but this was a lazy initial edit. I'm sure none of the values should ever get anywhere close to 1000 though.
As far as I know, only offset and frequency could have possible negative values. Jitter and wander are calculated as RMS averages and so should always be positive. Dispersion is related to delay and latency measurements, both of which should be positive (unless you live in a Tardis …)
-
Plotting negative offset seems like a good idea. What does it show now if it can't go negative? Worst case it shows zero offset unrealistically.
Steve
-
Negative values are clipped off completely resulting in gaps in the graph.
-
As far as I know, only offset and frequency could have possible negative values. Jitter and wander are calculated as RMS averages and so should always be positive. Dispersion is related to delay and latency measurements, both of which should be positive (unless you live in a Tardis …)
I'd say it's still good if the graphs themselves could plot negative values, because that could potentially show if there are some problems with ntpd.
I plan to test next week and post a diff too.