Ntpd / gps need some love part II
-
A NTP internal only switch that worked would also be a nice thing to have.
+1 on that. If said check box is only on the GPS settings page it's only ever likely to be set by people who understand what they're doing (or think they do ;)).
I thought the clockstats thing had been taken care of? Hmm, clearly I'm not paying enough attention. There was a manual tweak to disable it though right? That didn't make it into the package?
Steve
-
BTW still got some errors using GGA such that I went back to using ZDA/ZDG.
IE: I was seeing bad data using GGA here.
It seems to me that badformat error is due to a bug in ntp, the SureGPS sends the sentences with an extra empty line between them, and it seems to me that the nmea driver gets confused by this. It should either be fixed in ntp-nmea driver, or the SureGPS firmware.
However, the number of badformats is rather small (got about 26 during a day), and since the PPS signal is the actual source for exact timing, I wouldn't worry about it. The time is kept very exact anyway.
-
Thanks Robi.
Will do.
Thinking a while back did play with the firmware on the SureGPS and had a dialog going with company about the device.
Personally just the integration of using a GPS/PPS into PFSense is a neato thing for me as I had initially just an autonomous running NTP server with a GPS on the network here. Its my home such that its not a big deal.
-
It's for a GPS with a uBlox chipset; I mentioned it in my first post about GPS bugs here:
https://forum.pfsense.org/index.php?topic=67189.msg367460#msg367460 I believe JimP later posted that it was added because it was funded by a customer. IMHO it should not be there by defaultWell, if it works for others too, why remove it? I noticed it's an uBlox, but it's init commands differ from the standard uBlox.
"Default":
$PUBX,40,GSV,0,0,0,059
$PUBX,40,GLL,0,0,0,05C
$PUBX,40,ZDA,0,0,0,044
$PUBX,40,VTG,0,0,0,05E
$PUBX,40,GSV,0,0,0,059
$PUBX,40,GSA,0,0,0,04E
$PUBX,40,GGA,0,0,0,0
$PUBX,40,TXT,0,0,0,0
$PUBX,40,RMC,0,0,0,0*46
$PUBX,41,1,0007,0003,4800,0
$PUBX,40,ZDA,1,1,1,1"U-Blox":
$PUBX,40,GGA,1,1,1,1,0,05A
$PUBX,40,GLL,1,1,1,1,0,05C
$PUBX,40,GSA,0,0,0,0,0,04E
$PUBX,40,GSV,0,0,0,0,0,059
$PUBX,40,RMC,1,1,1,1,0,047
$PUBX,40,VTG,0,0,0,0,0,05E
$PUBX,40,GRS,0,0,0,0,0,05D
$PUBX,40,GST,0,0,0,0,0,05B
$PUBX,40,ZDA,1,1,1,1,0,044
$PUBX,40,GBS,0,0,0,0,0,04D
$PUBX,40,DTM,0,0,0,0,0,046
$PUBX,40,GPQ,0,0,0,0,0,05D
$PUBX,40,TXT,0,0,0,0,0,043
$PUBX,40,THS,0,0,0,0,0,054
$PUBX,41,1,0007,0003,4800,0*13Which one is the real uBlox then? Maybe we should just rename "Default" to the real name of it, since although it may be an uBlox, it may implement some other commands too. Could be useful for somebody…
Another outstanding 'design' question is whether we should be saving clockstats by default. They are not used by any installed program or available package AFAIK, and grow without bounds, as there's no cron job to compress/remove/consolidate them. I think anyone who wants them can certainly turn them on. I first brought it up here:
https://forum.pfsense.org/index.php?topic=67189.msg373669#msg373669 and more here:
https://forum.pfsense.org/index.php?topic=67189.msg373841#msg373841clockstats are not saved by default. I just checked the interface, on the NTP page at the bottom (Statistics logging section) there's a ckeckbox named "Enable logging of reference clock statistics (default: disabled).". The php code behind it acts as it should, I just tested. If it's not checked, the /var/log/ntp stays empty. If it's checked, it creates the files.
-
It seems to me that badformat error is due to a bug in ntp, the SureGPS sends the sentences with an extra empty line between them, and it seems to me that the nmea driver gets confused by this. It should either be fixed in ntp-nmea driver, or the SureGPS firmware.
It was fixed by the ntp devs a few days after reporting: http://bugs.ntp.org/show_bug.cgi?id=2140
So, the fix should be in pfSense 2.2 which uses ntpd 4.2.7p440, but I don't think it's in 2.1.3 which uses an earlier version of ntpd (4.2.6 somthing, IIRC). I do not see the issue on my Sure Electronics boards, but I do see it on another GPS.
But you are right, it's a small problem, unless I find it is related to the spikes I'm observing. But that's an ntp issue, not for a pfSense forum.
-
I think that was a different issue, having millions of badformats under Windows build. We have only a couple tens, so I tend to think it's either a different thing, or the bug still not fixed properly.
It's ntpd 4.2.7p411 in my 2.1.3 boxes here. -
Well, if it works for others too, why remove it? I noticed it's an uBlox, but it's init commands differ from the standard uBlox.
$PUBX is proprietary to uBlox. Hopefully your Garmin or trimble would ignore these … IMHO it's better to send nothing by default, just open the serial port and look for one of the NMEA sentences. Only send something if the user has chosen a unit.
"Default":
$PUBX,40,GSV,0,0,0,059
$PUBX,40,GLL,0,0,0,05C
$PUBX,40,ZDA,0,0,0,044
$PUBX,40,VTG,0,0,0,05E
$PUBX,40,GSV,0,0,0,059
$PUBX,40,GSA,0,0,0,04E
$PUBX,40,GGA,0,0,0,0
$PUBX,40,TXT,0,0,0,0
$PUBX,40,RMC,0,0,0,0*46
$PUBX,41,1,0007,0003,4800,0
$PUBX,40,ZDA,1,1,1,1"U-Blox":
$PUBX,40,GGA,1,1,1,1,0,05A
$PUBX,40,GLL,1,1,1,1,0,05C
$PUBX,40,GSA,0,0,0,0,0,04E
$PUBX,40,GSV,0,0,0,0,0,059
$PUBX,40,RMC,1,1,1,1,0,047
$PUBX,40,VTG,0,0,0,0,0,05E
$PUBX,40,GRS,0,0,0,0,0,05D
$PUBX,40,GST,0,0,0,0,0,05B
$PUBX,40,ZDA,1,1,1,1,0,044
$PUBX,40,GBS,0,0,0,0,0,04D
$PUBX,40,DTM,0,0,0,0,0,046
$PUBX,40,GPQ,0,0,0,0,0,05D
$PUBX,40,TXT,0,0,0,0,0,043
$PUBX,40,THS,0,0,0,0,0,054
$PUBX,41,1,0007,0003,4800,0*13Which one is the real uBlox then? Maybe we should just rename "Default" to the real name of it, since although it may be an uBlox, it may implement some other commands too. Could be useful for somebody…
I documented many common commands in this message:
https://forum.pfsense.org/index.php?topic=67189.msg373885#msg373885clockstats are not saved by default. I just checked the interface, on the NTP page at the bottom (Statistics logging section) there's a ckeckbox named "Enable logging of reference clock statistics (default: disabled).". The php code behind it acts as it should, I just tested. If it's not checked, the /var/log/ntp stays empty. If it's checked, it creates the files.
Sorry, my bad! Good job!
[edit: add link to gps commands]
-
OK, thanks, maybe it's a different uBlox model.
-
This results in keeping the original idea in pfSense, that you can't run without at least one NTP server configured. If one still wants to run without any external NTP servers, I guess he/she can use 'localhost' instead or some any dummy IP address or hostname which cannot be resolved. Not too nice though.
I don't remember seeing this documented as a requirement! While it's certainly a good recommendation, should it actually be required?
Well, if you take a pfSense out of the box, it starts with a wizard which forces you to use an NTP server. You can't go further to the next step until you enter a string there (the message is "NTP Time Server names may only contain the characters a-z, 0-9, '-' and '.'. Entries may be separated by spaces. Please press back in your browser window and correct."). A similar message appears when you press Save in General Setup, if you empty the "NTP time server" text box. You can't save your settings unless you enter a string there.
If you manually delete the 'timeservers' value from the config xml file, you'll quickly notice that in in General Setup the "NTP time server" text box is pre-filled with pool.ntp.org.All the above makes me think, that pfSense was designed to run with a configured NTP server, and the user is not able to run the system without at least one.
But this could be reconsidered nowdays, when we have local time sources based on GPS.
-
Here's just for fun, the difference between a pfSense system synced to public NTP servers, and one with a local GPS/PPS sync.
Timing is 100 times more exact with a local PPS source, than with public servers. Same hardware, same pfSense version.
-
Nice. :D
You've gotta love graphs!Steve
-
Some odd stuff is happening this morning. Drive checks out fine when rebooting such that I do not understand why I am seeing this. I just backed up configuration and probably will reinstall PFSense on another drive.
Looks like its a hardware issue?
May 30 07:29:42 kernel: pid 55919 (rrdtool), uid 0: exited on signal 11
May 30 07:29:42 kernel: vm_fault: pager read error, pid 55919 (rrdtool)
May 30 07:29:42 kernel: vnode_pager_getpages: I/O read error
May 30 07:29:42 kernel: g_vfs_done():ad0s1a[READ(offset=36628164608, length=4096)]error = 5
May 30 07:29:42 kernel: ad0: FAILURE - READ_DMA status=51 <ready,dsc,error>error=40 <uncorrectable>LBA=71539463
May 30 07:29:38 kernel: vm_fault: pager read error, pid 55919 (rrdtool)
May 30 07:29:38 kernel: vnode_pager_getpages: I/O read error
May 30 07:29:38 kernel: g_vfs_done():ad0s1a[READ(offset=36628164608, length=4096)]error = 5
May 30 07:29:38 kernel: ad0: FAILURE - READ_DMA status=51 <ready,dsc,error>error=40 <uncorrectable>LBA=71539463
May 30 07:29:36 kernel: pid 53340 (rrdtool), uid 0: exited on signal 11</uncorrectable></ready,dsc,error></uncorrectable></ready,dsc,error> -
I just backed up configuration and probably will reinstall PFSense on another drive.
Looks like its a hardware issue?
Yes, it's hardware, backing up now and re-installing somewhere else is the right move. Looks like something being read from disk by rrdtool failed, same block each time. Is this an SSD?
-
Thank-you charliem.
Its just an old IDE drive. Actually failed a few minutes ago. No boot at all.
Geez it looked new as it was just a Tivo backup drive.
I did back up right before. Switched to a another temporary IDE drive.
I do have a configured CF card slot and wondering whether to go to this or just SSD.
I did do a new build just now of 2.1.3 and restored from backup. It looks like its working.
Rebooted and restoring packages.
Need to do the change on the serial port and I think I will be good to go until I change drives.
Something else is going on. I just utilized my new drive for 2nd test machine to do the above rebuild. It worked for some 10 minutes and started again with similiar errors.
I've been testing with 2.2 just fine with this drive.
Swapped cables; same error. Might fallback to the smoothwall box…odd stuff...my pfsense now has spelt its guts with parts out of it and its still one the rack...ugly picture.....
-
Just shut off the RRD service and the above mentioned errors went away. Very odd.
I was watching the console and while navigating the GUI the same error would come up.
After shutting off the RRD service the errors have gone away.
Looks to be running fine now.
Still running fine after about 30 minutes. Put it all back together again as I didn't like seeings its guts all over the rack.
-
Try NanoBSD from CF card. I'm using nano for several years now. Most reliable IMHO, the whole system runs from RAM.
-
Thanks Robi and will do. What is the recommended size for the CF card? I read somewhere 2 Gb should work.
Can I install with the PFSense ISO?
-
-
Thank-you for the pointers Robi.
-
Once I got the GPS sending only the $GPGGA sentence, I haven't had any spikes (so far). If I was ambitious, I'd go back in my clockstats file, and see what the strings looked like around the times the spikes were detected. Hopefully it's fixed though.
I spoke (wrote?) too soon! My spikes are back, so I've got to do more digging; will update if I find anything.
Just an update: It seems the spikes are an artifact.
I've modified /var/db/rrd/updaterrd.sh to keep the output of every 'ntpq -c rv', as well as the the values parsed for rrd. I see spikes suddenly, with no corresponding upsets in the ntp clockstats or loopstats files. So, it's an ntp issue, not a pfSense issue.
For reference, if anyone else faces a similar problem, my debugging mods to updaterrd.sh are like so:
LOGPATH=/tmp LOGFILE=$LOGPATH/`date +'%y.%m.%d_%H:%M:%S'` PERM_LOG=$LOGPATH/ntp_stuff.log /usr/local/sbin/ntpq -c rv | /usr/bin/awk 'BEGIN{ RS=","}{ print }' >> $LOGFILE NOFFSET=`grep offset $LOGFILE | awk 'BEGIN{FS="="}{print $2}'` NFREQ=`grep frequency $LOGFILE | awk 'BEGIN{FS="="}{print $2}'` NSJIT=`grep sys_jitter $LOGFILE | awk 'BEGIN{FS="="}{print $2}'` NCJIT=`grep clk_jitter $LOGFILE | awk 'BEGIN{FS="="}{print $2}'` NWANDER=`grep clk_wander $LOGFILE | awk 'BEGIN{FS="="}{print $2}'` NDISPER=`grep rootdisp $LOGFILE | awk 'BEGIN{FS="="}{print $2}'` /usr/bin/nice -n20 /usr/local/bin/rrdtool update /var/db/rrd/ntpd.rrd \N:${NOFFSET}:${NSJIT}:${NCJIT}:${NWANDER}:${NFREQ}:${NDISPER} echo $NOFFSET : $NSJIT : $NCJIT : $NWANDER : $NFREQ : $NDISPER >> $PERM_LOG