Ntpd / gps need some love part II
-
Thanks, I found the problem for your weirdness and fixed it, attachments below.
It's hardcoded in pfSense's system.php that if for some reason no NTP server is present in the config file, pool.ntp.org is used by default. That's why you were seeing it in General Setup, and when you press Save there, pool.ntp.org goes into the config file.
However, in NTP settings, you could delete all the NTP servers from the list, but the code expects in return when parsing the config file, to exist at least one. That's what generated your error.
I fixed it by adding the same pool.ntp.org server as last resort, if someone deletes all the servers. If you delete all the servers and press Save, the system will automatically add pool.ntp.org, and you will see that in the first row.This results in keeping the original idea in pfSense, that you can't run without at least one NTP server configured. If one still wants to run without any external NTP servers, I guess he/she can use 'localhost' instead or some any dummy IP address or hostname which cannot be resolved. Not too nice though.
-
It's worth a discussion and a decision with pfSense's main devs, to allow no external NTP servers configured, IF there's a local GPS time source. That would involve a simple logic to check if a GPS time source is already configured, and in both NTP settings and in General Setup allow to have empty list of servers.
Moreover, it could even be decided to completely remove the "NTP time server" option from General Setup, since the NTP page already contains advanced configuration for NTP stuff.
Also there's still present the "Default" type of GPS receiver in the list, which is copied from the original pfSense 2.1's code. Dagorlad has put this text on the page:
Note: Default is the configuration of pfSense 2.1 and earlier (not recommended). Select Generic if your GPS is not listed.)
I think we should clear this up, what was the type/model of the GPS receiver used originally when Serial GPS was originally added to pfSense 2.1? I'd add the correct name to reduce confusion and remove the warning text.
Jimp? Can you please consider the above?
-
This results in keeping the original idea in pfSense, that you can't run without at least one NTP server configured. If one still wants to run without any external NTP servers, I guess he/she can use 'localhost' instead or some any dummy IP address or hostname which cannot be resolved. Not too nice though.
I don't remember seeing this documented as a requirement! While it's certainly a good recommendation, should it actually be required?
-
Thank-you Robi!
I am fine using one external NTP server as a failover. A NTP internal only switch that worked would also be a nice thing to have.
Its been a few years here but started initially with an old "surplus" Trimble GPS (well it was made to be inside of a tank) using this one for many years went to using the SureGPS until the USB power port fell off of the board. I baked the board last year and was able to get it to work.
BTW still got some errors using GGA such that I went back to using ZDA/ZDG.
IE: I was seeing bad data using GGA here.
ntpq> cv
assID=0 status=0012 clk_okay, last_clk_18,
device="NMEA GPS Clock", timecode="$GPZDA,123914.000,29,05,2014,,*53",
poll=7, noreply=0, badformat=1, baddata=0, fudgetime2=400.000,
stratum=0, refid=GPS, flags=5 -
Also there's still present the "Default" type of GPS receiver in the list, which is copied from the original pfSense 2.1's code. Dagorlad has put this text on the page:
Note: Default is the configuration of pfSense 2.1 and earlier (not recommended). Select Generic if your GPS is not listed.)
I think we should clear this up, what was the type/model of the GPS receiver used originally when Serial GPS was originally added to pfSense 2.1? I'd add the correct name to reduce confusion and remove the warning text.
Jimp? Can you please consider the above?
It's for a GPS with a uBlox chipset; I mentioned it in my first post about GPS bugs here:
https://forum.pfsense.org/index.php?topic=67189.msg367460#msg367460 I believe JimP later posted that it was added because it was funded by a customer. IMHO it should not be there by defaultAnother outstanding 'design' question is whether we should be saving clockstats by default. They are not used by any installed program or available package AFAIK, and grow without bounds, as there's no cron job to compress/remove/consolidate them. I think anyone who wants them can certainly turn them on. I first brought it up here:
https://forum.pfsense.org/index.php?topic=67189.msg373669#msg373669 and more here:
https://forum.pfsense.org/index.php?topic=67189.msg373841#msg373841 -
A NTP internal only switch that worked would also be a nice thing to have.
+1 on that. If said check box is only on the GPS settings page it's only ever likely to be set by people who understand what they're doing (or think they do ;)).
I thought the clockstats thing had been taken care of? Hmm, clearly I'm not paying enough attention. There was a manual tweak to disable it though right? That didn't make it into the package?
Steve
-
BTW still got some errors using GGA such that I went back to using ZDA/ZDG.
IE: I was seeing bad data using GGA here.
It seems to me that badformat error is due to a bug in ntp, the SureGPS sends the sentences with an extra empty line between them, and it seems to me that the nmea driver gets confused by this. It should either be fixed in ntp-nmea driver, or the SureGPS firmware.
However, the number of badformats is rather small (got about 26 during a day), and since the PPS signal is the actual source for exact timing, I wouldn't worry about it. The time is kept very exact anyway.
-
Thanks Robi.
Will do.
Thinking a while back did play with the firmware on the SureGPS and had a dialog going with company about the device.
Personally just the integration of using a GPS/PPS into PFSense is a neato thing for me as I had initially just an autonomous running NTP server with a GPS on the network here. Its my home such that its not a big deal.
-
It's for a GPS with a uBlox chipset; I mentioned it in my first post about GPS bugs here:
https://forum.pfsense.org/index.php?topic=67189.msg367460#msg367460 I believe JimP later posted that it was added because it was funded by a customer. IMHO it should not be there by defaultWell, if it works for others too, why remove it? I noticed it's an uBlox, but it's init commands differ from the standard uBlox.
"Default":
$PUBX,40,GSV,0,0,0,059
$PUBX,40,GLL,0,0,0,05C
$PUBX,40,ZDA,0,0,0,044
$PUBX,40,VTG,0,0,0,05E
$PUBX,40,GSV,0,0,0,059
$PUBX,40,GSA,0,0,0,04E
$PUBX,40,GGA,0,0,0,0
$PUBX,40,TXT,0,0,0,0
$PUBX,40,RMC,0,0,0,0*46
$PUBX,41,1,0007,0003,4800,0
$PUBX,40,ZDA,1,1,1,1"U-Blox":
$PUBX,40,GGA,1,1,1,1,0,05A
$PUBX,40,GLL,1,1,1,1,0,05C
$PUBX,40,GSA,0,0,0,0,0,04E
$PUBX,40,GSV,0,0,0,0,0,059
$PUBX,40,RMC,1,1,1,1,0,047
$PUBX,40,VTG,0,0,0,0,0,05E
$PUBX,40,GRS,0,0,0,0,0,05D
$PUBX,40,GST,0,0,0,0,0,05B
$PUBX,40,ZDA,1,1,1,1,0,044
$PUBX,40,GBS,0,0,0,0,0,04D
$PUBX,40,DTM,0,0,0,0,0,046
$PUBX,40,GPQ,0,0,0,0,0,05D
$PUBX,40,TXT,0,0,0,0,0,043
$PUBX,40,THS,0,0,0,0,0,054
$PUBX,41,1,0007,0003,4800,0*13Which one is the real uBlox then? Maybe we should just rename "Default" to the real name of it, since although it may be an uBlox, it may implement some other commands too. Could be useful for somebody…
Another outstanding 'design' question is whether we should be saving clockstats by default. They are not used by any installed program or available package AFAIK, and grow without bounds, as there's no cron job to compress/remove/consolidate them. I think anyone who wants them can certainly turn them on. I first brought it up here:
https://forum.pfsense.org/index.php?topic=67189.msg373669#msg373669 and more here:
https://forum.pfsense.org/index.php?topic=67189.msg373841#msg373841clockstats are not saved by default. I just checked the interface, on the NTP page at the bottom (Statistics logging section) there's a ckeckbox named "Enable logging of reference clock statistics (default: disabled).". The php code behind it acts as it should, I just tested. If it's not checked, the /var/log/ntp stays empty. If it's checked, it creates the files.
-
It seems to me that badformat error is due to a bug in ntp, the SureGPS sends the sentences with an extra empty line between them, and it seems to me that the nmea driver gets confused by this. It should either be fixed in ntp-nmea driver, or the SureGPS firmware.
It was fixed by the ntp devs a few days after reporting: http://bugs.ntp.org/show_bug.cgi?id=2140
So, the fix should be in pfSense 2.2 which uses ntpd 4.2.7p440, but I don't think it's in 2.1.3 which uses an earlier version of ntpd (4.2.6 somthing, IIRC). I do not see the issue on my Sure Electronics boards, but I do see it on another GPS.
But you are right, it's a small problem, unless I find it is related to the spikes I'm observing. But that's an ntp issue, not for a pfSense forum.
-
I think that was a different issue, having millions of badformats under Windows build. We have only a couple tens, so I tend to think it's either a different thing, or the bug still not fixed properly.
It's ntpd 4.2.7p411 in my 2.1.3 boxes here. -
Well, if it works for others too, why remove it? I noticed it's an uBlox, but it's init commands differ from the standard uBlox.
$PUBX is proprietary to uBlox. Hopefully your Garmin or trimble would ignore these … IMHO it's better to send nothing by default, just open the serial port and look for one of the NMEA sentences. Only send something if the user has chosen a unit.
"Default":
$PUBX,40,GSV,0,0,0,059
$PUBX,40,GLL,0,0,0,05C
$PUBX,40,ZDA,0,0,0,044
$PUBX,40,VTG,0,0,0,05E
$PUBX,40,GSV,0,0,0,059
$PUBX,40,GSA,0,0,0,04E
$PUBX,40,GGA,0,0,0,0
$PUBX,40,TXT,0,0,0,0
$PUBX,40,RMC,0,0,0,0*46
$PUBX,41,1,0007,0003,4800,0
$PUBX,40,ZDA,1,1,1,1"U-Blox":
$PUBX,40,GGA,1,1,1,1,0,05A
$PUBX,40,GLL,1,1,1,1,0,05C
$PUBX,40,GSA,0,0,0,0,0,04E
$PUBX,40,GSV,0,0,0,0,0,059
$PUBX,40,RMC,1,1,1,1,0,047
$PUBX,40,VTG,0,0,0,0,0,05E
$PUBX,40,GRS,0,0,0,0,0,05D
$PUBX,40,GST,0,0,0,0,0,05B
$PUBX,40,ZDA,1,1,1,1,0,044
$PUBX,40,GBS,0,0,0,0,0,04D
$PUBX,40,DTM,0,0,0,0,0,046
$PUBX,40,GPQ,0,0,0,0,0,05D
$PUBX,40,TXT,0,0,0,0,0,043
$PUBX,40,THS,0,0,0,0,0,054
$PUBX,41,1,0007,0003,4800,0*13Which one is the real uBlox then? Maybe we should just rename "Default" to the real name of it, since although it may be an uBlox, it may implement some other commands too. Could be useful for somebody…
I documented many common commands in this message:
https://forum.pfsense.org/index.php?topic=67189.msg373885#msg373885clockstats are not saved by default. I just checked the interface, on the NTP page at the bottom (Statistics logging section) there's a ckeckbox named "Enable logging of reference clock statistics (default: disabled).". The php code behind it acts as it should, I just tested. If it's not checked, the /var/log/ntp stays empty. If it's checked, it creates the files.
Sorry, my bad! Good job!
[edit: add link to gps commands]
-
OK, thanks, maybe it's a different uBlox model.
-
This results in keeping the original idea in pfSense, that you can't run without at least one NTP server configured. If one still wants to run without any external NTP servers, I guess he/she can use 'localhost' instead or some any dummy IP address or hostname which cannot be resolved. Not too nice though.
I don't remember seeing this documented as a requirement! While it's certainly a good recommendation, should it actually be required?
Well, if you take a pfSense out of the box, it starts with a wizard which forces you to use an NTP server. You can't go further to the next step until you enter a string there (the message is "NTP Time Server names may only contain the characters a-z, 0-9, '-' and '.'. Entries may be separated by spaces. Please press back in your browser window and correct."). A similar message appears when you press Save in General Setup, if you empty the "NTP time server" text box. You can't save your settings unless you enter a string there.
If you manually delete the 'timeservers' value from the config xml file, you'll quickly notice that in in General Setup the "NTP time server" text box is pre-filled with pool.ntp.org.All the above makes me think, that pfSense was designed to run with a configured NTP server, and the user is not able to run the system without at least one.
But this could be reconsidered nowdays, when we have local time sources based on GPS.
-
Here's just for fun, the difference between a pfSense system synced to public NTP servers, and one with a local GPS/PPS sync.
Timing is 100 times more exact with a local PPS source, than with public servers. Same hardware, same pfSense version.
-
Nice. :D
You've gotta love graphs!Steve
-
Some odd stuff is happening this morning. Drive checks out fine when rebooting such that I do not understand why I am seeing this. I just backed up configuration and probably will reinstall PFSense on another drive.
Looks like its a hardware issue?
May 30 07:29:42 kernel: pid 55919 (rrdtool), uid 0: exited on signal 11
May 30 07:29:42 kernel: vm_fault: pager read error, pid 55919 (rrdtool)
May 30 07:29:42 kernel: vnode_pager_getpages: I/O read error
May 30 07:29:42 kernel: g_vfs_done():ad0s1a[READ(offset=36628164608, length=4096)]error = 5
May 30 07:29:42 kernel: ad0: FAILURE - READ_DMA status=51 <ready,dsc,error>error=40 <uncorrectable>LBA=71539463
May 30 07:29:38 kernel: vm_fault: pager read error, pid 55919 (rrdtool)
May 30 07:29:38 kernel: vnode_pager_getpages: I/O read error
May 30 07:29:38 kernel: g_vfs_done():ad0s1a[READ(offset=36628164608, length=4096)]error = 5
May 30 07:29:38 kernel: ad0: FAILURE - READ_DMA status=51 <ready,dsc,error>error=40 <uncorrectable>LBA=71539463
May 30 07:29:36 kernel: pid 53340 (rrdtool), uid 0: exited on signal 11</uncorrectable></ready,dsc,error></uncorrectable></ready,dsc,error> -
I just backed up configuration and probably will reinstall PFSense on another drive.
Looks like its a hardware issue?
Yes, it's hardware, backing up now and re-installing somewhere else is the right move. Looks like something being read from disk by rrdtool failed, same block each time. Is this an SSD?
-
Thank-you charliem.
Its just an old IDE drive. Actually failed a few minutes ago. No boot at all.
Geez it looked new as it was just a Tivo backup drive.
I did back up right before. Switched to a another temporary IDE drive.
I do have a configured CF card slot and wondering whether to go to this or just SSD.
I did do a new build just now of 2.1.3 and restored from backup. It looks like its working.
Rebooted and restoring packages.
Need to do the change on the serial port and I think I will be good to go until I change drives.
Something else is going on. I just utilized my new drive for 2nd test machine to do the above rebuild. It worked for some 10 minutes and started again with similiar errors.
I've been testing with 2.2 just fine with this drive.
Swapped cables; same error. Might fallback to the smoothwall box…odd stuff...my pfsense now has spelt its guts with parts out of it and its still one the rack...ugly picture.....
-
Just shut off the RRD service and the above mentioned errors went away. Very odd.
I was watching the console and while navigating the GUI the same error would come up.
After shutting off the RRD service the errors have gone away.
Looks to be running fine now.
Still running fine after about 30 minutes. Put it all back together again as I didn't like seeings its guts all over the rack.