Ntp_intres.request: permission denied

phil.davis

2.1-BETA0 (i386)
built on Tue Dec 4 21:53:10 EST 2012

Somehow some of my systems have got 2 ntpd processes going:

root   20112  0.0  2.6  6132  6156  ??  SNs  10:17AM   0:01.07 /usr/local/bin/ntpd -g -c /var/etc/ntpd.conf
root   29080  0.0  2.6  6132  6156  ??  IN   10:17AM   0:00.07 /usr/local/bin/ntpd -g -c /var/etc/ntpd.conf

and they complain in the system and NTP log tabs at the same second of every minute:
In the system log;

Dec 6 12:30:03 	ntpd_intres[29080]: ntp_intres.request: permission denied
Dec 6 12:31:03 	ntpd_intres[29080]: ntp_intres.request: permission denied
Dec 6 12:32:03 	ntpd_intres[29080]: ntp_intres.request: permission denied
Dec 6 12:33:03 	ntpd_intres[29080]: ntp_intres.request: permission denied
Dec 6 12:34:03 	ntpd_intres[29080]: ntp_intres.request: permission denied
Dec 6 12:35:03 	ntpd_intres[29080]: ntp_intres.request: permission denied

In the NTP log tab;

Dec 6 12:30:03 	ntpd[20112]: process_private: failed auth mod_okay 0
Dec 6 12:31:03 	ntpd[20112]: process_private: failed auth mod_okay 0
Dec 6 12:32:03 	ntpd[20112]: process_private: failed auth mod_okay 0
Dec 6 12:33:03 	ntpd[20112]: process_private: failed auth mod_okay 0
Dec 6 12:34:03 	ntpd[20112]: process_private: failed auth mod_okay 0
Dec 6 12:35:03 	ntpd[20112]: process_private: failed auth mod_okay 0
Dec 6 12:36:03 	ntpd[20112]: process_private: failed auth mod_okay 0
Dec 6 12:37:03 	ntpd[20112]: process_private: failed auth mod_okay 0

Restarting ntpd from Status Services fixes the issue (blows both of them away and starts a new one).
I have no idea how often this happens, or when it started (sorry about that - will monitor it now). It was on 2 out of 8 systems I looked at today. The systems have no packages, 1 OpenVPN client, 1 OpenVPN server, Alix 2D13. Maybe some timing interaction when multiple things are starting, interfaces coming up…?
Has anyone else seen this happen?

msi

Also seen this once, but not yet on a Snap from Dec. 5th

I don't know if this is related to your findings but have found that ntpd isn't really working reliably as of now and sometimes plainly fails to start without a more helpful log message other than the version string.
This happens mostly when using DNS, less when directly aiming with IP addresses.
It both happens when using local DCs as NTP peers as well as using ch.pool.ntp.org or 1.ch.pool.ntp.org & 2.ch.pool.ntp.org

msi

bump ;-)

I can now confirm that at every reboot the ntpd fails to start, I did also see permission denied errors - when I relaunch, at least the daemon keeps running for a time - sometimes it definitely crashed.
Will check a bit closer - this applies to all snapshots that I have upgraded a box since Dec. 4th so far and the issue is reproducible at every reboot of the machine after a normal reboot or one after firmware update.

jimp

hmm, my ntpd is starting and working fine on my VMs on current snapshots.

I changed the ntp startup script to be a little more verbose but that doesn't change messages from the daemon itself.

Otherwise there hasn't been any change to the NTP code since October.

msi

Thanks will get a new snap tomorrow and poke more closely at the logs
to get a better idea.

The NTP servers locally are Windows Servers, but the same behaviour happened when using ntp.org's pool.

msi

Weird, ok, this is BETA1 now, I did a reboot and send you the output of /var/log/ntpd.log:

Just after the complete reboot:
Dec 14 08:50:07 wifigate-ng ntpdate[65530]: can't find host ntp1. <ourdomain>Dec 14 08:50:07 wifigate-ng ntpdate[65530]: can't find host ntp2. <ourdomain>Dec 14 08:50:07 wifigate-ng ntpdate[65530]: no servers can be used, exiting
Dec 14 08:50:13 wifigate-ng ntpdate[10277]: can't find host ntp1. <ourdomain>Dec 14 08:50:13 wifigate-ng ntpdate[10277]: can't find host ntp2. <ourdomain>Dec 14 08:50:13 wifigate-ng ntpdate[10277]: no servers can be used, exiting
Dec 14 08:50:19 wifigate-ng ntpdate[11192]: can't find host ntp1. <ourdomain>Dec 14 08:50:19 wifigate-ng ntpdate[11192]: can't find host ntp2. <ourdomain>Dec 14 08:50:19 wifigate-ng ntpdate[11192]: no servers can be used, exiting
–-

The interesting thing is that later on both ntp servers are resolved properly when doing a 'host ntp1.<ourdomain>'
I think the resolvers comes up after ntpdate thant that's perhaps one part of the problem.

Things continue then:

Dec 14 08:50:24 wifigate-ng ntp: Giving up on time sync after 3 attempts.
Dec 14 08:50:24 wifigate-ng ntp: Starting NTP Daemon.
Dec 14 08:50:24 wifigate-ng ntpd[13009]: ntpd 4.2.6p5@1.2349-o Wed Dec 12 15:46:12 UTC 2012 (1)
Dec 14 08:50:24 wifigate-ng ntpd[13268]: proto: precision = 0.838 usec
Dec 14 08:50:24 wifigate-ng ntpd[13268]: ntp_io: estimated max descriptors: 11095, initial socket boundary: 20
Dec 14 08:50:24 wifigate-ng ntpd[13268]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123
Dec 14 08:50:24 wifigate-ng ntpd[13268]: Listen and drop on 1 v6wildcard :: UDP 123
Dec 14 08:50:24 wifigate-ng ntpd[13268]: Listen normally on 2 ix0 fe80::92e2:baff:fe0b:cdc UDP 123
<snip! repeats="" for="" every="" interface="" enabled="">Dec 14 08:50:24 wifigate-ng ntpd[13268]: peers refreshed
Dec 14 08:50:24 wifigate-ng ntpd[13268]: Listening on routing socket on fd #43 for interface updates
Dec 14 08:50:24 wifigate-ng ntpd[13268]: Deferring DNS for ntp1. <ourdomain>1
Dec 14 08:50:24 wifigate-ng ntpd[13268]: Deferring DNS for ntp2. <ourdomain>1
Dec 14 08:51:28 wifigate-ng ntpd[13268]: process_private: failed auth mod_okay 0
Dec 14 08:51:28 wifigate-ng ntpd[13268]: process_private: failed auth mod_okay 0
<snip this="" message="" thereafter="" gets="" repeated="" twice="" perminute="">–-

ps aux | grep does not reveal a running ntp or ntpd process.
'ntpq -p' does not show peers
The GUI shows green in the services panel
The service status panel confirms what ntpq -p said: "No peers found, is the ntp service running?. "

Now when I restart the ntp server via GUI (takes longer than on the 2.0.1 box with OpenNTPD, the GUI (now correctly) tell: The Service is not started

Dec 14 09:03:59 wifigate-ng ntpd[13268]: ntpd exiting on signal 15
Dec 14 09:03:59 wifigate-ng ntpd[25394]: ntpd 4.2.6p5@1.2349-o Wed Dec 12 15:46:12 UTC 2012 (1)
–-

Issuing ntpdate ntp1 syncs the clock then - interestig, but I keep being unable to start ntpd...</snip></ourdomain></ourdomain></snip!></ourdomain></ourdomain></ourdomain></ourdomain></ourdomain></ourdomain></ourdomain>

msi

Further things:

Starting 'ntpd' uses /etc/ntp.conf default config and launches.

The pfsense php code reveals quite some references to OpenNTPD, maybe some leftovers keep ntp.org's implementation from running?

jimp

How is "ntp1.<yourdomain>" resolved? Is it in /etc/hosts? Local DNS server? Public DNS host? DHCP lease registered hostname?

I use public hostnames in mine and it starts fine every time (0.pfsense.pool.ntp.org 1.pfsense.pool.ntp.org 2.pfsense.pool.ntp.org 3.pfsense.pool.ntp.org)

Also there aren't any more references to "openntpd" that are connected to anything that is significant. A few things could need cleaned up but there isn't anything that would conflict.

You could try moving the dnsmasq configure lines in /etc/rc.bootup up above the ntp config, see if that helps.</yourdomain>

msi

G'day Jim

Thanks for checking the OpenNTPD remainders were not disturbing the ntp.org server.
I'll try to see if moving will optimize things. Yet do think it's realistic that dnsmasq isn't as quick to answer as ntpdate is firing up.

ntp1.ourdomain is resolved via (pfSense default as I remember) local dnsmasq resolver which uses our Domain Controllers as reference.
By Microsofts' definition of DCs they are both DNS and NTP servers, whereas one of them serves are reference clock whole Windows domain.
(and since I have attached admin authentication to the LDAP of the DCs - which happily work with LDAPS - I'd like to use them as reference clock)

jimp

That should work OK, sure there isn't some kind of chicken-and-egg scenario where those DCs need external connectivity or some resource that isn't up yet in order to fully resolve those hostnames?

I recall hitting something like that before with AD for DNS but it's been so long I can't remember the specifics.

I'll probably move the ntp start down under dnsmasq (or perhaps move dnsmasq up – either way) regardless since even if it doesn't fix your situation, if someone had a custom host override in use as the NTP server it wouldn't work otherwise.

phil.davis

When I googled around for "ntp_intres.request: permission denied" I found a few references that indicated that it could get stuck looping with this error message if the DNS was not available when it first kicked off. That doesn't explain how we can get 2 ntpd processes started, but it does indicated that starting dnsmasq before ntpd is a good idea.

jimp

https://github.com/bsdperimeter/pfsense/commit/26ee0570379bd5fb936328786a4f5f6b9ba9f39f

phil.davis

So far it is very happy starting after DNS:

Dec 15 07:03:18 	ntpdate[86433]: adjust time server 120.88.47.10 offset 0.019573 sec
Dec 15 07:03:18 	ntp: Successfully synced time after 1 attempts.
Dec 15 07:03:18 	ntp: Starting NTP Daemon.
Dec 15 07:03:18 	ntpd[170]: ntpd 4.2.6p5@1.2349-o Wed Dec 12 15:48:18 UTC 2012 (1)
Dec 15 07:03:18 	ntpd[279]: proto: precision = 2.720 usec
Dec 15 07:03:18 	ntpd[279]: ntp_io: estimated max descriptors: 3405, initial socket boundary: 20
Dec 15 07:03:18 	ntpd[279]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen and drop on 1 v6wildcard :: UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen normally on 2 vr0 10.49.112.250 UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen normally on 3 vr0 fe80::20d:b9ff:fe29:de60 UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen normally on 4 vr1 fe80::20d:b9ff:fe29:de61 UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen normally on 5 vr1 192.168.2.100 UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen normally on 6 lo0 127.0.0.1 UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen normally on 7 lo0 ::1 UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen normally on 8 lo0 fe80::1 UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen normally on 9 ath0_wlan0 fe80::92a4:deff:fec4:10c5 UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen normally on 10 ath0_wlan0 10.49.113.0 UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen normally on 11 ovpnc1 fe80::20d:b9ff:fe29:de60 UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen normally on 12 ovpnc1 10.49.243.30 UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen normally on 13 ovpnc2 fe80::20d:b9ff:fe29:de60 UDP 123
Dec 15 07:03:18 	ntpd[279]: Listen normally on 14 ovpnc2 10.49.240.30 UDP 123
Dec 15 07:03:19 	ntpd[279]: peers refreshed
Dec 15 07:03:19 	ntpd[279]: Listening on routing socket on fd #37 for interface updates

Now we wait and see if the "2 ntpd at the same time" issue recurs somehow.

msi

I have also updated to a snapshot which has the lines moved - unfortunately not a real improvement.
I have tried switching to direct IPv4 addresses which seems to work, so yet the problem somewhere lies with DNS.

As I used to add a CNAME to our DCs (ntp1 and ntp2) I don't know if it's related to the usage of that yet.
I'lld try to add a host override too locally to see if it's taking too long for the local resolver to get an answer.