Check_reload_status continually (update: was caused by apinger but why)?

binaryjay


Dec 19 10:09:57	check_reload_status: Syncing firewall
Dec 19 10:09:57	check_reload_status: Syncing firewall
Dec 19 10:09:35	php-fpm[62243]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_DHCP.
Dec 19 10:09:34	php-fpm[62243]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_DHCP.
Dec 19 10:09:33	check_reload_status: Reloading filter
Dec 19 10:09:33	check_reload_status: Restarting OpenVPN tunnels/interfaces
Dec 19 10:09:33	check_reload_status: Restarting ipsec tunnels
Dec 19 10:09:33	check_reload_status: updating dyndns WAN_DHCP
Dec 19 10:09:33	check_reload_status: Reloading filter
Dec 19 10:09:33	check_reload_status: Restarting OpenVPN tunnels/interfaces
Dec 19 10:09:33	check_reload_status: Restarting ipsec tunnels
Dec 19 10:09:33	check_reload_status: updating dyndns WAN_DHCP

This group of events are getting logged every 4-5 minutes like clockwork now, I can't pinpoint the cause, help?

binaryjay

I had the bright idea to check cron…

*/5 * * * * root /usr/bin/nice -n20 /usr/local/bin/php -f /usr/local/pkg/snort/snort_check_cron_misc.inc
*/5 * * * * root /usr/bin/nice -n20 /sbin/pfctl -q -t snort2c -T expire 3600

I think Snort is at fault for this? I'm guessing it is snort2c... but why does that need to be a cron job?

cmb

It's probably not the snort2c, that just removes old entries from that table. Try disabling the other cron job and see if that stops.

binaryjay

I modified both entries to run every 60 minutes instead of 5 using the Cron package but it had no effect at all.

I disabled gateway monitoring and the excess logging stopped… nothing has gone wrong with the wan gateway during all of this, pings have been exceptional if anything whenever I tested it and I haven't noticed any kind of connectivity problems.

I even restarted the machine earlier to no effect... has something gone wrong with apinger lately causing it to run amok?

cmb

What's in the gateways log?

binaryjay

@cmb:

What's in the gateways log?

Small example, was ongoing for ages and during this whole time I could ping the gateway IP just fine and I did try ping from pfsense itself to weed out any obscure problems but ping was fine from everywhere.

IP masked to protect the innocent…


Dec 19 15:04:07	apinger: alarm canceled: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 15:04:07	apinger: ALARM: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 14:54:37	apinger: alarm canceled: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 14:54:37	apinger: ALARM: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 14:49:27	apinger: alarm canceled: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 14:49:27	apinger: ALARM: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 14:43:42	apinger: alarm canceled: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 14:43:42	apinger: ALARM: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 14:30:17	apinger: alarm canceled: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 14:30:17	apinger: ALARM: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 14:17:35	apinger: alarm canceled: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 14:17:35	apinger: ALARM: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 14:07:27	apinger: alarm canceled: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 14:07:27	apinger: ALARM: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 13:56:56	apinger: alarm canceled: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 13:56:56	apinger: ALARM: WAN_DHCP(xx.xxx.xx.x) *** down ***
Dec 19 13:47:47	apinger: alarm canceled: WAN_DHCP(xx.xxx.xx.x) *** down ***

Escorpiom

More or less the same issue, my gateway is static but apinger claims delay.
I'm pretty sure the gateway is OK, so there must be something fishy going on.

Dec 19 14:31:39	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 14:32:20	apinger: alarm canceled: WANGW(*****) *** delay ***
Dec 19 14:32:45	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 14:33:13	apinger: alarm canceled: WANGW(*****) *** delay ***
Dec 19 14:37:38	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 14:38:21	apinger: alarm canceled: WANGW(*****) *** delay ***
Dec 19 14:41:16	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 14:41:41	apinger: alarm canceled: WANGW(*****) *** delay ***
Dec 19 14:42:02	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 14:42:48	apinger: alarm canceled: WANGW(*****) *** delay ***
Dec 19 14:44:49	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 14:45:34	apinger: alarm canceled: WANGW(*****) *** delay ***
Dec 19 14:46:04	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 14:46:53	apinger: alarm canceled: WANGW(*****) *** delay ***
Dec 19 14:48:57	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 14:49:46	apinger: alarm canceled: WANGW(*****) *** delay ***
Dec 19 14:50:07	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 14:50:51	apinger: alarm canceled: WANGW(*****) *** delay ***
Dec 19 14:54:10	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 14:54:55	apinger: alarm canceled: WANGW(*****) *** delay ***
Dec 19 14:56:20	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 14:57:02	apinger: alarm canceled: WANGW(*****) *** delay ***
Dec 19 14:58:42	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 14:59:25	apinger: alarm canceled: WANGW(*****) *** delay ***
Dec 19 15:02:27	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 15:03:08	apinger: alarm canceled: WANGW(*****) *** delay ***
Dec 19 15:03:38	apinger: ALARM: WANGW(*****) *** delay ***
Dec 19 15:04:16	apinger: alarm canceled: WANGW(*****) *** delay ***

Cheers.

cmb

Go to System>Advanced, Misc and check "Enable gateway monitoring debug logging". That'll give more useful log output.

binaryjay

@cmb:

Go to System>Advanced, Misc and check "Enable gateway monitoring debug logging". That'll give more useful log output.

I've re-enabled gateway monitoring with debug logging enabled and will report back later. Thanks for your help.

binaryjay

Here is an example of the cause and effect that is happening… I really don't see why apinger would be starting a reload given the surrounding gateway conditions.


Dec 19 18:30:06	apinger: (avg. loss: 0.0%)
Dec 19 18:30:06	apinger: (avg: 15.525ms)
Dec 19 18:30:06	apinger: #326 from WAN_DHCP(xx.xxx.xx.x) delay: 13.995ms/16.707ms/155.252ms received = 10
Dec 19 18:30:06	apinger: Polling, timeout: 0.999s
Dec 19 18:30:06	apinger: Recently lost packets: 0
Dec 19 18:30:06	apinger: Sending ping #326 to WAN_DHCP (xx.xxx.xx.x)
Dec 19 18:30:06	apinger: Polling, timeout: 0.425s
Dec 19 18:30:06	apinger: Polling, timeout: 0.006s
[b]Dec 19 18:30:06	apinger: Starting: /usr/local/sbin/pfSctl -c 'service reload dyndns WAN_DHCP' -c 'service reload ipsecdns' -c 'service reload openvpn WAN_DHCP' -c 'filter reload'
Dec 19 18:30:06	apinger: Polling, timeout: 0.000s
Dec 19 18:30:06	apinger: Starting: /usr/local/sbin/pfSctl -c 'service reload dyndns WAN_DHCP' -c 'service reload ipsecdns' -c 'service reload openvpn WAN_DHCP' -c 'filter reload'[/b]
Dec 19 18:30:05	apinger: Polling, timeout: 0.503s
Dec 19 18:30:05	apinger: (avg. loss: 0.0%)
Dec 19 18:30:05	apinger: (avg: 15.796ms)
Dec 19 18:30:05	apinger: #325 from WAN_DHCP(99.234.56.1) delay: 19.851ms/14.994ms/157.964ms received = 9
Dec 19 18:30:05	apinger: Polling, timeout: 0.523s
Dec 19 18:30:05	apinger: Recently lost packets: 0
Dec 19 18:30:05	apinger: Sending ping #325 to WAN_DHCP (xx.xxx.xx.x)

Dec 19 18:30:06	check_reload_status: updating dyndns WAN_DHCP

Edit: It's also suspicious about why it is doing it twice in a row?

cmb

Seems that's probably effect, not cause. Check your DHCP logs for dhclient things. I'm guessing you have a 10 minute DHCP lease, which will renew every 5 minutes.

binaryjay

@cmb:

Seems that's probably effect, not cause. Check your DHCP logs for dhclient things. I'm guessing you have a 10 minute DHCP lease, which will renew every 5 minutes.

dhcp renewal time reported by dhclient.leases for the interface is 24 hours, this behavior only started recently so I'm still stumped.

I updated to the last snapshot, only one day newer, and power cycled my gateway for good measure last night and all of that excess logging stopped. Perhaps it was the gateway going crazy after all even though it seemed to be working fine…

Will keep an eye on it. By the way I couldn't find any logs for dhclient?

binaryjay

Noticed it started happening again… check_reload_status continually. Resetting the gateway made no difference, and there is no noticeable loss of connectivity ever nor any packet loss reported. Anybody else?