Apinger running amok?
-
On March 22nd, with then then-current prerelease version, I experienced a significant number of connection losses for HTTP (or maybe HTTPS) downloads.
the system log mentioned:
Mar 22 20:00:55 php: rc.filter_configure_sync: Adding TFTP nat rules Mar 22 20:00:53 php: rc.dyndns.update: phpDynDNS (yyyyyyyy): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Mar 22 20:00:53 php: rc.dyndns.update: phpDynDNS (yyyyyyyy): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Mar 22 20:00:53 php: rc.filter_configure_sync: Adding TFTP nat rules Mar 22 20:00:52 php: rc.dyndns.update: phpDynDNS (xxxxxxxx): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Mar 22 20:00:52 php: rc.dyndns.update: phpDynDNS (xxxxxxxx): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Mar 22 20:00:48 check_reload_status: Reloading filter Mar 22 20:00:48 check_reload_status: Restarting OpenVPN tunnels/interfaces Mar 22 20:00:48 check_reload_status: Restarting ipsec tunnels Mar 22 20:00:48 check_reload_status: updating dyndns WAN Mar 22 20:00:48 check_reload_status: Reloading filter Mar 22 20:00:48 check_reload_status: Restarting OpenVPN tunnels/interfaces Mar 22 20:00:48 check_reload_status: Restarting ipsec tunnels Mar 22 20:00:48 check_reload_status: updating dyndns WAN
(DNS names replaced by xes and ys to protect the innocent)
The gateway was log shows:
Mar 22 20:00:38 apinger: alarm canceled: WAN(xx.xx.xx.xx) *** WANdown *** Mar 22 20:00:38 apinger: ALARM: WAN(xx.xx.xx.xx) *** WANdown ***
Both sections repeat many times, correlating with the connection losses of the HTTP/HTTPS downloads. The modem's log showed nothing suspicious.
On the machine doing the HTTP/HTTPS downloads, a BitTorrent client was active. After shutting down this BT client, the log messages did not reappear any more, and the HTTP/HTTPS downloads ran stable again.
On pfSense, I have the traffic shaper (HFSC) configured to give BT traffic low priority (outbound traffic shaping only, the cable speed is 100/5, LAN is GBit). The LAN default rule has qACK/qDefault assigned. TBR size has been set to 65535.
To me it looks like apringer tries to ping the default gateway, the ping does not get through and apinger then kills the firewall states.
What makes the situation is bit "diffuse" is that I have never ever noticed this behavior before. However, I must admit that I don't use BT every often - every few weeks or months or so. So I am not totally sure that this is a new issue which has appeared in the 2.1.1 branch or if it has been present ever since.
-
So, you basically killed your line with BT traffic… how's this an apinger issue?
-
If you really saturate your link, then the ping time from pfSense to the monitor IP eventually goes higher than the default parameters. That will cause apinger to declare the link down.
System->Gateways - edit WAN gateway/s, click Advanced and make the ping and packet loss parameters higher. Then do a few parallel downloads again and have a ping going on a client also - see how the ping time goes up, make sure the gateway status on the dashboard shows similar times, adjust those gateway parameters.
I always make mine a lot higher than any reasonable person would like - my organisation has offices in remote places with internet you would only dream of in a nightmare - because I don't want pfSense to failover until it really is desperation. -
Sadly, I can confirm misconfiguration. My fault.
I don't know why or when, I don't remember to have done this, but my apinger configuration was set up way to sensible. Definitely not the default options.
Must definitely be my fault. Sorry for that!