dpinger shows 100% loss after gateway recovers
-
Hi.
I ran into a problem with gateway monitoring using pfSense 2.5.0 on a x86 machine.
I have multiple WAN interfaces, one of which is a hardware router with NAT (there are reasons for that). Let's call this gateway GW1.
In pfSense, I have a static IP interface for this link.Using this router's IP for monitoring doesn't make much sense because it's always up, but the internet link behind it can go down. Because of that, I set the GW1 monitoring IP to 8.8.8.8 (tried also 1.1.1.1 and some other well-known addresses, they give the same result).
After pfSense reboot, everything works fine, the GW1 status is Online, 0% packet loss. If I disconnect the internet link behind the GW1 router, packet loss number grows up and eventually GW1 status becomes "Offline". So far so good.
The problem is, after I restore the GW1 router internet link, the gateway status does not change back to Online and the packet loss number remains 100% (I waited at least 10 minutes, no change observed), although ping 8.8.8.8 from this interface works just fine.
Rebooting the GW1 router doesn't change anything, so it's definitely not a router's problem.
If I restart dpinger, the GW1 status immediately changes to Online, 0% loss and it will be like that until next link outage.
I took a look at dpinger source code and found that it reuses the same socket for all requests. I'm not that familiar with BSD networking internals, but it seems to me that after GW1 router's WAN link goes down, the socket dpinger used to ping 8.8.8.8 is no longer valid and, after the link goes up, a new socket should be used.
The standard ping tool show the same behavior: if I start pinging 8.8.8.8 specifying the GW1 link address, after I disconnect the GW1 router's WAN link, the ping responses stop and don't continue after I restore the link.
But if I close this ping instance and start a new one right away, it works just fine.I made a packet capture on GW1 interface and it shows that ICMP requests from GW1 interface IP to 8.8.8.8 are being sent indeed, but no response packets are received until I restart dpinger or ping.
I can set a cron job with a script that will check GW1 interface and restart dpinger if its status changes to Offline, but it looks pretty clumsy to me.
I'd like to know, if anyone else has this problem and is there a better way to overcome it.
-
@dbykov If you view the gateways page does it recover? Had that issue on prior versions, haven't tried upgrading that client to 2.5 yet. However per the last post I made there it looked like it was fixed in 2.5.
In that thread I ended up installing the cron package to check the gateway status every so often so it would self-recover.
-
@steveits said in dpinger shows 100% loss after gateway recovers:
If you view the gateways page does it recover?
No, the Status -> Gateways page shows 100% loss.
As I said, if I run dpinger in shell manually, it shows the same behavior - the output shows 100% loss even after 10 minutes passed since physical link recovery, but if I restart dpinger, it shows 0% loss as it should.