dpinger stops (crashes?) after update to 2.6.0
-
@stephenw10 The route works just fine even when dpinger stops operating. No problems there. The gateway works as well. Strange. Never happened before in 200+ non-stop up days. It started after the upgrade to 2.6.0 and now happens many times every day.
But it has zero impact on my network. I only notice it when I login to pfense once a day and see that dpinger is "off" in the status widget.
I have now added dpinger to the service watchdog to get notifications when it's force started again. I have now also changed the default gateway from Auto to the WAN interface maybe 2.6.0 wants this to be set manually now. I only have one WAN interface so it doesn't really matter.
I also found this https://forum.netgate.com/topic/169990/failover-on-pfsense-2-6
Might be connected? -
Not enough info on that other ticket, yet, to know if it's related.
I assume you are also using 9.9.9.9 as your monitoring ping target?
Does it change if you allow it to monitor the gateway directly?If you only have one gateway it should always be the default. Nothing has changed there in 2.6.
Setting it to WAN specifically does not hurt though.Are there really no log entries when it stops other than the watchdog?
Steve
-
@stephenw10 The default gateway can be set to auto or a specific one in the drop down menu. For the last 2 years it was set to auto and worked fine. I have now changed it from auto to the WAN interface and dpinger has been fine.
And yes it was pinging 9.9.9.9 for the last 2 years.
Also there has never been an interruption of internet access when dpinger stopped.
Not sure what happened but setting the default gateway manually seems to have fixed it. Even without the watchdog.
Oh, and no there were no other log entires than the ones mentioned.Thanks for chiming in!
Klaus.
-
Sorry, celebrated too early. dpinger still stops to work and I see these gateway logs. I don't quite understand why the pinger stops to work when the gateway can't be pinged. Shouldn't it keep on trying? (regardless the state of the gateway I mean?)
I'll active service watchdog now.
-
Hmm, those logs are both dpinger starting. There are no errors on the system log in between those showing why it stopped?
-
@stephenw10 only this
(this was just weh it happened the next time)
Needless to say during that event Internet didn't go down for me.
-
Hmm, none of that is an error. It just stops silently....
Is that with 'Gateway Monitoring Action' disabled? I wonder if it's trying to do something and failing.
-
-
Just to see what happens I have now set the monitor address to the Fibre modem (the actual gateway) instead of Quad9.
-
Ok, that's a reasonable test.
If you only have one WAN disabling the monitoring action is also a good test.
-
Ok, so after a couple of days I can see that dpinger stops (and is restarted by watchdog) right after pfblocker has updated at 3AM which also leads to a restart of unbound. So somehow dpinger doesn't seem to like that and decides to stop.
All the same cron settings since before the update to 2.6.0 though.
Interestingly it doesn't happen at all on my second installation (21.6) SG1100 also running pfblocker. For some reason on that system dpinger keeps on trucking. -
Do you have DNS-BL runing on both? That's what would restart Unbound.
No idea why restarting Unbound would cause an issue for dpinger though.
-
@stephenw10 Yes more or less same setup on both appliances. No crashes on the SG1100.
Another thing. It seems like the notification page does not store the email password so I am not getting any notifications anymore. When I paste the password again and trigger a test email it will go through. However after hitting save and coming back to that page the PW has been lost.
This also has worked just fine before the update. And: The latter also only happens on the 2.6.0 installation, not on the SG1100.
After sending the test email I can see the PW field on that settings page fall back to some old PW it has stored by the amount of dots in that field. When I hit send test email again it fails because of wrong PW. So it just doesn't seem to store that setting. -
Ok sorry, the email thing was user error. It's a bit confusing. The email pw gets used AFTER saving. So that works now. And 5 minutes later I get a message that dpinger has been relaunched by watchdog.
-
Hmm, well anything logged to show why Unbound stops? In the system or resolver logs?
-
I get the same exact behavior mostly night at 16minutes pass midnight dpinger stops and sometimes at 5minutes pass midnight and on rare occasion during it stops. Most of the time now the error is sento 50, sometimes there are latency alarm and clear latency. I added this to the system turnable: kern.ipc.maxsockbuf Maximum socket buffer size 1000000
That seems to take of the sento error 65. -
@stephenw10 So I tried to reproduce the dpinger thing and simply pulled the cable going to the WAN port of pfsense (instead of waiting for the next time the connect goes down) and put it back in. A second later I receive a notification email that dpinger was relaunched by watch dog. I did it 3-4 times again. Dpinger stops every single time.
This was with Gateway Monitoring Action turned off, btw.
On the same ticket I found that the patch cable going into pfsense has a bad connection when moved so the ping dropping out might have been caused by the cable being just on the edge of failing and then reconnecting.
However this doesn't explain why dpinger decides to stop doing it's thing when that happens. And this definitely didn't happen before the upgrade.
Hm...
-
Mmm, I would expect dpinger to stop if the WAN loses link and that's the only gateway. But it should start again when you reconnect it. Without needing the watchdog package.
Steve
-
@stephenw10 Right, that makes sense. So maybe that's a new bug then? The fact that dpinger doesn't come back to life after a disconnect? (unlike the previous version of pfsense I mean)
-
Mmm, it could be. It feels like a timing issue. Not sure what might have changed there though.
I expect to see something logged with it trying to start and failing....