pfSense GUI unresponsive when WAN drops
-
If my WAN connection drops, the pfSense GUI stops responding. Always. It doesn't happen immediately. Over the course of a few minutes (maybe 10-15 not 100% sure) the GUI becomes increasingly sluggish until finally it just stops working with 504 Gateway Time-out. The router itself appears to work fine, only the GUI fails.
If I login to the console I can reboot the box. Then the GUI works for another 10-15 minutes until I get 504 Gateway Time-out again.
In the log corresponding to this problem, I always find a message like this:
Jun 25 10:44:05 nginx 2021/06/25 10:44:05 [error] 32510#100113: *399 upstream timed out (60: Operation timed out) while reading response header from upstream, client: 192.168.0.194, server: , request: "GET /status_services.php HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm.socket", host: "192.168.0.1", referrer: "https://192.168.0.1/diag_system_activity.php"
If I "Restart PHP-FPM" from the console menu, the GUI comes back to life for another 10-15 minutes. It seems something in PHP is doing something that requires a WAN connection. My first thought was DNS but fiddling with those settings (and disabling DNS entirely) didn't change anything.
This is a 100% reproducible problem. If my WAN link goes down due to an ISP problem, or if I just unplug the modem, this problem happens.
My pfSense config is fairly vanilla. Here's what I have tried:
- Switching from DNS resolver to forwarder
- Fiddling with various DHCP/DNS settings
- Disabling DNS altogether
- Disabling IPv6
I don't know what else to try. I don't really have anything else configured beyond defaults that I can identify.
Any ideas? I am always running the latest 2.x.y-RELEASE.
-
@flarednostril said in pfSense GUI unresponsive when WAN drops:
This is a 100% reproducible problem.
You're right !
And it's a very known one, many forum messages talk about it.
Some of them even have replies where it is explained what happens.In short :
The "pfSense dashboard" page shows a lot of info, and some of it has to be taken from places that are not local, but somewhere on the Internet.
Like, just one example : the list with packages, known locally, and their eventual updates that might exist, that info resides at the servers of Netgate / pfSense.So, when you visit a page like the pfSEnse dashboard with your browser,the web server, the one present in pfSense, can't collect the info, it stalls, because the 'remote' info isn't aviable.
It tries to get info from https://somewhere.netgate.com/..... so, first, deep down in the system "netgate.com" has to be resolved to an IP address. Your local DNS gets questioned, and then ... it stalls : no answer. It will time out after xxx seconds.@flarednostril said in pfSense GUI unresponsive when WAN drops:
Switching from DNS resolver to forwarder
Why ? But ok, in this case, your private DNS isn't send to some company this time, the WAN is down anyway.
The issue stays the same : no WAN, so some parts of the system won't work.@flarednostril said in pfSense GUI unresponsive when WAN drops:
Fiddling with various DHCP/DNS settings
No need to do so.
@flarednostril said in pfSense GUI unresponsive when WAN drops:
Disabling DNS altogether
That creates the situation where pfSense itself can't collect info, upgrade itself etc etc etc.
I strongly advise you not to do this.@flarednostril said in pfSense GUI unresponsive when WAN drops:
Disabling IPv6
The hosts pfSEnse consults do not 'expose' IPv6 (yet !), so it's IPv4 all the way.
IMHO : This is will change in future ;)The real solution would be : Re establish toyr WAB , and your "connected device" will be happy again.
If WAN fails, or isn't set up correctly, the GUI will (could) be somewhat (very) slow.I advise you also to look for other threads discussing the same subject. You will most probably find some more advise and tricks.
-
Hi .. ok. Thanks for the reply. But I don't think you've understood the issue. If my WAN link goes down.. pfSense GUI becomes unresponsive. It shouldn't.
-
@flarednostril said in pfSense GUI unresponsive when WAN drops:
Hi .. ok. Thanks for the reply. But I don't think you've understood the issue. If my WAN link goes down.. pfSense GUI becomes unresponsive. It shouldn't.
What packages do you have installed? When the WAN is down, that triggers some internal scripts within pfSense that do things like "restart all packages" for one. That can be quite resource and CPU intensive for some packages with certain configurations. One example would be pfBlockerNG-devel using the DNSBL feature with very large lists of blacklisted domains.
And as explained earlier, if attempting to access the Dashboard page, that will be very slow due to the failure of DNS lookups in particular.
-
@bmeeks I have the "nut" package installed for my UPS. Nothing else.
It's not like what you are suggesting. It doesn't get busy for a while then come good. CPU usage is always close to 0. The GUI just gets slower progressively until it stops working altogether and never comes back.
-
That is indeed strange behavior then. I would expect it to be sporadically slow, especially when
dpinger
(if enabled) goes into its "alarm" mode and restarts things in an attempt to recover connectivity. But to eventually spin down to a complete hang is strange.That particular URL requested in your posted log snippet is strange as well. Were you actively trying to view that page, or do you have some kind of automated logging set up? What device was using the 192.168.0.194 IP address? Was it your workstation?
-
Wasn't this setting a workaround , for the "Gui issue" if was is down ?
One of the things the gui does (eagerly) , is to check if there are any updates.
And you need a working WAN for that.This one disables that check. I suppose you might loose the automatic update check, if disabling
Edit: I have not seen a real timeout here on wan down , just a super slow gui response.
/Bingo
-
@bmeeks Yes I was actively trying to view that page. I was just randomly clicking on things watching the GUI get slower and slower until I got the 504 Gateway Time-out. It's nothing to do with that particular page the whole GUI is affected.
192.168.0.194 was firefox on my Mac.
-
If you haven('t already done so, enable the SSH access.
Or use the console.
Menu option 8.
Optional, as it looks better/nicer :pkg install htop
'top' is installed by default.
No use top or htop
Have it sorted on CPU utilsation (hit F6 and select CPU percentage).Now, make the WAN go down. Look at what processes are using the CPU - these are at the top.
Btw : when your uplink connection goes down, is it the pfSense WAN interface that goes down, or the link upstream becomes bad while the pfSEnse WAN interface stays up ?
-
@bingo600 Ok I'll test with auto-update check disabled. Thanks for the suggestion.
-
@gertjan thanks but as I said above it's not an issue of CPU utilisation and the box being busy. CPU utilisation is always close to 0%. There is no process causing any problem like that. The GUI simply stops responding and I have to "Restart PHP-FPM". Then it works again for a few minutes before it gives me a 504 Gateway Time-out again.
-
It's not the cpu utilisation, but more what processes are running at that moment. These will rise to the top of the queue.
It's time to find out if it's related to the web server (nginx) or the PHP engine.
What does the System log mentions when the issue happens ?
And the DNS log ?
DHCP log nothing special ?Btw, : I'm also using the NUT package.
-
@gertjan Ok thanks I'll check what htop shows. I'm pretty sure there is nothing in dns/dhcp/system logs of interest apart from the "fastcgi://unix:/var/run/php-fpm.socket" error above.
-
What you could try is this :
Backup your config.
Go to the console or SSH access and use option 4 : 'reset to default'.
Don't change any settings. If needed, make WAN work. WAN is set up to be a DHCP client, so pfSense could work out of the box.
Do not change anything else.
( ok for changing the password )Now, test.
Same behaviour ?Go go back to the set up with issues : import your backed up config back in again.
-
@gertjan Yep reset to default. Then try disconnecting WAN. I suppose I will have to try that. Just surprised I'm the only one apparently having this problem. I will try reset to default and see if I get the same behaviour.
-
@flarednostril said in pfSense GUI unresponsive when WAN drops:
@gertjan Yep reset to default. Then try disconnecting WAN. I suppose I will have to try that. Just surprised I'm the only one apparently having this problem. I will try reset to default and see if I get the same behaviour.
We all agree the GUI may get a bit sluggish with no WAN connection, especially so if the Dashboard "home page" is being viewed and "Check for Updates" is checked. But I personally have never seen the pfSense GUI just basically slowly die as you describe.