504 Gateway timeout and full network loss periodically
-
At some point since Thursday (we had a long weekend here in the UK for public holidays) the system has failed again. SSH is working, so I've managed to grab the output from
top aSH
:last pid: 18852; load averages: 0.00, 0.00, 0.00 up 11+18:22:31 08:32:33 54 processes: 1 running, 53 sleeping CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 99.9% idle Mem: 24M Active, 376M Inact, 655M Wired, 6360M Free ARC: 138M Total, 22M MFU, 111M MRU, 16K Anon, 779K Header, 4804K Other 102M Compressed, 253M Uncompressed, 2.47:1 Ratio Swap: 1024M Total, 1024M Free
I couldn't immediately check the graphs as the "504 Gateway Time-out" error on login prevented me from accessing them. I've since ran
/etc/rc.php-fpm_restart
and/etc/rc.restart_webgui
and now cannot even get to the login page... -
Interestingly, if I try to login from a private window or another browser I do see a log from syslog in my SSH session that the login was successful, but the 504 Gateway Time-out still occurs:
Message from syslogd ... <32>1 2024-04-02T08:53:24.974210+01:00 PFSENSEBOX php-fpm 16785 - - /index.php: Successful login for user 'euant' from: IP_ADDRESS (Local Database)
So php-fpm is at least kind of working up to that point.
-
@euantorano I've been running into the same issue for the past couple months. I migrated in late 2023 from a virtualized setup in hyper-v that ran without issue for a couple years. I'm now on a protectli FW4C on CE 2.7.2. I have a pretty small home setup with 1 wan, 1 lan, and another port I'm using with two VLAN. OpenVPN running on UDP and a TCP instance behind HAProxy (for connection from work/locations that block UDP).
I can't find anything meaningful in System Logs under General, Gateways, DHCP, DNS Resolver, etc. There are some alarms periodically through the day but not around the outage :
send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr ***** bind_addr ***** identifier "WAN_DHCP "
Some things I've tried :
- Place a switch between modem and protectli/pfsense
- Swapped all cables
- TSO to disabled/0
- Disabled Gateway Monitoring Action
- System -> Advanced -> Networking - KEA DHCP
- System -> Advanced -> Miscellaneous - Memory Limit - 1024
- System -> Routing -> Gateways - Monitor IP - 1.1.1.1
- Default Gateways to WAN_DHCP (from automatic)
- DHCP Client Configuration to FreeBSD Default
- Reject Leases from 192.168.100.1 (modem)
- Lease Requirements and Requests : Options modifiers - supersede dhcp-server-identifier 255.255.255.255
- Interfaces -> * -> Speed and Duplex set explicitly 1000baseT full-duplex (and 2500 because of 2.5G intel ports)
I've done a backup restore but I'm trying everything I can to avoid a full fresh install while I try to work on other projects, but the inability to restart pfsense/fix the issue while I'm away from home is breaking me. Have you had any luck?
-
@LaFlamaBlanca sounds extremely familiar. I too have tried similar steps including putting a switch between the modem and pfSense and swapping out cables.
Unfortunately I’ve not had any luck yet, but at the moment it looks like the frequency of it happening has reduced slightly after I enabled some of the hardware offloading settings to turn them on.
-
Well everything had been fairly smooth sailing since my previous post in April 2024 until I applied the 25.07-RELEASE update and I'm now back to where I was previously, with sporadic lock-ups happening somewhere between every couple of days and once a week.
No other settings were changed, except to move to the new DHCP server (Kea).
Any ideas on how to troubleshoot what's happening? There are no obvious logs to report what the problem is.
-
The system has fallen over again overnight. The login screen loads fine, but as soon as you try to submit the login, the request takes quite a long time then you eventually get this screen:
Clicking on the link to the Crash Reporter is pretty useless, as it only contains the following:
Crash report begins. Anonymous machine information: amd64 15.0-CURRENT FreeBSD 15.0-CURRENT #0 plus-RELENG_25_07_1-n256513-49844af35a5d: Fri Aug 15 19:21:04 UTC 2025 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-25_07_1-main/obj/amd64/DZizCvOj/var/jenkins/workspace/pfSense-Plus-snapshots-25_07_1-main/sources Crash report details: No PHP errors found. No FreeBSD crash data found.
And I can't access any other pages within the system to check logs or diagnostics etc. until we perform a shutdown and start the system back up again.
-
Interestingly, I've just SSHed into the system, and running the
ifconfig
command hangs in thesbwait
state for a very long time:96993 root 1 68 0 15M 3504K sbwait 10 0:00 0.00% ifconfig 91059 root 1 68 0 15M 3520K sbwait 6 0:00 0.00% ifconfig 38972 root 1 68 0 15M 3516K sbwait 8 0:00 0.00% ifconfig
I found a topic on the FreeBSD community discussing this, and I wonder if this may be related: https://forums.freebsd.org/threads/ifconfig-needs-16s-mainly-sbwait-on-freebsd-14-2-p1.97931/
-
uname -r
will tell you what you already knew :
so, be ware : (Your) pfSense doesn't use FreeBSD 14.2
About ifconfig : I never saw the sbwait issue, but it isn't unknown. Look here, and read some of the post, check what matches with what you see.
I tend to think : an interface issue ? and somehow this impact your PHP daemon (uses a socket) for the communication between nginsx, the web server, and the PHP interpreter, hence the 50x error.
-
@Gertjan Yes, I'm suspecting an interface or driver issue. It's interesting that I can SSH in (from a remote location, over a WireGuard connection), but devices connected directly to the LAN from the pfSense install cannot reach out and PHP/nginx seem to have issues.
This box has a 2.5GbE NIC, and a separate Intel PCI card with 4 network interfaces. I've included some of the output from
pciconf -lv
below:igb0@pci0:1:0:0: class=0x020000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x1521 subvendor=0x8086 subdevice=0x0001 vendor = 'Intel Corporation' device = 'I350 Gigabit Network Connection' class = network subclass = ethernet igb1@pci0:1:0:1: class=0x020000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x1521 subvendor=0x8086 subdevice=0x0001 vendor = 'Intel Corporation' device = 'I350 Gigabit Network Connection' class = network subclass = ethernet igb2@pci0:1:0:2: class=0x020000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x1521 subvendor=0x8086 subdevice=0x0001 vendor = 'Intel Corporation' device = 'I350 Gigabit Network Connection' class = network subclass = ethernet igb3@pci0:1:0:3: class=0x020000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x1521 subvendor=0x8086 subdevice=0x0001 vendor = 'Intel Corporation' device = 'I350 Gigabit Network Connection' class = network subclass = ethernet none6@pci0:88:0:0: class=0x028000 rev=0x1a hdr=0x00 vendor=0x8086 device=0x2725 subvendor=0x8086 subdevice=0x0024 vendor = 'Intel Corporation' device = 'Wi-Fi 6E(802.11ax) AX210/AX1675* 2x2 [Typhoon Peak]' class = network igc0@pci0:89:0:0: class=0x020000 rev=0x03 hdr=0x00 vendor=0x8086 device=0x15f2 subvendor=0x8086 subdevice=0x3019 vendor = 'Intel Corporation' device = 'Ethernet Controller I225-LM' class = network subclass = ethernet ``
-
-
@Gertjan Yeah, the Wi-Fi isn't assigned at all. I'll try disabling it and see if it has any effect.