Netgate 4200 freeze and a possible fix
-
Hi,
Setup:
- Netgate 4200
- pfSense version 25.11.1
- additional packages: pfBlockerNG-devel net 3.2.14
Fairly simple setup; no VLANS, a few NAT rules, less than 100 clients with mostly static IP mappings on 2 LAN interfaces and a WireGuard interface
I experienced a freeze on the box; all clients lost connectivity to WAN, firewall GUI responded but was very slow to use, rebooting from GUI did not work.
Investigation of available logs did not show any problems in ISP service (checked with ISP as well).Rebooting using the hardware switch resolved the issue and service was restored. Nothing unusual in the logs after reboot either.
After searching for information I am leaning towards the issue being caused by the Intel NIC and a possible fix by disabling hardware offloading.
As far as I understand, disabling should not reduce performance in any meaningful way, so might as well do so - especially if this freezing occurs again.
Opinions on this? Should I just disable the hardware offloading or wait and see if this happens again? Uptime for the new version 25..1.1 was roughly 10 days before the freeze.
-
I have a 4200 using the same version and PF package. Small home office set up. I have not experienced any of the freezing you describe ,but what does come to mine is cabling. I would try a different set of patch cables. Just recently, the PS5 in the house was having connectivity issues. Just out of the blue after checking all the basics I changed the patch cable. The PS5 connectivity problem went away. Haven’t had a problem since changing cables why it started happening I have no idea it doesn’t move. The cable doesn’t move. Makes no sense. Maybe dust. But I would start with cabling first.
-
@belajasmert said in Netgate 4200 freeze and a possible fix:
Investigation of available logs did not show any problems in ISP service
What was being logged at that time?
Only check sum off loading is enabled by default. That shouldn't cause any issues but disabling it is unlikely to reduce performance as you say.
-
@Uglybrian Cabling should be fine, I've run it several times with a cable tester and there is only a single cable from the ISP device to firewall and directly from the firewall interface to my workstation.
Also, I would expect something in the logs to indicate cable fault, but everything is running smoothly. Plus I don't feel like replacing the cable from firewall to ISP fibre box just-in-case 'cause it runs through a wall and a ceiling, so not trivial to do.
I've had a single outage on the old 24 -version, but that was clearly an ISP-side issue based on logs available (and the fact that in my neighbourhood there have been issues with the ISP in general during the same timeframe).
-
@stephenw10 Well, there were some issues in troubleshooting 'cause logs were partially missing - which I understand is one of the symptoms of the freeze. And dpinger had stopped.
No packet loss in WAN, no errors in system log, no nothing. Simply a frozen box that was very sluggish in the GUI. The earlier ISP issues were quickly identified through an increase in packet loss.
-
Aaaaand I found something:
"Feb 9 08:09:40 dpinger 38845 PORT1WAN_DHCP xx.xxx.8.1: sendto error: 50
Feb 9 08:09:40 dpinger 38845 PORT1WAN_DHCP xx.xxx.8.1: sendto error: 50
Feb 9 08:09:39 dpinger 38845 PORT1WAN_DHCP xx.xxx.8.1: sendto error: 50
Feb 9 08:09:39 dpinger 38845 PORT1WAN_DHCP xx.xxx.8.1: sendto error: 50"These would usually be linked to NIC issues, right?
-
Hmm, sendto error 50 is ENETDOWN which implies there is no connection. I would expect to see some link state changes logged though if that was happening.
-
@stephenw10 Unfortunately I didn't make a note of the time when I power cycled the ISP fibre box, but I did it before rebooting the firewall. Could these be caused by the ISP box shutting down?
-
Yes, if the ISP device is connected directly it will cause errors if it reboots and loses link. But it would also show specific link state change logs.
-
Looking at timestamps of events on client devices, I can see that at 00:08 and 04:16 on the 9th Feb a large number of IoT devices lost connectivity.
So, whatever happened, happened sometime after midnight but there is nothing on any logs near those times. All quiet and normal, and those sendto errors are almost 100% caused by me power cycling the ISP device.
So, I'll wait and see if this repeats and then maybe disable the only hardware offloading setting not yet disabled.
-
Hmm, hard to imagine how it could be anything on the firewall with nothing logged at all.

Do you have multiple gateways defined? Do you have the system default gateway still set to automatic?
It may be defaulting to something invalid. Though that would also be logged. -
@stephenw10 There are only two gateways, IPv4 + v6. Default gateways are set to automatic. I think those were created automatically and I have not changed the configuration.
I'm also not using IPv6 at all, it has been disabled.
-
I guess I could do a bit of cleanup by setting the ipv6 gateway to "none" and also mark it disabled
-
Yes those are expected and if there are only those it can't be the problem.
-
Ok, ended up doing nothing to the configuration and now experienced the freeze again - uptime roughly 17 days and 11 hours.
Before rebooting I did a lot of searching and noticed one specific issue - the DNS resolver status page would not load at all. Everything else opened up, albeit very slowly. After making sure I had screenshots and logs safely stored away, I tried to restart the DNS resolver service through the UI.
The restart never finished and other status pages stopped working as well. After roughly 10 minutes of waiting I power cycled the firewall (ACPI button shutdown first) and everything is now working beautifully again.
There are two items that might be related to the issue.
-
I am using the firewall to respond to all DNS queries and there is one domain that is never resolved by the service:
"Feb 26 21:25:54 filterdns 62690 failed to resolve host xx.yyyyyy.zzzz will retry later again."
There is nothing that I can see that is problematic in the DNS log otherwise. -
My PC is directly connected to igc2 LAN port. This results in dpinger sig 15 restarts and other things like link changes. Again, this problem was noticed after my PC wokeup from standby, so I am wondering if this might be related as well.
-
-
Did some digging and found these:
https://forum.netgate.com/topic/161400/unbound-stops-listening-on-interface
https://redmine.pfsense.org/issues/11547I am starting to suspect that the LAN flapping because of the PC sleeping / power on / power off might be the origin of the problem.
These are the timestamps for LAN port flapping and correspond to the timing of the firewall freeze:
19:59:13 DOWN
19:59:19 UP
20:01:20 DOWN
20:01:22 UP
20:01:31 DOWN
20:01:34 UP -
Neither a failing filterdns entry nor an interface changing state should really be a problem.
The NIC re-linking when your client wakes will trigger bunch of processes but that would only cause a temporary delay if anything.
If either of those did cause a problem I'd expect it to log something. Do you see the Unbound service restarting in the logs when you lost access?
Do you see anything logged at the time it failed? Services failing spontaneously like that can be a sign of a failing drive. pfSense will keep running but anything that tries to read or write will fail so you end up with services slowly failing. However since logs cannot be written that is a pretty clear indication.
-
These are the unbound events:
19:59:14 — Unbound stopped
“Feb 26 19:59:14 unbound … info: service stopped (unbound 1.24.2).”
19:59:19 — Unbound started
“Feb 26 19:59:19 unbound … info: start of service (unbound 1.24.2).”
19:59:22 — Unbound stopped
“Feb 26 19:59:22 unbound … info: service stopped (unbound 1.24.2).”
19:59:22 — Unbound started
“Feb 26 19:59:22 unbound … info: start of service (unbound 1.24.2).”These correspond with the time I started up the PC and the subsequent LAN port flapping. Other logs show only entries that are caused by LAN flapping - nothing that looks like problematic and are always present when the PC starts up.
The domain that was failing to resolve was listed in an alias, removed that as unnecessary anyway and that should cleanup the DNS resolver log a bit.
The Netgate device itself is new, but I don't know if there is something wrong with this model. The first device I got was DoA. Can I use the SMART status or something else to verify that drive is working as it should?
-
Yes on a 4200-max you can use the SMART data to see any drive errors. And I'd certainly expect to see errors there if it were failing.
-
@stephenw10 There are no errors on the log:
Logs
smartctl 7.5 2025-04-30 r5714 [FreeBSD 16.0-CURRENT amd64] (local build)
Copyright (C) 2002-25, Bruce Allen, Christian Franke, www.smartmontools.org=== START OF SMART DATA SECTION ===
Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors LoggedThe tests (short / long) both give this when you try to run them:
"Test Results
smartctl 7.5 2025-04-30 r5714 [FreeBSD 16.0-CURRENT amd64] (local build)
Copyright (C) 2002-25, Bruce Allen, Christian Franke, www.smartmontools.orgSelf-tests not supported"
And if you check the NVMe log you can see this:
"Logs
smartctl 7.5 2025-04-30 r5714 [FreeBSD 16.0-CURRENT amd64] (local build)
Copyright (C) 2002-25, Bruce Allen, Christian Franke, www.smartmontools.org=======> INVALID ARGUMENT TO -l: nvmelog
=======> VALID ARGUMENTS ARE: error, selftest, selective, directory[,g|s], xerror[,N][,error], xselftest[,N][,selftest], background, sasphy[,reset], sataphy[,reset], scttemp[sts,hist], scttempint,N[,p], scterc[,N,M][,p|reset], devstat[,N], defects[,N], ssd, gplog,N[,RANGE], smartlog,N[,RANGE], nvmelog,N,SIZE, tapedevstat, zdevstat, envrep, farm <=======Use smartctl -h to get a usage summary"
Looking at the all SMART sata, there is only one item indicating any issue:
"Information
smartctl 7.5 2025-04-30 r5714 [FreeBSD 16.0-CURRENT amd64] (local build)
Copyright (C) 2002-25, Bruce Allen, Christian Franke, www.smartmontools.org=== START OF INFORMATION SECTION ===
Model Number: TS128GMTE460T-SIL
Serial Number: J279400393
Firmware Version: V0804A3
PCI Vendor/Subsystem ID: 0x1d79
IEEE OUI Identifier: 0x7c3548
Controller ID: 1
NVMe Version: 1.3
Number of Namespaces: 1
Namespace 1 Size/Capacity: 128,035,676,160 [128 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 7c3548 5264b333c9
Local Time is: Sat Feb 28 14:58:26 2026 EET
Firmware Updates (0x12): 1 Slot, no Reset required
Optional Admin Commands (0x0007): Security Format Frmw_DL
Optional NVM Commands (0x0015): Comp DS_Mngmt Sav/Sel_Feat
Log Page Attributes (0x03): S/H_per_NS Cmd_Eff_Lg
Maximum Data Transfer Size: 64 Pages
Warning Comp. Temp. Threshold: 85 Celsius
Critical Comp. Temp. Threshold: 90 CelsiusSupported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 6.00W - - 0 0 0 0 15000 0
1 + 3.00W - - 1 1 1 1 15000 0
2 + 1.50W - - 2 2 2 2 15000 0
3 - 0.0450W - - 3 3 3 3 15000 15000
4 - 0.0040W - - 4 4 4 4 25000 25000Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSEDSMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning: 0x00
Temperature: 52 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 0%
Data Units Read: 13,327 [6.82 GB]
Data Units Written: 842,330 [431 GB]
Host Read Commands: 115,375
Host Write Commands: 31,129,084
Controller Busy Time: 71
Power Cycles: 33
Power On Hours: 971
Unsafe Shutdowns: 12
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors LoggedSelf-tests not supported"