pfSense 2.7.2 interface Tx underrun, restarting...
-
The log was after the reboot and it is non-stop filled with such an entry.
I attached the output of
ps -awx -l
I don't yet know how to interpret it.
processes.txtEDIT:
When my network crashes, the session in putty will freeze for a while. I took advantage of this to see what processes are most resource intensive at the time usingtop -P
.
As my network crashes now I have this state in putty:
So
php-fpm
seems to be a pretty heavy-duty process. -
Yeah something is churning scripts there. The CPU time on check_reload_status is huge.
I'd say you probably have a link flapping. It's hard to see from that log because of off the check_reload_status messages but there are a bunch of hotplug messages like:
<27>1 2024-03-27T13:48:18.118187+01:00 hidden.host.name php-fpm 5904 - - /rc.linkup: DEVD Ethernet attached event for opt5 <27>1 2024-03-27T13:48:18.118319+01:00 hidden.host.name php-fpm 5904 - - /rc.linkup: HOTPLUG: Triggering address refresh on opt5 (stge0.80) <13>1 2024-03-27T13:48:18.118635+01:00 hidden.host.name check_reload_status 430 - - rc.newwanip starting stge0.80 <27>1 2024-03-27T13:48:18.130454+01:00 hidden.host.name php-fpm 5904 - - /rc.linkup: Hotplug event detected for VLAN_71_LAB_INTERNET(opt8) static IP address (4: 192.168.71.1) <30>1 2024-03-27T13:48:18.142627+01:00 hidden.host.name lighttpd_pfb 21772 - - [pfBlockerNG] DNSBL Webserver started <27>1 2024-03-27T13:48:18.147659+01:00 hidden.host.name php-fpm 5904 - - /rc.linkup: DEVD Ethernet attached event for opt8 <27>1 2024-03-27T13:48:18.147841+01:00 hidden.host.name php-fpm 5904 - - /rc.linkup: HOTPLUG: Triggering address refresh on opt8 (stge0.71) <13>1 2024-03-27T13:48:18.148123+01:00 hidden.host.name check_reload_status 430 - - rc.newwanip starting stge0.71
Do you see any actual kernel level link state changes?
-
@stephenw10 said in pfSense 2.7.2 interface Tx underrun, restarting...:
Do you see any actual kernel level link state changes?
Not sure how to check it. Attached current
dmesg
output with a lot of link state changes. dmesg.txt -
Yeah that's a LOT of link state changes. All the php load and check_reload_status messages are probably coming from that. Each link state change triggers restarting services.
So is that NIC, stge0, actually losing link? What is it connected to?
-
It is connected to one of my Mikrotik switches. Maybe I should reboot it too... :)
EDIT:
I check the logs on the mikrotik. stge0 is connected to the ether24. You can see that it has disconnected there, but it is not noted as often in the logs as on the pfSense side. Those warnings on sfp are also puzzling, I didn't have them before. -
Yes try that. But also check any logs the switch has to see it that's seeing the link lost too.
If it's really flapping you might be able to set both sides to a fixed speed to prevent it.
That is not a common NIC though. Swapping it out for something Intel based would almost certainly solve this.
-
Also odd that it links at 100M then 10M then 1G.
-
I replaced the NIC to Realtek. Now my interface is re1 instead of stge0. So I replaced every stge0 yo re1 in my backup config.xml and I followed this article to restore config from USB: https://linuxconfig.org/restore-pfsense-configuration-backup-from-console-using-usb-drive
But for some reason it is still trying to configure my VLANs as stge0.xx
Do you know what I should delete additionally so it woulf take the new config into account?
EDIT:
I set a trap for myself. I copied the config that was there before as config.xml to the USB root directory. I didn't know about ECL and when I had that flash drive plugged in, despite changing the config, ECL reloaded the previous one from the USB on reboot.Ok, now I have a machine with a realtek card instead of the previous one.
-
After almost 1h of startup, the error counter is still at 0, the logs no longer contain these strange entries, and the network is running stably. It seems to me that the problem has been solved.
Thank you very much for your support! I appreciate it!
-
After 17h, I see that the error counter shows something, however. I will honestly admit that I have never checked it before. Is it normal that errors appear on the Tx interface?
-
It's common to have a few errors especially if it was recently disconnected. That looks high though. Check the cable.