Crash due to MBUF usage constant increase
-
Hello all.
I'm in trouble with my PFSense platform following v2.3 update.
My MBUF usage is increasing slowly but continuously, and after several days (~1 month), it reaches 100% and my router crash.
Is this trouble already knows and resolvable ?
Thank you for help & advices
My config :
Jetway NC9T-1037 (Celeron 1037U)
with 2NIC Ethernet Realtek RTL8111G
14Go RAM DDR3 Samsung CAS11 ECO PC12800
SSD Samsung PM830 32Go mSATA
NIC WiFi Atheros AR9380 AR5BHB112 802.11n 3*3 mini PCI-E
Case Morex M5X 60W
1 * Noctua NF-R8 80mmPFSense v2.3
Services : dhcpd, dnsmasq, dpinger, ntpd, openvpn
Interfaces : WAN, LAN (brindging WiFi/Ethernet/VPN), WLAN_GUEST -
I'm not aware of any issue like that remaining on 2.3 with the hardware you list. When those are observed, they tend to be a problem such as an mbuf leak in one of the active NIC drivers.
Is the increase linear, or does it go up in sharp spikes that coincide with high network traffic?
-
Hard to say as I have constant traffic. I will try to measure it more precisely and come back.
Thanks.
-
After 5 days, it seams to be quite linear :
16/05/2016 5%
17/05/2016 9%
18/05/2016 14%
20/05/2016 23%Around 4.5% per day. So, after 23 days, it will probably crash.
I update to v2.3.1 and I will come back.
-
I'm not familiar with that board (and short on time to go digging for info) – is there a way you could add a different type of NIC (like a dual Intel card) to see if that fixes it?
I have no confidence in Realtek cards, especially under load, and even less so since it's a Jetway board.
-
I have a basic ADSL line with 1M up and 5M down … we can't consider that as heavy load ... and I run with this config with pfSense for 2 years (first v2.1.2, then 2.2.x) without troubles.
I don't have a dual intel NIC with PCI-E 1X port ready for test ... only single NICs with PCI.
I will update to v2.3.1 and test again. If no improvement, I will reinstall v2.3.1 from scratch. If no improvements, I will roll back to v2.2.
-
Only 1h atfer update and reboot, I already have +1% on MBUF (1520->1776)
The only entries I have in System Logs/System/general are 8 lines of "ath0: stuck beacon; resetting (bmiss count X)" … do you think it is linked with ?
-
I was getting stuck beacons in a test rig and I thought it was crashing my test rig. The stuck beacon is related to your Wi-Fi. Eventually I gave up on the Wi-Fi and used a switch and a WAP. Still having periodic crashes though, so maybe it wasn't just Wi-Fi that was giving me troubles.
-
Dp you have amd64 platform installed or it is plain 32 bit?
-
It is a Celeron 1037U so AMD64. The pfSense version installed is the AMD64 one.
-
You should try to run without wifi.
-
@w0w:
You should try to run without wifi.
I will.
After several tries, including a partial upgrade to v2.3.1_1 (due to the "pfSense-Status_Monitoring-1.4.1_1.txz: Not Found" error) which has generated a big increase of MBUF Usage, I have decided to reinstall v2.3.1 from scratch and make a complete Backup/Restore. I notice a big increase of the MBUF buffer size (247804 instead of 26584 on a pfSense upgraded from 2.2 to 2.3).
I'm now with 12h uptime, and MBUF Usage is still correct (classic 1520->1776). I keep you informed.