Previously working pfSense 2.4.4 setup stops "randomly" accepting LAN traffic
-
@kiokoman: thanks for your quick reply. here is my dmesg.txt and dmesg_crashed_ix1_no_longer_reachable.txt (as far as I can tell, they are identical).
Here also the Status -> System_logs.png
Cables I did not swap so far... don't wan't to say anything wrong, but why would cables stop working if I quickly push a lot of data thru? When they are working just fine as long as I don't put load on them
@stephenw10: how can I check for buffer exhaustion?
-
It should be logged.
That screenshot only shows like 2 minutes. What time did it stop responding? Before the boot?
Try getting the complete system log with:
clog /var/log/system.log > /tmp.systemlog.txt
Steve
-
What I did is: I cleared dmesg with dmesg -c started the vm new exported the dmesg to the text file dmesg > dmesg.txt created the crash with a speedtest on speedtest.net logged in thru openvpn created the dmesg_crashed and the printscreens.
I really hope this log helps you more. tmp.systemlog.txt since I think it s quite strange that the system just starts having this troubles...
-
Can we see those dmesg outputs and printscreens?
At what time approximately did it stop responding?
Just before you rebooted here?
Sep 3 14:32:33 fw01 php-fpm: /index.php: Successful login for user 'admin' from: 10.10.1.2 (Local Database) Sep 3 17:44:40 fw01 php-fpm: /index.php: Successful login for user 'admin' from: 10.10.1.2 (Local Database) Sep 3 17:46:28 fw01 sshd[44883]: user admin login class [preauth] Sep 3 17:46:28 fw01 sshd[44883]: user admin login class [preauth] Sep 3 17:46:28 fw01 sshd[44883]: user admin login class [preauth] Sep 3 17:46:31 fw01 sshd[44883]: Accepted keyboard-interactive/pam for admin from 10.10.1.2 port 59170 ssh2 Sep 3 17:47:03 fw01 reboot: rebooted by admin Sep 3 17:47:03 fw01 syslogd: exiting on signal 15
You can see there is nothing logged there at all.
Steve
-
@stephenw10: so I am back home for more testing, the dmesg outputs and printscreens are in this post https://forum.netgate.com/topic/146231/previously-working-pfsense-2-4-4-setup-stops-randomly-accepting-lan-traffic/4
I can replicate the issue as many times as I want. Just need to reboot pfsense, connect my notebook and hit speedtest.net it crashes instantly... but no bluescreen or log as far as I can tell... nothing in dmesg nothing in /var/system.log... do I need to change something to verbose? to get more infos?
-
Ok so what time in that log did it stop passing traffic on ix1?
You might also check
netstat -m
when it fails. An mbuf exhaustion like that would normally affect all NICs though.The output of
sysctl dev.ix.1
might show you something if it's just interface.Steve
-
it should always be the last thing / time at the logs since I always did the logs after the crash beside the dmesg.txt
here are the two requested outputs (both created within seconds after the crash.
-
It somehow got worse... now the pfsense ix1 which currently for eliminating issues is directly attached to my desktop also put in vlan 50 (rechecked against the switch - because I started to get worried if I am stupid - but desktop is in VLAN50 and talks on VLAN50) but it seams ix1.50 does not accept ping / or anything from my desktop even after ifconfig ix1 down;ifconfig ix1 up or am I doing something completly wrong right now?
I can also remove vlans and test again without them...
-
This post is deleted! -
now its official.... I can't not even ping pfSense (10GbE cable directly from my workstation <-> pfSense ix1 without VLAN10 or VLAN50) anymore I think I broke the internet fun byside, it is quite strange whats happening here...
-
@Yves_ said in Previously working pfSense 2.4.4 setup stops "randomly" accepting LAN traffic:
status: no carrier
Isn't going to work very well.
-
Mmm, bad NIC maybe? Try re-assigning ix0 and ix1, does it now fail WAN side?
Steve
-
@johnpoz said in Previously working pfSense 2.4.4 setup stops "randomly" accepting LAN traffic:
@Yves_ said in Previously working pfSense 2.4.4 setup stops "randomly" accepting LAN traffic:
status: no carrier
Isn't going to work very well.
No, that actually was my fault. thats why I deleted the post. I forgot the plug the cable back from the switch directly into ix1....
-
Not sure this looks great..
dev.ix.1.mac_stats.local_faults: 31
I see 0 faults on everything I'm checking here. What does dev.ix.0.mac_stats.local_faults show there?
You might also try disabling flow control:
sysctl dev.ix.1.fc=0
If that works you can add it as a system tunable in the gui.
Steve
-
@stephenw10 okay, so I set ix0 to lan and ix1 wan no more vlans... and voila ix0 still working now as lan... and ix1 which now would be wan is still dead... so eighter one port on my card is broken (which would be the first time I hear of something like that) or there is something else seriously wrong... anyway going to create a backup now of pfsense, kill the vm completely and reinstall a new vm. if this does not work. I will switch the X540 tomorrow.
-
Mmm, yeah does seem like a hardware issue or maybe something in the way that port is passed through to pfSense.
Replacing the card will tell you that though.
Steve
-
I have some more feedback. After almost giving up, I thought why not just reboot the complete VMware ESXi server for once. Which I did and which seams to have solved all the issues... even doh the intel x540 is completely passedthru, very very strange. I will keep an eye on everything and keep you posted.
@stephenw10 THANK YOU SO MUCH FOR ALL YOUR EFFORT!
-
Hmm, I guess it retained some config then. We have seen NICs that require a complete power cycle to clear some issues.
Steve