[Solved] pfsense 2.3.3 - Gigabit Speeds - Lan/Wan interfaces drop

olorinpc

Hey all, I can certainly post more system logs but thought I would post this initial scenario to see if it rings a bell with anyone.

I am running the latest 2.3.3-p1 on an old 1u Portwell NAR-5500 (Intel Pentium 4 HT 3.2GHz - 2GB DDR Ram, 80GB SATA HDD, 6x Gigabit Ethernet ports ) and have had zero issues for quite a while, everything has been rock solid.

However, I recently had my bandwidth upgraded from 100/3 to 1000/1000. Sounds awesome right? Yet, when I run a speedtest, and actually loop close to gigabit speed though the pfsense firewall, the interfaces start dropping and I have to perform a reboot to get connectivity back.

Yet looking through the logs, not really seeing a cause, so having trouble determining if this is a config or a hardware issue. (It is an old repurposed appliance, but would hate to drop cash on a replacement if it is something in the config I am missing.)

I have included a solid test result below. PfSense comes back up, from a computer on the LAN side, I initiate the speedtest, midway though (luckily the WAN interface died first) the WAN flips to yellow showing packet loss and has lost the external IP, in which case I kicked off a reboot before the LAN died as well.

Any ideas would be much appreciated!


4/27/2017 1:47	shutdown		reboot by root:
4/27/2017 1:47	php-fpm	60339	/diag_reboot.php: Stopping all packages.
4/27/2017 1:47	check_reload_status		Linkup starting msk3
4/27/2017 1:47	kernel		msk3: link state changed to UP
4/27/2017 1:47	check_reload_status		Reloading filter
4/27/2017 1:47	php-fpm	32460	/rc.linkup: Linkup detected on disabled interface...Ignoring
4/27/2017 1:47	check_reload_status		Reloading filter
4/27/2017 1:47	php-fpm	32460	/rc.linkup: Linkup detected on disabled interface...Ignoring
4/27/2017 1:47	check_reload_status		Linkup starting msk3
4/27/2017 1:47	kernel		msk3: link state changed to DOWN
4/27/2017 1:47	kernel		msk3: watchdog timeout
4/27/2017 1:47	check_reload_status		Linkup starting sk1
4/27/2017 1:47	kernel		sk1: link state changed to DOWN
4/27/2017 1:47	kernel		sk0: link state changed to DOWN
4/27/2017 1:47	check_reload_status		Linkup starting sk0
4/27/2017 1:47	check_reload_status		Reloading filter
4/27/2017 1:47	check_reload_status		Restarting OpenVPN tunnels/interfaces
4/27/2017 1:47	check_reload_status		Restarting ipsec tunnels
4/27/2017 1:47	check_reload_status		updating dyndns WAN_DHCP
4/27/2017 1:47	php-fpm	153	/rc.linkup: HOTPLUG: Configuring interface wan
4/27/2017 1:47	php-fpm	153	/rc.linkup: DEVD Ethernet attached event for wan
4/27/2017 1:47	xinetd	16537	Exiting...
4/27/2017 1:47	kernel		msk3: link state changed to UP
4/27/2017 1:47	check_reload_status		Linkup starting msk3
4/27/2017 1:47	check_reload_status		Reloading filter
4/27/2017 1:47	php-fpm	92890	/rc.linkup: Shutting down Router Advertisment daemon cleanly
4/27/2017 1:47	php-fpm	92890	/rc.linkup: DEVD Ethernet detached event for wan
4/27/2017 1:47	kernel		msk3: link state changed to DOWN
4/27/2017 1:47	kernel		msk3: watchdog timeout
4/27/2017 1:47	check_reload_status		Linkup starting msk3

Stewart

With Brighthouse / Spectrum, we experience flapping on high-speed fiber off of the ONU that looks similar to this. Have you tried hardcoding the speed onto the interface?

olorinpc

Interesting, no I haven't. I temporarily bypassed the issue by turning dhcp back on on my wireless AP that I had in bridge mode, but I think we can all agree consumer dlink firewall vs pfsense is a bit lacking.

I will certainly give that a go and see if that changes any results with it.

Sadly, I am half expecting it to be hardware related as one of the ports died a couple of years ago, but would be happy if it was something as simple as just hard setting the rate on the interface. Thanks for the suggestion! Will report back here after I am able to test

jahonix

Is this an academic issue or something you experience in production use all the time?

Set a limiter just below the killing-speed and you should be safe, shouldn't you? How often do you actually need to saturate 1Gb/s?

olorinpc

Well, not an academic issue as its the primary firewall for my home office and it seems silly to limit bandwidth below a certain rate when the equipment (not to mention paying for it) should be able to handle it.

That being said, within 4hrs of the initial post, the firewall killed again, without artificially saturating with a speedtest. Additional testing with hard setting the rate to 1000 didn't seem to have an effect. Something over 100mb/s and obviously less than 1000mb/s kills the thing.

At this point, reviewing my old notes and logs from a while back… and actually forum posts here. The original WAN port was bad, so with the current symptoms and it not seeming to be a pfsense issue, pretty sure the old IBM appliance is just giving up the ghost.

As such, think it is time to replace... so just going to mark this as solved so folks don't attempt to go down a rabbit hole.