"device timeout" on Intel Pro/100 Ethernet
-
Are you still running the P3 1GHz?
Nothing else changed except the firmware update?Steve
-
Hi Steve
Yep - that's correct. No changes in between the upgrades.-Lou
-
strictly speaking, I guess no - not the exact same config. I think there are some ACLs that I did not apply in Squidguard and the DHCP scope starts at .75 instead of .50. I doubt that they would contribute to a NIC timeout, but I've seen crazy things in the past. Nothing changed with the hardware, cabling, switches, etc. I didn't install the lightsquid package nor the ntop package - they were both installed in the last one.
-
Hmm, odd. I don't think the fxp driver has changed in a long time but I guess you'd have to look into it to be sure.
Dropping back to 2.0.1 or 2.0 would be a good test. I believe the config file should be importable, though I've never tried it. The format changed significantly between 1.2.3 and 2.0 which means it isn't compatible that far back.
Steve
-
Discovered something interesting today when it happened again. I ran tcpdump from the pfsense console on both fxp0 and fxp3 interfaces. on fxp3 (LAN), I saw the usual expected broadcast activity, though I could not ping anything on the LAN. On fxp0/pppoE, I saw it sending out nothing but DHCP requests over and over again. So on a hunch, I pulled the plug on my DSL modem (setup in bridged mode) and then plugged it back in. Once it (presumably) renegotiated all its bits with my ISP, both interfaces came back online, I could ping internal LAN IPs, and also had re-established internet access.
So now the questions are A) why is my DSL modem/pfsense dropping and not subsequently re-establishing connectivity (not a question for this forum potentially) and B) why does this event knock out the fxp3/LAN interface for pfsense? Nonetheless - at least I know from where the problem is coming.
-Lou
-
Hmm, interesting observation.
Some cable modems will start to hand out a local (192.168..) IP address when they loose connectivity. This can cause all sorts of trouble as you can imagine. I can see how it would stop you pinging anything if it happened to be the same subnet as your LAN. I expect you would have spotted that though.Steve
-
After bouncing the DSL modem on Friday early afternoon, I've gone all day Saturday and Sunday without the interfaces shutting down. I still see the "device timeout" on the console and in system.log, but the interfaces are OK (i.e. passing traffic).
-
spoke too soon and the timeout happened today :-( I was at work and my wife needed the connection back so she had to bounce the firewall. Looks like I'll have to wait for another chance to troubleshoot.
-
I recently upgraded from a ~ 1 year old pfsense 2.0.something install to 2.0.2. Ever since the upgrade, both my WAN and LAN interfaces will die at various times, with tons of these in "dmesg" output, the console, and system.log:
pfsense kernel: fxp0: device timeout
pfsense kernel: fxp3: device timeoutIt seems to happen randomly, i.e. not under period of heavy load or anything consistent like that. The only way to get them back is to bounce the whole firewall. I never had any such issue with my original install. Its all the same hardware, as well - nothing added or subtracted.
My NIC is a quad-port server class Intel, that reports itself as:
Intel 82559 Pro/100 Ethernet
I don't specifically see the "82559" listed on the BSD HCL (only generically Intel Pro/100) - but not sure if a compatibility issue is my problem as it worked fine in the older pfsense install…. that is unless support was removed for this particular NIC.
Thoughts?
We have a similar issue - on 2.0.0 and 2.0.1 no issues at all, upgraded to 2.0.2 (and 2.0.3 pre) and random issues with WAN side interface going into a zombie state - link up but wont pass traffic with the "device timeouts" on the console. We are using Intel NIC's with the em driver.
Bumped MBUF and tweaked the cards to use MSI versus MSIX and only a single queue and it seems to help stabilize it - but under 2.0.1 and prior no tweaking required.
If we install pfblocker and enable any blocking lists it seems to trigger this issue much sooner.
-
very interesting. question regarding:
it seems to help stabilize it
stabilize as in "issue is resolved" or stabilize as in "doesn't happen nearly as frequently"?
how did you do your tuning/tweaking, i.e. at the BSD level or via the pfsense webGUI?
-
To be fair the em driver is very different to the fxp driver. There are no options for doing these things in the webgui. See:
http://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_CardsSteve
-
To be fair the em driver is very different to the fxp driver. There are no options for doing these things in the webgui. See:
http://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_CardsSteve
Thats the process we used for the em driver.
And it is fixed with that tuning in place unless we enable pfBlocker with a bunch of lists - then it does it again. Can't figure out how that's related.
-
I haven't had much luck with this, so I decided to upgrade to the latest 2.1 snapshot instead of downgrading to 2.0. I haven't seen any fxp0/3 device timeouts on the console, in any of the logs, or in dmesg output, and I haven't had connectivity drop at all so far. That's definitely a good sign! It's been about 13 hours and I would have definitely seen device timeout errors within that span. I'll continue to post.
On a side note, while catching up on some other posts on the forum, I read one that was written is very poor taste. The author called it a "rant" about how poor pfsense is, but I wouldn't even consider the post that. It was an ill-advised and poorly thought out dump of frustration directed at folks who volunteer their time to help others, which disappoints me. Obviously the individual doesn't understand, appreciate, nor care about the effort it takes to get a project like pfsense off the ground and running. I'm sure most of the people monitoring the forums have seen stuff like that before and will see it again and most likely have very thick skin by now - nonetheless, whenever I see posts such as that I try and make sure those involved know that the product is truly amazing and I love pfsense! The features it packs are incredible, it runs on almost anything of any age, and its really simple to pick up and learn. And to those guys that hawk the forums - THANK YOU for the tremendous jobs you all do with helping us newbies with anything from the "is it plugged in" type questions to "can I get a turbo-charger in the next release" type of questions.
Sincerely
-Lou -
I jinxed myself. Just a few hours after my last post, I started getting device timeouts again. I replaced cables and switch ports, to no avail. Getting desperate, I added two 3com 905c NICs and changed both WAN/LAN interfaces to use them instead of the Pro/100 quad-port NIC. Later today will be 3 full days with no issues on the new cards. No clue what happened to the Intel quad-port, but this is looking more and more like a hardware problem now.
-
Just to close this thread up - replacing the NIC has resolved the issues.