LAN adapter watchdog timeout when running heavy load on pfSense
-
Found a post on this http://forum.ev1servers.net/showthread.php?t=55633
It suggests the following.a) turn ACPI support OFF
b) turn PNP OS support OFF [in bios]if possible
c) change the NIC to some other brand like 3com or Intel.
-
work like expected besides the log entries?
the system does fail and has to be rebooted. If the WAN network cards fails than it's still possible to access the Firewall via Web-GUI or SSH. The console does work in all cases
a) turn ACPI support OFF
b) turn PNP OS support OFF [in bios]
if possible
c) change the NIC to some other brand like 3com or Intel.thank you for the link. At the end of this thread I found
Disabling ACPI did not help.
Dedicating IRQ for the NIC did not help.
Changing the hardware (nic/mb) did help, havent seen the error for a week now. So 'watchdog timeouts' is mostly hardware related problem.I think I will give a try to the last solution
regards
Günther
-
Problem:
when running a larger traffic (e.g. Filetransfer) across the firewall (LAN<->WAN = about 40Mbps) than we get suddently
#kernel vr0: watchdog timeout
The problem can occur after 5 Seconds or 20 Minutes
Im having problems similar to this. K6-500, 415MB ram. Regardless of what NICs I try (compaq dual-port server NIC, or generic cheapos), Pfsense locks up when I get close to 10-15Mbps, and occasionally at lower speeds with heavy use such as uTorrent. I suspect it may be a problem with the actual computer running pfsense (bad ram, bad cpu??). I ran Pfsense under VMware, on a 1.8Ghz "3000+" amd computer. Performance as far as throughput goes was no better than the K6-500, but I havent been able to reproduce any of the crashing I was experiancing.
gschoch, do you have any way of monitoring temperatures (CPU, memory etc) in the routers? Ive had issues before with other PC's that wernt getting good ventilation, and hanging/crashing when getting well over 60-70+C.
-
…, do you have any way of monitoring temperatures (CPU, memory etc) in the routers?
Ive had issues before with other PC's that wernt getting good ventilation,
and hanging/crashing when getting well over 60-70+C.thank you for the answer. Yes we monitored and have about 6 pfSense boxes (but the same age or cooling or memory). All of them do have in common that they are based on VIA Mini ITX boards. All other components are different. But in all cases we do end up with this timeout errors. We will test now another set of NICs and if this does not help the Mini ITX boards have to be switched to another product.
regards
Günther
-
Hi
just for others with the same problem:
we switched to a D-Link DFE-580TX Networkadapter with 4 Ports and since then we had nomore any watchdog timeout
regards
Günther
-
This is problem FreeBSD v. 6.1
Im using two card 3Com 3C905CX-TX-M, disable ACPI, disable sound blaster, change network cards, disable PNP OS, disable COM, LPT, change VGA card.
No change anything.Im using pfSense v. 1.0.1:
(filtered…)Dec 8 05:24:45 kernel: xl1: watchdog timeout
Dec 7 19:26:13 kernel: xl1: watchdog timeout
Dec 7 16:03:00 kernel: xl1: watchdog timeout
Dec 7 11:15:14 kernel: xl1: watchdog timeout
Dec 7 09:18:26 kernel: xl1: watchdog timeout
Dec 7 03:43:00 kernel: xl1: watchdog timeout
Dec 7 03:40:29 kernel: xl1: watchdog timeout
Dec 7 03:35:51 kernel: xl1: watchdog timeout
Dec 6 20:34:35 kernel: xl1: watchdog timeout
Dec 6 18:29:09 kernel: xl1: watchdog timeout
Dec 6 17:15:44 kernel: xl1: watchdog timeout
Dec 6 16:27:28 kernel: xl0: watchdog timeout
Dec 6 13:36:52 kernel: xl1: watchdog timeout
Dec 5 17:29:39 kernel: xl1: watchdog timeout
Dec 5 15:12:16 kernel: xl1: watchdog timeout
Dec 5 12:34:30 kernel: xl1: watchdog timeout
Dec 5 10:43:28 kernel: xl1: watchdog timeout
Dec 5 10:39:41 kernel: xl1: watchdog timeout -
for reference, broadcom based integrated chips tend to watchdog timeout under VERY heavy loads but they reset and go on with life. I've seen this happen on AMD Opteron and Intel Xeon platforms with multiple broadcom chips. I just use Intel NICS for everything critical and I don't have issues.
As for 3com's the 3c905b's were perhaps the best 10/100 NIC's ever built in my opinion. But when they did the die shrink and built the 3c905c's they were terrible. I even have problems with those in Windows boxes and Linux boxes!
My recommendation would be to stay away from newer 3com cards, period. There's a reason Intel has outsold, and blown them out of the water in the NIC market. Besides, Intel continuously contributes code for their NIC drivers. So they just 'work' T.M.
-
Upgrade to http://www.pfsense.com/~sullrich/1.0.1-SNAPSHOT-12-06-2006/ and see if the problems persist.
-
I had the same problem earlier today.
Dec 9 14:57:03 kernel: sk0: link state changed to UP
Dec 9 14:33:46 kernel: arplookup 231.57.128.57 failed: host is not on local network
Dec 9 14:08:17 php: : Hotplug event detected for sk0 but ignoring since interface is not set for DHCP
Dec 9 14:08:17 check_reload_status: rc.linkup starting
Dec 9 14:08:14 kernel: sk0: link state changed to DOWN
Dec 9 14:08:14 kernel: sk0: watchdog timeout
Dec 9 13:59:43 kernel: arp: 172.17.17.254 is on sk0 but got reply from 00:0b:db:65:8a:91 on em0
Dec 9 12:58:09 kernel: arp: 172.17.17.241 is on sk0 but got reply from 00:11:43:11:92:8b on em0
Dec 9 12:38:17 last message repeated 4 times
Dec 9 12:37:53 kernel: arp: 172.17.17.252 is on sk0 but got reply from 00:13:20:00:cb:cf on em0
Dec 9 12:36:57 last message repeated 9 times
Dec 9 12:36:37 kernel: arp: 172.17.17.241 is on sk0 but got reply from 00:11:43:11:92:8b on em0
Dec 9 12:21:55 kernel: em0: watchdog timeout – resetting
Dec 9 12:21:33 last message repeated 9 times
Dec 9 12:19:44 kernel: arp: 172.17.17.241 is on sk0 but got reply from 00:11:43:11:92:8b on em0
Dec 9 12:02:09 last message repeated 4 times
Dec 9 12:02:01 kernel: arp: 172.17.17.241 is on sk0 but got reply from 00:11:43:11:92:8b on em0
Dec 9 11:46:48 last message repeated 5 times
Dec 9 11:46:23 kernel: arp: 172.17.17.252 is on sk0 but got reply from 00:13:20:00:cb:cf on em0
Dec 9 11:45:39 kernel: arp: 172.17.17.247 is on sk0 but got reply from 00:06:5b:f3:2a:4b on em0Intel Pro/1000 MT and 3COM 3C2000-T NIC's
I have just updated my box with your snaphsot link and also went into BIOS and disabled all unnecessary hardware. I hope this fixes it.
-
I am upgrade pfSense 1.0.1 to 1.0.1-SNAPSHOT-12-08-2006, new install 1.0.1-SNAPSHOT-12-08-2006, but have same problem.
…filtered...
Dec 10 17:48:43 kernel: xl1: watchdog timeout
Dec 10 17:19:51 kernel: xl0: watchdog timeout
Dec 10 17:01:23 kernel: xl1: watchdog timeout
Dec 10 16:36:45 kernel: xl1: watchdog timeout
Dec 10 15:13:29 kernel: xl1: watchdog timeout
Dec 10 12:28:23 kernel: xl1: watchdog timeout
Dec 10 12:18:24 kernel: xl1: watchdog timeout -
If you start receiving these messages regularly, read below:
FreeBSD network device drivers (dc, xl, sk etc.) utilize watchdog timers to keep some statistics about the device driver and to tackle deadlock situations that may arise because of hardware issues. This timer is set to some specific value by the device driver, and decremented by one once a second. If the timer expires and the network adapter
did not finish its job, the watchdog routine for this adapter is run.Time watchdog timer routine simply prints a diagnostics message that is visible through /var/log/messages and restarts the network adapter.
Briefly, you can try the following to get rid of the problem:
Many PCI network adapters require a PCI slot that supports Bus Mastering. Some old motherboards
have this feature on only on their first PCI slot (pci0). So, plug your network adapter to the first
pci slot on your mainboard and see if that helps. -
Hi, support this networl card D-Link DFE-580TX Traffic shaper?
-
don't steal a topic
post it in the trafic shaper section of the forum
if you had searched there you had found this post that will tell you witch interfaces suport alt q
http://forum.pfsense.org/index.php/topic,1686.msg9789.html#msg9789 -
Hi, exists patch for FreeBSD v. 6.1 for watchdog timeouts (Bus mastering PCI)?
-
Hi, i am change 3Com card 3C905X to Planet with Realtek chipset (ENW-9503A) and disable acpi in to pfSense v. 1.0.1 (hint.acpi.0.disabled=1) and works perfectly!!!!!