I keep loosing connectivity on my wan after a few days

Guest

If i reboot box or just release/renew on wan interface it comes back

this is what i se in my logs

xl1: no carrier - transceiver cable problem?
xl1: link state changed to DOWN
xl1: link state changed to UP
xl1: watchdog timeout

i´m running
RELENG_1_SNAPSHOT-07-03-2006
built on Fri Jun 16 01:04:23 UTC 2006

any ideas on how to resolve?

regards /Fredde

sullrich

Sounds like hardware issues.

Check your hardware IRQ's, etc. If the problem persists try a newer image:

http://www.pfsense.com/~sullrich/RELENG_1_SNAPSHOT-07-06-2006/

Guest

@sullrich:

Sounds like hardware issues.

Check your hardware IRQ's, etc. If the problem persists try a newer image:

http://www.pfsense.com/~sullrich/RELENG_1_SNAPSHOT-07-06-2006/

this is starting to bug me out..been using the same box with astor for roughly 1.5year without any prb,now it dies after a few hours, feel like dhcp lease runs out or something.

i´m running
RELENG_1_SNAPSHOT-07-09-2006
built on Fri Jun 16 01:04:23 UTC 2006

been changing nic´s, differant pci slots, altering bios and still the same problem.

any more ideas as of what could be the problem?

regards /Fredde

sullrich

Check the ram and motherboard. Run diagnostics and memory tests.

eric

Im having the exact same problem.. and it only happened recently after running this server for about a year now..
If i put a router between my WAN interface and my wan, then DHCP works fine. but DHCP from my provider runs out and wont renew unless i do it manually

right now with the other router its working fine, but its very weird that it only started happending with (i think) RC1, so i upgraded to lastest nightly and still same problem..

cheers
-Eric

sullrich

Run this from a shell:

/etc/rc.conf_mount_rw
killall check_reload_status
fetch -o /usr/local/sbin/check_reload_status http://www.pfsense.com/~sullrich/check_reload_status
chmod a+rx /usr/local/sbin/check_reload_status
/etc/rc.conf_mount_ro
check_reload_status

And let me know if the problem persists.

eric

thanks

i'll give it a go and let you know what happens ;)

magikman

I was having this same problem after updating on the 6th of July. However, as of now, it seems as thought the updated check_reload_status build has fixed this problem as i have not lost my connection in about a week now.

I will repost here if this changes…

mdepot

I am having the same problem as well. I'm using RC2 and getting

xl1: watchdog timeout

This same box has been running IPCop for three years, so I doubt there's anything wrong with the hardware…

pjaromin

I'm seeing the same issue after upgrading from beta to RC-2. I'm using different hardware, but have already swapped out the NIC. Scott's suggestion didn't fix the problem – now there's an extra message in the logs about check_reload_status.

I'm using fairly cheap Linksys EG1032 NICs in this one as I'm out of Intel PRO-S cards.

snip from logs:

kernel: re1: watchdog timeout
kernet: re1: link state changed to DOWN
check_reload_status: rc.linkup starting
php: Hotplug event detected for re1 but ignoring since interface is not set for DHCP
kernel: re1: link state changed to UP
check_reload_status: rc.linkup starting

I'm going to re-install the old beta2 firewall (which I kept around in case of emergency). Any other suggestions?

cheech

I am also getting the same "xl0 watchdog timeout" repeated messages with RC2 (not sure about prev) this is happening on 3 different boxes. I wander if this has any effect beyond minor annoyance? Is there any risk of packet loss occuring or media disconnects? Thanks!

narf

I've been having the same problems, always on the same interface.


Sep 25 00:27:01	php: : Hotplug event detected for sk0 but ignoring since interface is not set for DHCP
Sep 25 00:27:00	kernel: sk0: link state changed to UP
Sep 25 00:09:35	php: : Hotplug event detected for sk0 but ignoring since interface is not set for DHCP
Sep 25 00:09:32	kernel: sk0: link state changed to DOWN
Sep 25 00:09:32	kernel: sk0: watchdog timeout


Sep 24 19:22:11	php: : FTP proxy disabled for interface opt1 - ignoring.
Sep 24 19:22:11	kernel: sk1: link state changed to UP
Sep 24 19:13:55	php: : Hotplug event detected for sk1 but ignoring since interface is not set for DHCP
Sep 24 19:13:51	kernel: sk1: link state changed to DOWN
Sep 24 19:13:51	kernel: sk1: watchdog timeout

I happens occasionally at random times, sometimes up to twice a day. Its usually not down for more than an hour of so.

Any clues as to what causes this?

edit: Im using 3com gigabit adaptors:


$ dmesg | grep -i 3com
skc0: <3Com 3C940 Gigabit Ethernet> port 0x1000-0x10ff mem 0x40000000-0x40003fff irq 16 at device 4.0 on pci2
skc0: 3Com Gigabit NIC (3C2000) rev. (0x1)
skc1: <3Com 3C940 Gigabit Ethernet> port 0x1400-0x14ff mem 0x40100000-0x40103fff irq 18 at device 9.0 on pci2
skc1: 3Com Gigabit NIC (3C2000) rev. (0x1)
skc2: <3Com 3C940 Gigabit Ethernet> port 0x1800-0x18ff mem 0x40200000-0x40203fff irq 21 at device 10.0 on pci2
skc2: 3Com Gigabit NIC (3C2000) rev. (0x1)

and im using 1.0-SNAPSHOT-09-14-06 built on Thu Sep 14 21:19:15 UTC 2006 after an update of RC2, which i installed from scratch using the LiveCD.

Guest

i´m about to give pfsense another try, just wondering if this problem still persists or not.

/F

narf

Im still having these problems, but there have been 4 snapshots since the one i am running, so it may have been resolved.

Im going to update tomorrow morning (in about 12 hours) and try it out again….

Ill keep posting here...

Gertjan

@narf:

I've been having the same problems, always on the same interface.
...
Sep 25 00:27:01	php: : Hotplug event detected for sk0 but ignoring since interface is not set for DHCP
...
This is a physical disconnect/connect event on your NIC.
Isolate it (remova/change/inverse with LAN) to check this card, or, more likely, change the cable.
Check also 'the other side' of the cable.

narf

@Gertjan:

@narf:
I've been having the same problems, always on the same interface.
...
Sep 25 00:27:01	php: : Hotplug event detected for sk0 but ignoring since interface is not set for DHCP
...
This is a physical disconnect/connect event on your NIC.
Isolate it (remova/change/inverse with LAN) to check this card, or, more likely, change the cable.
Check also 'the other side' of the cable.
Nope. I've changed the NIC (both are brand new), the cable, even the switch port that its on.

Im going to update to the latest snapshot, and see what gives.

Gertjan

A long shot : the NIC, device name sk0, has FreeBSD driver troubles ?
Try another brand - another chipset NIC aren't expensive ;)
Imoh, it can only be the cable or the NIC.

narf

@Gertjan:

A long shot : the NIC, device name sk0, has FreeBSD driver troubles ?
Try another brand - another chipset NIC aren't expensive ;)
Imoh, it can only be the cable or the NIC.

Alright, ill try replacing the NICs with different one. What strikes me as weird is that all three nics in the box that im using are of the same make and model (bought at the same time). SK1 and SK2 have no problems.

Gertjan

Ok.
Some more questions persists.
Is it the third card, recognized when booting (indifferent from which one, sk0, sk1 & sk2 - or on what PCI slot it is) that poses a problem ? dmesg says what ?
(I try to detemine here system resources limits like IRQ share on a BIOS level, card level OS level - because even some PCI implementations have a fixed number of possible IRQS's to chose / OS drivers can have limits…)
Make the system with works with 2 cards - try all combinations....

Still sure that it isn't the other side ?
Try also to loop sk1 to sk2 - open WAN Firewall on sk0 (give it a static IP !), open port SSH an login - check also like this.

Youl'll manage.

freax

Hi all !

Is this solved in 1.0.1 ?
I upgraded to this version but with 1.0 I was reinstalling the machine often !
I can't even have lan access to it …

I installed one pfSense in Africa and I'm out of that country and I think it happened there.

[]'s