I keep loosing connectivity on my wan after a few days



  • If i reboot box or just release/renew on wan interface it comes back

    this is what i se in my logs

    xl1: no carrier - transceiver cable problem?
    xl1: link state changed to DOWN
    xl1: link state changed to UP
    xl1: watchdog timeout

    i´m running
    RELENG_1_SNAPSHOT-07-03-2006
    built on Fri Jun 16 01:04:23 UTC 2006

    any ideas on how to resolve?

    regards /Fredde



  • Sounds like hardware issues.

    Check your hardware IRQ's, etc.  If the problem persists try a newer image:

    http://www.pfsense.com/~sullrich/RELENG_1_SNAPSHOT-07-06-2006/



  • @sullrich:

    Sounds like hardware issues.

    Check your hardware IRQ's, etc.   If the problem persists try a newer image:

    http://www.pfsense.com/~sullrich/RELENG_1_SNAPSHOT-07-06-2006/

    this is starting to bug me out..been using the same box with astor for roughly 1.5year without any prb,now it dies after a few hours, feel like dhcp lease runs out or something.

    i´m running
    RELENG_1_SNAPSHOT-07-09-2006
    built on Fri Jun 16 01:04:23 UTC 2006

    been changing nic´s, differant pci slots, altering bios and still the same problem.

    any more ideas as of what could be the problem?

    regards /Fredde



  • Check the ram and motherboard.  Run diagnostics and memory tests.



  • Im having the exact same problem.. and it only happened recently after running this server for about a year now..
    If i put a router between my WAN interface and my wan, then DHCP works fine.  but DHCP from my provider runs out and wont renew unless i do it manually

    right now with the other router its working fine, but its very weird that it only started happending with (i think) RC1, so i upgraded to lastest nightly and still same problem..

    cheers
    -Eric



  • Run this from a shell:

    /etc/rc.conf_mount_rw
    killall check_reload_status
    fetch -o /usr/local/sbin/check_reload_status http://www.pfsense.com/~sullrich/check_reload_status
    chmod a+rx /usr/local/sbin/check_reload_status
    /etc/rc.conf_mount_ro
    check_reload_status

    And let me know if the problem persists.



  • thanks

    i'll give it a go and let you know what happens  ;)



  • I was having this same problem after updating on the 6th of July. However, as of now, it seems as thought the updated check_reload_status build has fixed this problem as i have not lost my connection in about a week now.

    I will repost here if this changes…



  • I am having the same problem as well.  I'm using RC2 and getting

    xl1: watchdog timeout

    This same box has been running IPCop for three years, so I doubt there's anything wrong with the hardware…



  • I'm seeing the same issue after upgrading from beta to RC-2. I'm using different hardware, but have already swapped out the NIC. Scott's suggestion didn't fix the problem – now there's an extra message in the logs about check_reload_status.

    I'm using fairly cheap Linksys EG1032 NICs in this one as I'm out of Intel PRO-S cards.

    snip from logs:

    kernel: re1: watchdog timeout
    kernet: re1: link state changed to DOWN
    check_reload_status: rc.linkup starting
    php: Hotplug event detected for re1 but ignoring since interface is not set for DHCP
    kernel: re1: link state changed to UP
    check_reload_status: rc.linkup starting

    I'm going to re-install the old beta2 firewall (which I kept around in case of emergency). Any other suggestions?



  • I am also getting the same "xl0 watchdog timeout" repeated messages with RC2 (not sure about prev) this is happening on 3 different boxes. I wander if this has any effect beyond minor annoyance? Is there any risk of packet loss occuring or media disconnects? Thanks!



  • I've been having the same problems, always on the same interface.

    
    Sep 25 00:27:01	php: : Hotplug event detected for sk0 but ignoring since interface is not set for DHCP
    Sep 25 00:27:00	kernel: sk0: link state changed to UP
    Sep 25 00:09:35	php: : Hotplug event detected for sk0 but ignoring since interface is not set for DHCP
    Sep 25 00:09:32	kernel: sk0: link state changed to DOWN
    Sep 25 00:09:32	kernel: sk0: watchdog timeout
    
    
    
    Sep 24 19:22:11	php: : FTP proxy disabled for interface opt1 - ignoring.
    Sep 24 19:22:11	kernel: sk1: link state changed to UP
    Sep 24 19:13:55	php: : Hotplug event detected for sk1 but ignoring since interface is not set for DHCP
    Sep 24 19:13:51	kernel: sk1: link state changed to DOWN
    Sep 24 19:13:51	kernel: sk1: watchdog timeout
    
    

    I happens occasionally at random times, sometimes up to twice a day. Its usually not down for more than an hour of so.

    Any clues as to what causes this?

    edit: Im using 3com gigabit adaptors:

    
    $ dmesg | grep -i 3com
    skc0: <3Com 3C940 Gigabit Ethernet> port 0x1000-0x10ff mem 0x40000000-0x40003fff irq 16 at device 4.0 on pci2
    skc0: 3Com Gigabit NIC (3C2000) rev. (0x1)
    skc1: <3Com 3C940 Gigabit Ethernet> port 0x1400-0x14ff mem 0x40100000-0x40103fff irq 18 at device 9.0 on pci2
    skc1: 3Com Gigabit NIC (3C2000) rev. (0x1)
    skc2: <3Com 3C940 Gigabit Ethernet> port 0x1800-0x18ff mem 0x40200000-0x40203fff irq 21 at device 10.0 on pci2
    skc2: 3Com Gigabit NIC (3C2000) rev. (0x1)
    
    

    and im using 1.0-SNAPSHOT-09-14-06 built on Thu Sep 14 21:19:15 UTC 2006 after an update of RC2, which i installed from scratch using the LiveCD.



  • i´m about to give pfsense another try, just wondering if this problem still persists or not.

    /F



  • Im still having these problems, but there have been 4 snapshots since the one i am running, so it may have been resolved.

    Im going to update tomorrow morning (in about 12 hours) and try it out again….

    Ill keep posting here...



  • @narf:

    I've been having the same problems, always on the same interface.

    ...
    Sep 25 00:27:01	php: : Hotplug event detected for sk0 but ignoring since interface is not set for DHCP
    ...
    

    This is a physical disconnect/connect event on your NIC.
    Isolate it (remova/change/inverse with LAN) to check this card, or, more likely, change the cable.
    Check also 'the other side' of the cable.



  • @Gertjan:

    @narf:

    I've been having the same problems, always on the same interface.

    ...
    Sep 25 00:27:01	php: : Hotplug event detected for sk0 but ignoring since interface is not set for DHCP
    ...
    

    This is a physical disconnect/connect event on your NIC.
    Isolate it (remova/change/inverse with LAN) to check this card, or, more likely, change the cable.
    Check also 'the other side' of the cable.

    Nope. I've changed the NIC (both are brand new), the cable, even the switch port that its on.

    Im going to update to the latest snapshot, and see what gives.



  • A long shot : the NIC, device name sk0, has FreeBSD driver troubles ?
    Try another brand - another chipset NIC aren't expensive  ;)
    Imoh, it can only be the cable or the NIC.



  • @Gertjan:

    A long shot : the NIC, device name sk0, has FreeBSD driver troubles ?
    Try another brand - another chipset NIC aren't expensive  ;)
    Imoh, it can only be the cable or the NIC.

    Alright, ill try replacing the NICs with different one. What strikes me as weird is that all three nics in the box that im using are of the same make and model (bought at the same time). SK1 and SK2 have no problems.



  • Ok.
    Some more questions persists.
    Is it the third card, recognized when booting (indifferent from which one, sk0, sk1 & sk2 - or on what PCI slot it is) that poses a problem ? dmesg says what ?
    (I try to detemine here system resources limits like IRQ share on a BIOS level, card level OS level - because even some PCI implementations have a fixed number of possible IRQS's to chose / OS drivers can have limits…)
    Make the system with works with 2 cards - try all combinations....

    Still sure that it isn't the other side ?
    Try also to loop sk1 to sk2 - open WAN Firewall on sk0 (give it a static IP !), open port SSH an login - check also like this.

    Youl'll manage.



  • Hi all !

    Is this solved in 1.0.1 ?
    I upgraded to this version but with 1.0 I was reinstalling the machine often !
    I can't even have lan access to it …

    I installed one pfSense in Africa and I'm out of that country and I think it happened there.

    []'s



  • @freax:

    Hi all !

    Is this solved in 1.0.1 ?
    I upgraded to this version but with 1.0 I was reinstalling the machine often !
    I can't even have lan access to it …

    I installed one pfSense in Africa and I'm out of that country and I think it happened there.

    []'s

    Who knows… You are the only person with this problem.



  • How can I debug this ?
    Is there some file with logs besides web interface ?



  • No idea, honestly.  It should "just work".



  • Anyway, I think there is some bug when ISP gives you connection without an ip.
    But ok, thanks.



  • Uhm, that would not be a bug?  If the ISP does not hand out an IP then there is nothing pfSense can do.



  • But it won't give access to web interface of firewall and the other services in DMZ !
    This is the bug.

    This leads to a firewall reset. Nothing more.



  • sounds like hardware issues to me. Why should not receiving a dhcp lease on request cause a reset?



  • No. You didn't understand:

    The reset in console is what I must do because the reboot won't work, access to web interface won't work, access to hosts behind DMZ won't work, …

    The firewall becomes dead when wan connection goes up but don't receives ip from the ISP.

    []'s



  • I agree with Holger.  Sounds like hardware issues.



  • If you lose the WAN IP and use natreflection to access the DMZ it won't work anymore as you lost the IP that gets reflected. You maybe even lose DNS to resolve the WAN IP first if it is a dyndns account. That makes sense. I'm still thinking something with your WAN is wrong or maybe even with your ISP.


Log in to reply