Intermittent issue since 2.3 upgrade (Watchdog timer, resetting)



  • Hi there

    Since the upgrade to 2.3, I'm having an intermittent issue whereby either the LAN or WAN card reports a "Watchdog timer, resetting" error.  There does not seem to be any pattern to this, all I can say is that it has started occurring since the 2.3 upgrade.  The server is a Dell Poweredge 2970 with a dual port Broadcom NetXtreme II network card.

    I tried adding settings to boot/loader.conf.local as per Card specific issues on this page: https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards, but that doesn't seem to have helped.  Shall I just get a new network card?  Any recommendations?  Any more info I can provide to help troubleshoot?

    Thanks in advance
    Peter



  • started occurring since the 2.3 upgrade.

    Did you think about a fresh and full install?

    The server is a Dell Poweredge 2970 with a dual port Broadcom NetXtreme II network card.

    Did you think about to deactivate the

    issue whereby either the LAN or WAN card reports a "Watchdog timer, resetting" error.

    There is a new menu if a WatchDog Timer is detected in any kind on the hardware where pfSense will
    be installed on and this could be the problem here. It comes with a pre-set number likes "128" this could
    be tuned by yours or deactivated.

    I tried adding settings to boot/loader.conf.local as per Card specific issues on this page: https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards, but that doesn't seem to have helped.

    If this card was working before the upgrade fine, it is not tended to that card itself but more to that new
    WatchDog Timer issue perhaps. Again this menu is only shown when a WatchDog Timer hardware in any
    kind during the install was detected.

    Shall I just get a new network card?

    An Intel server grade NIC with two or four Ports would be better in my eyes but this must decided by your self.

    Any recommendations?

    Intel i350 or i354 Quad Port NIC

    Any more info I can provide to help troubleshoot?

    Other users may also sitting in the same boat as you, but not really based on the same problem!
    Network Card Not being detected (igb(4))
    em0: watchdog timeout - resetting



  • Thanks for your advice - I think this is solved for us at least. 
    Looking at this page: https://www.freebsd.org/cgi/man.cgi?query=bce&sektion=4

    I set
    hw.bce.msi_enable=0

    Rather than
    hw.pci.enable_msix=0

    which is from https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards and potentially needs updating?

    Hasn't crashed in the last 24 hours with "Watchdog timer error, resetting!", so fingers crossed this works permanently.

    Thanks again
    Peter



  • Just had the same issue again - Watchdog timeout, resetting! - only way to fix this is a reboot  :(

    Back to the drawing board… unless anyone has any other ideas

    Thanks, hopefully
    Peter



  • I have the very same issue.

    Running 2.3 on an old but reliable IBM x336, with two Broadcom BCM5750 NICs. Watchdog timeout message and no more traffic on the interfaces, occurs randomly every couple of days.

    Setting hw.bge.allow_asf=0 on kernel load (as suggested by The Internet) did not resolve the issue.
    Now trying dev.bge.0.msi=0 (will report in a couple of days)

    This must be some regression on the bge driver on 10.3 since I can also confirm it never happened before on pfSense 2.2.x

    Any ideas? I've been looking at the recent commits on the bge driver but obviously I don't understand pretty much anything  :-[

    [b]johnsonp, does your card use the bge or bce drivers??



  • If you're using IPsec, it's probably https://redmine.pfsense.org/issues/6296



  • @cmb:

    If you're using IPsec, it's probably https://redmine.pfsense.org/issues/6296

    I am!! The problem description fits exactly. I also have 2 CPUs on this server. Furthermore, at the same time I also upgraded an Alix 2D3 box on a remote site with pretty much the same configuration, but it has been rock solid.

    I will temporarily disable one CPU as you suggested on some topic.

    Thanks a lot for the heads up!!!



  • Sorry for the delay replying here - I have bce cards and yes also using IPsec. 
    We've managed to live with it for the time being (we can get crashes 2 or 3 times a day, or go a week and works fine) - hopeful that 2.3.1 will resolve the problem.

    Fingers crossed it's released this week.
    Peter


Log in to reply