WAN NIC losing link



  • WAN link keeps going down here.. I downgraded to 2.1.5 and will probably need to leave it at that :(

    No idea where this watchdog timeout is coming from but after that, the wan interface keeps requesting a new wan DHCP address but doesn't get any. (pulling out cable or resetting modem doesn't help, I need to reboot the pfsense box)

    Maybe it's useful to change something in the advanced settings? Like "Disable hardware checksum offload", maybe that has an impact?

    Apr 23 10:42:43 pfsense kernel: em1: Watchdog timeout -- resetting
    Apr 23 10:42:43 pfsense kernel: em1: Queue(0) tdh = 817, hw tdt = 786
    Apr 23 10:42:43 pfsense kernel: em1: TX(0) desc avail = 31,Next TX to Clean = 817
    Apr 23 10:42:43 pfsense kernel: em1: link state changed to DOWN
    Apr 23 10:42:43 pfsense check_reload_status: Linkup starting em1
    Apr 23 10:42:44 pfsense php-fpm[49573]: /rc.linkup: DEVD Ethernet detached event for wan
    Apr 23 10:42:45 pfsense kernel: arpresolve: can't allocate llinfo for 81.82.192.1 on em1
    Apr 23 10:42:45 pfsense kernel: arpresolve: can't allocate llinfo for 81.82.192.1 on em1
    ...
    


  • Have you tried switching the LAN and WAN connections ?
    Are you sure there isn't a cable or hardware error causing this ?



  • @TieT:

    Have you tried switching the LAN and WAN connections ?
    Are you sure there isn't a cable or hardware error causing this ?

    Yes, they have all been replaced.. Directly to modem, different cable, switch between modem-pfsense.

    Only happening on 2.2x and always takes a few days..



  • And the physical ports of the pfsense box ?



  • In a way :) I tried a different box.. (same hardware config)



  • @xtofh:

    In a way :) I tried a different box.. (same hardware config)

    Hmmm, then i would say telenet modem  :)

    I don't have the problem on 2.2 anymore.
    Strange though…



  • have the same issue on 1 location, also Telenet ISP  (i do have multiple other sites also on >2.2 that don't have the issue but they are VM's)

    odd to say the least … in my situation, a powercycle of the modem solves it.



  • @heper:

    have the same issue on 1 location, also Telenet ISP  (i do have multiple other sites also on >2.2 that don't have the issue but they are VM's)

    odd to say the least … in my situation, a powercycle of the modem solves it.

    Thanks for the info. Modem reboot does not help here, I have to reboot the pfsense box. (non VM)

    We have several (including my home, residential, Telenet connection) that work on 2.2 without a problem. the one that has the issue is a Telenet Business. (with static IP that gets assigned via dhcp)

    I have reverted to 2.1.5 since a few weeks now for my office connection because losing our connection became too annoying.



  • Could be a coincidence but I had the same issue today, also Telenet Business with a fixed ip. I'm also running 2.2



  • @sandern:

    Could be a coincidence but I had the same issue today, also Telenet Business with a fixed ip. I'm also running 2.2

    I never had the issue on other type of internet subscriptions. You could try upgrading to 2.2.3 ? Did the issue reoccur since last time?

    I hope to find the time to try this (upgrade to 2.2.3) in the next few weeks. Feel free to keep us posted..



  • Split this to its own thread since the other thread had 5 pages of unrelated history and ended up being caused by a dead hard drive.

    @xtofh:

    Apr 23 10:42:43 pfsense kernel: em1: Watchdog timeout -- resetting
    Apr 23 10:42:43 pfsense kernel: em1: Queue(0) tdh = 817, hw tdt = 786
    Apr 23 10:42:43 pfsense kernel: em1: TX(0) desc avail = 31,Next TX to Clean = 817
    ...
    

    This looks to match this:
    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199174

    which should be worked around if you disable MSI and MSIX.
    https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards#MSI.2FMSIX



  • @cmb

    i've experienced the same/very similar issue on a crappy dlink nic (sk0 driver) that worked fine before upgrading from 2.1.x –> 2.2.x  .... so in my case very similar situation with a non-intel nic. (haven't found the time to swap nics)
    currently i'm rebooting the firewall every morning at 5am and haven't received any reports like "please fix my internets"



  • @heper:

    @cmb

    i've experienced the same/very similar issue on a crappy dlink nic (sk0 driver) that worked fine before upgrading from 2.1.x –> 2.2.x  .... so in my case very similar situation with a non-intel nic. (haven't found the time to swap nics)
    currently i'm rebooting the firewall every morning at 5am and haven't received any reports like "please fix my internets"

    Could be the same solution for that, try disabling MSI and MSIX.

    In the Intel case, it appears the new driver wants to enable things that don't work on some small minority of cards. Could be a similar cause, though completely different root problem given different hardware and driver.



  • @cmb:

    Split this to its own thread since the other thread had 5 pages of unrelated history and ended up being caused by a dead hard drive.

    New thread: https://forum.pfsense.org/index.php?topic=96325



  • @cmb:

    @heper:

    @cmb

    i've experienced the same/very similar issue on a crappy dlink nic (sk0 driver) that worked fine before upgrading from 2.1.x –> 2.2.x  .... so in my case very similar situation with a non-intel nic. (haven't found the time to swap nics)
    currently i'm rebooting the firewall every morning at 5am and haven't received any reports like "please fix my internets"

    Could be the same solution for that, try disabling MSI and MSIX.

    In the Intel case, it appears the new driver wants to enable things that don't work on some small minority of cards. Could be a similar cause, though completely different root problem given different hardware and driver.

    Some rebranded cards, like the HP i350s, may have MSIX disabled in the firmware, but still identify as an Intel i350, which can cause issues. Nothing like saving $10 on a $140 card just to have driver issues or reduced performance because they've disabled or otherwise customized the card.