Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Em0 goes down, then I get Watchdog timout with filter reset

    Scheduled Pinned Locked Moved General pfSense Questions
    12 Posts 2 Posters 3.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S Offline
      shanis42
      last edited by

      I saw the post with someone having Watchdog timeouts with Realtek adapters. I am getting similar issues, but with Intel NIC's. I have an onboard NIC that I am using for WAN, and a PCIe card for the LAN side.

      I am running 2.2-RELEASE and completely updated.

      This was a log from a half hour ago (newest on top)

      Mar 17 00:35:08 php-fpm[20878]: /index.php: Successful login for user 'admin' from: 192.168.0.12
      Mar 17 00:35:08 php-fpm[20878]: /index.php: Successful login for user 'admin' from: 192.168.0.12
      Mar 17 00:35:02 php-fpm[20878]: /index.php: webConfigurator authentication error for 'admin' from 192.168.0.12
      Mar 17 00:35:02 php-fpm[20878]: /index.php: webConfigurator authentication error for 'admin' from 192.168.0.12
      Mar 17 00:21:08 check_reload_status: Reloading filter
      Mar 17 00:21:08 php-fpm[74788]: /rc.newwanip: rc.newwanip: on (IP address: x.x.x.x) (interface: WAN[wan]) (real interface: em0).
      Mar 17 00:21:08 php-fpm[74788]: /rc.newwanip: rc.newwanip: Info: starting on em0.
      Mar 17 00:21:07 check_reload_status: rc.newwanip starting em0
      Mar 17 00:21:07 php-fpm[74788]: /rc.linkup: Hotplug event detected for WAN(wan) but ignoring since interface is configured with static IP (x.x.x.x )
      Mar 17 00:21:06 kernel: em0: link state changed to UP
      Mar 17 00:21:06 check_reload_status: Linkup starting em0
      Mar 17 00:21:05 php-fpm[74788]: /rc.linkup: Hotplug event detected for WAN(wan) but ignoring since interface is configured with static IP (x.x.x.x )
      Mar 17 00:21:04 check_reload_status: Linkup starting em0
      Mar 17 00:21:04 kernel: em0: link state changed to DOWN
      Mar 17 00:21:04 kernel: em0: TX(0) desc avail = 31,Next TX to Clean = 325
      Mar 17 00:21:04 kernel: em0: Queue(0) tdh = 325, hw tdt = 294
      Mar 17 00:21:04 kernel: em0: Watchdog timeout – resetting
      Mar 17 00:20:47 bandwidthd: Previouse graphing run not complete... Skipping current run
      Mar 17 00:20:47 bandwidthd: Previouse graphing run not complete... Skipping current run
      Mar 17 00:19:09 bandwidthd: DNS timeout for 192.168.0.11: This problem reduces graphing performance
      Mar 17 00:19:08 bandwidthd: DNS timeout for 192.168.0.11: This problem reduces graphing performance
      Mar 17 00:17:37 bandwidthd: DNS timeout for 192.168.0.11: This problem reduces graphing performance
      Mar 17 00:17:36 bandwidthd: DNS timeout for 192.168.0.11: This problem reduces graphing performance
      Mar 17 00:16:44 php-fpm[24944]: /interfaces.php: Creating rrd update script
      Mar 17 00:16:44 check_reload_status: Reloading filter
      Mar 17 00:16:43 check_reload_status: Reloading filter
      Mar 17 00:16:43 php-fpm[74788]: /rc.newwanip: rc.newwanip: on (IP address: x.x.x.x) (interface: WAN[wan]) (real interface: em0).
      Mar 17 00:16:43 php-fpm[74788]: /rc.newwanip: rc.newwanip: Info: starting on em0.
      Mar 17 00:16:42 check_reload_status: rc.newwanip starting em0
      Mar 17 00:16:42 php-fpm[74788]: /rc.linkup: Hotplug event detected for WAN(wan) but ignoring since interface is configured with static IP (x.x.x.x )
      Mar 17 00:16:41 kernel: em0: link state changed to UP
      Mar 17 00:16:41 check_reload_status: Linkup starting em0
      Mar 17 00:16:40 check_reload_status: updating dyndns wan
      Mar 17 00:16:39 php-fpm[74788]: /rc.linkup: Hotplug event detected for WAN(wan) but ignoring since interface is configured with static IP (x.x.x.x )
      Mar 17 00:16:38 check_reload_status: Restarting ipsec tunnels
      Mar 17 00:16:38 php-fpm[24944]: /interfaces.php: ROUTING: setting default route to x.x.x.y
      Mar 17 00:16:38 kernel: em0: link state changed to DOWN
      Mar 17 00:16:38 check_reload_status: Linkup starting em0

      Then while trying to post this I got (again newest on top):

      Mar 17 00:50:56 check_reload_status: Reloading filter
      Mar 17 00:50:56 php-fpm[2495]: /rc.newwanip: rc.newwanip: on (IP address: 97.76.50.156) (interface: WAN[wan]) (real interface: em0).
      Mar 17 00:50:56 php-fpm[2495]: /rc.newwanip: rc.newwanip: Info: starting on em0.
      Mar 17 00:50:55 check_reload_status: rc.newwanip starting em0
      Mar 17 00:50:55 php-fpm[2495]: /rc.linkup: Hotplug event detected for WAN(wan) but ignoring since interface is configured with static IP (97.76.50.156 )
      Mar 17 00:50:54 kernel: em0: link state changed to UP
      Mar 17 00:50:54 check_reload_status: Linkup starting em0
      Mar 17 00:50:53 php-fpm[2495]: /rc.linkup: Hotplug event detected for WAN(wan) but ignoring since interface is configured with static IP (97.76.50.156 )
      Mar 17 00:50:52 check_reload_status: Linkup starting em0
      Mar 17 00:50:52 kernel: em0: link state changed to DOWN
      Mar 17 00:50:52 kernel: em0: TX(0) desc avail = 31,Next TX to Clean = 968
      Mar 17 00:50:52 kernel: em0: Queue(0) tdh = 968, hw tdt = 937
      Mar 17 00:50:52 kernel: em0: Watchdog timeout – resetting

      If it is like the other post is it possible there is a driver issue, or some tweaks that need to be done? Cables? The hardware is brand new, so I don't think I have bad hardware.

      1 Reply Last reply Reply Quote 0
      • stephenw10S Offline
        stephenw10 Netgate Administrator
        last edited by

        What hardware are you running?

        Check the various em sysctl error counters. Try disabling MSI/MSI-X.

        Steve

        1 Reply Last reply Reply Quote 0
        • S Offline
          shanis42
          last edited by

          Excuse my newb-atood. I have never created a loader.conf.local file but I am sure I can figure it out. In regards to the sysctrl error counters, where are they?

          I have a small formfactor desktop running it, 4GB RAM, Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz.

          I just did an update to 2.2.1 and now I am noticing it is running the i386 package. So perhaps a move over to the x64 version is in order.

          1 Reply Last reply Reply Quote 0
          • S Offline
            shanis42
            last edited by

            I was able to add the two lines to my loader.conf file. Then restarted.

            Is there any way to check if MSI is disable correctly?

            I just added:
            hw.pci.enable_msix=0
            hw.pci.enable_msi=0

            to the loader.conf file in Diagnostics->Edit File, then restarted.

            1 Reply Last reply Reply Quote 0
            • S Offline
              shanis42
              last edited by

              Did a fresh install of the x64 version, upgraded to 2.2.1 and disabled MSI/MSIX in loader.conf. So far so good, its only been a couple of hours though.

              1 Reply Last reply Reply Quote 0
              • stephenw10S Offline
                stephenw10 Netgate Administrator
                last edited by

                Yes, run 64bit if your system is capable of it.
                The file loader.conf is overwritten on a firmware update and is changed by various settings in the gui so you should use loader.conf.local. It doesn't exist so you need to create it. You could for example do:

                echo 'hw.pci.enable_msix=0' > /boot/loader.conf.local
                

                Either at the command line or in the Diagnostics > Command prompt box. Then edit it to add the other line(s).

                The em counters are accessed using sysctl from the command line. For example:

                [2.2-RELEASE][root@xtm5.localdomain]/root: sysctl dev.em.0
                dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.4.2
                dev.em.0.%driver: em
                dev.em.0.%location: slot=0 function=0
                dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 subdevice=0x0000 class=0x020000
                dev.em.0.%parent: pci2
                dev.em.0.nvm: -1
                dev.em.0.debug: -1
                dev.em.0.fc: 3
                dev.em.0.rx_int_delay: 0
                dev.em.0.tx_int_delay: 66
                dev.em.0.rx_abs_int_delay: 66
                dev.em.0.tx_abs_int_delay: 66
                dev.em.0.itr: 488
                dev.em.0.rx_processing_limit: 100
                dev.em.0.eee_control: 1
                dev.em.0.link_irq: 0
                dev.em.0.mbuf_alloc_fail: 0
                dev.em.0.cluster_alloc_fail: 0
                dev.em.0.dropped: 0
                dev.em.0.tx_dma_fail: 0
                dev.em.0.rx_overruns: 0
                dev.em.0.watchdog_timeouts: 0
                dev.em.0.device_control: 1049160
                dev.em.0.rx_control: 0
                dev.em.0.fc_high_water: 18432
                dev.em.0.fc_low_water: 16932
                dev.em.0.queue0.txd_head: 0
                dev.em.0.queue0.txd_tail: 0
                dev.em.0.queue0.tx_irq: 0
                dev.em.0.queue0.no_desc_avail: 0
                dev.em.0.queue0.rxd_head: 0
                dev.em.0.queue0.rxd_tail: 0
                dev.em.0.queue0.rx_irq: 0
                dev.em.0.mac_stats.excess_coll: 0
                dev.em.0.mac_stats.single_coll: 0
                dev.em.0.mac_stats.multiple_coll: 0
                dev.em.0.mac_stats.late_coll: 0
                dev.em.0.mac_stats.collision_count: 0
                dev.em.0.mac_stats.symbol_errors: 0
                dev.em.0.mac_stats.sequence_errors: 0
                dev.em.0.mac_stats.defer_count: 0
                dev.em.0.mac_stats.missed_packets: 0
                dev.em.0.mac_stats.recv_no_buff: 0
                dev.em.0.mac_stats.recv_undersize: 0
                dev.em.0.mac_stats.recv_fragmented: 0
                dev.em.0.mac_stats.recv_oversize: 0
                dev.em.0.mac_stats.recv_jabber: 0
                dev.em.0.mac_stats.recv_errs: 0
                dev.em.0.mac_stats.crc_errs: 0
                dev.em.0.mac_stats.alignment_errs: 0
                dev.em.0.mac_stats.coll_ext_errs: 0
                dev.em.0.mac_stats.xon_recvd: 0
                dev.em.0.mac_stats.xon_txd: 0
                dev.em.0.mac_stats.xoff_recvd: 0
                dev.em.0.mac_stats.xoff_txd: 0
                dev.em.0.mac_stats.total_pkts_recvd: 0
                dev.em.0.mac_stats.good_pkts_recvd: 0
                dev.em.0.mac_stats.bcast_pkts_recvd: 0
                dev.em.0.mac_stats.mcast_pkts_recvd: 0
                dev.em.0.mac_stats.rx_frames_64: 0
                dev.em.0.mac_stats.rx_frames_65_127: 0
                dev.em.0.mac_stats.rx_frames_128_255: 0
                dev.em.0.mac_stats.rx_frames_256_511: 0
                dev.em.0.mac_stats.rx_frames_512_1023: 0
                dev.em.0.mac_stats.rx_frames_1024_1522: 0
                dev.em.0.mac_stats.good_octets_recvd: 0
                dev.em.0.mac_stats.good_octets_txd: 0
                dev.em.0.mac_stats.total_pkts_txd: 0
                dev.em.0.mac_stats.good_pkts_txd: 0
                dev.em.0.mac_stats.bcast_pkts_txd: 0
                dev.em.0.mac_stats.mcast_pkts_txd: 0
                dev.em.0.mac_stats.tx_frames_64: 0
                dev.em.0.mac_stats.tx_frames_65_127: 0
                dev.em.0.mac_stats.tx_frames_128_255: 0
                dev.em.0.mac_stats.tx_frames_256_511: 0
                dev.em.0.mac_stats.tx_frames_512_1023: 0
                dev.em.0.mac_stats.tx_frames_1024_1522: 0
                dev.em.0.mac_stats.tso_txd: 0
                dev.em.0.mac_stats.tso_ctx_fail: 0
                dev.em.0.interrupts.asserts: 0
                dev.em.0.interrupts.rx_pkt_timer: 0
                dev.em.0.interrupts.rx_abs_timer: 0
                dev.em.0.interrupts.tx_pkt_timer: 0
                dev.em.0.interrupts.tx_abs_timer: 0
                dev.em.0.interrupts.tx_queue_empty: 0
                dev.em.0.interrupts.tx_queue_min_thresh: 0
                dev.em.0.interrupts.rx_desc_min_thresh: 0
                dev.em.0.interrupts.rx_overrun: 0
                
                

                Steve

                1 Reply Last reply Reply Quote 0
                • S Offline
                  shanis42
                  last edited by

                  Hello Stephen,

                  That are good instructions for someone a little green like I am.

                  I created the loader.conf.local file with the script you provided and the went into "Edit File" and added the second line. I removed the lines from loader.conf then rebooted

                  I also ran sysctrl dev.em.0 from the command prompt. The packet information is different for somewhat obvious reasons. But other than that, the only section the is radically different is pasted below.

                  dev.em.0.watchdog_timeouts: 9
                  dev.em.0.device_control: 1477444160
                  dev.em.0.rx_control: 67141634
                  dev.em.0.fc_high_water: 8192
                  dev.em.0.fc_low_water: 6692
                  dev.em.0.queue0.txd_head: 466
                  dev.em.0.queue0.txd_tail: 466
                  dev.em.0.queue0.tx_irq: 0
                  dev.em.0.queue0.no_desc_avail: 0
                  dev.em.0.queue0.rxd_head: 601
                  dev.em.0.queue0.rxd_tail: 600
                  

                  You have all zeros there. I dont know if this is just different usage in our configurations, it must clear out when I clear my logs out. I know I have had more than 9 watchdog timeouts.

                  1 Reply Last reply Reply Quote 0
                  • S Offline
                    shanis42
                    last edited by

                    I was able to get this from pciconf -lvbc:

                    em0@pci0:0:25:0:	class=0x020000 card=0xb049144d chip=0x10bd8086 rev=0x02 hdr=0x00
                        class      = network
                        subclass   = ethernet
                        bar   [10] = type Memory, range 32, base 0xfc480000, size 131072, enabled
                        bar   [14] = type Memory, range 32, base 0xfc4a5000, size 4096, enabled
                        bar   [18] = type I/O Port, range 32, base 0x1820, size 32, enabled
                        cap 01[c8] = powerspec 2  supports D0 D3  current D0
                        cap 05[d0] = MSI supports 1 message, 64 bit 
                        cap 09[e0] = vendor (length 6) Intel cap 2 version 0
                    

                    Looks like card supports MSI but not MSI-X

                    From my LAN side adapter:

                    em1@pci0:5:0:0:	class=0x020000 card=0xa01f8086 chip=0x10d38086 rev=0x00 hdr=0x00
                        class      = network
                        subclass   = ethernet
                        bar   [10] = type Memory, range 32, base 0xfc120000, size 131072, enabled
                        bar   [14] = type Memory, range 32, base 0xfc180000, size 524288, enabled
                        bar   [18] = type I/O Port, range 32, base 0x2000, size 32, enabled
                        bar   [1c] = type Memory, range 32, base 0xfc100000, size 16384, enabled
                        cap 01[c8] = powerspec 2  supports D0 D3  current D0
                        cap 05[d0] = MSI supports 1 message, 64 bit 
                        cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
                                     speed 2.5(2.5) ASPM disabled(L0s/L1)
                        cap 11[a0] = MSI-X supports 5 messages
                                     Table in map 0x1c[0x0], PBA in map 0x1c[0x2000]
                        ecap 0001[100] = AER 1 0 fatal 1 non-fatal 3 corrected
                        ecap 0003[140] = Serial 1 6805caffff2cea4b
                    

                    Showing it supports MSI and MSI-X.

                    If the problem is MSI-X it would explain why I only get all of my issues (watchdog, hotplug, flap) on the WAN side.

                    Perhaps I should enable MSI and disable MSI-X? Maybe I will try switching the cables and then reassigning the interfaces. I would then have the card that supports MSI-X on the WAN side, and to see if the errors jump over to the LAN side.

                    I currently have both MSI and MSI-X disabled in loader.conf.local. I had them disabled in loader.conf, but moved them and a MBUF reassignment over to the local file and restarted. I did check by running sysctl hw.pci and they are both disabled.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S Offline
                      stephenw10 Netgate Administrator
                      last edited by

                      Try swapping the NIC assignments so the msi-x capable card is WAN.

                      Steve

                      1 Reply Last reply Reply Quote 0
                      • S Offline
                        shanis42
                        last edited by

                        I have ordered a dual PRO/1000 Pcie card to eliminate the onboard controller which must be older.

                        In the meantime, while I wait for it to arrive, I will switch the interfaces when I get to work in the morning. It is still throwing watchdog errors right as of now, with both MSI an MSI-X disabled. But it is only throwing them on the WAN side. We will see what happens when I put the newer adapter on the WAN side. I suspect something is wrong with either the onboard NIC or the driver for the NIC.

                        1 Reply Last reply Reply Quote 0
                        • S Offline
                          shanis42
                          last edited by

                          Swapping the interfaces didn't work, actually got a little worse. I now know it is specifically that onboard adapter. When you get the timeouts and the interface is assigned to LAN, you get locked out of the web admin until you either restart web admin via the firewalls command menu, or restart the whole box.

                          I flipped them back and will have to wait for my dual nic PCI-e card to come in.

                          I think the onboard adapter is a little older than the PCI-e card I put in. They both want to use the em driver.

                          Any other ideas for future searchers? I read somewhere perhaps connections to Gigabit devices can overwhelm certain adapters, I don't know if there is any truth to that though.

                          1 Reply Last reply Reply Quote 0
                          • stephenw10S Offline
                            stephenw10 Netgate Administrator
                            last edited by

                            Not off hand. I would be searching the FreeBSD mailing list and forum using the details from the specific adapter given by pciconf.

                            Steve

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.