Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    1.2 on poweredge 860 - no more traffic thru wans after <1h

    Scheduled Pinned Locked Moved Problems Installing or Upgrading pfSense Software
    5 Posts 3 Posters 3.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B
      bruno
      last edited by

      hello, and sorry for my english.

      since 1.2 I have troubles with my dual wan setup on a dell poweredge 860 (celeron D 3GHz, 1GB RAM), 2x broadcom embedded nics and 2x pro1000. after some random time, generally within an hour of uptime, both WAN interfaces die and don't route traffic any more, while nics link is still up.

      in the dashboard, loadbalancer and both failovers pools turn to red, from cli I can't reach any outside address.
      here's the log when it happens. the only 'fix' is to reboot.

      i had 1.2 RC2 working w/o any issue for weeks. this problem started right after upgrading to 1.2 release, tried also reinstalling but no joy.

      here's the log starting at the time troubles begin. the same config, within a vmware esx vm works flawlessly. maybe some new/faulty drivers in 1.2REL?

      thanks

      Mar 13 22:11:22 kernel: em1: watchdog timeout – resetting
      Mar 13 22:11:22 kernel: em1: link state changed to DOWN
      Mar 13 22:11:24 kernel: em1: link state changed to UP
      Mar 13 22:11:24 check_reload_status: rc.linkup starting
      Mar 13 22:11:24 php: : Processing em1 - start
      Mar 13 22:11:24 php: : Hotplug event detected for em1 but ignoring since interface is not set for DHCP
      Mar 13 22:11:24 php: : Processing start -
      Mar 13 22:11:24 php: : Not a valid interface action ""
      Mar 13 22:11:24 php: : Processing -
      Mar 13 22:11:24 php: : Not a valid interface action ""
      Mar 13 22:11:26 slbd[426]: ICMP poll failed for WAN2_GW, marking service DOWN
      Mar 13 22:11:26 slbd[426]: ICMP poll failed for WAN_GW, marking service DOWN
      Mar 13 22:11:26 slbd[426]: Service tlc2bt changed status, reloading filter policy
      Mar 13 22:11:26 slbd[426]: ICMP poll failed for WAN_GW, marking service DOWN
      Mar 13 22:11:26 slbd[426]: ICMP poll failed for WAN_GW, marking service DOWN
      Mar 13 22:11:28 slbd[426]: Switching to sitedown for VIP 127.0.0.1:666
      Mar 13 22:11:29 check_reload_status: reloading filter
      Mar 13 22:11:33 slbd[426]: Switching to sitedown for VIP 127.0.0.1:666
      Mar 13 22:11:33 slbd[426]: ICMP poll failed for WAN2_GW, marking service DOWN
      Mar 13 22:11:33 slbd[426]: ICMP poll failed for WAN2_GW, marking service DOWN
      Mar 13 22:11:33 slbd[426]: Service bt2tlc changed status, reloading filter policy
      Mar 13 22:11:33 slbd[426]: Service LB01 changed status, reloading filter policy
      Mar 13 22:11:38 check_reload_status: reloading filter
      Mar 13 22:11:38 slbd[426]: Switching to sitedown for VIP 127.0.0.1:666
      Mar 13 22:11:38 last message repeated 2 times
      Mar 13 22:11:42 kernel: em1: watchdog timeout – resetting
      Mar 13 22:11:42 kernel: em1: link state changed to DOWN
      Mar 13 22:11:43 slbd[426]: Switching to sitedown for VIP 127.0.0.1:666
      Mar 13 22:11:43 last message repeated 2 times
      Mar 13 22:11:44 kernel: em1: link state changed to UP
      Mar 13 22:11:46 check_reload_status: rc.linkup starting
      Mar 13 22:11:46 php: : Processing em1 - start
      Mar 13 22:11:46 php: : Hotplug event detected for em1 but ignoring since interface is not set for DHCP
      Mar 13 22:11:46 php: : Processing start -
      Mar 13 22:11:46 php: : Not a valid interface action ""
      Mar 13 22:11:46 php: : Processing -
      Mar 13 22:11:46 php: : Not a valid interface action ""
      Mar 13 22:11:48 slbd[426]: Switching to sitedown for VIP 127.0.0.1:666
      Mar 13 22:12:23 last message repeated 21 times
      Mar 13 22:14:28 last message repeated 75 times
      Mar 13 22:14:28 last message repeated 2 times
      Mar 13 22:14:31 kernel: em1: watchdog timeout – resetting
      Mar 13 22:14:31 kernel: em1: link state changed to DOWN
      Mar 13 22:14:33 slbd[426]: Switching to sitedown for VIP 127.0.0.1:666
      Mar 13 22:14:33 last message repeated 2 times
      Mar 13 22:14:33 kernel: em1: link state changed to UP
      Mar 13 22:14:36 check_reload_status: rc.linkup starting
      Mar 13 22:14:36 php: : Processing em1 - start
      Mar 13 22:14:36 php: : Hotplug event detected for em1 but ignoring since interface is not set for DHCP
      Mar 13 22:14:36 php: : Processing start -
      Mar 13 22:14:36 php: : Not a valid interface action ""
      Mar 13 22:14:36 php: : Processing -
      Mar 13 22:14:36 php: : Not a valid interface action ""
      Mar 13 22:14:38 slbd[426]: Switching to sitedown for VIP 127.0.0.1:666
      Mar 13 22:14:38 last message repeated 2 times
      Mar 13 22:14:39 kernel: em1: watchdog timeout – resetting
      Mar 13 22:14:39 kernel: em1: link state changed to DOWN
      Mar 13 22:14:41 kernel: em1: link state changed to UP
      Mar 13 22:14:41 check_reload_status: rc.linkup starting
      Mar 13 22:14:42 php: : Processing em1 - start
      Mar 13 22:14:42 php: : Hotplug event detected for em1 but ignoring since interface is not set for DHCP
      Mar 13 22:14:42 php: : Processing start -
      Mar 13 22:14:42 php: : Not a valid interface action ""
      Mar 13 22:14:42 php: : Processing -
      Mar 13 22:14:42 php: : Not a valid interface action ""
      Mar 13 22:14:43 slbd[426]: Switching to sitedown for VIP 127.0.0.1:666
      Mar 13 22:15:13 last message repeated 20 times
      Mar 13 22:15:16 kernel: bge1: watchdog timeout – resetting
      Mar 13 22:15:16 kernel: bge1: link state changed to DOWN
      Mar 13 22:15:18 slbd[426]: Switching to sitedown for VIP 127.0.0.1:666
      Mar 13 22:15:18 last message repeated 2 times
      Mar 13 22:15:18 kernel: bge1: link state changed to UP
      Mar 13 22:15:22 check_reload_status: rc.linkup starting
      Mar 13 22:15:22 php: : Processing bge1 - start
      Mar 13 22:15:22 php: : Hotplug event detected for bge1 but ignoring since interface is not set for DHCP
      Mar 13 22:15:22 php: : Processing start -
      Mar 13 22:15:22 php: : Not a valid interface action ""
      Mar 13 22:15:22 php: : Processing -
      Mar 13 22:15:22 php: : Not a valid interface action ""
      Mar 13 22:15:23 slbd[426]: Switching to sitedown for VIP 127.0.0.1:666
      Mar 13 22:15:58 last message repeated 21 times
      Mar 13 22:16:03 last message repeated 5 times
      Mar 13 22:16:06 php: /index.php: XMLRPC communication error: No route to host
      Mar 13 22:16:08 slbd[426]: Switching to sitedown for VIP 127.0.0.1:666
      Mar 13 22:16:43 last message repeated 21 times
      Mar 13 22:18:38 last message repeated 71 times
      Mar 13 22:18:43 last message repeated 3 times
      Mar 13 22:18:43 kernel: em1: watchdog timeout – resetting
      Mar 13 22:18:43 kernel: em1: link state changed to DOWN
      Mar 13 22:18:45 kernel: em1: link state changed to UP
      Mar 13 22:18:47 check_reload_status: rc.linkup starting
      Mar 13 22:18:47 php: : Processing em1 - start
      Mar 13 22:18:47 php: : Hotplug event detected for em1 but ignoring since interface is not set for DHCP
      Mar 13 22:18:47 php: : Processing start -
      Mar 13 22:18:47 php: : Not a valid interface action ""
      Mar 13 22:18:47 php: : Processing -
      Mar 13 22:18:47 php: : Not a valid interface action ""
      Mar 13 22:18:48 slbd[426]: Switching to sitedown for VIP 127.0.0.1:666
      Mar 13 22:19:23 last message repeated 21 times
      Mar 13 22:19:33 last message repeated 8 times
      Mar 13 22:19:36 syslogd: exiting on signal 15
      Mar 13 22:19:36 syslogd: kernel boot file is /boot/kernel/kernel
      Mar 13 22:19:38 slbd[426]: Switching to sitedown for VIP 127.0.0.1:666

      1 Reply Last reply Reply Quote 0
      • H
        hoba
        last edited by

        try to disable acpi and see if your nic watchdog timeouts go away: http://devwiki.pfsense.org/BootOptions

        1 Reply Last reply Reply Quote 0
        • B
          bruno
          last edited by

          hi, no luck with no acpi, I have a kernel panic.

          1 Reply Last reply Reply Quote 0
          • B
            bruno
            last edited by

            anyone?

            meanwhile, booting with SMP kernel seems to prevent nics watchdog timeouts. anything wrong in running SMP kernel on a single proc machine?

            thanks.

            1 Reply Last reply Reply Quote 0
            • GruensFroeschliG
              GruensFroeschli
              last edited by

              Nothing wrong with SMP on single-core machines :)

              We do what we must, because we can.

              Asking questions the smart way: http://www.catb.org/esr/faqs/smart-questions.html

              1 Reply Last reply Reply Quote 0
              • First post
                Last post
              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.