Navigation

    Netgate Discussion Forum
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search

    WAN interface losing connectivity 5-10x daily - uber-thread!

    General pfSense Questions
    3
    3
    1735
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R
      rblock last edited by

      Hey there, I'm an experienced pfSense user (6 years), and I'm pretty perplexed by this issue I've been experiencing. Over the last week I've had a ton of issues with my WAN connection or routes dropping many times daily (with strong reason to believe it's not internet service).

      I've combed through dozens of threads of people losing WAN for unknown or mysterious reasons, and I'm hoping maybe I can pull some of these together here and find some lasting solutions, since it seems like this is a theme I've noticed. (Thanks in advance for any help!)

      Alright, here's a quick rundown of my setup:

      • Supermicro A1SRi-2558F with 4x Intel igb gigabit connections (no external / USB ethernet being used), 4GB ECC RAM, Intel SSD

      • Production release (2.1.5 amd64)

      • Arris Surfboard SB6141

      • Astound cable service (dynamic IP), all new coax to the modem

      • 1x gigabit WAN connection (all default + autonegotiate)

      • 1x gigabit LAN connection (multiple internal VLANs with known-working config, NAT / fw rules)

      I previously ran this exact same stack at this location on a Supermicro X7SPA-H, where everything was stable and running perfectly for months. Eventually I determined we'd need to upgrade that motherboard, and did so with a reinstall about 8 days ago. (Reinstall included a backup restoration of NAT / fw rules and RRD graph data, nothing else – especially not network configurations.)

      I've read dozens of threads from people having similar WAN issues (linked below), and tried a ton of different attempts at fixes:

      • Ensuring WAN is not blocking private or bogon networks

      • Ensuring IPv6 DHCP is disabled

      • Ensuring no MAC address is entered

      • Increasing MBUFs (currently set to 262144)

      • Ensuring low hw.igb.num_queues (currently 2)

      • Not having any VPN services active (which I never did)

      • Allowing apinger to monitor the gateway

      • NOT allowing apinger to monitor the gateway

      • Rebooting the system, or modem, or all hardware

      Symptoms I'm noticing when connection flaps:

      • No correlation between usage and outages; WAN loss of connectivity occurs during off hours as well as during moments of normal usage.

      • Not much going on in logs (especially with apinger disabled). Traffic simply stops routing outbound. (No firewall / NAT rule changes or scheduling has been made lately.)

      • State table drops down pretty low, but not completely to zero; same with MBUF usage.

      • System is still perfectly responsive, and WAN connection still appears to be parsing firewall rules.

      Right now, the only thing that will reliably restore the connection is to completely reboot the box. Waiting a while (usually 10-20 minutes) sometimes also sees the connection restored on its own, but that isn't reliable. Enabling / disabling the WAN doesn't have any effect.

      Cable connection + modem is fine, and internet access resumes without any intervention on the modem once the pfSense box comes back up. I've totally ruled out cable issues – again, this same setup worked with precisely zero outages for as long as it operated at this location prior to this installation.

      Perhaps it's unrelated, but earlier on with the box, I had the WAN interface on igb1 and LAN on igb0 (currently it's reversed), and was having DHCP issues then, too; it went away for a while when I swapped them out. Which has me wondering whether this is potentially a driver issue. Although at this point I've chased down so many leads, I'm pretty much stumped.

      Again, thanks in advance for any help!

      Here's a sample log from earlier today, although this has more going on than what usually pops up during an outage:

      Dec 31 12:22:47 php: rc.start_packages: Restarting/Starting all packages.
      Dec 31 12:22:45 check_reload_status: Reloading filter
      Dec 31 12:22:45 check_reload_status: Starting packages
      Dec 31 12:22:45 php: rc.newwanip: pfSense package system has detected an ip change x.x.x.x (my public IP) -> x.x.x.x (the same public IP lol) … Restarting packages.
      Dec 31 12:22:43 php: rc.newwanip: Creating rrd update script
      Dec 31 12:22:43 php: rc.newwanip: RRD create failed exited with 1, the error is: ERROR: you must define at least one Data Source
      Dec 31 12:22:43 php: rc.newwanip: RRD create failed exited with 1, the error is: ERROR: you must define at least one Data Source
      Dec 31 12:22:43 php: rc.newwanip: Resyncing OpenVPN instances for interface WAN.
      Dec 31 12:22:42 php: /interfaces.php: Creating rrd update script
      Dec 31 12:22:42 php: /interfaces.php: RRD create failed exited with 1, the error is: ERROR: you must define at least one Data Source
      Dec 31 12:22:42 php: /interfaces.php: RRD create failed exited with 1, the error is: ERROR: you must define at least one Data Source
      Dec 31 12:22:42 check_reload_status: Reloading filter
      Dec 31 12:22:38 check_reload_status: updating dyndns wan
      Dec 31 12:22:36 php: rc.newwanip: ROUTING: setting default route to x.x.x.1 (my ISP gateway)
      Dec 31 12:22:36 php: rc.newwanip: rc.newwanip: on (IP address: x.x.x.x) (interface: WAN[wan]) (real interface: igb0).
      Dec 31 12:22:36 php: rc.newwanip: rc.newwanip: Informational is starting igb0.
      Dec 31 12:22:33 php: /interfaces.php: ROUTING: setting default route to x.x.x.1 (my ISP gateway)
      Dec 31 12:22:33 check_reload_status: rc.newwanip starting igb0
      Dec 31 12:22:33 php: /interfaces.php: Clearing states to old gateway x.x.x.1 (my ISP gateway).
      Dec 31 12:22:30 check_reload_status: Syncing firewall
      Dec 31 12:22:09 php: /interfaces.php: Creating rrd update script
      Dec 31 12:22:09 php: /interfaces.php: RRD create failed exited with 1, the error is: ERROR: you must define at least one Data Source
      Dec 31 12:22:09 php: /interfaces.php: RRD create failed exited with 1, the error is: ERROR: you must define at least one Data Source
      Dec 31 12:22:09 check_reload_status: Reloading filter
      Dec 31 12:22:06 check_reload_status: updating dyndns wan
      Dec 31 12:22:05 php: rc.interfaces_wan_configure: The command '/sbin/dhclient -c /var/etc/dhclient_wan.conf igb0 > /tmp/igb0_output 2> /tmp/igb0_error_output' returned exit code '1', the output was ''
      Dec 31 12:22:05 check_reload_status: updating dyndns wan
      Dec 31 12:22:03 check_reload_status: Configuring interface wan
      Dec 31 12:22:03 php: rc.newwanip: rc.newwanip: Failed to update wan IP, restarting…
      Dec 31 12:22:03 php: rc.newwanip: rc.newwanip: on (IP address: ) (interface: WAN[wan]) (real interface: igb0).
      Dec 31 12:22:03 php: rc.newwanip: rc.newwanip: Informational is starting igb0.
      Dec 31 12:22:02 php: rc.linkup: The command '/sbin/route change -inet default 'x.x.x.1 (my ISP gateway)'' returned exit code '1', the output was 'route: writing to routing socket: No such process route: writing to routing socket: Network is unreachable change net default: gateway x.x.x.1 (my ISP gateway): Network is unreachable'
      Dec 31 12:22:02 php: rc.linkup: ROUTING: setting default route to x.x.x.1 (my ISP gateway)
      Dec 31 12:22:02 php: rc.linkup: The command '/sbin/dhclient -c /var/etc/dhclient_wan.conf igb0 > /tmp/igb0_output 2> /tmp/igb0_error_output' returned exit code '1', the output was ''
      Dec 31 12:22:02 php: rc.linkup: HOTPLUG: Configuring interface wan
      Dec 31 12:22:02 php: rc.linkup: DEVD Ethernet attached event for wan
      Dec 31 12:22:00 php: /interfaces.php: ROUTING: setting default route to x.x.x.1 (my ISP gateway)
      Dec 31 12:22:00 check_reload_status: rc.newwanip starting igb0
      Dec 31 12:22:00 kernel: igb0: link state changed to UP
      Dec 31 12:22:00 check_reload_status: Linkup starting igb0
      Dec 31 12:21:59 php: rc.linkup: DEVD Ethernet detached event for wan
      Dec 31 12:21:56 kernel: igb0: link state changed to DOWN
      Dec 31 12:21:56 check_reload_status: Linkup starting igb0
      Dec 31 12:21:37 check_reload_status: Syncing firewall
      Dec 31 12:21:29 php: /interfaces.php: Creating rrd update script
      Dec 31 12:21:29 php: /interfaces.php: RRD create failed exited with 1, the error is: ERROR: you must define at least one Data Source
      Dec 31 12:21:29 php: /interfaces.php: RRD create failed exited with 1, the error is: ERROR: you must define at least one Data Source
      Dec 31 12:21:29 check_reload_status: Reloading filter
      Dec 31 12:21:24 php: /interfaces.php: Clearing states to old gateway x.x.x.1 (my ISP gateway).
      Dec 31 12:21:22 check_reload_status: Syncing firewall
      Dec 31 12:18:34 php: /interfaces.php: Creating rrd update script
      Dec 31 12:18:34 php: /interfaces.php: RRD create failed exited with 1, the error is: ERROR: you must define at least one Data Source
      Dec 31 12:18:34 php: /interfaces.php: RRD create failed exited with 1, the error is: ERROR: you must define at least one Data Source
      Dec 31 12:18:34 check_reload_status: Reloading filter
      Dec 31 12:18:32 check_reload_status: updating dyndns wan
      Dec 31 12:18:30 php: rc.interfaces_wan_configure: The command '/sbin/dhclient -c /var/etc/dhclient_wan.conf igb0 > /tmp/igb0_output 2> /tmp/igb0_error_output' returned exit code '1', the output was ''
      Dec 31 12:18:30 check_reload_status: updating dyndns wan
      Dec 31 12:18:28 check_reload_status: Configuring interface wan
      Dec 31 12:18:28 php: rc.newwanip: rc.newwanip: Failed to update wan IP, restarting…
      Dec 31 12:18:28 php: rc.newwanip: rc.newwanip: on (IP address: ) (interface: WAN[wan]) (real interface: igb0).
      Dec 31 12:18:28 php: rc.newwanip: rc.newwanip: Informational is starting igb0.
      Dec 31 12:18:28 php: rc.linkup: The command '/sbin/route change -inet default 'x.x.x.1 (my ISP gateway)'' returned exit code '1', the output was 'route: writing to routing socket: No such process route: writing to routing socket: Network is unreachable change net default: gateway x.x.x.1 (my ISP gateway): Network is unreachable'
      Dec 31 12:18:28 php: rc.linkup: ROUTING: setting default route to x.x.x.1 (my ISP gateway)
      Dec 31 12:18:28 php: rc.linkup: The command '/sbin/dhclient -c /var/etc/dhclient_wan.conf igb0 > /tmp/igb0_output 2> /tmp/igb0_error_output' returned exit code '1', the output was ''
      Dec 31 12:18:27 php: rc.linkup: HOTPLUG: Configuring interface wan
      Dec 31 12:18:27 php: rc.linkup: DEVD Ethernet attached event for wan
      Dec 31 12:18:26 php: /interfaces.php: ROUTING: setting default route to x.x.x.1 (my ISP gateway)
      Dec 31 12:18:26 check_reload_status: rc.newwanip starting igb0
      Dec 31 12:18:25 kernel: igb0: link state changed to UP
      Dec 31 12:18:25 check_reload_status: Linkup starting igb0
      Dec 31 12:18:24 php: rc.linkup: DEVD Ethernet detached event for wan
      Dec 31 12:18:22 kernel: igb0: link state changed to DOWN
      Dec 31 12:18:22 check_reload_status: Linkup starting igb0
      Dec 31 12:18:19 check_reload_status: Syncing firewall
      Dec 31 12:17:58 php: /interfaces.php: Creating rrd update script
      Dec 31 12:17:58 php: /interfaces.php: RRD create failed exited with 1, the error is: ERROR: you must define at least one Data Source
      Dec 31 12:17:58 php: /interfaces.php: RRD create failed exited with 1, the error is: ERROR: you must define at least one Data Source
      Dec 31 12:17:58 check_reload_status: Reloading filter
      Dec 31 12:17:54 php: /interfaces.php: Clearing states to old gateway x.x.x.1 (my ISP gateway).
      Dec 31 12:17:51 check_reload_status: Syncing firewall


      Appendix of other WAN-related threads I've read in trying to solve this issue:
      WAN interface going down
      https://forum.pfsense.org/index.php?topic=84037.0

      Ethernet connection goes down every now & then
      https://forum.pfsense.org/index.php?topic=83702.0

      WAN unable to obtain IP address via DHCP
      https://forum.pfsense.org/index.php?topic=81987.0

      No internet connectivity on WAN with valid public IP
      https://forum.pfsense.org/index.php?topic=81943.0

      WAN port goes down and up but no internet connection - em card, 2.1.5 x64
      https://forum.pfsense.org/index.php?topic=81407.0

      check_reload_status at 100% + apinger messages
      https://forum.pfsense.org/index.php?topic=79812.0

      All packages restart after a DHCP renew
      https://forum.pfsense.org/index.php?topic=76597.0

      apinger exits when no useable targets but is not restarted
      https://forum.pfsense.org/index.php?topic=71908.0

      WAN-link "randomly" disconnects. pfSense 2.1
      https://forum.pfsense.org/index.php?topic=71624.0

      WAN down
      https://forum.pfsense.org/index.php?topic=70682.0

      WAN dropped connection on 2.1
      https://forum.pfsense.org/index.php?topic=70677.0

      pfSense 2.1 WAN interface DHCP problem
      https://forum.pfsense.org/index.php?topic=69904.0

      kernel: arpresolve: can't allocate llinfo for 192.168.100.1 (cable modem)
      https://forum.pfsense.org/index.php?topic=63474.0

      kernel: arpresolve: can't allocate llinfo for xxx.xxx.xxx.xxx
      https://forum.pfsense.org/index.php/topic,62964.0

      WAN down
      https://forum.pfsense.org/index.php?topic=61785.0

      wan cable disconnect/reconnect causes interface drop no recover
      https://forum.pfsense.org/index.php?topic=61182

      pfSense loses default route after link flap
      https://forum.pfsense.org/index.php?topic=60886.0

      Since a few day's losing the wan (dhcp) ip address
      https://forum.pfsense.org/index.php?topic=58819.0

      Intel i350: recognized, but no traffic and no DHCP
      https://forum.pfsense.org/index.php?topic=55032.0

      Cannot obtain DHCP from WAN automatically, must be done manually.
      https://forum.pfsense.org/index.php?topic=53341.0

      Gateway status Offline
      https://forum.pfsense.org/index.php?topic=53187.0

      dhclient loosing WAN connection
      https://forum.pfsense.org/index.php?topic=48013.0

      WAN DHCP Does Not Work
      https://forum.pfsense.org/index.php?topic=31523.0

      Polling broken in current snapshot for igb interfaces
      https://forum.pfsense.org/index.php?topic=27126.0

      wan interface losing ip address
      http://lists.pfsense.org/pipermail/list/2012-July/002572.html

      • others, less relevant

      Bugs I've investigated:
      #3669
      #2704
      #2647
      #2919
      #1943

      1 Reply Last reply Reply Quote 0
      • R
        reilos last edited by

        I know it's been a while, but were you able to figure this out?

        1 Reply Last reply Reply Quote 0
        • D
          David_W last edited by

          All the log extracts in the original post show is the link cycling on the igb0 interface, with the inevitable consequences of pfSense stopping, starting and reloading various services.

          The original post is now 13 months old and refers to an obsolete and end of life version of pfSense that is based on an obsolete and end of life version of FreeBSD. If you have an issue with link cycling, it would be best if you describe your issue afresh, enclosing relevant log extracts.

          1 Reply Last reply Reply Quote 0
          • First post
            Last post