Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    WAN flapping since upgrading to 2.4.5

    Scheduled Pinned Locked Moved General pfSense Questions
    15 Posts 3 Posters 626 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • L
      larold42
      last edited by

      So i decided to upgrade my old hardware that was running 2.4.4_3 to something newer. Old hardware [Solana Tech Mini ITX pfSense firewall router with 4x Intel NICs and 2ghz J1900 CPU, 4gb RAM and 32gb SSD] everything was working great, i just needed to upgrade to something that supported AES-NI as i have been setting up more tunnels. I first tried the Protectli boxes on amazon, installed 2.4.5 and immediately noticed my WAN going up and down every 15 mins. I decided to return the hardware and try something else, ended up finding a used Jetway JBC385. Installed 2.4.5 and again my WAN goes up and down. I put my old hardware back in place and not a single issue. I've tried troubleshooting everything i can at this point i believe. Looking for help.

      1 Reply Last reply Reply Quote 0
      • L
        larold42
        last edited by

        Oct 26 23:28:32	check_reload_status		Reloading filter
        Oct 26 23:28:32	check_reload_status		Linkup starting igb0
        Oct 26 23:28:32	kernel		igb0: link state changed to UP
        Oct 26 23:28:32	kernel		arpresolve: can't allocate llinfo for 174.109.176.1 on igb0
        Oct 26 23:28:32	kernel		arpresolve: can't allocate llinfo for 174.109.176.1 on igb0
        Oct 26 23:28:33	kernel		arpresolve: can't allocate llinfo for 174.109.176.1 on igb0
        Oct 26 23:28:33	php-fpm	713	/rc.openvpn: Gateway, none 'available' for inet, use the first one configured. 'WAN_DHCP'
        Oct 26 23:28:33	php-fpm	713	/rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. ''
        Oct 26 23:28:33	php-fpm	713	/rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_DHCP.
        Oct 26 23:28:33	kernel		arpresolve: can't allocate llinfo for 174.109.176.1 on igb0
        Oct 26 23:28:33	kernel		arpresolve: can't allocate llinfo for 174.109.176.1 on igb0
        Oct 26 23:28:33	php-fpm	713	/rc.linkup: DEVD Ethernet attached event for wan
        Oct 26 23:28:33	php-fpm	713	/rc.linkup: HOTPLUG: Configuring interface wan
        Oct 26 23:28:33	check_reload_status		rc.newwanip starting igb0
        Oct 26 23:28:33	php-fpm	713	/rc.linkup: Gateway, none 'available' for inet, use the first one configured. 'WAN_DHCP'
        Oct 26 23:28:33	php-fpm	713	/rc.linkup: Gateway, none 'available' for inet6, use the first one configured. ''
        Oct 26 23:28:33	check_reload_status		Restarting ipsec tunnels
        Oct 26 23:28:34	php-fpm	714	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 174.109.182.X -> 174.109.182.X - Restarting packages.
        Oct 26 23:28:34	check_reload_status		Starting packages
        Oct 26 23:28:34	php-fpm	98247	/rc.newwanip: rc.newwanip: Info: starting on igb0.
        Oct 26 23:28:34	php-fpm	98247	/rc.newwanip: rc.newwanip: on (IP address: 174.109.182.X) (interface: WAN[wan]) (real interface: igb0).
        Oct 26 23:28:35	php-fpm	98247	/rc.newwanip: Gateway, none 'available' for inet6, use the first one configured. ''
        Oct 26 23:28:35	php-fpm	714	/rc.start_packages: Restarting/Starting all packages.
        Oct 26 23:28:36	php-fpm	98247	/rc.newwanip: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1603754916] unbound[19136:0] error: bind: address already in use [1603754916] unbound[19136:0] fatal error: could not open ports'
        Oct 26 23:28:37	kernel		igb0: link state changed to DOWN
        Oct 26 23:28:37	check_reload_status		Linkup starting igb0
        Oct 26 23:28:37	check_reload_status		updating dyndns wan
        Oct 26 23:28:37	check_reload_status		Reloading filter
        Oct 26 23:28:38	php-fpm	713	/rc.linkup: DEVD Ethernet detached event for wan
        Oct 26 23:28:40	kernel		arpresolve: can't allocate llinfo for 174.109.176.1 on igb0
        Oct 26 23:28:40	rc.gateway_alarm	64625	>>> Gateway alarm: WAN_DHCP (Addr:174.109.176.1 Alarm:1 RTT:16.676ms RTTsd:3.783ms Loss:37%)
        Oct 26 23:28:40	check_reload_status		updating dyndns WAN_DHCP
        Oct 26 23:28:40	check_reload_status		Restarting ipsec tunnels
        Oct 26 23:28:40	check_reload_status		Restarting OpenVPN tunnels/interfaces
        Oct 26 23:28:40	check_reload_status		Reloading filter
        
        Oct 26 23:28:57	dpinger		WAN_DHCP 174.109.176.1: sendto error: 65
        Oct 26 23:28:58	dpinger		WAN_DHCP 174.109.176.1: sendto error: 65
        Oct 26 23:28:58	dpinger		WAN_DHCP 174.109.176.1: sendto error: 65
        Oct 26 23:29:00	dpinger		send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 174.109.176.1 bind_addr 174.109.182.150 identifier "WAN_DHCP "
        

        I restored from a previous backup, but did not include my packages. And when i started having these troubles, i also removed my VPN tunnels and went to a barebones install.

        GertjanG 1 Reply Last reply Reply Quote 0
        • bmeeksB
          bmeeks
          last edited by bmeeks

          You NIC is detecting (or thinks it is detecting) a link state change. That is then triggering automatic processes (scripts, actually) to run within pfSense.

          So the root cause is the link state change. As to what is causing that, the first place to look usually is hardware. But in your case you say you have tried two different pieces of new hardware with the same result. When you swap in the old hardware for testing, are you using the exact same Cat5 cable and switch port that you use for the new hardware? Don't forget that cables and switch ports can get flaky.

          pfSense-2.4.5 is based on FreeBSD-11.3/STABLE while pfSense-2.4.4 was based on FreeBSD-11.2/RELEASE. So it's not outside the realm of possibility that some change to the NIC driver or something else in FreeBSD-11.3/STABLE is causing you grief with your new hardware. Although the igb driver is very popular, so any underlying issues with it in FreeBSD-11.3/STABLE would show up quickly and result in lots of complaints. That is not the case. I have igb NIC drivers in use in my Netgate SG-5100 appliance on pfSense-2.4.5_p1 with no issues.

          1 Reply Last reply Reply Quote 0
          • L
            larold42
            last edited by

            So old hardware has intel 82583v.
            New hardware is 1x intel 219-LM, 1x intel 211-AT, 4x intel 350-AM4. I may try and figure out which port is which model and then make the WAN a new port. Also someone just suggested placing a dumb switch in between the modem and firewall to see if the issue goes away, so ill be trying that.

            bmeeksB 1 Reply Last reply Reply Quote 0
            • bmeeksB
              bmeeks @larold42
              last edited by

              @larold42 said in WAN flapping since upgrading to 2.4.5:

              So old hardware has intel 82583v.
              New hardware is 1x intel 219-LM, 1x intel 211-AT, 4x intel 350-AM4. I may try and figure out which port is which model and then make the WAN a new port. Also someone just suggested placing a dumb switch in between the modem and firewall to see if the issue goes away, so ill be trying that.

              Swapping ports around is a good idea. As I said, the root cause of your problem is the NIC driver within pfSense "thinks" the link state is changing (going down and then coming back up) -- same as if someone unplugged the Cat5 cable and then plugged it back in.

              L 1 Reply Last reply Reply Quote 0
              • L
                larold42 @bmeeks
                last edited by

                @bmeeks The only thing is, the protectli box has the same NIC as my old hardware intel 82583v and that had the same problem.

                bmeeksB 1 Reply Last reply Reply Quote 0
                • bmeeksB
                  bmeeks @larold42
                  last edited by

                  @larold42 said in WAN flapping since upgrading to 2.4.5:

                  @bmeeks The only thing is, the protectli box has the same NIC as my old hardware intel 82583v and that had the same problem.

                  You might be seeing an impact of this bug which was reportedly fixed: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235147. Do you perhaps have the older suggested workaround for this bug still in your configuration? If so, try removing it. Here is the pfSense Redmine bug report associated with the FreeBSD bug report I linked earlier: https://redmine.pfsense.org/issues/9414.

                  L 3 Replies Last reply Reply Quote 0
                  • L
                    larold42 @bmeeks
                    last edited by

                    @bmeeks huh didnt even know about this bug, well on the jetway i dont have that ethernet controller and it is recognized and loads. But i'm wondering if i should put in a bug now. Problem is i really dont have the time to troubleshoot this anymore. This is the only part that sucks about not having support.

                    bmeeksB 1 Reply Last reply Reply Quote 0
                    • L
                      larold42 @bmeeks
                      last edited by

                      @bmeeks i'm almost tempted to try the 2.5 dev version, but i feel like that will only dilute the problem with likely more issues.

                      bmeeksB 1 Reply Last reply Reply Quote 0
                      • bmeeksB
                        bmeeks @larold42
                        last edited by

                        @larold42 said in WAN flapping since upgrading to 2.4.5:

                        @bmeeks huh didnt even know about this bug, well on the jetway i dont have that ethernet controller and it is recognized and loads. But i'm wondering if i should put in a bug now. Problem is i really dont have the time to troubleshoot this anymore. This is the only part that sucks about not having support.

                        If you submit a bug report, it most likely should be to the FreeBSD bunch and not pfSense. The pfSense team does not do anything with regards to drivers. That is all taken "as-is" from upstream FreeBSD.

                        What is different with pfSense-2.4.5 is the newer version of FreeBSD. Things like drivers get various fixes and adjustments with new OS versions. Some of those are good and fix things, but others can "break" things through unintentional regressions of one kind of another.

                        1 Reply Last reply Reply Quote 0
                        • bmeeksB
                          bmeeks @larold42
                          last edited by bmeeks

                          @larold42 said in WAN flapping since upgrading to 2.4.5:

                          @bmeeks i'm almost tempted to try the 2.5 dev version, but i feel like that will only dilute the problem with likely more issues.

                          2.5 is based on FreeBSD-12.2/STABLE, so it is newer still. But it does have all of the latest NIC driver "fixes". The really big change in terms of NIC drivers in FreeBSD-12 is the move to the iflib wrapper API. That is a big change to the way NIC manufacturers write their hardware drivers.

                          1 Reply Last reply Reply Quote 0
                          • GertjanG
                            Gertjan @larold42
                            last edited by

                            This one - maybe not related - is also an issue that has to be checked :
                            @larold42 said in WAN flapping since upgrading to 2.4.5:

                            Oct 26 23:28:36 php-fpm 98247 /rc.newwanip: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1603754916] unbound[19136:0] error: bind: address already in use [1603754916] unbound[19136:0] fatal error: could not open ports'

                            "could not open ports" == another (probably) unbound instance was already running - or is still running (?). A new one can't be launched, as used ports like '53' are still occupied.
                            If I recall correctly (2.4.4-p3 is rather old already) , there was a timing issue with unbound, as the "stop" taks some time - and the restart came in to fast.

                            Check the unbound logs to see why it failed ?!

                            Also : take note that these settings :

                            a62ec67d-c322-445e-b845-f7964dd73585-image.png

                            will take the interface (WAN) down and up (== 'flapping ?!) if the motoring looses contact with the automatic or gateway IP (my 87.98.136.xx).
                            Practical joke : many use 8.8.8.8 here - and 8.8.8.8 is not being paid to serve (reply to) ICMP packets, it's job is serving DNS requests. So when 8.8.8.8 stops replying on (useless) ICMP, many WAN interfaces will fall.
                            In other words; your WAN connection will be as good as the "Monitor IP" being able to reply to pings. Temporary solution : disable the Gateway action to exclude this reason as a possible cause.

                            @larold42 said in WAN flapping since upgrading to 2.4.5:

                            and immediately noticed my WAN going up and down every 15 mins.

                            So it's down all the time (15 minutes) - then it goes UP :
                            Your log starts with :

                            Oct 26 23:28:32 kernel igb0: link state changed to UP

                            to go down 5 seconds later

                            Oct 26 23:28:37 kernel igb0: link state changed to DOWN

                            at the end of the log lines you showed - and it stays down for another 15 minutes ?

                            No "help me" PM's please. Use the forum, the community will thank you.
                            Edit : and where are the logs ??

                            1 Reply Last reply Reply Quote 0
                            • L
                              larold42
                              last edited by

                              ok i dont want to jinx myself, but i tried putting a dumb switch in front on my pfsense and i havent had an issue...... minus a couple "sendto error: 55". I have no idea how this is a solution. @bmeeks @Gertjan

                              1 Reply Last reply Reply Quote 0
                              • L
                                larold42 @bmeeks
                                last edited by larold42

                                @bmeeks so i check the interface that was doing this

                                igb0@pci0:1:0:0:        class=0x020000 card=0x0000ffff chip=0x15218086 rev=0x01 hdr=0x00
                                    vendor     = 'Intel Corporation'
                                    device     = 'I350 Gigabit Network Connection'
                                    class      = network
                                    subclass   = ethernet
                                    bar   [10] = type Memory, range 32, base 0xdf160000, size 131072, enabled
                                    bar   [18] = type I/O Port, range 32, base 0xe060, size 32, enabled
                                    bar   [1c] = type Memory, range 32, base 0xdf18c000, size 16384, enabled
                                    cap 01[40] = powerspec 3  supports D0 D3  current D0
                                    cap 05[50] = MSI supports 1 message, 64 bit, vector masks
                                    cap 11[70] = MSI-X supports 10 messages, enabled
                                                 Table in map 0x1c[0x0], PBA in map 0x1c[0x2000]
                                    cap 10[a0] = PCI-Express 2 endpoint max data 256(512) FLR RO NS
                                                 link x4(x4) speed 5.0(5.0) ASPM disabled(L0s/L1)
                                    ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected
                                    ecap 0003[140] = Serial 1 003018ffff0f0d21
                                    ecap 000e[150] = ARI 1
                                    ecap 0010[160] = SR-IOV 1 IOV disabled, Memory Space disabled, ARI disabled
                                                     0 VFs configured out of 8 supported
                                                     First VF RID Offset 0x0180, VF RID Stride 0x0004
                                                     VF Device ID 0x1520
                                                     Page Sizes: 4096 (enabled), 8192, 65536, 262144, 1048576, 4194304
                                    ecap 0017[1a0] = TPH Requester 1
                                    ecap 0018[1c0] = LTR 1
                                    ecap 000d[1d0] = ACS 1
                                

                                so... i'm wondering how many other folks are having issues running i350's, this is has to be a driver issue.

                                EDIT:
                                Here is the driver info as well
                                dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k
                                dev.igb.%parent:

                                1 Reply Last reply Reply Quote 0
                                • bmeeksB
                                  bmeeks
                                  last edited by bmeeks

                                  Here is a link to the source code for the latest version of Intel driver for what appears to be your card: https://downloadcenter.intel.com/download/15815/Intel-Network-Adapter-Driver-for-82575-6-and-82580-Based-Gigabit-Network-Connections-under-FreeBSD-?product=46827. This is only the C source code. To use this driver, you would need to create your own separate FreeBSD-11 virtual machine with the proper developer tools installed (compiler and linker) and then compile the source code into the binary driver module. Then copy that module over to your pfSense box and load it. That may be more effort than you wish to expend, though.

                                  The one thing I've noticed over the years with FreeBSD is that the support of newer hardware seems to lag behind Linux. The drivers within FreeBSD-11 and earlier are maintained by a team of Intel folks who then submit the updates to FreeBSD. For FreeBSD-12 and later, as I mentioned in a previous post, FreeBSD has moved to a new wrapper API called iflib. That move has muddied the waters a bit in terms of NIC driver development and support as now the FreeBSD team has the iflib API part while hardware manufacturers write the pieces that need to directly manipulate widgets on their particular NIC.

                                  It might be worth trying pfSense-2.5 DEVEL since it is based on FreeBSD-12.2/STABLE and will contain newer NIC driver versions.

                                  1 Reply Last reply Reply Quote 0
                                  • First post
                                    Last post
                                  Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.