Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Weekly Loss of WAN after 2.2.2 upgrade

    Scheduled Pinned Locked Moved Problems Installing or Upgrading pfSense Software
    12 Posts 4 Posters 1.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      mattlach
      last edited by

      Hey all,

      Since the upgrade to 2.2.2 I lose my WAN IP roughly once a week.  Everything comes back up again when rebooting though.

      This is what I am faced with when I log in to troubleshoot:

      What log information might be useful to troubleshoot this?

      Appreciate any help!

      Thanks,
      Matt

      1 Reply Last reply Reply Quote 0
      • C
        cmb
        last edited by

        What does the system log show? Looks like DHCP isn't able to obtain a lease. dhclient should be logging something relevant.

        1 Reply Last reply Reply Quote 0
        • M
          mattlach
          last edited by

          @cmb:

          What does the system log show? Looks like DHCP isn't able to obtain a lease. dhclient should be logging something relevant.

          Thank you.

          At the point this has happened, it has had a negotiated IP on the WAN for - in this case - 6 days and 20 hours, so just about a week.  Do you think it is failing at renegotiating?

          The log page in the GUI is only showing me last 50 entries, which doesn't go back far enough.

          Is there a file in /var/log I can look at instead?  What can I grep for?  The interface name?

          I've tried a few different things (like grepping for dhcp, or for my WAN interface, (em1) etc. etc, but I'm not sure if I'm finding anything relevant.

          This comes up repeatedly at about the same time as the problem (or just before?).  I've anonymized it to hide my IP but you get the point:

          /var/log/gateways.log:Apr 13 13:12:28 home-router apinger: Could not bind socket on address(xxx.xxx.xxx.28) for monitoring address xxx.xxx.xxx.1(WAN_DHCP) with error Can't assign requested address

          Does this sound like it might be it?

          There are pages upon pages of this same error in a row, one every second.

          Appreciate any thoughts!

          –Matt

          1 Reply Last reply Reply Quote 0
          • C
            cmb
            last edited by

            'clog /var/log/system.log' will dump the full system.log. Or can increase lines shown on Settings tab under Status>System logs.

            The apinger log is a symptom of the problem, it has no IP to bind to. It sounds like DHCP renewals are failing for some reason.

            1 Reply Last reply Reply Quote 0
            • M
              mattlach
              last edited by

              @cmb:

              'clog /var/log/system.log' will dump the full system.log. Or can increase lines shown on Settings tab under Status>System logs.

              The apinger log is a symptom of the problem, it has no IP to bind to. It sounds like DHCP renewals are failing for some reason.

              Thank you.

              Looked at the system log.  This looks relevant:

              Jun  4 19:30:42 home-router kernel: arpresolve: can't allocate llinfo for xxx.xxx.xxx.1 on em1

              Again, repeated very often for several pages, as many as 16 times per second at about the time of the issue.

              Like before, this is probably a symtom, not the problem itself.  Went up to see what the last message was before arpresolve started spamming the log and found this:

              
              Jun  4 19:27:45 home-router check_reload_status: updating dyndns WAN_DHCP
              Jun  4 19:27:45 home-router check_reload_status: Restarting ipsec tunnels
              Jun  4 19:27:45 home-router check_reload_status: Restarting OpenVPN tunnels/interfaces
              Jun  4 19:27:45 home-router check_reload_status: Reloading filter
              Jun  4 19:27:46 home-router php-fpm[98346]: /rc.dyndns.update: phpDynDNS (an.address.com): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry.
              Jun  4 19:27:47 home-router bandwidthd: DNS timeout for xxx.xxx.xxx.242: This problem reduces graphing performance
              Jun  4 19:27:47 home-router bandwidthd: DNS timeout for xxx.xxx.xxx.242: This problem reduces graphing performance
              Jun  4 19:28:24 home-router check_reload_status: Linkup starting em1
              Jun  4 19:28:24 home-router kernel: em1: Watchdog timeout -- resetting
              Jun  4 19:28:24 home-router kernel: em1: Queue(0) tdh = 135, hw tdt = 104
              Jun  4 19:28:24 home-router kernel: em1: TX(0) desc avail = 31,Next TX to Clean = 135
              Jun  4 19:28:24 home-router kernel: em1: link state changed to DOWN
              Jun  4 19:28:25 home-router php-fpm[95237]: /rc.linkup: DEVD Ethernet detached event for wan
              Jun  4 19:28:26 home-router kernel: arpresolve: can't allocate llinfo for xxx.xxx.xxx.1 on em1
              Jun  4 19:28:27 home-router kernel: arpresolve: can't allocate llinfo for xxx.xxx.xxx.1 on em1
              Jun  4 19:28:27 home-router kernel: arpresolve: can't allocate llinfo for xxx.xxx.xxx.1 on em1
              Jun  4 19:28:27 home-router kernel: arpresolve: can't allocate llinfo for xxx.xxx.xxx.1 on em1
              Jun  4 19:28:27 home-router kernel: arpresolve: can't allocate llinfo for xxx.xxx.xxx.1 on em1
              Jun  4 19:28:27 home-router kernel: arpresolve: can't allocate llinfo for xxx.xxx.xxx.1 on em1
              Jun  4 19:28:27 home-router kernel: arpresolve: can't allocate llinfo for xxx.xxx.xxx.1 on em1
              Jun  4 19:28:27 home-router kernel: arpresolve: can't allocate llinfo for xxx.xxx.xxx.1 on em1
              

              So, it looks like dyndns check runs, followed by something for bandwidthd, and finally the watchdog loses touch with em1, it goes down, and doesn't come up again, followed by the log spams from arpresolve.

              It's odd that all these things happen right before my issue.  Are all these executed on a regular interval, one after another as part of a cron script?  If so, what else is executed as part of that script.  Something that could be taking down my em1 interface?

              I did a "crontab -e" to try to check, but found it empty…  ???

              Appreciate any help!

              Thanks,
              Matt

              1 Reply Last reply Reply Quote 0
              • C
                cmb
                last edited by

                Yeah the arpresolve: can't allocate llinfo is a symptom as well. You're losing link for some reason.

                this thread:
                https://forum.pfsense.org/index.php?topic=81929.0

                suggests setting the following in /boot/loader.conf.local will fix.

                hw.pci.enable_msi=0
                hw.pci.enable_msix=0
                

                Reboot after creating that file with those lines (or adding them to the file, if you already have one), and see what that does.

                1 Reply Last reply Reply Quote 0
                • M
                  mattlach
                  last edited by

                  Thank you for this suggestion!

                  I will try it and report back within a couple of weeks.

                  1 Reply Last reply Reply Quote 0
                  • M
                    mattlach
                    last edited by

                    Follow up question:

                    Will these changes to loader.conf be persistent between web interface invoked automatic upgrades, or am I going to have to keep in mind that I need to make this change every time an upgrade comes out?

                    Thanks,
                    Matt

                    1 Reply Last reply Reply Quote 0
                    • H
                      heper
                      last edited by

                      yes thats what the .local is for ;)

                      1 Reply Last reply Reply Quote 0
                      • S
                        Supermule Banned
                        last edited by

                        Disable your GW monitoring on DHCP connections.

                        I disabled mine and many things has been better since.

                        1 Reply Last reply Reply Quote 0
                        • C
                          cmb
                          last edited by

                          @Supermule:

                          Disable your GW monitoring on DHCP connections.

                          That has nothing at all to do with a NIC losing link. In a default config, gateway monitoring does nothing but log response times.

                          @mattlach:

                          Will these changes to loader.conf be persistent between web interface invoked automatic upgrades, or am I going to have to keep in mind that I need to make this change every time an upgrade comes out?

                          That's why you put it in loader.conf.local not loader.conf, .local will never be overwritten by an upgrade.

                          1 Reply Last reply Reply Quote 0
                          • M
                            mattlach
                            last edited by

                            @heper:

                            yes thats what the .local is for ;)

                            @cmb:

                            That's why you put it in loader.conf.local not loader.conf, .local will never be overwritten by an upgrade.

                            Thanks guys.  Sometimes my Linux experience is a great help in the BSD's.  Sometimes it just helps me get in over my head faster.

                            So many things are similar, but so many are also different.

                            Appreciate it!

                            –Matt

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.