Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    2.5.1: missing route to localhost (no joke)

    Scheduled Pinned Locked Moved Routing and Multi WAN
    12 Posts 3 Posters 1.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • GertjanG
      Gertjan @612brokeaf
      last edited by

      @612brokeaf

      Yeah, a couple of weeks ago, since 2.5.1, commands like dig, ping and others showed a 'error' messages that localhost (127.0.0.1) was absent.

      To get it back : reboot.

      I guess there is already a redmine issue for it.

      No "help me" PM's please. Use the forum, the community will thank you.
      Edit : and where are the logs ??

      612brokeaf6 1 Reply Last reply Reply Quote 0
      • viktor_gV
        viktor_g Netgate
        last edited by

        Unable to reproduce, but it may be related to https://redmine.pfsense.org/issues/11806

        You can try this patch: 221.diff

        GertjanG 612brokeaf6 2 Replies Last reply Reply Quote 0
        • 612brokeaf6
          612brokeaf @Gertjan
          last edited by

          @gertjan Rebooted n times, no change.

          1 Reply Last reply Reply Quote 0
          • GertjanG
            Gertjan @viktor_g
            last edited by

            @viktor_g said in 2.5.1: missing route to localhost (no joke):

            Unable to reproduc

            I was referring to these forum post.

            The issue should pop up when you declare something like this :

            0eaa6557-4dae-4bb0-8b7c-5bcb548cb578-image.png

            No "help me" PM's please. Use the forum, the community will thank you.
            Edit : and where are the logs ??

            1 Reply Last reply Reply Quote 0
            • 612brokeaf6
              612brokeaf @viktor_g
              last edited by 612brokeaf

              @viktor_g OK, that patch works, albeit not completely. I now have the route to 127.0.0.1 in the table, but another route for localhost (a secondary 172.16.x.x/32) has disappeared after rebooting, meaning my routing is now completely broken until I manually add that route pointing it to localhost, because I rely on this for BGP etc. Interestingly, there is another /32 on lo0 from the same range that I use as GRE source, and that was unaffected.

              viktor_gV 2 Replies Last reply Reply Quote 0
              • viktor_gV
                viktor_g Netgate @612brokeaf
                last edited by

                @612brokeaf said in 2.5.1: missing route to localhost (no joke):

                @viktor_g OK, that patch works, albeit not completely. I now have the route to 127.0.0.1 in the table, but another route for localhost (a secondary 172.16.x.x/32) has disappeared after rebooting, meaning my routing is now completely broken until I manually add that route pointing it to localhost, because I rely on this for BGP etc. Interestingly, there is another /32 on lo0 from the same range that I use as GRE source, and that was unaffected.

                Could you show your complete routing config?

                612brokeaf6 1 Reply Last reply Reply Quote 0
                • 612brokeaf6
                  612brokeaf @viktor_g
                  last edited by 612brokeaf

                  @viktor_g Not unless you have a config sanitisation tool where I could securely paste the XML and hide config details.

                  I can describe what I have though.

                  I have a hub and spoke type setup with multiple pfSense hosts in different regions as the hub(s). Spokes are a mix of pfSense and traditional big name hardware vendors.

                  On each hub:

                  • 10+ pairs of IPSec tunnels over GRE (several spokes + full mesh between hubs), meaning 10+ GRE interfaces, times two - for each location there is v4 and v6 (IPSec tunnel + GRE for each)
                  • 4 x extra secondary IPs on lo0 (Firewall -> VIPs -> type: alias): two IPv4 (172.16.x.x/32) and two IPv6 (fd00:xx::xx/128). For both v4 and v6, one is used as GRE source and this -> remote is what the IPSec tunnels cover, and the other is a general loopback for services/router IDs/BGP peering.
                  • Running FRR with OSPF + OSPF3 to distribute v4 and v6 loopbacks, and BGP via those loopbacks, hubs run a route reflector.
                  • A single WAN interface with a primary static v4 IP and multiple secondary v4 IPs. Secondaries / extra v4 IPs are /32s, gateway is the primary gateway. Also a /56 public IPv6 on each, ND-RA/SLAAC with /64 PDs.

                  I think possibly the issue triggers when setting up aliases on lo0. After the upgrade from 2.5.0 to 2.5.1, the 127.0.0.1 route was gone from routing table, even though the IP was configured correctly. After the patch you suggested, the route for 127.0.0.1 was in, but the route for another v4 alias for lo0 was not, while the third one was in. This broke most of my VPN tunnels, because some spokes have dynamic IPs and DNS is used to resolve the IPSec tunnel endpoints. Dnsmasq listens on 127.0.0.1 and this is what indirectly broke things. Before the patch I changed the local DNS server to 172.16.x.x as a workaround, but with the patch, that IP didn't make it into the routing table, resulting in the same issue.

                  For now I added manual shellcmds to install the missing lo0 routes on boot.

                  For completeness: I have another manual modification in place, in /etc/inc/config.lib.inc, and that is changing alias_make_table(); to alias_make_table($config);, because otherwise I kept getting crash reports / PHP errors complaining about alias_make_table being called with zero arguments and expecting one. This was being triggered from the ACME cert renewal cron job. There is also another bug in ACME, complaining about the function getarraybyref() not found. Even though all PHP include chains look fine, I can't find another way to fix this than pasting that function into the same scope in ACME. This is for another topic though - this issue looked fixed in 2.5.0, but maybe I fixed it by hand and forgot about it until 2.5.1.

                  viktor_gV 1 Reply Last reply Reply Quote 0
                  • 612brokeaf6
                    612brokeaf
                    last edited by

                    Instead of shellcmds, I added manual static routes to 127/8 and the two other /32s I have on lo0 in the GUI. The node now survives reboot entirely intact.

                    Side note: When adding static routes, gateway / interface selection lists lo0 as respectively "null4" and "null6". This naming is a little confusing - to a network engineer this looks like blackhole routes, and it is probably meant to be exactly that, but this hints that there may be additional rules in place that actually drop the traffic rather than just push it to the CPU, just like in other network OSes there can be a dedicated null interface.

                    1 Reply Last reply Reply Quote 0
                    • 612brokeaf6
                      612brokeaf
                      last edited by 612brokeaf

                      Correction: setting those missing loopback routes as static routes apparently only fixed it on one node and only temporarily.

                      @viktor_g looks like the patch did not change much - I ran debug on that function and it didn't seem to be touching the v4 loopback, so this may be elsewhere - possibly IPSec scripts, since there were so many fixes in 2.5.1? Probably a good test would be to stop/start IPSec and see if this breaks the loopback again, at least it would narrow this down somewhat.

                      Anyhow, I added shellcmds (regular, not early) adding routes to the various lo0 addresses, and that seems to have worked so far, 10+ reboots. It's an ugly fix but I'm not touching it until some proper resolution comes up. I've ran out of downtime credits for now so can't test much for the next dew days.

                      1 Reply Last reply Reply Quote 0
                      • viktor_gV
                        viktor_g Netgate @612brokeaf
                        last edited by

                        @612brokeaf said in 2.5.1: missing route to localhost (no joke):

                        @viktor_g OK, that patch works, albeit not completely. I now have the route to 127.0.0.1 in the table, but another route for localhost (a secondary 172.16.x.x/32) has disappeared after rebooting, meaning my routing is now completely broken until I manually add that route pointing it to localhost, because I rely on this for BGP etc. Interestingly, there is another /32 on lo0 from the same range that I use as GRE source, and that was unaffected.

                        Unable to reproduce on the latest dev snapshot:
                        Screenshot from 2021-05-09 16-04-01.png

                        all OK after rebooting:

                        # netstat -rn | grep 127
                        5.5.5.0/24         127.0.0.1          UGSB        lo0
                        6.6.6.6/32         127.0.0.1          UGSB        lo0
                        127.0.0.1          link#5             UH          lo0
                        
                        1 Reply Last reply Reply Quote 0
                        • viktor_gV
                          viktor_g Netgate @612brokeaf
                          last edited by

                          @612brokeaf said in 2.5.1: missing route to localhost (no joke):

                          For completeness: I have another manual modification in place, in /etc/inc/config.lib.inc, and that is changing alias_make_table(); to alias_make_table($config);, because otherwise I kept getting crash reports / PHP errors complaining about alias_make_table being called with zero arguments and expecting one. This was being triggered from the ACME cert renewal cron job. There is also another bug in ACME, complaining about the function getarraybyref() not found. Even though all PHP include chains look fine, I can't find another way to fix this than pasting that function into the same scope in ACME. This is for another topic though - this issue looked fixed in 2.5.0, but maybe I fixed it by hand and forgot about it until 2.5.1.

                          Please create a bugreport about this issue:
                          https://docs.netgate.com/pfsense/en/latest/development/bug-reports.html

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.