Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Changes to IPsec tunnels leads to routing instability

    Scheduled Pinned Locked Moved General pfSense Questions
    10 Posts 2 Posters 1.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      michmoor LAYER 8 Rebel Alliance
      last edited by

      Wondering if anyone has come across this. I have filed this under a redmine - https://redmine.pfsense.org/issues/14483#change-68058
      This potential issue is enough to make explore other options

      I have a hub with 4x IPsec VTI tunnels. Each tunnel is running eBGP. The problem i discovered is that when I make any changes to a single tunnel (doesnt matter which one) if the change is a description change OR a Phase 1 parameter change it doesnt matter, once i apply those changes all 4x tunnels lose routing briefly. BGP flaps. From my experience with other platforms, this should not happen at all.
      Its so bad that if i bring up another tunnel and apply changes, routing breaks for all tunnels briefly and comes back up. This is really unreasonable.

      At first i thought this was a gateway monitoring action but those have been disabled.
      Then maybe thought its a package issue so i disabled pfblocker [ive had issues with this package impacting reliability of the system n the past]

      Today one of the things i noticed when making a interface change is the following in the logs
      714bc717-ea9b-40bb-8f79-ede4ab4f66b5-image.png

      Firewall: NetGate,Palo Alto-VM,Juniper SRX
      Routing: Juniper, Arista, Cisco
      Switching: Juniper, Arista, Cisco
      Wireless: Unifi, Aruba IAP
      JNCIP,CCNP Enterprise

      P 1 Reply Last reply Reply Quote 0
      • P
        pete35 @michmoor
        last edited by

        @michmoor
        i suppose you use the Frr package. I noticed this behavior some years ago. It was never solved, it happens to OSPF too, which makes this dynamic routing packages on Pfsense really worthless,
        Its not clear for me if the problem is on the FRR routing software site (that is beta software) or just the way Pfsense implements routing table changes. So we will see,
        whether your Redmine ticket gets any attention, but i dont think so.

        Read this and some other topics: https://forum.netgate.com/topic/145653/ffr-restart-on-configuration-changes?_=1687120051977

        <a href="https://carsonlam.ca">bintang88</a>
        <a href="https://carsonlam.ca">slot88</a>

        M 2 Replies Last reply Reply Quote 1
        • M
          michmoor LAYER 8 Rebel Alliance @pete35
          last edited by michmoor

          @pete35 This is VERY frustrating at this point. The Redmine was closed and you can tell no thought was given other than the following "fix"
          "This is part of the reason why the option Ignore IPsec Restart in FRR exists."

          That option is enabled for me...The ENTIRE POINT of opening the redmine is because there is something broken in the way FRR or IPsec is handled within pfsense.

          I just did a description change on one of my tunnels. As you can see all my tunnels routing peers flapped
          As I also made mention in the ticket i informed NetGate that i am trying this on another FreeBSD system (*sense) which I will not name and this issue does not exist there. This is specifc to pfSense.
          To reject the ticket without even mentioning if you were able to replicate it is really bad form. Ive worked with Marcos (yes calling him out by name) and hes professional so not sure why the ticket was dismissed in this way.
          @stephenw10 can you assist here. Can you see if you are able to replicate this issue in a lab? If possible re-open the Redmine again.

           sh ip bgp summary
          
          IPv4 Unicast Summary:
          BGP router identifier 192.168.50.254, local AS number 65001 vrf-id 0
          BGP table version 981
          RIB entries 50, using 9600 bytes of memory
          Peers 4, using 57 KiB of memory
          Peer groups 1, using 64 bytes of memory
          
          Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt
          10.6.106.6      4      31898      3286      3426        0    0    0 00:00:36            2        6
          10.6.106.10     4      31898      3218      3334        0    0    0 00:00:37            2        6
          10.6.106.2      4      65520     14803     15170        0    0    0 00:00:50            2        6
          172.28.0.5      4      65002     14665     14676        0    0    0 00:00:48            1        2
          
          

          What happens when i make a change in IPsec
          87da1818-2646-4fa7-97a3-6667d6d4422f-image.png

          Setting enabled
          15cb2deb-cf10-4fbc-967b-3862bc994d3a-image.png

          Firewall: NetGate,Palo Alto-VM,Juniper SRX
          Routing: Juniper, Arista, Cisco
          Switching: Juniper, Arista, Cisco
          Wireless: Unifi, Aruba IAP
          JNCIP,CCNP Enterprise

          1 Reply Last reply Reply Quote 0
          • M
            michmoor LAYER 8 Rebel Alliance @pete35
            last edited by

            @pete35 said in Changes to IPsec tunnels leads to routing instability:

            or just the way Pfsense implements routing table changes

            I can confirm working on a lab opnsense machine this problem is not present there. There is something in the way pfsense is implementing frr or ipsec.

            Firewall: NetGate,Palo Alto-VM,Juniper SRX
            Routing: Juniper, Arista, Cisco
            Switching: Juniper, Arista, Cisco
            Wireless: Unifi, Aruba IAP
            JNCIP,CCNP Enterprise

            P 1 Reply Last reply Reply Quote 0
            • P
              pete35 @michmoor
              last edited by

              @michmoor

              The situation with tickets for this issue was expected. There are more open issues with dynamic routing. If you can go with opensense just go ahead.

              <a href="https://carsonlam.ca">bintang88</a>
              <a href="https://carsonlam.ca">slot88</a>

              M 1 Reply Last reply Reply Quote 0
              • M
                michmoor LAYER 8 Rebel Alliance @pete35
                last edited by

                @pete35 FRR ha been stable for me through my deployments both personal (home lab) and professional as an MSP. Using bgp on the edge is fine.
                I noticed certain configurations do not work well such as IPsec and Routing. This is a critical deployment where swapping out for another platform is a headache but not impossible.
                I’m honestly still shocked the redmine got rejected but then got reopened and an acknowledgment that this could be replicated.

                Firewall: NetGate,Palo Alto-VM,Juniper SRX
                Routing: Juniper, Arista, Cisco
                Switching: Juniper, Arista, Cisco
                Wireless: Unifi, Aruba IAP
                JNCIP,CCNP Enterprise

                P 1 Reply Last reply Reply Quote 0
                • P
                  pete35 @michmoor
                  last edited by

                  @michmoor
                  Yes there is hope. But on the otherside if you look on the open bug tickets at redmine, there are about 20 which are older then 3 years. You need some patience.
                  jimp mentioned dont change anything during work hours ... practical solution but neverthenless very unsatisfactory. This stops me from implementing dynamic routing with pfsense.

                  <a href="https://carsonlam.ca">bintang88</a>
                  <a href="https://carsonlam.ca">slot88</a>

                  M 1 Reply Last reply Reply Quote 1
                  • M
                    michmoor LAYER 8 Rebel Alliance @pete35
                    last edited by

                    @pete35 I cant believe the solution is to not make changes during working business hours which is so confusing.

                    The Redmine i submitted, the solution makes it seem as though a simple checkbox would solve all my issues but that checkbox doesnt do anything from what i can tell. As a test i reverted my test system back to 22.05 with the same issues. So confused as to what the checkbox does in FRR.

                    This is about money for me at the end of the day. This is a hub and spoke topology where I am doing a rip and replace. I spec'd out this project for a 8200 at the hub. Spokes get migrated over the next several months. This was just a POC (proof of concept) I was building but now I have to go back to the client and spec something completely different which is annoying and embarrassing.
                    I wish I knew FRR in an enterprise setting does not work. The blame completely falls on netgate here. The package is broken with no solution but its included. Why? No bias here, if another vendor pulled this stunt i would have the harsh words for them as well.
                    I cant even have a very basic IPsec/BGP tunnel running.I was looking at TNSR for one job but i have a big confidence issue right now with this product. There is no telling what works and what doesnt without relying on good people like @pete35 to provide a forum link. A forum link....
                    A firewall needs to route. This is as basic as it gets. As an implementor, i need to be able to trust the knobs that are presented to me in the GUI. This is a reevaluation point for me and its really unfortunate.

                    Firewall: NetGate,Palo Alto-VM,Juniper SRX
                    Routing: Juniper, Arista, Cisco
                    Switching: Juniper, Arista, Cisco
                    Wireless: Unifi, Aruba IAP
                    JNCIP,CCNP Enterprise

                    P 1 Reply Last reply Reply Quote 0
                    • P
                      pete35 @michmoor
                      last edited by pete35

                      @michmoor
                      It should be possible to load/run a Netgate 8200 with Opnsense, if that works for you, you POC might work too.
                      If it doenst work you can always reload pfsense onto the 8200 and sell this unit.
                      For routing TNSR maybe possible but as far as i see, which much higher yearly costs.

                      <a href="https://carsonlam.ca">bintang88</a>
                      <a href="https://carsonlam.ca">slot88</a>

                      M 1 Reply Last reply Reply Quote 0
                      • M
                        michmoor LAYER 8 Rebel Alliance @pete35
                        last edited by

                        @pete35 Update on this.

                        I secured this contract. We're using a pair of Juniper SRX 380s as we got 10Gbps dual DIA circuits

                        I am posting lessons learned for posterity's sake and a cautionary tale for others who search for something similar to this.

                        1. pfSense cannot perform advanced routing in a stable way. FRR needing to be reloaded for changes is a problem that i do not blame pfSense on. Thats just the way the package currently functions but still should be taken into account. I got over 15 sites in a hub and spoke set up. If i update frr im breaking connectivity for all sites. When won't I have to make a route map change? Add a new BGP neighbor? There is no maintenance window in the world that a company would approve a global outage. There are workarounds for this I suppose but not worth exploring.

                        2. As I outlined in my redmine, there is an issue with IPsec that impacts FRR in a negative way. The problem isnt with FRR.
                          If there is a need to do routing over IPsec (obviously utilizing VTIs) then pick another firewall. Imagine you have a datacenter terminating over 50 IPsec tunnels. All you do is update the IPsec configuration or even onboard another site and click apply. You just broke routing within the enterprise. Thats absolutely insane and scary. This is something that can be replicated by TAC per the redmine. I cant recommend in good conscience deploying pfSense in that situation.
                          I got extremely lucky in that my client paid thousands of dollars on the 6100s to make the sacrifice of getting the Juniper head-end SRXs to manage all of this.
                          I really do advise anyone reading this to reconsider something else if your solution requires dynamic routing with IPsec. Beware,..

                        3. Lastly, there are lots of things that pfSense gets right. I will continue to deploy it in much less advanced scenarios but cannot use it going forward on topologies that require High availability with routing. The software just cant do it. This was indeed an eye-opener for me but we all learn the hard way.

                        Firewall: NetGate,Palo Alto-VM,Juniper SRX
                        Routing: Juniper, Arista, Cisco
                        Switching: Juniper, Arista, Cisco
                        Wireless: Unifi, Aruba IAP
                        JNCIP,CCNP Enterprise

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.