• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

BGP convergence with BFD working smoothly with the settings below.

FRR
3
20
784
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M
    mcury
    last edited by mcury Feb 27, 2025, 7:48 PM Feb 27, 2025, 7:04 PM

    I have been reading testing in GNS3.. How couldn't I see this before, just disabling the inbound reply-to fixes a lot of problems for multipath environments..

    https://docs.netgate.com/pfsense/en/latest/firewall/configure.html#disable-reply-to

    Combination of settings that seems to be working pretty well, one or two packet loss during convergence, which are:

    FRR: Ignore IPsec Restart not ticked, which is the default (found this option to cause problems if the parent interface where IPsec is running goes down).

    The setting below won't change the convergence behavior, it is here just as an information:
    IPsec Filter Mode: Filter IPsec VTI and Transport on assigned interfaces, block all tunnel mode traffic.

    Source:
    Firewall rule, LAN rule: simple allow rule from source to destination (no advanced settings changed).

    Destination:
    On the IPsec VTI firewall rule tab:
    Firewall rule allowing destination to reach the source, with reply-to disabled.

    dead on arrival, nowhere to be found.

    M 1 Reply Last reply Feb 28, 2025, 2:23 PM Reply Quote 0
    • M
      michmoor LAYER 8 Rebel Alliance @mcury
      last edited by Feb 28, 2025, 2:23 PM

      @mcury What problems are you seeing in multipath environments? Im not clear on what is trying to be fixed. I have tested BFD/ECMP with BGP and have it working right now without issue.

      Firewall: NetGate,Palo Alto-VM,Juniper SRX
      Routing: Juniper, Arista, Cisco
      Switching: Juniper, Arista, Cisco
      Wireless: Unifi, Aruba IAP
      JNCIP,CCNP Enterprise

      M 1 Reply Last reply Feb 28, 2025, 2:29 PM Reply Quote 0
      • M
        mcury @michmoor
        last edited by mcury Feb 28, 2025, 2:29 PM Feb 28, 2025, 2:29 PM

        @michmoor said in BGP convergence with BFD working smoothly with the settings below.:

        What problems are you seeing in multipath environments?

        States get stuck when route changes, TCP timeouts.. Even ICMP stops..
        When that happens with TCP for an example, we usually need a new three way handshake, and that means, open the browser again, relogin and etc..

        When disabling reply-to inbound at the destination side of the tunnel, or in IPsec tab or in the VTI tab depending on the IPsec advanced settings. this no longer happens.
        Convergence takes seconds and everything just works..

        dead on arrival, nowhere to be found.

        M 1 Reply Last reply Feb 28, 2025, 2:31 PM Reply Quote 0
        • M
          michmoor LAYER 8 Rebel Alliance @mcury
          last edited by Feb 28, 2025, 2:31 PM

          @mcury

          Related to this??

          https://redmine.pfsense.org/issues/14630

          Firewall: NetGate,Palo Alto-VM,Juniper SRX
          Routing: Juniper, Arista, Cisco
          Switching: Juniper, Arista, Cisco
          Wireless: Unifi, Aruba IAP
          JNCIP,CCNP Enterprise

          M 1 Reply Last reply Feb 28, 2025, 2:36 PM Reply Quote 0
          • M
            mcury @michmoor
            last edited by Feb 28, 2025, 2:36 PM

            @michmoor said in BGP convergence with BFD working smoothly with the settings below.:

            Related to this??

            https://redmine.pfsense.org/issues/14630

            Yeap, exactly that.

            Try to disable reply-to for the inbound rules..
            That is all it is required, no need to change the state, leave it at default.
            That seems to fix all the problems.

            I'm currently testing this in lab with BGP and also for a customer that has OSPF, so far no problems at all..

            Just disable reply-to in a lab environment, test it and then update here.
            I really want to know if I'm the only one seeing this working smoothly.
            LAB is running 2.7.2 with all patches applied.
            My customer is running 24.03 on a SG-3100 (IPsec Filter Mode: Filter IPsec VTI and Transport on assigned interfaces, block all tunnel mode traffic.) to 2.7.2 with IPsec Filter Mode default.

            dead on arrival, nowhere to be found.

            M 1 Reply Last reply Feb 28, 2025, 2:42 PM Reply Quote 0
            • M
              michmoor LAYER 8 Rebel Alliance @mcury
              last edited by michmoor Feb 28, 2025, 2:47 PM Feb 28, 2025, 2:42 PM

              @mcury

              Bringing in @marcosm as he was glancing at this from another forum post not that long ago.

              This is an exciting find and something that has been "broken" for some time now but i am curious about two things

              1. Why does disabling reply-to fix it [edit] - The VTI is a logical interface so in theory traffic should return back to the source. The underlying path may change of course but VTI should be constant.
              2. What is the long term fix for this? Just disable 'reply-to' on every rule created under a IPsec VTI tab that may be doing dynamic routing? What if i have an IPsec VTI that isn't using routing?

              Seriously tho, this is really huge because FRR has not worked as intended for quite some time.

              Firewall: NetGate,Palo Alto-VM,Juniper SRX
              Routing: Juniper, Arista, Cisco
              Switching: Juniper, Arista, Cisco
              Wireless: Unifi, Aruba IAP
              JNCIP,CCNP Enterprise

              M 1 Reply Last reply Feb 28, 2025, 2:48 PM Reply Quote 0
              • M
                mcury @michmoor
                last edited by Feb 28, 2025, 2:48 PM

                @michmoor said in BGP convergence with BFD working smoothly with the settings below.:

                This is an exciting find

                I have been struggling with this for a long long time..
                When it worked, i just started a new lab just to confirm which setting did the trick.
                And that is it, reply-to.. 👍

                dead on arrival, nowhere to be found.

                M 1 Reply Last reply Feb 28, 2025, 2:51 PM Reply Quote 0
                • M
                  michmoor LAYER 8 Rebel Alliance @mcury
                  last edited by Feb 28, 2025, 2:51 PM

                  @mcury

                  So the way i see it is either disable reply-to on the entire system OR disable reply-to on individual firewall rules under the VTI interface.

                  What are the security implications if any i wonder.

                  Firewall: NetGate,Palo Alto-VM,Juniper SRX
                  Routing: Juniper, Arista, Cisco
                  Switching: Juniper, Arista, Cisco
                  Wireless: Unifi, Aruba IAP
                  JNCIP,CCNP Enterprise

                  M 1 Reply Last reply Feb 28, 2025, 2:55 PM Reply Quote 0
                  • M
                    mcury @michmoor
                    last edited by Feb 28, 2025, 2:55 PM

                    @michmoor said in BGP convergence with BFD working smoothly with the settings below.:

                    So the way i see it is either disable reply-to on the entire system OR disable reply-to on individual firewall rules under the VTI interface.

                    I did disable only on some firewall rules.

                    These are for my lab environment:
                    ICMP for dpinger is allowed with reply-to enabled
                    TCP 179 (BGP) is allowed with reply-to enabled
                    UDP 3784 (BFD) is allowed with reply-to enabled.
                    ICMP for the local LAN (reply-to disabled)
                    TCP 443 for the local LAN (reply-to-disabled).

                    I don't think there are security implications involved..
                    As far as I know, reply-to is only a pf mechanism to avoid packets taking a different path from where they originally came from.

                    But, lets see what users that really understand the things under the hood and how pf and reply-to work, have to say about this.

                    dead on arrival, nowhere to be found.

                    M 1 Reply Last reply Feb 28, 2025, 3:06 PM Reply Quote 0
                    • M
                      michmoor LAYER 8 Rebel Alliance @mcury
                      last edited by michmoor Feb 28, 2025, 3:07 PM Feb 28, 2025, 3:06 PM

                      @mcury

                      I think a simply fix for this would actually be an updated documentation believe it or not.

                      System:Advanced:Firewall & NAT has the following

                      login-to-view

                      So in my mind, make a note here AND in the documentation to suggest disabling if using dynamic routing.

                      @mcury I think you are right in that there isn't any harm disabling reply-to. Its added there as a benefit as most likely customers would need it but in advanced scenarios such as BGP routing and the possibility of traffic being asymmetric in nature, this MUST be disabled. There is no alternative.

                      Great find my friend. Between this and helping me with Graylog i think i owe you some beers dude.

                      Firewall: NetGate,Palo Alto-VM,Juniper SRX
                      Routing: Juniper, Arista, Cisco
                      Switching: Juniper, Arista, Cisco
                      Wireless: Unifi, Aruba IAP
                      JNCIP,CCNP Enterprise

                      M 1 Reply Last reply Feb 28, 2025, 3:11 PM Reply Quote 1
                      • M
                        mcury @michmoor
                        last edited by mcury Feb 28, 2025, 6:24 PM Feb 28, 2025, 3:11 PM

                        @michmoor said in BGP convergence with BFD working smoothly with the settings below.:

                        Great find my friend. Between this and helping me with Graylog i think i owe you some beers dude.

                        oh, its Friday :)
                        🍻 🍻

                        dead on arrival, nowhere to be found.

                        M 1 Reply Last reply Feb 28, 2025, 6:12 PM Reply Quote 1
                        • M
                          michmoor LAYER 8 Rebel Alliance @mcury
                          last edited by michmoor Feb 28, 2025, 6:15 PM Feb 28, 2025, 6:12 PM

                          @mcury
                          Speaking to marcos on the side, there is a bit of nuance here.

                          If the IPsec Filter Mode is changed from the default to 'Filter IPsec VTI and Transport on assigned interfaces' then reply-to gets applied by default. I think that is what ultimately breaks convergence. If using that mode then the fix is to do what we suggested. Also the state policy mode must be set to floating.

                          There are just to many gotchas here and the only way this was discovered was through experimentation.

                          In my humble opinion, a change needs to go in that if the IPsec Filter Mode is changed from the default then on the backend all rules created under the VTI Firewall tables have reply-to negated along with state policy mode changed to floating. Just a simple drop-down and apply.

                          FRR has been broken for over a year, potentially longer. Im glad you solved it but this is a tad much to take into account if you are an network administrator who simply wants failover.

                          Another added wrinkle is that if you do not use IPsec with BGP/OSPF and you simply have pfsense with multiple ISP providers doing BGP as i discovered, you must change the state policy mode to floating otherwise traffic gets blackholed.

                          Again -- too many gotchas. Default configuration as used today will blackhole traffic if using FRR.

                          Firewall: NetGate,Palo Alto-VM,Juniper SRX
                          Routing: Juniper, Arista, Cisco
                          Switching: Juniper, Arista, Cisco
                          Wireless: Unifi, Aruba IAP
                          JNCIP,CCNP Enterprise

                          M 1 Reply Last reply Feb 28, 2025, 6:37 PM Reply Quote 1
                          • M
                            mcury @michmoor
                            last edited by Feb 28, 2025, 6:37 PM

                            If the IPsec Filter Mode is changed from the default to 'Filter IPsec VTI and Transport on assigned interfaces' then reply-to gets applied by default. I think that is what ultimately breaks convergence. If using that mode then the fix is to do what we suggested.

                            Tested with both modes, both work with the no-reply option.

                            Also the state policy mode must be set to floating.

                            Indeed, this is how it is currently set here, but there is an option that does that automatically for IPsec rules if I'm not mistaken, check image posted in my answer below.

                            There are just to many gotchas here and the only way this was discovered was through experimentation.

                            A bunch of tests here, even using different state options, keep, none (creating outbound floating rules), sloppy..

                            In my humble opinion, a change needs to go in that if the IPsec Filter Mode is changed from the default then on the backend all rules created under the VTI Firewall tables have reply-to negated along with state policy mode changed to floating. Just a simple drop-down and apply.

                            I don't know what the best approach would be, perhaps give the option you mentioned and update the IPsec VTI documents highlighting this.

                            FRR has been broken for over a year, potentially longer. Im glad you solved it but this is a tad much to take into account if you are an network administrator who simply wants failover.

                            I agree.

                            Another added wrinkle is that if you do not use IPsec with BGP/OSPF and you simply have pfsense with multiple ISP providers doing BGP as i discovered, you must change the state policy mode to floating otherwise traffic gets blackholed.

                            I think that is already the default for IPsec VTI tunnels.

                            login-to-view

                            Again -- too many gotchas. Default configuration as used today will blackhole traffic if using FRR.

                            First thing I would do is to update the documentation with this workaround, specially in the OSPF/BGP section of the FRR.

                            dead on arrival, nowhere to be found.

                            M 1 Reply Last reply Feb 28, 2025, 8:19 PM Reply Quote 0
                            • M marcosm referenced this topic on Feb 28, 2025, 6:59 PM
                            • M
                              michmoor LAYER 8 Rebel Alliance @mcury
                              last edited by Feb 28, 2025, 8:19 PM

                              @mcury

                              Ok so ive made some changes to my default configuration since you identified the issue.

                              This pertains to me only but when i set up a new firewall AND the firewall is at the edge of the network i do the following

                              1. If a single ISP, disable gateway monitoring action. The default is that its enabled but the problem is if there is a ISP hiccup, all packages and services gets restarted. I learned this the hard way. If i get packet loss even for a few seconds, all the VPN tunnels get restarted along with BGP. Why? Thats the gateway monitoring action.

                              2. If pfsense is at the edge of the network and is doing OSPF/BGP
                                a. Firewall State Policy gets changed to Floating States
                                b. Disable reply-to is checked off globally.

                              I wouldn't mind the defaults as they are but the problem is there is little to no documentation on how the defaults behave.

                              Firewall: NetGate,Palo Alto-VM,Juniper SRX
                              Routing: Juniper, Arista, Cisco
                              Switching: Juniper, Arista, Cisco
                              Wireless: Unifi, Aruba IAP
                              JNCIP,CCNP Enterprise

                              M 1 Reply Last reply Feb 28, 2025, 9:02 PM Reply Quote 0
                              • M
                                mcury @michmoor
                                last edited by Feb 28, 2025, 9:02 PM

                                @michmoor said in BGP convergence with BFD working smoothly with the settings below.:

                                @mcury

                                Ok so ive made some changes to my default configuration since you identified the issue.

                                This pertains to me only but when i set up a new firewall AND the firewall is at the edge of the network i do the following

                                1. If a single ISP, disable gateway monitoring action. The default is that its enabled but the problem is if there is a ISP hiccup, all packages and services gets restarted. I learned this the hard way. If i get packet loss even for a few seconds, all the VPN tunnels get restarted along with BGP. Why? Thats the gateway monitoring action.

                                2. If pfsense is at the edge of the network and is doing OSPF/BGP
                                  a. Firewall State Policy gets changed to Floating States
                                  b. Disable reply-to is checked off globally.

                                I wouldn't mind the defaults as they are but the problem is there is little to no documentation on how the defaults behave.

                                Point number 1 is something that I'll start to do, didn't think about that before 👍
                                Point number 2, I'm not sure if it should be disabled globally, I'm still trying to figure something that would get the benefits of reply-to but still give the user some warning about that specific scenario.

                                I'm recording a video of lab working with the reply-to disabled.
                                Soon I'll post it somewhere.

                                dead on arrival, nowhere to be found.

                                M 1 Reply Last reply Feb 28, 2025, 9:31 PM Reply Quote 0
                                • M
                                  michmoor LAYER 8 Rebel Alliance @mcury
                                  last edited by Feb 28, 2025, 9:31 PM

                                  @mcury said in BGP convergence with BFD working smoothly with the settings below.:

                                  Point number 2, I'm not sure if it should be disabled globally, I'm still trying to figure something that would get the benefits of reply-to but still give the user some warning about that specific scenario.

                                  Thats why i mentioned if pfsense is at the Edge of the network - internet facing and doing OSPF/BGP. In that case you are more than likely in a multi-wan scenrio so in my opinion disabling reply-to is OK.

                                  Firewall: NetGate,Palo Alto-VM,Juniper SRX
                                  Routing: Juniper, Arista, Cisco
                                  Switching: Juniper, Arista, Cisco
                                  Wireless: Unifi, Aruba IAP
                                  JNCIP,CCNP Enterprise

                                  M 1 Reply Last reply Feb 28, 2025, 10:01 PM Reply Quote 0
                                  • M
                                    mcury @michmoor
                                    last edited by Feb 28, 2025, 10:01 PM

                                    @michmoor said in BGP convergence with BFD working smoothly with the settings below.:

                                    Thats why i mentioned if pfsense is at the Edge of the network - internet facing and doing OSPF/BGP. In that case you are more than likely in a multi-wan scenrio so in my opinion disabling reply-to is OK.

                                    👍
                                    Agreed.
                                    Sent you a PM, hope you don't mind..

                                    dead on arrival, nowhere to be found.

                                    1 Reply Last reply Reply Quote 1
                                    • A
                                      andrew_cb
                                      last edited by Mar 18, 2025, 1:10 AM

                                      Nice work @michmoor @mcury @marcosm !

                                      I'm not using BGP or OSPF but your troubleshooting has been an interesting read.

                                      M 1 Reply Last reply Mar 18, 2025, 1:42 PM Reply Quote 1
                                      • M
                                        michmoor LAYER 8 Rebel Alliance @andrew_cb
                                        last edited by Mar 18, 2025, 1:42 PM

                                        Redmine has been updated to reflect the testing done by @mcury so there is official guidance regarding treating this set up with dynamic routing.

                                        I would take it a step further and make the declaration that if your firewall is running dynamic routing protocols (BGP/OSPF) then disable reply-to system wide and make sure you are using floating state policy. I wouldn't do it on a per-rule basis as that's not scalable - prone to error as you miss a rule with those advanced options set.

                                        Firewall: NetGate,Palo Alto-VM,Juniper SRX
                                        Routing: Juniper, Arista, Cisco
                                        Switching: Juniper, Arista, Cisco
                                        Wireless: Unifi, Aruba IAP
                                        JNCIP,CCNP Enterprise

                                        M 1 Reply Last reply Mar 18, 2025, 3:43 PM Reply Quote 1
                                        • M
                                          mcury @michmoor
                                          last edited by Mar 18, 2025, 3:43 PM

                                          @michmoor said in BGP convergence with BFD working smoothly with the settings below.:

                                          Redmine has been updated to reflect the testing done by @mcury so there is official guidance regarding treating this set up with dynamic routing.

                                          Glad to see that..
                                          They even tested with HA.. Thanks @marcosm for testing. 👍 👍

                                          dead on arrival, nowhere to be found.

                                          1 Reply Last reply Reply Quote 1
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.