BGP convergence with BFD working smoothly with the settings below.
-
@michmoor said in BGP convergence with BFD working smoothly with the settings below.:
Related to this??
https://redmine.pfsense.org/issues/14630
Yeap, exactly that.
Try to disable reply-to for the inbound rules..
That is all it is required, no need to change the state, leave it at default.
That seems to fix all the problems.I'm currently testing this in lab with BGP and also for a customer that has OSPF, so far no problems at all..
Just disable reply-to in a lab environment, test it and then update here.
I really want to know if I'm the only one seeing this working smoothly.
LAB is running 2.7.2 with all patches applied.
My customer is running 24.03 on a SG-3100 (IPsec Filter Mode: Filter IPsec VTI and Transport on assigned interfaces, block all tunnel mode traffic.) to 2.7.2 with IPsec Filter Mode default. -
Bringing in @marcosm as he was glancing at this from another forum post not that long ago.
This is an exciting find and something that has been "broken" for some time now but i am curious about two things
- Why does disabling reply-to fix it [edit] - The VTI is a logical interface so in theory traffic should return back to the source. The underlying path may change of course but VTI should be constant.
- What is the long term fix for this? Just disable 'reply-to' on every rule created under a IPsec VTI tab that may be doing dynamic routing? What if i have an IPsec VTI that isn't using routing?
Seriously tho, this is really huge because FRR has not worked as intended for quite some time.
-
@michmoor said in BGP convergence with BFD working smoothly with the settings below.:
This is an exciting find
I have been struggling with this for a long long time..
When it worked, i just started a new lab just to confirm which setting did the trick.
And that is it, reply-to.. -
So the way i see it is either disable reply-to on the entire system OR disable reply-to on individual firewall rules under the VTI interface.
What are the security implications if any i wonder.
-
@michmoor said in BGP convergence with BFD working smoothly with the settings below.:
So the way i see it is either disable reply-to on the entire system OR disable reply-to on individual firewall rules under the VTI interface.
I did disable only on some firewall rules.
These are for my lab environment:
ICMP for dpinger is allowed with reply-to enabled
TCP 179 (BGP) is allowed with reply-to enabled
UDP 3784 (BFD) is allowed with reply-to enabled.
ICMP for the local LAN (reply-to disabled)
TCP 443 for the local LAN (reply-to-disabled).I don't think there are security implications involved..
As far as I know, reply-to is only a pf mechanism to avoid packets taking a different path from where they originally came from.But, lets see what users that really understand the things under the hood and how pf and reply-to work, have to say about this.
-
I think a simply fix for this would actually be an updated documentation believe it or not.
System:Advanced:Firewall & NAT has the following
So in my mind, make a note here AND in the documentation to suggest disabling if using dynamic routing.
@mcury I think you are right in that there isn't any harm disabling reply-to. Its added there as a benefit as most likely customers would need it but in advanced scenarios such as BGP routing and the possibility of traffic being asymmetric in nature, this MUST be disabled. There is no alternative.
Great find my friend. Between this and helping me with Graylog i think i owe you some beers dude.
-
@michmoor said in BGP convergence with BFD working smoothly with the settings below.:
Great find my friend. Between this and helping me with Graylog i think i owe you some beers dude.
oh, its Friday :)
-
@mcury
Speaking to marcos on the side, there is a bit of nuance here.If the IPsec Filter Mode is changed from the default to 'Filter IPsec VTI and Transport on assigned interfaces' then reply-to gets applied by default. I think that is what ultimately breaks convergence. If using that mode then the fix is to do what we suggested. Also the state policy mode must be set to floating.
There are just to many gotchas here and the only way this was discovered was through experimentation.
In my humble opinion, a change needs to go in that if the IPsec Filter Mode is changed from the default then on the backend all rules created under the VTI Firewall tables have reply-to negated along with state policy mode changed to floating. Just a simple drop-down and apply.
FRR has been broken for over a year, potentially longer. Im glad you solved it but this is a tad much to take into account if you are an network administrator who simply wants failover.
Another added wrinkle is that if you do not use IPsec with BGP/OSPF and you simply have pfsense with multiple ISP providers doing BGP as i discovered, you must change the state policy mode to floating otherwise traffic gets blackholed.
Again -- too many gotchas. Default configuration as used today will blackhole traffic if using FRR.
-
If the IPsec Filter Mode is changed from the default to 'Filter IPsec VTI and Transport on assigned interfaces' then reply-to gets applied by default. I think that is what ultimately breaks convergence. If using that mode then the fix is to do what we suggested.
Tested with both modes, both work with the no-reply option.
Also the state policy mode must be set to floating.
Indeed, this is how it is currently set here, but there is an option that does that automatically for IPsec rules if I'm not mistaken, check image posted in my answer below.
There are just to many gotchas here and the only way this was discovered was through experimentation.
A bunch of tests here, even using different state options, keep, none (creating outbound floating rules), sloppy..
In my humble opinion, a change needs to go in that if the IPsec Filter Mode is changed from the default then on the backend all rules created under the VTI Firewall tables have reply-to negated along with state policy mode changed to floating. Just a simple drop-down and apply.
I don't know what the best approach would be, perhaps give the option you mentioned and update the IPsec VTI documents highlighting this.
FRR has been broken for over a year, potentially longer. Im glad you solved it but this is a tad much to take into account if you are an network administrator who simply wants failover.
I agree.
Another added wrinkle is that if you do not use IPsec with BGP/OSPF and you simply have pfsense with multiple ISP providers doing BGP as i discovered, you must change the state policy mode to floating otherwise traffic gets blackholed.
I think that is already the default for IPsec VTI tunnels.
Again -- too many gotchas. Default configuration as used today will blackhole traffic if using FRR.
First thing I would do is to update the documentation with this workaround, specially in the OSPF/BGP section of the FRR.
-
-
Ok so ive made some changes to my default configuration since you identified the issue.
This pertains to me only but when i set up a new firewall AND the firewall is at the edge of the network i do the following
-
If a single ISP, disable gateway monitoring action. The default is that its enabled but the problem is if there is a ISP hiccup, all packages and services gets restarted. I learned this the hard way. If i get packet loss even for a few seconds, all the VPN tunnels get restarted along with BGP. Why? Thats the gateway monitoring action.
-
If pfsense is at the edge of the network and is doing OSPF/BGP
a. Firewall State Policy gets changed to Floating States
b. Disable reply-to is checked off globally.
I wouldn't mind the defaults as they are but the problem is there is little to no documentation on how the defaults behave.
-
-
@michmoor said in BGP convergence with BFD working smoothly with the settings below.:
Ok so ive made some changes to my default configuration since you identified the issue.
This pertains to me only but when i set up a new firewall AND the firewall is at the edge of the network i do the following
-
If a single ISP, disable gateway monitoring action. The default is that its enabled but the problem is if there is a ISP hiccup, all packages and services gets restarted. I learned this the hard way. If i get packet loss even for a few seconds, all the VPN tunnels get restarted along with BGP. Why? Thats the gateway monitoring action.
-
If pfsense is at the edge of the network and is doing OSPF/BGP
a. Firewall State Policy gets changed to Floating States
b. Disable reply-to is checked off globally.
I wouldn't mind the defaults as they are but the problem is there is little to no documentation on how the defaults behave.
Point number 1 is something that I'll start to do, didn't think about that before
Point number 2, I'm not sure if it should be disabled globally, I'm still trying to figure something that would get the benefits of reply-to but still give the user some warning about that specific scenario.I'm recording a video of lab working with the reply-to disabled.
Soon I'll post it somewhere. -
-
@mcury said in BGP convergence with BFD working smoothly with the settings below.:
Point number 2, I'm not sure if it should be disabled globally, I'm still trying to figure something that would get the benefits of reply-to but still give the user some warning about that specific scenario.
Thats why i mentioned if pfsense is at the Edge of the network - internet facing and doing OSPF/BGP. In that case you are more than likely in a multi-wan scenrio so in my opinion disabling reply-to is OK.
-
@michmoor said in BGP convergence with BFD working smoothly with the settings below.:
Thats why i mentioned if pfsense is at the Edge of the network - internet facing and doing OSPF/BGP. In that case you are more than likely in a multi-wan scenrio so in my opinion disabling reply-to is OK.
Agreed.
Sent you a PM, hope you don't mind..