IPsec site to site dropping every 49-55 minutes
-
@TheStormsOfFury said in IPsec site to site dropping every 49-55 minutes:
@andrew_cb Just looking at this again as I know there is a time difference, but shouldn't this be the configuration if we do not want the IPsec traffic to go out the WAN?
Action => Block
Interface => IPsec
Direction => out
Address Family => IPv4
Protocol => Any
Source => LAN
Destination => WAN RTR1 / WAN RTR2 / VIP WANMaybe i'm thinking about this wrong, but this would keep IPsec traffic from the lan going out over the wan, or am I wrong?
PS, the system is still dropping for about 40-50 seconds every 55-ish minutes. I'm going to get the logs again shortly.
If the remote subnets are being removed from the firewall's routing table then they are sent out the default route (i.e. 0.0.0.0 / WAN).
When traffic is sent out the WAN, it gets NAT'ed so that the source address is now the WAN IP, and then it is sent out the WAN interface.
A state is created on the WAN interface with the source address being the WAN IP (due to NAT) and the destination will be the remote VPN subnet.
This state will persist and cause the firewall to keep sending traffic destined for the remote VPN subnet out the WAN instead.Eventually a program will timeout and stop transmitting. When the program tries again, it usually uses a different source port, which causes a new session to be created on the firewall. If the VPN's P2 are established, then the new traffic is correctly sent over the VPN tunnel.
The floating block rule is set on the WAN interface with source of WAN Address because NAT is performed before the firewall rules are processed.
-
@andrew_cb Okay, I think i understand now.
This means that the rule should look as follows:
Action => Block
Interface => WAN/WAN VIP
Direction => out
Address Family => IPv4
Protocol => Any
Source => WAN/WAN VIP
Destination => For Site A = Site B Subnets && for Site B = Site A SubnetsLet me know if I've got it right this time!
Thanks again,
TSoF
-
@andrew_cb well, I have the rules as this one setup above, and it still timed out. Maybe I just need to set P2 to timeout at like 24 hours and then apply it and reboot at like 6:00 AM that way it will take it months to timeout during the working period.
It just seems odd that I have "Make before break" enabled, yet it isn't doing that.
Thinking about it, I have it set to WAN SUBNETS as the source as this uses CARP and VIP Ip address. Should I change it to just the "IP address" that is the VIP IP?
-
So i have my firewall rule to block on the WAN VIP address to see if it will stop that 45-50 second down time.
I also performed a test and setup my tunnel times as follows:
Phase 1:
- Lifetime 604,800 (7 days/168 hours)
- ReKey 604,200 (10 minutes less than lifetime)
- ReAuth 160 (2 minutes) (This might need to be 604,640)
- Rand Time 60 (1 minute)
Phase 2:
- 86,400 (1 day/24 hours)
- 85,800 (10 minutes less than lifetime)
- 60 (1 minute)
The results were disastrously. The tunnels were resting every 45-50 seconds and it was creating multiple connection visible in the ipsec status.
I reset everything back to the original mentioned above, but we're still timing out at the 55-ish minute mark, which is not good during business hours as phone calls and app connections are failing.
I do have a question. If i am using make before break, do i still need "Phase One Child SA Close Action" to be restart? Shouldn't it be close?
Any thoughts? I'm about ready to move on and try wireguard.
-
I did a little checking and the tunnel is going down every 53 minutes 30 seconds (give or take a second).
Then it is down for approximately 38 - 40 seconds.
This is super strange that I have make before break enabled, but it is still doing this.
Thanks in advance!
TSoF
-
@TheStormsOfFury said in IPsec site to site dropping every 49-55 minutes:
We are running CARP and both sites are running identical hardware (Dell R620) with pfSense 24.11-RELEASE.
A couple of things.
- Are you using the WAN CARP address to establish your IPsec tunnels?
- For your IPsec tunnels. Do you have gateway monitoring enabled?
- Have you enabled DPD ?
-
@michmoor Thanks for the reply!
1.) I am using the shared Virtual IP address so that way if RTR1 goes down RTR2 can provide the services.
2.) I need to find that feature. inside Phase One there is a "Gateway Duplicate" that is disabled. Do you know where the "gateway monitoring enabled" is by any chance?
3.) DPD is enabled in Phase One on both side of the tunnel.
Thanks again!
TSoF
-
@TheStormsOfFury said in IPsec site to site dropping every 49-55 minutes:
2.) I need to find that feature. inside Phase One there is a "Gateway Duplicate" that is disabled. Do you know where the "gateway monitoring enabled" is by any chance?
System /Routing / Gateways
-
@michmoor I found the Gateway monitoring setting in: System > Routing > Gateways > Edit
Gateway Monitoring - Disable Gateway Monitoring
This will consider this gateway as always being up.Is this what you are talking about? Reading it, without it being checked gateway monitoring is enabled.
I went ahead and disabled the "Gateway Monitoring" and "Gateway Action."
Thanks again,
TSoF
-
@TheStormsOfFury
I care more about Gateway Action. If that's set, lets unselect for now. Im assuming you are NOT in a multi-wan configurationWhat that does is that if there is an issue with your gateway monitor IP (packet loss, jitter) this will bring down the IPsec connection in your case.
-
@michmoor said in IPsec site to site dropping every 49-55 minutes:
@TheStormsOfFury
I care more about Gateway Action. If that's set, lets unselect for now. Im assuming you are NOT in a multi-wan configurationWhat that does is that if there is an issue with your gateway monitor IP (packet loss, jitter) this will bring down the IPsec connection in your case.
Correct, we are in a single WAN configuration. They were both "enabled" or "un-checked" and i went ahead and "checked" them on both sites
TSoF
-
@TheStormsOfFury great. Let’s monitor IPsec stability.
How soon will you know if it dropped?Edit: to be clear, disable gateway action is checked? It should be
-
@michmoor In about 5 minutes. That will be the 53 minute mark. I did make the changes on both sites as well.
Thanks!
TSoF
-
@michmoor no dice. Still timed out at 53 minutes and 29 seconds for 39 seconds.
Thanks for the suggestions! Open if you have any more.
my logs roll over so quick, i'll have to wait as it's already pushed out.
Thanks again!
TSoF
-
Have you read the suggestions here?
https://docs.netgate.com/pfsense/en/latest/troubleshooting/ipsec-connections.html#dpd-is-unsupported-and-one-side-drops-while-the-other-remains
-
@michmoor said in IPsec site to site dropping every 49-55 minutes:
https://docs.netgate.com/pfsense/en/latest/troubleshooting/ipsec-connections.html#dpd-is-unsupported-and-one-side-drops-while-the-other-remains
DPD is enabled on both sites; however, i did not have the periodic keepalive or ping set. I did go ahead and enable that and also changed it from "child actions" of restart / reconnect to close and clear SA. This time when I reset the tunnels instead of creating multiple connections, it only created one.
Now we wait 53 minutes and see what happens.
Thanks agian!
TSoF
-
@TheStormsOfFury
If this doesnt fix i strongly feel there is some mismatch between the two. If you don't mind sharing your P1/P2 settings from each side. Pictures preferred. -
@TheStormsOfFury Yes, that rule looks correct now.
If you enable logging and call it something like "Block VPN subnets leaks to WAN" you can check under Status > System Logs > Firewall and see all the times that the rule is triggered. -
@TheStormsOfFury It might help to increase all the IPsec logging by one so that you can gather more data about what is happening during the re-keying. Also increase the size of the IPsec log so that more information is visible before being overwritten.
Also, does RTR2 show anything in its logs? I wonder if due to the way that IPsec is part of the kernel, maybe RTR2 responding to some of the traffic during the re-key?
-
So, inside VPN > IPsec > Advanced Settings
There is a list of 16 IPsec logging controls. Which would you recommend increasing so we can get the best results??
https://imgur.com/2g4WGXh