IPsec site to site dropping every 49-55 minutes
-
@michmoor I found the Gateway monitoring setting in: System > Routing > Gateways > Edit
Gateway Monitoring - Disable Gateway Monitoring
This will consider this gateway as always being up.Is this what you are talking about? Reading it, without it being checked gateway monitoring is enabled.
I went ahead and disabled the "Gateway Monitoring" and "Gateway Action."
Thanks again,
TSoF
-
@TheStormsOfFury
I care more about Gateway Action. If that's set, lets unselect for now. Im assuming you are NOT in a multi-wan configurationWhat that does is that if there is an issue with your gateway monitor IP (packet loss, jitter) this will bring down the IPsec connection in your case.
-
@michmoor said in IPsec site to site dropping every 49-55 minutes:
@TheStormsOfFury
I care more about Gateway Action. If that's set, lets unselect for now. Im assuming you are NOT in a multi-wan configurationWhat that does is that if there is an issue with your gateway monitor IP (packet loss, jitter) this will bring down the IPsec connection in your case.
Correct, we are in a single WAN configuration. They were both "enabled" or "un-checked" and i went ahead and "checked" them on both sites
TSoF
-
@TheStormsOfFury great. Let’s monitor IPsec stability.
How soon will you know if it dropped?Edit: to be clear, disable gateway action is checked? It should be
-
@michmoor In about 5 minutes. That will be the 53 minute mark. I did make the changes on both sites as well.
Thanks!
TSoF
-
@michmoor no dice. Still timed out at 53 minutes and 29 seconds for 39 seconds.
Thanks for the suggestions! Open if you have any more.
my logs roll over so quick, i'll have to wait as it's already pushed out.
Thanks again!
TSoF
-
Have you read the suggestions here?
https://docs.netgate.com/pfsense/en/latest/troubleshooting/ipsec-connections.html#dpd-is-unsupported-and-one-side-drops-while-the-other-remains
-
@michmoor said in IPsec site to site dropping every 49-55 minutes:
https://docs.netgate.com/pfsense/en/latest/troubleshooting/ipsec-connections.html#dpd-is-unsupported-and-one-side-drops-while-the-other-remains
DPD is enabled on both sites; however, i did not have the periodic keepalive or ping set. I did go ahead and enable that and also changed it from "child actions" of restart / reconnect to close and clear SA. This time when I reset the tunnels instead of creating multiple connections, it only created one.
Now we wait 53 minutes and see what happens.
Thanks agian!
TSoF
-
@TheStormsOfFury
If this doesnt fix i strongly feel there is some mismatch between the two. If you don't mind sharing your P1/P2 settings from each side. Pictures preferred. -
@TheStormsOfFury Yes, that rule looks correct now.
If you enable logging and call it something like "Block VPN subnets leaks to WAN" you can check under Status > System Logs > Firewall and see all the times that the rule is triggered. -
@TheStormsOfFury It might help to increase all the IPsec logging by one so that you can gather more data about what is happening during the re-keying. Also increase the size of the IPsec log so that more information is visible before being overwritten.
Also, does RTR2 show anything in its logs? I wonder if due to the way that IPsec is part of the kernel, maybe RTR2 responding to some of the traffic during the re-key?
-
So, inside VPN > IPsec > Advanced Settings
There is a list of 16 IPsec logging controls. Which would you recommend increasing so we can get the best results??
https://imgur.com/2g4WGXh
-
@michmoor So it still closed and I'm going to now past copies of the P1/P2 configs per site.
While I was taking these images, I was confirming that they were all identical. Let me know if you see something I missed.
Site 1 Phase 1: https://imgur.com/eldBRXO
- Part 1: https://imgur.com/G4kEzHl
- Part 2: https://imgur.com/x44xLAj
- Part 3: https://imgur.com/N70XlmN
Site 1 Phase 2 ONE - Part 1: https://imgur.com/DYdObqD
- Part 2: https://imgur.com/iwBOlBM
Site 1 Phase 2 TWO - Part 1: https://imgur.com/m1R7THi
- Part 2: https://imgur.com/Q7PiEI0
Site 2 Phase 1: https://imgur.com/xgtr7Rh
- Part 1: https://imgur.com/jckl5jQ
- Part 2: https://imgur.com/x44xLAj
- Part 3: https://imgur.com/TBvYo0b
Site 2 Phase 2 ONE - Part 1: https://imgur.com/PkvVr1R
- Part 2: https://imgur.com/aBcSz2n
Site 2 Phase 2 TWO - Part 1: https://imgur.com/Of4Phes
- Part 2: https://imgur.com/aksgmlq
-
@TheStormsOfFury
Thanks for this and thanks for being organized in how you presented the pictures.Curious. For picture https://imgur.com/Of4Phes
I noticed that the Local Network is set to 'Network' 10.0.1.0 which is different for the other site. Is this network not directly connected to pfsense? Is it routed (another gateway/router behind pfsense)?
edit: I don't think that's the problem just curious. Trying to better understand the environment.
-
@michmoor You're welcome. And I just lay it out how I can see it in my head lol!
So that network is an OpenVPN connection for off-site individuals, and I took the insturctions from the site on how to configure the back and forth.
That said, I also don't think it's the issue as i have tried deleteing it and the 53 minute timeout keeps happening.
Thanks!
TSoF
-
@andrew_cb RTR2 shows basically the same as 1.
I ended up setting al of the logging details to "diag" and so in about 53 minutes i'll have better logs on the IPsec connection and I'll post them here!
Thank you!
TSoF
-
Okay, so I was finally able to be there when it happened and catch the logs:
Timeout started at 21:15:42 and ended at 21:16:21 (site a) and started at 21:15:41 and ended at 21:16:21 (site b)
It was too long to put into here, and I got (2) total minutes for both sides.
https://pastebin.com/raw/FrBXWYaw
Also, @andrew_cb i mistook what you were asking about RTR2. I was thinking site 2 / site B, and you were asking about rtr2 at site a/b
I didn't get the right time, but I will the next one! I looked and the only things i am seeing are following:
Apr 14 21:39:34 charon 41423 02[JOB] watcher is observing 6 fds Apr 14 21:39:34 charon 41423 02[JOB] events on fds: 25[w] Apr 14 21:39:34 charon 41423 02[JOB] watcher is observing 5 fds Apr 14 21:39:34 charon 41423 02[JOB] watcher got notification, rebuilding Apr 14 21:39:34 charon 41423 02[JOB] watcher is observing 6 fds Apr 14 21:39:34 charon 41423 02[JOB] events on fds: 25[r] Apr 14 21:39:34 charon 41423 02[JOB] watcher is observing 5 fds Apr 14 21:39:34 charon 41423 02[JOB] watcher got notification, rebuilding Apr 14 21:39:34 charon 41423 06[CFG] vici client 216 disconnected Apr 14 21:39:34 charon 41423 02[JOB] watcher is observing 5 fds
-
I went through your logs and nothing is sticking out. Do you have other IPsec tunnels? If so are they having the same problems?
-
@michmoor This is the only tunnel we have lol.
I had considered wireguard but i undersand that it is not dependable, but then again at this point neither is IPsec; however, i understand it's worse.
We switched from openvpn to ipsec becuase we're just not getting the speeds needed across the tunnel, and from what i understand that is becuase the version of openvpn on pfSense is only single threaded and it cannot handle higher speeds above 100-200Mbps and we have 1000Mbps synchronous uplinks at both locations.
What is your thought on the comparison between ipsec vs wireguard vs openvpn?
Thanks again!
TSoF
-
@TheStormsOfFury said in IPsec site to site dropping every 49-55 minutes:
What is your thought on the comparison between ipsec vs wireguard vs openvpn?
I use Netgate appliances not white box so from a hardware support perspective our experiences will be different.
For example, I have options to use AES-NI, QAT or Ipsec-MB for cryptographic acceleration or DCO for OpenVPN. I dont have throughput limitations by hardware.From experience, i have had no issues with Wireguard. The only caveat is that in a High Availability set up its not as seamless as IPsec. You can read about it here
If i had to choose, i would go with Wireguard.