IPsec site to site dropping every 49-55 minutes
-
@TheStormsOfFury
If this doesnt fix i strongly feel there is some mismatch between the two. If you don't mind sharing your P1/P2 settings from each side. Pictures preferred. -
@TheStormsOfFury Yes, that rule looks correct now.
If you enable logging and call it something like "Block VPN subnets leaks to WAN" you can check under Status > System Logs > Firewall and see all the times that the rule is triggered. -
@TheStormsOfFury It might help to increase all the IPsec logging by one so that you can gather more data about what is happening during the re-keying. Also increase the size of the IPsec log so that more information is visible before being overwritten.
Also, does RTR2 show anything in its logs? I wonder if due to the way that IPsec is part of the kernel, maybe RTR2 responding to some of the traffic during the re-key?
-
So, inside VPN > IPsec > Advanced Settings
There is a list of 16 IPsec logging controls. Which would you recommend increasing so we can get the best results??
https://imgur.com/2g4WGXh
-
@michmoor So it still closed and I'm going to now past copies of the P1/P2 configs per site.
While I was taking these images, I was confirming that they were all identical. Let me know if you see something I missed.
Site 1 Phase 1: https://imgur.com/eldBRXO
- Part 1: https://imgur.com/G4kEzHl
- Part 2: https://imgur.com/x44xLAj
- Part 3: https://imgur.com/N70XlmN
Site 1 Phase 2 ONE - Part 1: https://imgur.com/DYdObqD
- Part 2: https://imgur.com/iwBOlBM
Site 1 Phase 2 TWO - Part 1: https://imgur.com/m1R7THi
- Part 2: https://imgur.com/Q7PiEI0
Site 2 Phase 1: https://imgur.com/xgtr7Rh
- Part 1: https://imgur.com/jckl5jQ
- Part 2: https://imgur.com/x44xLAj
- Part 3: https://imgur.com/TBvYo0b
Site 2 Phase 2 ONE - Part 1: https://imgur.com/PkvVr1R
- Part 2: https://imgur.com/aBcSz2n
Site 2 Phase 2 TWO - Part 1: https://imgur.com/Of4Phes
- Part 2: https://imgur.com/aksgmlq
-
@TheStormsOfFury
Thanks for this and thanks for being organized in how you presented the pictures.Curious. For picture https://imgur.com/Of4Phes
I noticed that the Local Network is set to 'Network' 10.0.1.0 which is different for the other site. Is this network not directly connected to pfsense? Is it routed (another gateway/router behind pfsense)?
edit: I don't think that's the problem just curious. Trying to better understand the environment.
-
@michmoor You're welcome. And I just lay it out how I can see it in my head lol!
So that network is an OpenVPN connection for off-site individuals, and I took the insturctions from the site on how to configure the back and forth.
That said, I also don't think it's the issue as i have tried deleteing it and the 53 minute timeout keeps happening.
Thanks!
TSoF
-
@andrew_cb RTR2 shows basically the same as 1.
I ended up setting al of the logging details to "diag" and so in about 53 minutes i'll have better logs on the IPsec connection and I'll post them here!
Thank you!
TSoF
-
Okay, so I was finally able to be there when it happened and catch the logs:
Timeout started at 21:15:42 and ended at 21:16:21 (site a) and started at 21:15:41 and ended at 21:16:21 (site b)
It was too long to put into here, and I got (2) total minutes for both sides.
https://pastebin.com/raw/FrBXWYaw
Also, @andrew_cb i mistook what you were asking about RTR2. I was thinking site 2 / site B, and you were asking about rtr2 at site a/b
I didn't get the right time, but I will the next one! I looked and the only things i am seeing are following:
Apr 14 21:39:34 charon 41423 02[JOB] watcher is observing 6 fds Apr 14 21:39:34 charon 41423 02[JOB] events on fds: 25[w] Apr 14 21:39:34 charon 41423 02[JOB] watcher is observing 5 fds Apr 14 21:39:34 charon 41423 02[JOB] watcher got notification, rebuilding Apr 14 21:39:34 charon 41423 02[JOB] watcher is observing 6 fds Apr 14 21:39:34 charon 41423 02[JOB] events on fds: 25[r] Apr 14 21:39:34 charon 41423 02[JOB] watcher is observing 5 fds Apr 14 21:39:34 charon 41423 02[JOB] watcher got notification, rebuilding Apr 14 21:39:34 charon 41423 06[CFG] vici client 216 disconnected Apr 14 21:39:34 charon 41423 02[JOB] watcher is observing 5 fds
-
I went through your logs and nothing is sticking out. Do you have other IPsec tunnels? If so are they having the same problems?
-
@michmoor This is the only tunnel we have lol.
I had considered wireguard but i undersand that it is not dependable, but then again at this point neither is IPsec; however, i understand it's worse.
We switched from openvpn to ipsec becuase we're just not getting the speeds needed across the tunnel, and from what i understand that is becuase the version of openvpn on pfSense is only single threaded and it cannot handle higher speeds above 100-200Mbps and we have 1000Mbps synchronous uplinks at both locations.
What is your thought on the comparison between ipsec vs wireguard vs openvpn?
Thanks again!
TSoF
-
@TheStormsOfFury said in IPsec site to site dropping every 49-55 minutes:
What is your thought on the comparison between ipsec vs wireguard vs openvpn?
I use Netgate appliances not white box so from a hardware support perspective our experiences will be different.
For example, I have options to use AES-NI, QAT or Ipsec-MB for cryptographic acceleration or DCO for OpenVPN. I dont have throughput limitations by hardware.From experience, i have had no issues with Wireguard. The only caveat is that in a High Availability set up its not as seamless as IPsec. You can read about it here
If i had to choose, i would go with Wireguard.
-
@michmoor I would love to give your reply a thumbs up, but apparently you have to have 5 something, and no clue on how to get that.
Anyway, I'm going to look at wireguard; however, i upped my p1 timeout, rekey, and expiry times to 7 days then 10 under for rekey and 2 under for expiry and i've gone ahead and upped the p2 to 1 day and rekey at 5 minutes under.
That was at 13:44 and we are now at 16:17 and we haven't had a drop yet.