IPsec site to site dropping every 49-55 minutes
-
@Viragonmann
So I found this setting that says "Make before Break," and I am wondering if that is the issue I am running into as it is breaking, then renegotiating then reconnecting, which would cause that issue. So it is now enabled; however, my latency went way up for some reason.
My Phase One is:
Life Time: 28800
Rekey Time: 25920
Reauth Time: 0
Rand Time: 2880Phy Phase Two(s) are:
Life Time: 4000
ReKey Time: 3600
Rand Time: 400Now, in their instructions; it shows that these should match site-to-site. Is this the case in your experience?
-
On Site A:
Apr 8 18:31:59 charon 25819 15[CFG] <con1|1> config: 10.0.1.0/24|/0, received: 10.0.1.0/24|/0 => match: 10.0.1.0/24|/0 Apr 8 18:31:59 charon 25819 15[CFG] <con1|1> config: 10.0.1.0/24|/0, received: 10.2.0.0/16|/0 => no match Apr 8 18:31:59 charon 25819 15[CFG] <con1|1> config: 10.2.0.0/16|/0, received: 10.0.1.0/24|/0 => no match Apr 8 18:31:59 charon 25819 15[CFG] <con1|1> config: 10.2.0.0/16|/0, received: 10.2.0.0/16|/0 => match: 10.2.0.0/16|/0
On Site B:
Apr 8 18:31:59 charon 93875 09[CFG] <con1|1> config: 10.1.0.0/16|/0, received: 10.1.0.0/16|/0 => match: 10.1.0.0/16|/0 Apr 8 18:31:59 charon 93875 09[CFG] <con1|1> config: 10.1.0.0/16|/0, received: 10.1.0.0/16|/0 => match: 10.1.0.0/16|/0 Apr 8 18:31:59 charon 93875 09[CFG] <con1|1> selecting traffic selectors for other: Apr 8 18:31:59 charon 93875 09[CFG] <con1|1> config: 10.0.1.0/24|/0, received: 10.0.1.0/24|/0 => match: 10.0.1.0/24|/0 Apr 8 18:31:59 charon 93875 09[CFG] <con1|1> config: 10.0.1.0/24|/0, received: 10.2.0.0/16|/0 => no match Apr 8 18:31:59 charon 93875 09[CFG] <con1|1> config: 10.2.0.0/16|/0, received: 10.0.1.0/24|/0 => no match Apr 8 18:31:59 charon 93875 09[CFG] <con1|1> config: 10.2.0.0/16|/0, received: 10.2.0.0/16|/0 => match: 10.2.0.0/16|/0 Apr 8 18:31:59 charon 93875 09[CFG] <con1|1> selecting traffic selectors for us:
It looks like this triggers a bunch of deleting of CHILD_SA's which seems to break the tunnel.
What are the actual P2 subnets on each side?
You might also be experiencing "leakage" during the rekeying, where during the brief interruption, the IPsec routes disappear from the routing table and so pfSense sends the traffic out the WAN interfaces. This causes states to become established which causes it to continue to try sending traffic out the WAN until the state expires.
You can prevent this by creating a floating rule, with source interface WAN, destination <VPN subnet>, direction Out, action Block.
This causes any attempts to send the VPN subnet traffic out the WAN to be blocked (dropped) and prevents an outbound state from being established on the WAN interface. -
@andrew_cb Thank you very much for the reply.
Site A Phase 2 remote subnets
P2-1:- Local Subnet: LAN Subnet (10.1.0.0/16
- Remote Subnet: 10.2.0.0/16 (Site B LAN Subnet)
P2-2: - Local Subnet: LAN Subnet (10.1.0.0/16)
- Remote Subnet: 10.0.1.0/24 (for a road warrior open VPN connection at site B)
Site B Phase 2 remote subnets
P2-1- Local Subnet: LAN Subnet (10.2.0.0/16)
- Remote Subnet 10.1.0.0/16 (Site A LAN Subnet)
P2-2 - Local Subnet: 10.0.1.0/24 (for a road warrior open VPN connection at site B)
- Remote Subnet: 10.1.0.0/16 (Site A LAN Subnet)
I thought I turned on "Make before break" last night, but I apparently was so tired I forgot to apply the changes and restart the VPN. I just did this now, show hopefully this might fix the issue; however, I am going to also implement your suggestion of the floating rule.
In fact I'm looking at this now, but I do have a quick question about the config:
Action => Block
Interface ... Should this be WAN or IPsec?
Direction => out
Address Family => IPv4
Protocol => Any
Source ... should be WAN Address? I'm running HA so I have VIP, should it be the VIP Address? or should I do one for WAN and a separate for VIP?
Destination ... I'm going to need 1 for both remote subnets on Site A and just the Site B subnetHow close am I on this configuration?
-
Well, I was hoping the "Make before break" setting would have resulted this issue, yet at 58 minutes it trimmed out for about 45 seconds again.
-
@andrew_cb Just looking at this again as I know there is a time difference, but shouldn't this be the configuration if we do not want the IPsec traffic to go out the WAN?
Action => Block
Interface => IPsec
Direction => out
Address Family => IPv4
Protocol => Any
Source => LAN
Destination => WAN RTR1 / WAN RTR2 / VIP WANMaybe i'm thinking about this wrong, but this would keep IPsec traffic from the lan going out over the wan, or am I wrong?
PS, the system is still dropping for about 40-50 seconds every 55-ish minutes. I'm going to get the logs again shortly.
-
Here are the updated logs for the timeout session that just happened:
I am wondering if i should change the ReKey on the Phase2 as it is happening at 55 minutes on a 60 expiration and a 3-4 minute random.
I started getting ping timeouts at 14:59:02 and service return at 3:00:03
Logs on Site A:
Apr 9 15:00:33 charon 81451 09[IKE] <con1|1> nothing to initiate Apr 9 15:00:33 charon 81451 09[IKE] <con1|1> activating new tasks Apr 9 15:00:33 charon 81451 09[ENC] <con1|1> parsed INFORMATIONAL response 27 [ ] Apr 9 15:00:33 charon 81451 09[NET] <con1|1> received packet: from 75.61.85.193[500] to 207.162.137.151[500] (80 bytes) Apr 9 15:00:33 charon 81451 09[NET] <con1|1> sending packet: from 207.162.137.151[500] to 75.61.85.193[500] (80 bytes) Apr 9 15:00:33 charon 81451 09[ENC] <con1|1> generating INFORMATIONAL request 27 [ ] Apr 9 15:00:33 charon 81451 09[IKE] <con1|1> activating IKE_DPD task Apr 9 15:00:33 charon 81451 09[IKE] <con1|1> activating new tasks Apr 9 15:00:33 charon 81451 09[IKE] <con1|1> queueing IKE_DPD task Apr 9 15:00:33 charon 81451 09[IKE] <con1|1> sending DPD request Apr 9 15:00:23 charon 81451 09[IKE] <con1|1> nothing to initiate Apr 9 15:00:23 charon 81451 09[IKE] <con1|1> activating new tasks Apr 9 15:00:23 charon 81451 09[ENC] <con1|1> parsed INFORMATIONAL response 26 [ ] Apr 9 15:00:23 charon 81451 09[NET] <con1|1> received packet: from 75.61.85.193[500] to 207.162.137.151[500] (80 bytes) Apr 9 15:00:23 charon 81451 09[NET] <con1|1> sending packet: from 207.162.137.151[500] to 75.61.85.193[500] (80 bytes) Apr 9 15:00:23 charon 81451 09[ENC] <con1|1> generating INFORMATIONAL request 26 [ ] Apr 9 15:00:23 charon 81451 09[IKE] <con1|1> activating IKE_DPD task Apr 9 15:00:23 charon 81451 09[IKE] <con1|1> activating new tasks Apr 9 15:00:23 charon 81451 09[IKE] <con1|1> queueing IKE_DPD task Apr 9 15:00:23 charon 81451 09[IKE] <con1|1> sending DPD request Apr 9 15:00:13 charon 81451 09[IKE] <con1|1> nothing to initiate Apr 9 15:00:13 charon 81451 09[IKE] <con1|1> activating new tasks Apr 9 15:00:13 charon 81451 09[ENC] <con1|1> parsed INFORMATIONAL response 25 [ ] Apr 9 15:00:13 charon 81451 09[NET] <con1|1> received packet: from 75.61.85.193[500] to 207.162.137.151[500] (80 bytes) Apr 9 15:00:13 charon 81451 10[NET] <con1|1> sending packet: from 207.162.137.151[500] to 75.61.85.193[500] (80 bytes) Apr 9 15:00:13 charon 81451 10[ENC] <con1|1> generating INFORMATIONAL request 25 [ ] Apr 9 15:00:13 charon 81451 10[IKE] <con1|1> activating IKE_DPD task Apr 9 15:00:13 charon 81451 10[IKE] <con1|1> activating new tasks Apr 9 15:00:13 charon 81451 10[IKE] <con1|1> queueing IKE_DPD task Apr 9 15:00:13 charon 81451 10[IKE] <con1|1> sending DPD request Apr 9 15:00:03 charon 81451 10[IKE] <con1|1> nothing to initiate Apr 9 15:00:03 charon 81451 10[IKE] <con1|1> activating new tasks Apr 9 15:00:03 charon 81451 10[ENC] <con1|1> parsed INFORMATIONAL response 24 [ ] Apr 9 15:00:03 charon 81451 10[NET] <con1|1> received packet: from 75.61.85.193[500] to 207.162.137.151[500] (80 bytes) Apr 9 15:00:03 charon 81451 10[NET] <con1|1> sending packet: from 207.162.137.151[500] to 75.61.85.193[500] (80 bytes) Apr 9 15:00:03 charon 81451 10[ENC] <con1|1> generating INFORMATIONAL request 24 [ ] Apr 9 15:00:03 charon 81451 10[IKE] <con1|1> activating IKE_DPD task Apr 9 15:00:03 charon 81451 10[IKE] <con1|1> activating new tasks Apr 9 15:00:03 charon 81451 10[IKE] <con1|1> queueing IKE_DPD task Apr 9 15:00:03 charon 81451 10[IKE] <con1|1> sending DPD request Apr 9 14:59:53 charon 81451 10[IKE] <con1|1> nothing to initiate Apr 9 14:59:53 charon 81451 10[IKE] <con1|1> activating new tasks Apr 9 14:59:53 charon 81451 10[ENC] <con1|1> parsed INFORMATIONAL response 23 [ ] Apr 9 14:59:53 charon 81451 10[NET] <con1|1> received packet: from 75.61.85.193[500] to 207.162.137.151[500] (80 bytes) Apr 9 14:59:53 charon 81451 10[NET] <con1|1> sending packet: from 207.162.137.151[500] to 75.61.85.193[500] (80 bytes) Apr 9 14:59:53 charon 81451 10[ENC] <con1|1> generating INFORMATIONAL request 23 [ ] Apr 9 14:59:53 charon 81451 10[IKE] <con1|1> activating IKE_DPD task Apr 9 14:59:53 charon 81451 10[IKE] <con1|1> activating new tasks Apr 9 14:59:53 charon 81451 10[IKE] <con1|1> queueing IKE_DPD task Apr 9 14:59:53 charon 81451 10[IKE] <con1|1> sending DPD request Apr 9 14:59:47 charon 81451 09[IKE] <con1|1> nothing to initiate Apr 9 14:59:47 charon 81451 09[IKE] <con1|1> activating new tasks Apr 9 14:59:47 charon 81451 09[CHD] <con1|1> CHILD_SA con1{6} state change: DELETED => DESTROYING Apr 9 14:59:47 charon 81451 09[IKE] <con1|1> activating CHILD_DELETE task Apr 9 14:59:47 charon 81451 09[IKE] <con1|1> activating new tasks Apr 9 14:59:47 charon 81451 09[IKE] <con1|1> queueing CHILD_DELETE task Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> nothing to initiate Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> activating new tasks Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> CHILD_SA con1{6} state change: DELETING => DELETED Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> CHILD_SA closed Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> received DELETE for ESP CHILD_SA with SPI cfe718f6 Apr 9 14:59:42 charon 81451 09[ENC] <con1|1> parsed INFORMATIONAL response 22 [ D ] Apr 9 14:59:42 charon 81451 09[NET] <con1|1> received packet: from 75.61.85.193[500] to 207.162.137.151[500] (80 bytes) Apr 9 14:59:42 charon 81451 09[NET] <con1|1> sending packet: from 207.162.137.151[500] to 75.61.85.193[500] (80 bytes) Apr 9 14:59:42 charon 81451 09[ENC] <con1|1> generating INFORMATIONAL request 22 [ D ] Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> CHILD_SA con1{6} state change: REKEYED => DELETING Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> sending DELETE for ESP CHILD_SA with SPI cb24b68e Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> closing CHILD_SA con1{6} with SPIs cb24b68e_i (99971985 bytes) cfe718f6_o (135223892 bytes) and TS 10.1.0.0/16|/0 === 10.0.1.0/24|/0 10.2.0.0/16|/0 Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> CHILD_REKEY task Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> reinitiating already active tasks Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> CHILD_SA con1{6} state change: REKEYING => REKEYED Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> outbound CHILD_SA con1{7} established with SPIs c0ed26f2_i c421f173_o and TS 10.1.0.0/16|/0 === 10.0.1.0/24|/0 10.2.0.0/16|/0 Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> SPI 0xc421f173, src 207.162.137.151 dst 75.61.85.193 Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> adding outbound ESP SA Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> CHILD_SA con1{7} state change: INSTALLING => INSTALLED Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> inbound CHILD_SA con1{7} established with SPIs c0ed26f2_i c421f173_o and TS 10.1.0.0/16|/0 === 10.0.1.0/24|/0 10.2.0.0/16|/0 Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> SPI 0xc421f173, src 207.162.137.151 dst 75.61.85.193 Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> registering outbound ESP SA Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> SPI 0xc0ed26f2, src 75.61.85.193 dst 207.162.137.151 Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> adding inbound ESP SA Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> using HMAC_SHA2_256_128 for integrity Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> using AES_CBC for encryption Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> CHILD_SA con1{7} state change: CREATED => INSTALLING Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> config: 10.0.1.0/24|/0, received: 10.0.1.0/24|/0 => match: 10.0.1.0/24|/0 Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> config: 10.0.1.0/24|/0, received: 10.2.0.0/16|/0 => no match Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> config: 10.2.0.0/16|/0, received: 10.0.1.0/24|/0 => no match Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> config: 10.2.0.0/16|/0, received: 10.2.0.0/16|/0 => match: 10.2.0.0/16|/0 Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> selecting traffic selectors for other: Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> config: 10.1.0.0/16|/0, received: 10.1.0.0/16|/0 => match: 10.1.0.0/16|/0 Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> config: 10.1.0.0/16|/0, received: 10.1.0.0/16|/0 => match: 10.1.0.0/16|/0 Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> selecting traffic selectors for us: Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> selected proposal: ESP:AES_CBC_256/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> configured proposals: ESP:AES_CBC_256/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> received proposals: ESP:AES_CBC_256/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> proposal matches Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> selecting proposal: Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> received ESP_TFC_PADDING_NOT_SUPPORTED, not using ESPv3 TFC padding Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> received ESP_TFC_PADDING_NOT_SUPPORTED notify Apr 9 14:59:42 charon 81451 09[ENC] <con1|1> parsed CREATE_CHILD_SA response 21 [ N(ESP_TFC_PAD_N) SA No KE TSi TSr ] Apr 9 14:59:42 charon 81451 09[NET] <con1|1> received packet: from 75.61.85.193[500] to 207.162.137.151[500] (496 bytes) Apr 9 14:59:42 charon 81451 09[NET] <con1|1> sending packet: from 207.162.137.151[500] to 75.61.85.193[500] (512 bytes) Apr 9 14:59:42 charon 81451 09[ENC] <con1|1> generating CREATE_CHILD_SA request 21 [ N(REKEY_SA) N(ESP_TFC_PAD_N) SA No KE TSi TSr ] Apr 9 14:59:42 charon 81451 09[CHD] <con1|1> CHILD_SA con1{6} state change: INSTALLED => REKEYING Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> establishing CHILD_SA con1{7} reqid 1 Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> configured proposals: ESP:AES_CBC_256/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> 10.0.1.0/24|/0 Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> 10.2.0.0/16|/0 Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> proposing traffic selectors for other: Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> 10.1.0.0/16|/0 Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> 10.1.0.0/16|/0 Apr 9 14:59:42 charon 81451 09[CFG] <con1|1> proposing traffic selectors for us: Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> activating CHILD_REKEY task Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> activating new tasks Apr 9 14:59:42 charon 81451 09[IKE] <con1|1> queueing CHILD_REKEY task Apr 9 14:59:42 charon 81451 08[KNL] creating rekey job for CHILD_SA ESP/0xcfe718f6/75.61.85.193 Apr 9 14:58:06 charon 81451 08[CFG] vici client 3008 disconnected Apr 9 14:58:06 charon 81451 09[CFG] vici client 3008 requests: list-sas Apr 9 14:58:06 charon 81451 09[CFG] vici client 3008 registered for: list-sa Apr 9 14:58:06 charon 81451 08[CFG] vici client 3008 connected Apr 9 14:58:01 charon 81451 09[CFG] vici client 3007 disconnected Apr 9 14:58:01 charon 81451 08[CFG] vici client 3007 requests: list-sas Apr 9 14:58:01 charon 81451 08[CFG] vici client 3007 registered for: list-sa Apr 9 14:58:01 charon 81451 08[CFG] vici client 3007 connected
On Site B:
Apr 9 15:00:33 charon 80312 13[NET] <con1|1> sending packet: from 75.61.85.193[500] to 207.162.137.151[500] (80 bytes) Apr 9 15:00:33 charon 80312 13[ENC] <con1|1> generating INFORMATIONAL response 27 [ ] Apr 9 15:00:33 charon 80312 13[ENC] <con1|1> parsed INFORMATIONAL request 27 [ ] Apr 9 15:00:33 charon 80312 13[NET] <con1|1> received packet: from 207.162.137.151[500] to 75.61.85.193[500] (80 bytes) Apr 9 15:00:23 charon 80312 14[NET] <con1|1> sending packet: from 75.61.85.193[500] to 207.162.137.151[500] (80 bytes) Apr 9 15:00:23 charon 80312 14[ENC] <con1|1> generating INFORMATIONAL response 26 [ ] Apr 9 15:00:23 charon 80312 14[ENC] <con1|1> parsed INFORMATIONAL request 26 [ ] Apr 9 15:00:23 charon 80312 14[NET] <con1|1> received packet: from 207.162.137.151[500] to 75.61.85.193[500] (80 bytes) Apr 9 15:00:13 charon 80312 14[NET] <con1|1> sending packet: from 75.61.85.193[500] to 207.162.137.151[500] (80 bytes) Apr 9 15:00:13 charon 80312 14[ENC] <con1|1> generating INFORMATIONAL response 25 [ ] Apr 9 15:00:13 charon 80312 14[ENC] <con1|1> parsed INFORMATIONAL request 25 [ ] Apr 9 15:00:13 charon 80312 14[NET] <con1|1> received packet: from 207.162.137.151[500] to 75.61.85.193[500] (80 bytes) Apr 9 15:00:03 charon 80312 14[NET] <con1|1> sending packet: from 75.61.85.193[500] to 207.162.137.151[500] (80 bytes) Apr 9 15:00:03 charon 80312 14[ENC] <con1|1> generating INFORMATIONAL response 24 [ ] Apr 9 15:00:03 charon 80312 14[ENC] <con1|1> parsed INFORMATIONAL request 24 [ ] Apr 9 15:00:03 charon 80312 14[NET] <con1|1> received packet: from 207.162.137.151[500] to 75.61.85.193[500] (80 bytes) Apr 9 14:59:53 charon 80312 14[NET] <con1|1> sending packet: from 75.61.85.193[500] to 207.162.137.151[500] (80 bytes) Apr 9 14:59:53 charon 80312 14[ENC] <con1|1> generating INFORMATIONAL response 23 [ ] Apr 9 14:59:53 charon 80312 14[ENC] <con1|1> parsed INFORMATIONAL request 23 [ ] Apr 9 14:59:53 charon 80312 14[NET] <con1|1> received packet: from 207.162.137.151[500] to 75.61.85.193[500] (80 bytes) Apr 9 14:59:47 charon 80312 14[IKE] <con1|1> nothing to initiate Apr 9 14:59:47 charon 80312 14[IKE] <con1|1> activating new tasks Apr 9 14:59:47 charon 80312 14[CHD] <con1|1> CHILD_SA con1{6} state change: DELETED => DESTROYING Apr 9 14:59:47 charon 80312 14[IKE] <con1|1> activating CHILD_DELETE task Apr 9 14:59:47 charon 80312 14[IKE] <con1|1> activating new tasks Apr 9 14:59:47 charon 80312 14[IKE] <con1|1> queueing CHILD_DELETE task Apr 9 14:59:42 charon 80312 14[NET] <con1|1> sending packet: from 75.61.85.193[500] to 207.162.137.151[500] (80 bytes) Apr 9 14:59:42 charon 80312 14[ENC] <con1|1> generating INFORMATIONAL response 22 [ D ] Apr 9 14:59:42 charon 80312 14[IKE] <con1|1> outbound CHILD_SA con1{7} established with SPIs c421f173_i c0ed26f2_o and TS 10.0.1.0/24|/0 10.2.0.0/16|/0 === 10.1.0.0/16|/0 Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> SPI 0xc0ed26f2, src 75.61.85.193 dst 207.162.137.151 Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> adding outbound ESP SA Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> CHILD_SA con1{6} state change: DELETING => DELETED Apr 9 14:59:42 charon 80312 14[IKE] <con1|1> CHILD_SA closed Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> CHILD_SA con1{6} state change: REKEYED => DELETING Apr 9 14:59:42 charon 80312 14[IKE] <con1|1> sending DELETE for ESP CHILD_SA with SPI cfe718f6 Apr 9 14:59:42 charon 80312 14[IKE] <con1|1> closing CHILD_SA con1{6} with SPIs cfe718f6_i (106476374 bytes) cb24b68e_o (129569664 bytes) and TS 10.0.1.0/24|/0 10.2.0.0/16|/0 === 10.1.0.0/16|/0 Apr 9 14:59:42 charon 80312 14[IKE] <con1|1> received DELETE for ESP CHILD_SA with SPI cb24b68e Apr 9 14:59:42 charon 80312 14[ENC] <con1|1> parsed INFORMATIONAL request 22 [ D ] Apr 9 14:59:42 charon 80312 14[NET] <con1|1> received packet: from 207.162.137.151[500] to 75.61.85.193[500] (80 bytes) Apr 9 14:59:42 charon 80312 14[NET] <con1|1> sending packet: from 75.61.85.193[500] to 207.162.137.151[500] (496 bytes) Apr 9 14:59:42 charon 80312 14[ENC] <con1|1> generating CREATE_CHILD_SA response 21 [ N(ESP_TFC_PAD_N) SA No KE TSi TSr ] Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> CHILD_SA con1{6} state change: REKEYING => REKEYED Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> CHILD_SA con1{6} state change: INSTALLED => REKEYING Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> CHILD_SA con1{7} state change: INSTALLING => INSTALLED Apr 9 14:59:42 charon 80312 14[IKE] <con1|1> inbound CHILD_SA con1{7} established with SPIs c421f173_i c0ed26f2_o and TS 10.0.1.0/24|/0 10.2.0.0/16|/0 === 10.1.0.0/16|/0 Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> SPI 0xc0ed26f2, src 75.61.85.193 dst 207.162.137.151 Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> registering outbound ESP SA Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> SPI 0xc421f173, src 207.162.137.151 dst 75.61.85.193 Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> adding inbound ESP SA Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> using HMAC_SHA2_256_128 for integrity Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> using AES_CBC for encryption Apr 9 14:59:42 charon 80312 14[CHD] <con1|1> CHILD_SA con1{7} state change: CREATED => INSTALLING Apr 9 14:59:42 charon 80312 14[CFG] <con1|1> config: 10.1.0.0/16|/0, received: 10.1.0.0/16|/0 => match: 10.1.0.0/16|/0 Apr 9 14:59:42 charon 80312 14[CFG] <con1|1> config: 10.1.0.0/16|/0, received: 10.1.0.0/16|/0 => match: 10.1.0.0/16|/0 Apr 9 14:59:42 charon 80312 14[CFG] <con1|1> selecting traffic selectors for other: Apr 9 14:59:42 charon 80312 14[CFG] <con1|1> config: 10.0.1.0/24|/0, received: 10.0.1.0/24|/0 => match: 10.0.1.0/24|/0 Apr 9 14:59:42 charon 80312 14[CFG] <con1|1> config: 10.0.1.0/24|/0, received: 10.2.0.0/16|/0 => no match Apr 9 14:59:42 charon 80312 14[CFG] <con1|1> config: 10.2.0.0/16|/0, received: 10.0.1.0/24|/0 => no match Apr 9 14:59:42 charon 80312 14[CFG] <con1|1> config: 10.2.0.0/16|/0, received: 10.2.0.0/16|/0 => match: 10.2.0.0/16|/0 Apr 9 14:59:42 charon 80312 14[CFG] <con1|1> selecting traffic selectors for us: Apr 9 14:59:42 charon 80312 14[CFG] <con1|1> selected proposal: ESP:AES_CBC_256/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ Apr 9 14:59:42 charon 80312 14[CFG] <con1|1> configured proposals: ESP:AES_CBC_256/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ Apr 9 14:59:42 charon 80312 14[CFG] <con1|1> received proposals: ESP:AES_CBC_256/HMAC_SHA2_256_128/MODP_2048/NO_EXT_SEQ Apr 9 14:59:42 charon 80312 14[CFG] <con1|1> proposal matches Apr 9 14:59:42 charon 80312 14[CFG] <con1|1> selecting proposal: Apr 9 14:59:42 charon 80312 14[IKE] <con1|1> received ESP_TFC_PADDING_NOT_SUPPORTED, not using ESPv3 TFC padding Apr 9 14:59:42 charon 80312 14[ENC] <con1|1> parsed CREATE_CHILD_SA request 21 [ N(REKEY_SA) N(ESP_TFC_PAD_N) SA No KE TSi TSr ] Apr 9 14:59:42 charon 80312 14[NET] <con1|1> received packet: from 207.162.137.151[500] to 75.61.85.193[500] (512 bytes) Apr 9 14:57:56 charon 80312 16[CFG] vici client 3068 disconnected Apr 9 14:57:56 charon 80312 16[CFG] vici client 3068 requests: list-sas Apr 9 14:57:56 charon 80312 16[CFG] vici client 3068 registered for: list-sa Apr 9 14:57:56 charon 80312 16[CFG] vici client 3068 connected Apr 9 14:57:51 charon 80312 16[CFG] vici client 3067 disconnected Apr 9 14:57:51 charon 80312 05[CFG] vici client 3067 requests: list-sas
-
Same thing just happened again. 54 minutes after the last one it went down for roughly 45 seconds.
Thanks again all!
TSoF
-
@TheStormsOfFury said in IPsec site to site dropping every 49-55 minutes:
@andrew_cb Just looking at this again as I know there is a time difference, but shouldn't this be the configuration if we do not want the IPsec traffic to go out the WAN?
Action => Block
Interface => IPsec
Direction => out
Address Family => IPv4
Protocol => Any
Source => LAN
Destination => WAN RTR1 / WAN RTR2 / VIP WANMaybe i'm thinking about this wrong, but this would keep IPsec traffic from the lan going out over the wan, or am I wrong?
PS, the system is still dropping for about 40-50 seconds every 55-ish minutes. I'm going to get the logs again shortly.
If the remote subnets are being removed from the firewall's routing table then they are sent out the default route (i.e. 0.0.0.0 / WAN).
When traffic is sent out the WAN, it gets NAT'ed so that the source address is now the WAN IP, and then it is sent out the WAN interface.
A state is created on the WAN interface with the source address being the WAN IP (due to NAT) and the destination will be the remote VPN subnet.
This state will persist and cause the firewall to keep sending traffic destined for the remote VPN subnet out the WAN instead.Eventually a program will timeout and stop transmitting. When the program tries again, it usually uses a different source port, which causes a new session to be created on the firewall. If the VPN's P2 are established, then the new traffic is correctly sent over the VPN tunnel.
The floating block rule is set on the WAN interface with source of WAN Address because NAT is performed before the firewall rules are processed.
-
@andrew_cb Okay, I think i understand now.
This means that the rule should look as follows:
Action => Block
Interface => WAN/WAN VIP
Direction => out
Address Family => IPv4
Protocol => Any
Source => WAN/WAN VIP
Destination => For Site A = Site B Subnets && for Site B = Site A SubnetsLet me know if I've got it right this time!
Thanks again,
TSoF
-
@andrew_cb well, I have the rules as this one setup above, and it still timed out. Maybe I just need to set P2 to timeout at like 24 hours and then apply it and reboot at like 6:00 AM that way it will take it months to timeout during the working period.
It just seems odd that I have "Make before break" enabled, yet it isn't doing that.
Thinking about it, I have it set to WAN SUBNETS as the source as this uses CARP and VIP Ip address. Should I change it to just the "IP address" that is the VIP IP?
-
So i have my firewall rule to block on the WAN VIP address to see if it will stop that 45-50 second down time.
I also performed a test and setup my tunnel times as follows:
Phase 1:
- Lifetime 604,800 (7 days/168 hours)
- ReKey 604,200 (10 minutes less than lifetime)
- ReAuth 160 (2 minutes) (This might need to be 604,640)
- Rand Time 60 (1 minute)
Phase 2:
- 86,400 (1 day/24 hours)
- 85,800 (10 minutes less than lifetime)
- 60 (1 minute)
The results were disastrously. The tunnels were resting every 45-50 seconds and it was creating multiple connection visible in the ipsec status.
I reset everything back to the original mentioned above, but we're still timing out at the 55-ish minute mark, which is not good during business hours as phone calls and app connections are failing.
I do have a question. If i am using make before break, do i still need "Phase One Child SA Close Action" to be restart? Shouldn't it be close?
Any thoughts? I'm about ready to move on and try wireguard.
-
I did a little checking and the tunnel is going down every 53 minutes 30 seconds (give or take a second).
Then it is down for approximately 38 - 40 seconds.
This is super strange that I have make before break enabled, but it is still doing this.
Thanks in advance!
TSoF
-
@TheStormsOfFury said in IPsec site to site dropping every 49-55 minutes:
We are running CARP and both sites are running identical hardware (Dell R620) with pfSense 24.11-RELEASE.
A couple of things.
- Are you using the WAN CARP address to establish your IPsec tunnels?
- For your IPsec tunnels. Do you have gateway monitoring enabled?
- Have you enabled DPD ?
-
@michmoor Thanks for the reply!
1.) I am using the shared Virtual IP address so that way if RTR1 goes down RTR2 can provide the services.
2.) I need to find that feature. inside Phase One there is a "Gateway Duplicate" that is disabled. Do you know where the "gateway monitoring enabled" is by any chance?
3.) DPD is enabled in Phase One on both side of the tunnel.
Thanks again!
TSoF
-
@TheStormsOfFury said in IPsec site to site dropping every 49-55 minutes:
2.) I need to find that feature. inside Phase One there is a "Gateway Duplicate" that is disabled. Do you know where the "gateway monitoring enabled" is by any chance?
System /Routing / Gateways
-
@michmoor I found the Gateway monitoring setting in: System > Routing > Gateways > Edit
Gateway Monitoring - Disable Gateway Monitoring
This will consider this gateway as always being up.Is this what you are talking about? Reading it, without it being checked gateway monitoring is enabled.
I went ahead and disabled the "Gateway Monitoring" and "Gateway Action."
Thanks again,
TSoF
-
@TheStormsOfFury
I care more about Gateway Action. If that's set, lets unselect for now. Im assuming you are NOT in a multi-wan configurationWhat that does is that if there is an issue with your gateway monitor IP (packet loss, jitter) this will bring down the IPsec connection in your case.
-
@michmoor said in IPsec site to site dropping every 49-55 minutes:
@TheStormsOfFury
I care more about Gateway Action. If that's set, lets unselect for now. Im assuming you are NOT in a multi-wan configurationWhat that does is that if there is an issue with your gateway monitor IP (packet loss, jitter) this will bring down the IPsec connection in your case.
Correct, we are in a single WAN configuration. They were both "enabled" or "un-checked" and i went ahead and "checked" them on both sites
TSoF
-
@TheStormsOfFury great. Let’s monitor IPsec stability.
How soon will you know if it dropped?Edit: to be clear, disable gateway action is checked? It should be
-
@michmoor In about 5 minutes. That will be the 53 minute mark. I did make the changes on both sites as well.
Thanks!
TSoF