Update from 2.3 to 2.4.2 Causes individual Phase 2's to not reconnect after IKE rekey



  • First post, apologize if I don't provide enough information, but it's a weird problem.

    I have multiple virtual PFSenses in a datacenter that have IPSEC IKEv2 tunnels to different remote sites with different hardware, most notably PFSense, SonicWALL, and Fortigate. Ever since we updated the datacenter PFSenses to 2.4.3, using the same config as 2.3, we're having sporadic issues where a P2 won't come back after a rekey. When I say "a P2" I mean exactly that, in some of the tunnels I have 5 P2's and there will be exactly one problem one in the set, and it's always the same one. In other tunnels I only have 1 P2, and it happens to that one.

    This issues doe not present itself on the PFSense->PFSense IPSEC tunnels, only the PFSense->Other Vendor tunnels.

    When you look at the status, the traffic appears to be all one way, and you can't ping or access resources across the tunnel. Simply disconnecting the P2 from either side of the tunnel fixes it immediately.

    It is sporadic, in that I can go days of 8 hour rekeys without a single problem, then have one day where the problem P2 has to be manually disconnected all 3 times it rekeys that day. I had the problem occur today on a tunnel that had not presented the problem any time in the last 3 weeks or so, while another tunnel on that same PFSense is my #1 problem issue.

    In our standard remote site config, the P1 and the P2 have the same lifetime of 8 hours. Again, never had a problem before now. On the one tunnel on the one PFsense that is the most problematic, I have changed the P1 to a 24 hour lifetime, and changed the P2's to 8 hours and ~3 minutes to minimize the odds of simultaneous processing. Didn't make any difference. Still just sporadically happens.

    I've crawled the forum and Google looking for this problem, and there are some things that seem similar, but so far, nothing has helped. Any of the PFSenses that I still have on 2.3 are not experiencing this issue, only 2.4+ and I'm now nervous about upgrading any of my other ones.

    If there are any log or config entries that would help, please let me know and I can provide them. I'm just looking for some guidance because I'm pulling my hair out on this one.



  • This is my frequent flier.

    0_1528821757542_8f21f0b2-f863-41c4-9620-8a1481daac2c-image.png

    There are 4 identical P2's with the below config, but only one of them goes down. It's connected to a SonicWALL on the other side, and SonicWALL doesn't allow for separate P2 settings. Again, ran solidly for over a year on 2.3 with no config changes.

    0_1528821872253_30e2c460-7c25-4424-8c1b-4ee8364e524a-image.png



  • Were you able to resolve this problem? I am having what seems to be the same problem.



  • Nope. No resolution. So far, the best we’ve been able to do is change the P1 and P2 rekeys to 24 hours, try to keep them going during the day, and have a quick VPN setup on the Oncall person’s phone so they can disconnect the P1 real quick as soon as the alert comes in that the site is unreachable. Not a solution at all, and it’s driving me nuts while I’m Oncall. I guess this is what we get for using Community-based software and not paying for software support from Netgate...



  • Same here. I'm on version 2.4.3 and have an IPSEC VPN with 6 P2 with rekey lifetime of 1 hour and a P1 with rekey lifetime in 8 hours. I've noticed that even with P1 up, all hosts behind P2 network is offline. So I have to manually go to Status > IPSEC to disconnect and reconnect the P1 tunnel.

    When this rekey occurs without monitoring, I've noticed that the tunnel is down for about 5-10 minutes and reconnects again. My other peer is a Checkpoint vendor. Any idea to keep all sessions behind a P2 tunnel?

    Thanks



  • I upgraded to 2.4.3 just before you posted this response. I'm still having the problem. I did find a similar thread. here



  • I've definitely had that particular issue before. In this case now though, the Old P2's aren't hanging around. There's still only 1 P2, and that one only takes traffic one-way. It's really odd.



  • Does anyone have the same issue with Pfsense in both sides?



  • guys are you experiencing a similar issue to this that I am having?

    https://forum.netgate.com/topic/132900/ipsec-phase-2-duplicate-causes-vpn-tunnel-to-get-stuck



  • leo.f - I haven't had the problem with any of my setups that have PFSense on both sides. Currently I have about a dozen sites that have IPSEC from PFSense to PFSense, and we've never seen the problem on any of those links.

    Harow - I absolutely have seen that issue more than once, but so far, it hasn't actually impacted functioning. It just displays with tons of P2's but the primary P2 still functions fine. I always considered it a cosmetic GUI glitch.



  • @dkase279 mine prevents the tunnel from working as client machines can not ping through to my main site via the VPN. I'm going to log a call with Netgate if possible as it's preventing service. I also might put logs on here once it happens again.