FRR seeing IPsec tunnels disappearing
-
Anything that would cause the interface to disappear should be shown by the default IPSec log settings.
I'd expect it to be in the system log too. Nothing else in the routing log either? Just FRR suddenly unable to find the interfaces?
-
IPsec acknowledges that the interface went away.
Its even saying that it disappeared
Deactivated and Disappeared are red flags to me.
-
I think i found the restart event with cause.
ipsecdnsedit: Oh man....so i do have a few IPsec tunnels using the DNS name of a remote gateway instead of an IPv4 address. My theory is that the IP changes, the ipsecdns process picks it up, and restarts all tunnels. I really hope that's not the case but if so that's really bad.
-
Mmm, indeed. IIRC there's something specific about the way FRR interacts with it there.
Looks like it may be related to this:
https://redmine.pfsense.org/issues/10503Though in your case no gateway actually goes down?
-
@stephenw10 No gateways go down.
The incident happened just now and whats good is that i now know what to look for.
Is there a way to find out which gateway is changing its IP.
Also should i open a redmine?
-
Hmm, if it's actually a gateway I'd expect to see that logged there and in the gateways log.
If it's just a remote IPSec node that changed IP that's probably in the resolver log.
-
The only times in the gateway log are the following. Packet loss but nothing seen showing a complete loss. Considering all VPN tunnels bounce, these error messages make sense.
Resolver log shows nothing useful. I see pfsense checking local cache for DNS but i don't see any related errors
-
Hmm, it does seem to have triggered something at 14:19:13 though. Was there anything in the system log leading up to that? I can just about see there was a newipsecdns call then.
-
@stephenw10 Yep a restart event
-
Hmm, so in both cases the first thing logged is 'Restarting IPsec tunnels' ?
That would normally be triggered by something else. Were any tunnels being renewed at that point?
-
@stephenw10 That is correct, that is the first thing logged.
-
Is it possible that coincided with the renew time for the tunnel using an FQDN remote endpoint?
-
I believe it does. For both incidents. Even though the time between a change of IP and the restart are a few minutes apart so it doesn't seem to occur right away.
Incident one. Time of restart event was around 09:38
./pfblockerng/dns_reply.log:DNS-reply,Oct 7 09:32:25,resolver,A,A,300,vpn.server4u.in,127.0.0.1,124.123.66.69,IN ./pfblockerng/dns_reply.log:DNS-reply,Oct 7 09:37:25,resolver,A,A,300,vpn.server4u.in,127.0.0.1,103.127.188.125,IN ./pfblockerng/dns_reply.log:DNS-reply,Oct 7 09:42:41,resolver,A,A,300,vpn.server4u.in,127.0.0.1,124.123.66.69,IN
Incident two: 14:18
./dns_reply.log:DNS-reply,Oct 7 14:14:03,resolver,A,A,300,vpn.networkzz.co.in,127.0.0.1,210.89.55.63,IN <--- ./dns_reply.log:DNS-reply,Oct 7 14:18:33,resolver,A,A,300,vpn.networkzz.co.in,127.0.0.1,202.88.209.151,IN
Im happy that we found something that is reproducable.
-
Hmm, OK so did those endpoints actually change? Are they FQDNs that resolve to several IPs?
I'd guess there is some timeout there that has to add-up over those 4mins.
Either way I agree it should not affect all IPSec tunnels.
-
@stephenw10 said in FRR seeing IPsec tunnels disappearing:
Hmm, OK so did those endpoints actually change? Are they FQDNs that resolve to several IPs?
Yep those endpoints do resolve to several IPs. One of those i know for sure because i remember the set up for that recently.
I did open the redmine for it for tracking purposes. Dont think there is any workaround for this other than getting into the weeds of how IPsec is configured/built
-
-
Hmm, do they all resolve IPs? Conversely do you have any that only resolve to one IP that doesn't cause this?
Like is this being triggered because it's resolving a different IP address everytime or just because it is re-resolving at all?
-
@stephenw10 said in FRR seeing IPsec tunnels disappearing:
do they all resolve IPs? Converse
I have a few IPsec tunnels that are by IP only. I suspect this is being caused every time the it detects a change in the IP when pfsense goes to resolve the name.
-
Yup. Are you able to test that by adding a host override so it always resolves to the same IP?
-
@stephenw10 that’s a good idea. Setting up one now. I’ll observe overnight maybe for a few days.
Have you discussed this internally?