25.07(.1) change to DynDNS? Incorrect VIP detection for GW groups!

JeGr

Hi,

we are using a DynDNS address for a customer with an Multi-WAN IPsec VPN that requires automatic failover to another WAN without any reconfiguration on the remote side.

So the remote end we do NOT control has set up their end of the P1 tunnel with the DynDNS address as remote ID. To make that work, we created a gateway group with both WANs and set the address of both Gateways to the appropriate VIP that is used for VPN. Up until Plus 24.11 that worked absolutely fine.

With 25.07.1 something changed:

GW Group is still working as intended (it seems) and hasn't changed
IPsec configuration (P1) of that setup still uses above GW Group and reads the correct VIP IPs for the connection. So instead of e.g. 192.0.2.2/3 (node IPs), it uses .8 (VPN VIP) just fine. You can see that in the VPN logs, that it tries to utilize .8 as outgoing IP.
DynDNS is broken:
We set it up as type "custom" and used a DynDNS provider, that could utilize an API URI where we could use the %IP% placeholder. Interface to monitor and Interface to send update from are BOTH set to the GW Group that is also used by the VPN.
When you now hit "Force Update" the DynDNS URI gets called bot the placeholder %IP% wrongly inserts the node IP (.2) instead of the VIP from the GW Group

So to verify that I checked in our lab:

create a dummy CARP WAN VIP with any other IP you can recognize
use 24.11
create a dummy GW Group with WAN as Tier 1 and select the Dummy IP created above as the address
head over to dyndns and set up a test DynDNS update which selects the GW Group
force update

the dummy IP is used successfully as it is intended.

instead use 25.07.1
do the above steps
force update

the node IP is wrongly used and makes the IPsec P1 unusable as the node IP mismatches with the dedicated VPN VIP utilized by the cluster.

Also the WAN VIPs aren't listed at all (OK they weren't with the older version either), but it would be extremely useful to have them there for reasons like above. (e.g. simple DNS failover WAN handling for proxy, VPN etc. services)

Cheers

stephenw10

Try applying the commit/patch from here: https://redmine.pfsense.org/issues/16326

JeGr

@stephenw10 said in 25.07(.1) change to DynDNS? Incorrect VIP detection for GW groups!:

Try applying the commit/patch from here: https://redmine.pfsense.org/issues/16326

I tried applying #691852a2 from that post, but after applying and force updating, the IP still stands at the node IP?
Is that only for 2.8.1 and somehow incompatible with 25.7.1?

Edit: seems that only is on the standby node, but I'm not as sure as Marcos about DDNS only running on the master node and not on standby as I've seen DDNS update messages on the standby node quite a few times so... perhaps include a toggle or function to actively disable DDNS on a CARP standby node would solve that AND clear the confusion about the IPs displayed wrong?

stephenw10

That patch should apply to both 2.8.1 and 25.07.1.

So it is working as expected on the primary node? When it's master?

JeGr

@stephenw10 said in 25.07(.1) change to DynDNS? Incorrect VIP detection for GW groups!:

That patch should apply to both 2.8.1 and 25.07.1.

So it is working as expected on the primary node? When it's master?

Yes it's woking on the master ONLY, but the problem is, that the standby WILL update it, too. Even if it's not the master, it still does occasionally update the dyndns entries configured - there is NO block to do it on the standby node, so we had two downs from the IPsec tunnels up to now and have to manually disable DDNS on the standby to not interfere.

It would be great if either the DDNS patch could simply work on both as before or DDNS itself would be made CARP aware like other services, so the standby node wouldn't drop in with wrong IPs.
I think it's the cronjob that runs daily that still triggers DDNS on the standby but due to being off site for the week couldn't check the logs in person.

Thanks,
\jegr

stephenw10

Hmm, OK the expected behaviour here is that the backup node tries to run the dyndns update script but fails because the CARP VIP is in backup.

It should generate an error log in the system log. If you have verbose logging enabled it should show the error and the VIP address it's trying to use.

Do you see that error logged?

Do you see it successfully update with the interface IP on the backup even with the patch applied?

JeGr

@stephenw10 said in 25.07(.1) change to DynDNS? Incorrect VIP detection for GW groups!:

Do you see it successfully update with the interface IP on the backup even with the patch applied?

That. With the patch applied it simply updates and puts the wrong node IP into DynDNS and thus kill the VPN tunnel after around an hour when the remote site renews the P1 and fails the ID check. With luck the master node runs AFTER it and overwrites the change but in our case we saw it happen multiple times and have disabled DDNS on standby node completely to not interfere with the VPN anymore but the customer is of course not that happy with a manual solution.

stephenw10

Does it not log the expected error at all then? I.e. it always does the wrong thing?

marcosm

I updated the redmine with a new patch. Let us know if that helps.