Workaround needed for IPsec VTI limitation with dynamic remote gateways (0.0.0.0 not supported)
-
Hi pfSense community,I've run into a frustrating limitation with IPsec VTI tunnels and dynamic remote endpoints, and I'm hoping someone has a clever workaround.
For context, per the pfSense documentation, VTI requires a specific remote endpoint IP address—it doesn't support "0.0.0.0" like policy-based or road warrior setups do. This makes sense for interface binding, but it's a pain for dynamic scenarios.
My setup and issue:I have multiple road warrior IPsec tunnels (using Mobile Clients) configured with remote gateway 0.0.0.0, which work great. To handle inbound connections securely, I've added WAN firewall rules allowing UDP 500/4500 only from geoIP-approved countries—no issues there.
For site-to-site VTI tunnels, I'm forced to use a FQDN (dynamic DNS) for the remote gateway since the peers have dynamic public IPs.
This mostly works, but intermittently, DNS resolution hiccups (e.g., propagation delays or temporary outages) cause the tunnel to drop during rekeying or reconnection.
On restart, Phase 1 authentication fails with auth errors because the local side's resolved IP (from the FQDN) no longer matches the peer's current IP. The tunnel stays down until I manually intervene (e.g., flush DNS cache, restart IPsec or access the remote side and force dyndns to update again).Policy-based IPsec and road warriors aren't affected here—they handle dynamics fine with 0.0.0.0. But I need VTI for easier routing (e.g., dynamic protocols like BGP/OSPF over the tunnel).
What I'm looking for:Any workarounds to make VTI play nice with dynamic remotes?
Ideas like:
A script/hook to auto-update the P1 remote gateway IP when DNS changes (and trigger IPsec reload)?
Custom strongSwan config tweaks to ignore IP mismatches during re-auth?Has anyone dealt with this in production? Happy to share logs/config snippets if needed.
pfSense version: 2.7.2 dynamic IP side, and 2.8.1 responder only with static IP.
Thanks in advance—appreciate any insights!Related logs:
Nov 12 06:54:30 charon 70880 10[CFG] vici client 14700 disconnected Nov 12 06:54:30 charon 70880 13[IKE] <con2|629> IKE_SA con2[629] state change: CONNECTING => DESTROYING Nov 12 06:54:30 charon 70880 13[CHD] <con2|629> CHILD_SA con2{2329} state change: CREATED => DESTROYING Nov 12 06:54:30 charon 70880 13[IKE] <con2|629> received AUTHENTICATION_FAILED notify error Nov 12 06:54:30 charon 70880 13[ENC] <con2|629> parsed IKE_AUTH response 1 [ N(AUTH_FAILED) ] -
Tried to move from PSK to mutual certificates, but remote gateway keeps being validated and if the FQDN does not match the DNS resolved IP, it does not connect throwing a NO PROP chosen error.
No joy.
To workaround this limitation, I'm now testing a script to enhance pfsense's dyndns plugin, to check once a while the gateway status through a cron schedule.
Single WAN:
If the public IP matches the FQDN (it will use 8.8.8.8 for the test), then do nothing.
If it doesn't match, force an update.Notes:
You will need to change in the script:
/etc/rc.dyndns.update 0 # zero is the dyndns ID, if you have more, check the firewall config xml to confirm which ID you want the script to update.
WAN_IF="ix3" # Your WAN interface (used for IP binding now)
DDNS_HOST="mydomain.duckdns.org" # Your DDNS hostnameMake sure ix3 is the interface configured in dyndns also.
#!/bin/sh # Config: Customize these WAN_IF="ix3" # Your WAN interface (used for IP binding now) DDNS_HOST="mydomain.duckdns.org" # Your DDNS hostname MAX_RETRIES=3 # Max attempts per run RETRY_DELAY=30 # Seconds between retries DNS_TIMEOUT=10 # Timeout in seconds for dig queries # Function to validate IPv4 address is_valid_ip() { [ -n "$1" ] && echo "$1" | grep -qE '^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$' && \ echo "$1" | awk -F. '{for(i=1;i<=4;i++)if($i<0||$i>255)exit 1;exit 0}' } # Get Public WAN IP for specific interface (multi-WAN compatible) WAN_IP=$(curl -s --interface $WAN_IF http://ifconfig.me/ip 2>/dev/null) if [ -z "$WAN_IP" ]; then WAN_IP=$(curl -s --interface $WAN_IF https://api.ipify.org 2>/dev/null) fi if [ -z "$WAN_IP" ] || [ "$WAN_IP" = "0.0.0.0" ] || ! is_valid_ip "$WAN_IP"; then exit 1 fi # Resolve DDNS IP with timeout and retry logic for DNS failure DNS_RETRIES=3 DNS_RETRY_DELAY=5 DDNS_IP="" for dns_i in $(seq 1 $DNS_RETRIES); do DDNS_IP=$(dig +short +time=$DNS_TIMEOUT $DDNS_HOST A @8.8.8.8 2>/dev/null | head -n1) if is_valid_ip "$DDNS_IP"; then break else if [ $dns_i -lt $DNS_RETRIES ]; then sleep $DNS_RETRY_DELAY fi fi done if ! is_valid_ip "$DDNS_IP"; then exit 1 fi # If match, green—exit if [ "$WAN_IP" = "$DDNS_IP" ]; then exit 0 fi # Mismatch: Red state—retry updates for i in $(seq 1 $MAX_RETRIES); do /etc/rc.dyndns.update 0 # Forces all DDNS entries; add --id if multiple sleep $RETRY_DELAY # Re-check after update: resolve with retry logic NEW_DDNS_IP="" for dns_i in $(seq 1 $DNS_RETRIES); do NEW_DDNS_IP=$(dig +short +time=$DNS_TIMEOUT $DDNS_HOST A @8.8.8.8 2>/dev/null | head -n1) if is_valid_ip "$NEW_DDNS_IP"; then break else if [ $dns_i -lt $DNS_RETRIES ]; then sleep $DNS_RETRY_DELAY fi fi done if ! is_valid_ip "$NEW_DDNS_IP"; then continue fi if [ "$WAN_IP" = "$NEW_DDNS_IP" ]; then exit 0 fi done exit 1Multi-WAN and gateway group:
Notes:
You will need to change in the script:
/etc/rc.dyndns.update 0 # zero is the dyndns ID, if you have more, check the firewall config xml to confirm which ID you want the script to update.
DDNS_HOST="mydomain.duckdns.org" # Your DDNS hostname
PRIMARY AND SECONDARY GW, use the same name you configured through the GUI.
Primary IF is the tier 1 gateway in the gateway group used in the dyndns ID 0.
Secondary IF is the tier 2 gateway in the gateway group used in the dyndns ID 0.#!/bin/sh # Config: Customize these DDNS_HOST="mydomain.duckdns.org" # Your DDNS hostname PRIMARY_GW="NETGW" # Primary gateway name in group (full name as shown in status) SECONDARY_GW="OIGW" # Secondary gateway name in group (full name as shown in status) PRIMARY_IF="ix3" # Interface for primary gateway (e.g., ix0) - UPDATE TO YOUR ACTUAL INTERFACE SECONDARY_IF="ix2" # Interface for secondary gateway (e.g., ix1) - UPDATE TO YOUR ACTUAL INTERFACE MAX_RETRIES=3 # Max attempts per run RETRY_DELAY=30 # Seconds between retries DNS_TIMEOUT=10 # Timeout in seconds for dig queries # Function to validate IPv4 address is_valid_ip() { [ -n "$1" ] && echo "$1" | grep -qE '^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$' && \ echo "$1" | awk -F. '{for(i=1;i<=4;i++)if($i<0||$i>255)exit 1;exit 0}' } # Get status output GATEWAY_STATUS=$(pfSsh.php playback gatewaystatus) # Determine active interface PRIMARY_STATUS=$(echo "$GATEWAY_STATUS" | awk "/^$PRIMARY_GW / {print \$7}") if [ "$PRIMARY_STATUS" = "online" ]; then ACTIVE_IF="$PRIMARY_IF" else SECONDARY_STATUS=$(echo "$GATEWAY_STATUS" | awk "/^$SECONDARY_GW / {print \$7}") if [ "$SECONDARY_STATUS" = "online" ]; then ACTIVE_IF="$SECONDARY_IF" else exit 1 fi fi # Get Public WAN IP for active interface WAN_IP=$(curl -s --interface $ACTIVE_IF http://ifconfig.me/ip 2>/dev/null) if [ -z "$WAN_IP" ]; then WAN_IP=$(curl -s --interface $ACTIVE_IF https://api.ipify.org 2>/dev/null) fi if [ -z "$WAN_IP" ] || [ "$WAN_IP" = "0.0.0.0" ] || ! is_valid_ip "$WAN_IP"; then exit 1 fi # Resolve DDNS IP with timeout and retry logic for DNS failure DNS_RETRIES=3 DNS_RETRY_DELAY=5 DDNS_IP="" for dns_i in $(seq 1 $DNS_RETRIES); do DDNS_IP=$(dig +short +time=$DNS_TIMEOUT $DDNS_HOST A @8.8.8.8 2>/dev/null | head -n1) if is_valid_ip "$DDNS_IP"; then break else if [ $dns_i -lt $DNS_RETRIES ]; then sleep $DNS_RETRY_DELAY fi fi done if ! is_valid_ip "$DDNS_IP"; then exit 1 fi # If match, exit if [ "$WAN_IP" = "$DDNS_IP" ]; then exit 0 fi # Mismatch: retry updates for i in $(seq 1 $MAX_RETRIES); do /etc/rc.dyndns.update 0 # Forces all DDNS entries; add --id if multiple sleep $RETRY_DELAY # Re-check after update: resolve with retry logic NEW_DDNS_IP="" for dns_i in $(seq 1 $DNS_RETRIES); do NEW_DDNS_IP=$(dig +short +time=$DNS_TIMEOUT $DDNS_HOST A @8.8.8.8 2>/dev/null | head -n1) if is_valid_ip "$NEW_DDNS_IP"; then break else if [ $dns_i -lt $DNS_RETRIES ]; then sleep $DNS_RETRY_DELAY fi fi done if ! is_valid_ip "$NEW_DDNS_IP"; then continue fi if [ "$WAN_IP" = "$NEW_DDNS_IP" ]; then exit 0 fi done exit 1I'm testing this for a few hours now running every 15 minutes, so far no problems.
If someone decides to test this in the future, please update here and also share if you have any improvements. -
I'm running a similar setup with about 12 sites where 9 of these have dynamic public IPs. The sites using VTI IPSec partial meshed and each site a connection to a hub site, which is running pfSense and also have a dynamic IP and acting as responder only. It's a mixed environment of vendors, not all sites running pfSense. I have no issues with reconnecting of the IPsec tunnels, when on one site the DSL sync falls apart and reestablish shortly after that.
I configured my dynamic DNS service down to 15 seconds TTL for the records and this seems to be sufficient to keep this setup working without too much administrative overhead. The unbound resolver must have cache-min-ttl set to 0 to honor the 15 seconds TTL for these records. -
@Averlon said in Workaround needed for IPsec VTI limitation with dynamic remote gateways (0.0.0.0 not supported):
I configured my dynamic DNS service down to 15 seconds TTL for the records and this seems to be sufficient to keep this setup working without too much administrative overhead.
Hi, that could be an option, but a 15-second DNS TTL is too aggressive, as I see it.
These VTIs are running BGP through multiple paths, so I don't see a problem if it takes 2 minutes to re-establish a peer connection.The main purpose of the scripts above is to avoid that issue: 'Ah, oh has the DNS been updated accordingly? Why is the tunnel down again? Let me check.'
And once you're there, you find the DynDNS Status page in pfSense showing red, so you have to manually force an update—and then the tunnel comes back up.
This has been happening a lot, especially with Starlink connections.Based on this, I think the script will do the job. Run it every 5 or 10 minutes, you choose.
-
@mcury said in Workaround needed for IPsec VTI limitation with dynamic remote gateways (0.0.0.0 not supported):
This mostly works, but intermittently, DNS resolution hiccups (e.g., propagation delays or temporary outages) cause the tunnel to drop during rekeying or reconnection.
On restart, Phase 1 authentication fails with auth errors because the local side's resolved IP (from the FQDN) no longer matches the peer's current IP. The tunnel stays down until I manually intervene (e.g., flush DNS cache, restart IPsec or access the remote side and force dyndns to update again).The 15 seconds may sound like a bit aggressive timer for the TTLs of the records, but that's the problem of the DNS provider in the first place. With this timers I don't have any issues with dynamic DNS updates or VTIs staying down and need manual intervention. I mainly have DSL and Fiber connections to deal with. Cannot tell how things with Startlink connection behave in that context.
Not all providers allow such aggressive timers, so the script would be an alternative, if a low TTL option isn't available.
-
@Averlon Indeed.
There are valid use cases for both options.Thanks for the feedback
