Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Workaround needed for IPsec VTI limitation with dynamic remote gateways (0.0.0.0 not supported)

    Scheduled Pinned Locked Moved IPsec
    6 Posts 2 Posters 100 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M Offline
      mcury Rebel Alliance
      last edited by mcury

      Hi pfSense community,I've run into a frustrating limitation with IPsec VTI tunnels and dynamic remote endpoints, and I'm hoping someone has a clever workaround.

      For context, per the pfSense documentation, VTI requires a specific remote endpoint IP address—it doesn't support "0.0.0.0" like policy-based or road warrior setups do. This makes sense for interface binding, but it's a pain for dynamic scenarios.

      My setup and issue:I have multiple road warrior IPsec tunnels (using Mobile Clients) configured with remote gateway 0.0.0.0, which work great. To handle inbound connections securely, I've added WAN firewall rules allowing UDP 500/4500 only from geoIP-approved countries—no issues there.

      For site-to-site VTI tunnels, I'm forced to use a FQDN (dynamic DNS) for the remote gateway since the peers have dynamic public IPs.
      This mostly works, but intermittently, DNS resolution hiccups (e.g., propagation delays or temporary outages) cause the tunnel to drop during rekeying or reconnection.
      On restart, Phase 1 authentication fails with auth errors because the local side's resolved IP (from the FQDN) no longer matches the peer's current IP. The tunnel stays down until I manually intervene (e.g., flush DNS cache, restart IPsec or access the remote side and force dyndns to update again).

      Policy-based IPsec and road warriors aren't affected here—they handle dynamics fine with 0.0.0.0. But I need VTI for easier routing (e.g., dynamic protocols like BGP/OSPF over the tunnel).

      What I'm looking for:Any workarounds to make VTI play nice with dynamic remotes?
      Ideas like:
      A script/hook to auto-update the P1 remote gateway IP when DNS changes (and trigger IPsec reload)?
      Custom strongSwan config tweaks to ignore IP mismatches during re-auth?

      Has anyone dealt with this in production? Happy to share logs/config snippets if needed.

      pfSense version: 2.7.2 dynamic IP side, and 2.8.1 responder only with static IP.
      Thanks in advance—appreciate any insights!

      Related logs:

      Nov 12 06:54:30	charon	70880	10[CFG] vici client 14700 disconnected
      Nov 12 06:54:30	charon	70880	13[IKE] <con2|629> IKE_SA con2[629] state change: CONNECTING => DESTROYING
      Nov 12 06:54:30	charon	70880	13[CHD] <con2|629> CHILD_SA con2{2329} state change: CREATED => DESTROYING
      Nov 12 06:54:30	charon	70880	13[IKE] <con2|629> received AUTHENTICATION_FAILED notify error
      Nov 12 06:54:30	charon	70880	13[ENC] <con2|629> parsed IKE_AUTH response 1 [ N(AUTH_FAILED) ]
      

      dead on arrival, nowhere to be found.

      M A 2 Replies Last reply Reply Quote 0
      • M Offline
        mcury Rebel Alliance @mcury
        last edited by

        Tried to move from PSK to mutual certificates, but remote gateway keeps being validated and if the FQDN does not match the DNS resolved IP, it does not connect throwing a NO PROP chosen error.

        No joy.

        To workaround this limitation, I'm now testing a script to enhance pfsense's dyndns plugin, to check once a while the gateway status through a cron schedule.

        Single WAN:
        If the public IP matches the FQDN (it will use 8.8.8.8 for the test), then do nothing.
        If it doesn't match, force an update.

        Notes:
        You will need to change in the script:
        /etc/rc.dyndns.update 0 # zero is the dyndns ID, if you have more, check the firewall config xml to confirm which ID you want the script to update.
        WAN_IF="ix3" # Your WAN interface (used for IP binding now)
        DDNS_HOST="mydomain.duckdns.org" # Your DDNS hostname

        Make sure ix3 is the interface configured in dyndns also.

        #!/bin/sh
        
        # Config: Customize these
        WAN_IF="ix3"  # Your WAN interface (used for IP binding now)
        DDNS_HOST="mydomain.duckdns.org"  # Your DDNS hostname
        MAX_RETRIES=3  # Max attempts per run
        RETRY_DELAY=30  # Seconds between retries
        DNS_TIMEOUT=10  # Timeout in seconds for dig queries
        
        # Function to validate IPv4 address
        is_valid_ip() {
            [ -n "$1" ] && echo "$1" | grep -qE '^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$' && \
            echo "$1" | awk -F. '{for(i=1;i<=4;i++)if($i<0||$i>255)exit 1;exit 0}'
        }
        
        # Get Public WAN IP for specific interface (multi-WAN compatible)
        WAN_IP=$(curl -s --interface $WAN_IF http://ifconfig.me/ip 2>/dev/null)
        if [ -z "$WAN_IP" ]; then
            WAN_IP=$(curl -s --interface $WAN_IF https://api.ipify.org 2>/dev/null)
        fi
        if [ -z "$WAN_IP" ] || [ "$WAN_IP" = "0.0.0.0" ] || ! is_valid_ip "$WAN_IP"; then
            exit 1
        fi
        
        # Resolve DDNS IP with timeout and retry logic for DNS failure
        DNS_RETRIES=3
        DNS_RETRY_DELAY=5
        DDNS_IP=""
        for dns_i in $(seq 1 $DNS_RETRIES); do
            DDNS_IP=$(dig +short +time=$DNS_TIMEOUT $DDNS_HOST A @8.8.8.8 2>/dev/null | head -n1)
            if is_valid_ip "$DDNS_IP"; then
                break
            else
                if [ $dns_i -lt $DNS_RETRIES ]; then
                    sleep $DNS_RETRY_DELAY
                fi
            fi
        done
        
        if ! is_valid_ip "$DDNS_IP"; then
            exit 1
        fi
        
        # If match, green—exit
        if [ "$WAN_IP" = "$DDNS_IP" ]; then
            exit 0
        fi
        
        # Mismatch: Red state—retry updates
        for i in $(seq 1 $MAX_RETRIES); do
            /etc/rc.dyndns.update 0  # Forces all DDNS entries; add --id if multiple
            sleep $RETRY_DELAY
        
            # Re-check after update: resolve with retry logic
            NEW_DDNS_IP=""
            for dns_i in $(seq 1 $DNS_RETRIES); do
                NEW_DDNS_IP=$(dig +short +time=$DNS_TIMEOUT $DDNS_HOST A @8.8.8.8 2>/dev/null | head -n1)
                if is_valid_ip "$NEW_DDNS_IP"; then
                    break
                else
                    if [ $dns_i -lt $DNS_RETRIES ]; then
                        sleep $DNS_RETRY_DELAY
                    fi
                fi
            done
        
            if ! is_valid_ip "$NEW_DDNS_IP"; then
                continue
            fi
        
            if [ "$WAN_IP" = "$NEW_DDNS_IP" ]; then
                exit 0
            fi
        done
        
        exit 1
        

        Multi-WAN and gateway group:

        Notes:
        You will need to change in the script:
        /etc/rc.dyndns.update 0 # zero is the dyndns ID, if you have more, check the firewall config xml to confirm which ID you want the script to update.
        DDNS_HOST="mydomain.duckdns.org" # Your DDNS hostname
        PRIMARY AND SECONDARY GW, use the same name you configured through the GUI.
        Primary IF is the tier 1 gateway in the gateway group used in the dyndns ID 0.
        Secondary IF is the tier 2 gateway in the gateway group used in the dyndns ID 0.

        #!/bin/sh
        
        # Config: Customize these
        DDNS_HOST="mydomain.duckdns.org"  # Your DDNS hostname
        PRIMARY_GW="NETGW"  # Primary gateway name in group (full name as shown in status)
        SECONDARY_GW="OIGW"  # Secondary gateway name in group (full name as shown in status)
        PRIMARY_IF="ix3"  # Interface for primary gateway (e.g., ix0) - UPDATE TO YOUR ACTUAL INTERFACE
        SECONDARY_IF="ix2"  # Interface for secondary gateway (e.g., ix1) - UPDATE TO YOUR ACTUAL INTERFACE
        MAX_RETRIES=3  # Max attempts per run
        RETRY_DELAY=30  # Seconds between retries
        DNS_TIMEOUT=10  # Timeout in seconds for dig queries
        
        # Function to validate IPv4 address
        is_valid_ip() {
            [ -n "$1" ] && echo "$1" | grep -qE '^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$' && \
            echo "$1" | awk -F. '{for(i=1;i<=4;i++)if($i<0||$i>255)exit 1;exit 0}'
        }
        
        # Get status output
        GATEWAY_STATUS=$(pfSsh.php playback gatewaystatus)
        
        # Determine active interface
        PRIMARY_STATUS=$(echo "$GATEWAY_STATUS" | awk "/^$PRIMARY_GW / {print \$7}")
        
        if [ "$PRIMARY_STATUS" = "online" ]; then
            ACTIVE_IF="$PRIMARY_IF"
        else
            SECONDARY_STATUS=$(echo "$GATEWAY_STATUS" | awk "/^$SECONDARY_GW / {print \$7}")
            if [ "$SECONDARY_STATUS" = "online" ]; then
                ACTIVE_IF="$SECONDARY_IF"
            else
                exit 1
            fi
        fi
        
        # Get Public WAN IP for active interface
        WAN_IP=$(curl -s --interface $ACTIVE_IF http://ifconfig.me/ip 2>/dev/null)
        if [ -z "$WAN_IP" ]; then
            WAN_IP=$(curl -s --interface $ACTIVE_IF https://api.ipify.org 2>/dev/null)
        fi
        if [ -z "$WAN_IP" ] || [ "$WAN_IP" = "0.0.0.0" ] || ! is_valid_ip "$WAN_IP"; then
            exit 1
        fi
        
        # Resolve DDNS IP with timeout and retry logic for DNS failure
        DNS_RETRIES=3
        DNS_RETRY_DELAY=5
        DDNS_IP=""
        for dns_i in $(seq 1 $DNS_RETRIES); do
            DDNS_IP=$(dig +short +time=$DNS_TIMEOUT $DDNS_HOST A @8.8.8.8 2>/dev/null | head -n1)
            if is_valid_ip "$DDNS_IP"; then
                break
            else
                if [ $dns_i -lt $DNS_RETRIES ]; then
                    sleep $DNS_RETRY_DELAY
                fi
            fi
        done
        
        if ! is_valid_ip "$DDNS_IP"; then
            exit 1
        fi
        
        # If match, exit
        if [ "$WAN_IP" = "$DDNS_IP" ]; then
            exit 0
        fi
        
        # Mismatch: retry updates
        for i in $(seq 1 $MAX_RETRIES); do
            /etc/rc.dyndns.update 0 # Forces all DDNS entries; add --id if multiple
            sleep $RETRY_DELAY
        
            # Re-check after update: resolve with retry logic
            NEW_DDNS_IP=""
            for dns_i in $(seq 1 $DNS_RETRIES); do
                NEW_DDNS_IP=$(dig +short +time=$DNS_TIMEOUT $DDNS_HOST A @8.8.8.8 2>/dev/null | head -n1)
                if is_valid_ip "$NEW_DDNS_IP"; then
                    break
                else
                    if [ $dns_i -lt $DNS_RETRIES ]; then
                        sleep $DNS_RETRY_DELAY
                    fi
                fi
            done
        
            if ! is_valid_ip "$NEW_DDNS_IP"; then
                continue
            fi
        
            if [ "$WAN_IP" = "$NEW_DDNS_IP" ]; then
                exit 0
            fi
        done
        
        exit 1
        

        I'm testing this for a few hours now running every 15 minutes, so far no problems.
        If someone decides to test this in the future, please update here and also share if you have any improvements.

        dead on arrival, nowhere to be found.

        A 1 Reply Last reply Reply Quote 0
        • A Offline
          Averlon @mcury
          last edited by

          I'm running a similar setup with about 12 sites where 9 of these have dynamic public IPs. The sites using VTI IPSec partial meshed and each site a connection to a hub site, which is running pfSense and also have a dynamic IP and acting as responder only. It's a mixed environment of vendors, not all sites running pfSense. I have no issues with reconnecting of the IPsec tunnels, when on one site the DSL sync falls apart and reestablish shortly after that.
          I configured my dynamic DNS service down to 15 seconds TTL for the records and this seems to be sufficient to keep this setup working without too much administrative overhead. The unbound resolver must have cache-min-ttl set to 0 to honor the 15 seconds TTL for these records.

          M 1 Reply Last reply Reply Quote 0
          • M Offline
            mcury Rebel Alliance @Averlon
            last edited by

            @Averlon said in Workaround needed for IPsec VTI limitation with dynamic remote gateways (0.0.0.0 not supported):

            I configured my dynamic DNS service down to 15 seconds TTL for the records and this seems to be sufficient to keep this setup working without too much administrative overhead.

            Hi, that could be an option, but a 15-second DNS TTL is too aggressive, as I see it.
            These VTIs are running BGP through multiple paths, so I don't see a problem if it takes 2 minutes to re-establish a peer connection.

            The main purpose of the scripts above is to avoid that issue: 'Ah, oh has the DNS been updated accordingly? Why is the tunnel down again? Let me check.'
            And once you're there, you find the DynDNS Status page in pfSense showing red, so you have to manually force an update—and then the tunnel comes back up.
            This has been happening a lot, especially with Starlink connections.

            Based on this, I think the script will do the job. Run it every 5 or 10 minutes, you choose.

            dead on arrival, nowhere to be found.

            1 Reply Last reply Reply Quote 0
            • A Offline
              Averlon @mcury
              last edited by

              @mcury said in Workaround needed for IPsec VTI limitation with dynamic remote gateways (0.0.0.0 not supported):

              This mostly works, but intermittently, DNS resolution hiccups (e.g., propagation delays or temporary outages) cause the tunnel to drop during rekeying or reconnection.
              On restart, Phase 1 authentication fails with auth errors because the local side's resolved IP (from the FQDN) no longer matches the peer's current IP. The tunnel stays down until I manually intervene (e.g., flush DNS cache, restart IPsec or access the remote side and force dyndns to update again).

              The 15 seconds may sound like a bit aggressive timer for the TTLs of the records, but that's the problem of the DNS provider in the first place. With this timers I don't have any issues with dynamic DNS updates or VTIs staying down and need manual intervention. I mainly have DSL and Fiber connections to deal with. Cannot tell how things with Startlink connection behave in that context.

              Not all providers allow such aggressive timers, so the script would be an alternative, if a low TTL option isn't available.

              M 1 Reply Last reply Reply Quote 1
              • M Offline
                mcury Rebel Alliance @Averlon
                last edited by

                @Averlon Indeed.
                There are valid use cases for both options.

                Thanks for the feedback 👍

                dead on arrival, nowhere to be found.

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.