VPN IPsec Remote gateway using DDNS doesn't update



  • I'm using 2.0 snapshot 2.0-BETA4  (i386) built on Wed Nov 3 14:11:01 EDT 2010

    I have several VPN IPsec site-to-site tunnels that the Remote gateway are using FQDN which in turn are DDNS.  One of the remote site has changed it's IP address.  Although the FQDN reflect the changes but Pfsense didn't update the IP address and hence the tunnel is broken.

    I need to go to VPN -> IPsec -> Edit Phase 1 of the tunnel, disabled and re-enabled it to establish again.

    -Raylund



  • I'm using 2.0-BETA4  (i386) built on Tue Dec 21 15:02:48 EST 2010. I have the exact same problem. When the remote site's dynamic IP address gets changed, the tunnel still seems to be associated with the old IP (you can see it in the ipsec logs) even though the dyndns hostname resolves to the new address. Just like Raylund, the only way to fix it seems to be to disable phase 1 and then re-enable it.



  • This has happened twice today already. Here is an example of some of the log messages showing the OLD dyndns IP address:

    racoon: [Unknown Gateway/Dynamic]: ERROR: pfkey DELETE received: ESP <static ip="" of="" local="" endpoint="">[500]-><the old="" dyndns="" ip="" address="" of="" remote="" endpoint="">[500] spi=...
    racoon: INFO: IPsec-SA request for <the old="" dyndns="" ip="" address="" of="" remote="" endpoint=""> queued due to no phase1 found
    racoon: [Unknown Gateway/Dynamic]: INFO: initiate new phase 1 negotiation: <static ip="" of="" local="" endpoint="">[500]<=><the old="" dyndns="" ip="" address="" of="" remote="" endpoint="">[500]
    racoon: [Unknown Gateway/Dynamic]: ERROR: phase2 negotiation failed due to time up waiting for phase1\. ESP <the old="" dyndns="" ip="" address="" of="" remote="" endpoint="">[0]-><static ip="" of="" local="" endpoint="">[0]</static></the></the></static></the></the></static>
    

    Rebooting the remote endpoint (the one with the dynamic IP) does not fix it. Clicking the yellow button on the pfSense IPSec status page on the local endpoint does not fix it. Deleting the SAD entries for the old address does not fix it. The only way I found to fix it was to disable/renable phase 1. It seems like racoon might be caching the old IP address even though the Remote gateway setting is set to the dyndns host name, not the ip.



  • @cyboc:

    It seems like racoon might be caching the old IP address even though the Remote gateway setting is set to the dyndns host name, not the ip.

    There needs to be some caching of the DNS entry.

    Is it supposed to work with host names whose IP address changes? What should cause a refresh of the DNS entry?

    A few minutes with google didn't turn up anything definitive but that may be because I don't know enough about IPSEC VPNs to type an intelligent search string. I did turn up a couple of example configurations which all used IP addresses rather than host names.



  • Try a snapshot from tomorrow and put the name in the phase1 rather than the ip so the utility in pfSense which monitors these DDNS names can reload the tunnel when that changes.



  • How long is it likely to take this utility to notice the change?

    I'm asking to help people set appropriate expectations. My production pfSense polls at 5 minute intervals myip.dnsomatic.com to see if its public IP address has changed and if so updates my dynamic DNS registration. That update could take some time to trickle through the name servers and then this utility needs to notice the IP address for the name has changed.

    Seems to me there is probably some scope in there for some tuning if you want to minimise VPN down time on IP address change.

    In an "average" case when the VPN does recover from the IP address change there is probably enough time for a user to get hot and bothered over the down time.



  • Ermal, thanks for your reply.

    @ermal:

    Try a snapshot from tomorrow and put the name in the phase1 rather than the ip so the utility in pfSense which monitors these DDNS names can reload the tunnel when that changes.

    Two follow up questions:

    1. Are you are referring to the "Remote Gateway" field in phase1? If so, it is already configured with the dyndns host name, not the IP address.

    2. Is the utility you are talking about the same one that monitors dynamic host names in Firewall Aliases?



  • @wallabybob:

    How long is it likely to take this utility to notice the change?

    If its the same one that is used for monitoring dynamic host names in Firewall Aliases, then I believe it is maximum 5 minutes between checks. In my case, my dyndns host has a TTL of 60 seconds. If the utility checked just before the TTL expired, I suppose it could take up to maybe 10 minutes for the utility to see the new IP (if my math is correct).

    In my case, even after a couple of hours, the VPN was still seeing the old IP address.



  • No its another one.
    dnswatch its called. You should have it running if you specified a name there.
    By default it polls every 60secs.
    The latest one should even leave a log entry saying " ….. reloading tunnel..."



  • Okay I tried with 2.0-BETA5  (i386) built on Mon Jan 24 07:08:15 EST 2011 but the tunnel is still going down when the dyndns address changes. It's probably something that I have misconfigured. I'll provide settings below and perhaps someone can tell me where I have gone wrong.

    I am creating a site-to-site ipsec tunnel between a static IP on one side and a dynamic IP on the other side. Attached are the screenshots of phase 1 settings on the static side and the dynamic side.

    When I first create the tunnel, it does come up. Then when the IP address of the dynamic side changes and the dyndns record is updated to point to the new address, the tunnel goes down and never comes back up. Furthermore, I do not see the message "reloading tunnel" in any of the logs, even after several minutes. Note that dig on both sides resolves the dyndns hostname to the new IP address after the 1 minute TTL expires, which is expected.

    After the dynamic IP changes, here is what the ipsec log looks like on the static side:

    Jan 25 10:47:59 	racoon: INFO: DPD: remote (ISAKMP-SA spi=a381e4e883deda80:309a4ee047e84806) seems to be dead.
    Jan 25 10:47:59 	racoon: INFO: DPD: remote (ISAKMP-SA spi=34f90f64acc71194:d3be4f7d53336652) seems to be dead.
    Jan 25 10:48:00 	racoon: [Unknown Gateway/Dynamic]: INFO: ISAKMP-SA deleted <static-ip>[500]-<old-dynamic-ip>[500] spi:a381e4e883deda80:309a4ee047e84806
    Jan 25 10:48:00 	racoon: [Unknown Gateway/Dynamic]: INFO: ISAKMP-SA deleted <static-ip>[500]-<old-dynamic-ip>[500] spi:34f90f64acc71194:d3be4f7d53336652</old-dynamic-ip></static-ip></old-dynamic-ip></static-ip>
    

    And here is what the ipsec log looks like on the dynamic side after the IP changes:

    Jan 25 10:47:30 	racoon: [Self]: INFO: <new-dynamic-ip>[500] used as isakmp port (fd=18)
    Jan 25 10:47:30 	racoon: INFO: <new-dynamic-ip>[500] used for NAT-T
    Jan 25 10:47:30 	racoon: [Self]: INFO: 172.16.3.1[500] used as isakmp port (fd=19)
    Jan 25 10:47:30 	racoon: INFO: 172.16.3.1[500] used for NAT-T
    Jan 25 10:47:30 	racoon: INFO: unsupported PF_KEY message REGISTER
    Jan 25 10:47:33 	racoon: ERROR: bind 1 (Can't assign requested address)
    Jan 25 10:47:33 	racoon: ERROR: sendfromto failed
    Jan 25 10:47:59 	racoon: INFO: DPD: remote (ISAKMP-SA spi=a381e4e883deda80:309a4ee047e84806) seems to be dead.
    Jan 25 10:47:59 	racoon: INFO: DPD: remote (ISAKMP-SA spi=34f90f64acc71194:d3be4f7d53336652) seems to be dead.
    Jan 25 10:48:00 	racoon: [TestToLbg]: INFO: ISAKMP-SA deleted <old-dynamic-ip>[500]-<static-ip>[500] spi:a381e4e883deda80:309a4ee047e84806
    Jan 25 10:48:00 	racoon: [TestToLbg]: INFO: ISAKMP-SA deleted <old-dynamic-ip>[500]-<static-ip>[500] spi:34f90f64acc71194:d3be4f7d53336652</static-ip></old-dynamic-ip></static-ip></old-dynamic-ip></new-dynamic-ip></new-dynamic-ip>
    

    I kept watching the logs for several minutes and there never appeared to be any attempt to re-establish the tunnel with the new IP. In fact, there were no new log messages at all after the ones listed above.

    What have I done wrong?






  • I wonder if dnswatch has crashed? When I first create the tunnel and bring it up, I notice the following files get created on the static side of the vpn:

    • /var/etc/dnswatch-ipsec.hosts

    • /var/run/dnswatch-ipsec.pid

    The file dnswatch-ipsec.hosts contains the correct dyndns hostname that I entered in the Remote gateway setting of the tunnel. I am guessing that the dnswatch-ipsec.pid file is supposed to contain the pid of the dnswatch process that is supposed to be checking for dynamic IP changes. However, when I use ps to look up that pid, I don't find it. Actually, I don't find any process named dnswatch (or even similar name) in either ps or http://<myrouterip>/status.php#Processes. So I think that maybe it runs once and then either quits or crashes?</myrouterip>



  • @cyboc:

    I wonder if dnswatch has crashed? When I first create the tunnel and bring it up, I notice the following files get created on the static side of the vpn:

    • /var/etc/dnswatch-ipsec.hosts

    • /var/run/dnswatch-ipsec.pid

    Forgot to mention, here are the (obfuscated) contents of dnswatch-ipsec.hosts:

    <obfuscated>.dyndns.org=value</obfuscated>
    

    Is the right hand side supposed to be "value" or is it supposed to be the IP address that the hostname resolves to?



  • Try running it by hand
    dnswatch /var/run/dnswatch-ipsec.pid 60 /etc/rc.newipsecdns /var/etc/dnswatch-ipsec.hosts

    See if you get any entry on log or if it runs normally.



  • Hi Ermal,

    I tried running dnswatch by hand. Here are the results:

    • no error messages in console

    • no error messages in logs

    • the pid changes in the file /var/run/dnswatch-ipsec.pid

    • the /var/etc/dnswatch-ipsec.hosts still has the same contents as above (i.e. <obfuscated>.dyndns.org=value)</obfuscated>

    • the tunnel does not reload

    • dnswatch not showing in process list



  • Ermal,

    I did some more troubleshooting to see if I can help.

    I double-checked the logs and still find no errors when I run dnswatch with the exact parameters you gave me above. To make sure errors are in fact being logged, I next tried passing bad parameters to dnswatch such as a negative interval or bad hostfile name. As expected, when I pass bad parameters to dnswatch, errors do get logged (e.g. "Invalid interval" and "unable to open configuration file"). So dnswatch is correctly logging errors.

    I looked over the dnswatch.c code from git. Perhaps this code is having no effect?:

    /* go into background */
    if (daemon(0, 0) == -1)
    	exit(1);
    

    Obviously, it's not exiting here otherwise I wouldn't get some of the bad parameter error messages that are generated below that line of code. But maybe, for some reason, it's not going into the background despite daemon() returning a good return code? Dunno. Just grasping at straws here.

    Anyway, the logs never show "DNS lookup for host %s failed" or "hostname %s ip changed to %s, reloading ipsec tunnel."

    It seems that dnswatch just exits prematurely for some unknown reason.



  • Hi Ermal,

    I noticed you've made some recent commits involving replacing dnswatch with filterdns, which I assume is in response to this thread. Thank you for that. :-)

    I wasn't sure if you were still working on these changes but I thought I would try today's build anyway (2.0-BETA5  (i386) built on Fri Jan 28 05:30:15 EST 2011).

    In the system log, I can see an attempt to startup filterdns but it fails with the following errors:

    Jan 28 16:04:54 	filterdns: unable to open configuration file
    Jan 28 16:04:54 	php: : The command '/usr/local/sbin/filterdns -p /var/run/filterdns-ipsec.pid -i 60 -c /var/etc/filterdns_ipsec.hosts -d 1' returned exit code '71', the output was 'filterdns: open file'
    

    I think the problem is that filterdns is trying to open the config file /var/etc/filterdns_ipsec.hosts but that file doesn't exist. The closest file name that exists in that directory is filternds-ipsec.hosts:

    # ls /var/etc/ | grep filter
    filternds-ipsec.hosts
    

    Notice the two differences between the expected file name and the actual file name:

    1. "nds" instead of "dns"
    2. "-" instead of "_"

    I hope that helps. I'm glad to help test this some more if you make some more commits.



  • Thank you for noticing that.
    The fix is simple https://rcs.pfsense.org/projects/pfsense/repos/mainline/commits/44ab93a4188b7530425ec74b5f832b5ab33fd1c4 if you do not want to wait for a snapshot.



  • Hi Ermal,

    Thank you for your quick response to my previous message.

    I tried 2.0-BETA5  (i386) built on Mon Jan 31 07:16:37 EST 2011 but I still see this filterdns error in the system logs:

    Jan 31 09:25:31 	filterdns: unable to open configuration file
    Jan 31 09:25:31 	php: : The command '/usr/local/sbin/filterdns -p /var/run/filterdns-ipsec.pid -i 60 -c /var/etc/filterdns-ipsec.hosts -d 1' returned exit code '71', the output was 'filterdns: open file'
    

    It appears that you have fixed the "_" vs "-" error but the "nds" vs "dns" error is still there. The command in the log message above expects to find the file filterdns-ipsec.hosts but the actual file in /var/etc is spelled like this: filternds-ipsec.hosts. Note the transposition of the letters "n" and "d".

    # ls /var/etc/ | grep filter
    filternds-ipsec.hosts
    

    I will continue to test this in new builds and I will report back here.


  • Rebel Alliance Developer Netgate

    Should be fixed in new snaps.



  • Thanks Jimp,

    I also just noticed another error message in the build 2.0-BETA5  (i386) built on Mon Jan 31 07:16:37 EST 2011. I don't know if it is already fixed in the next snaps but I thought I should report it anyway.

    When I uncheck "Enable IPSec", hit Save, then check "Enable IPSec" and hit Save again, I see the following error message at the top of the page:

    Warning: Invalid argument supplied for foreach() in /etc/inc/vpn.inc on line 908
    

    You can see it in the attached screenshot.

    Here is the code in that spot, it it helps:

    
    /* start filterdns, if necessary */
    if (count($filterdns_list) > 0) {
      $interval = 60;
      if (!empty($ipseccfg['dns-interval']) && is_numeric($ipseccfg['dns-interval']))
        $interval = $ipseccfg['dns-interval'];
    
      $hostnames = "";
      array_unique($filterdns_list);
      foreach ($hostname as $filterdns_list) ###### <---line 908
        $hostnames .= "cmd {$hostname} '/etc/rc.newipsecdns'\n";
      file_put_contents("{$g['varetc_path']}/filternds-ipsec.hosts", $hostnames);
    
      killbypid("{$g['varrun_path']}/filterdns-ipsec.pid");
      mwexec("/usr/local/sbin/filterdns -p {$g['varrun_path']}/filterdns-ipsec.pid -i {$interval} -c {$g['varetc_path']}/filterdns-ipsec.hosts -d 1");
    }
    



  • Rebel Alliance Developer Netgate

    Fixed now.

    That should have been:

    foreach ($filterdns_list as $hostname)
    


  • Hi Jimp,

    Thanks for your very quick response. I applied your changes locally, in advance of the next snapshots. The error messages are now gone from the system log and from the IPSec config page. However, when the IP address on the dynamic side of the tunnel changes, the tunnel does not come back up. The change is detected by filterdns (because the system log shows "php: : IPSEC: One or more IPSEC tunnel endpoints has changed IP. Refreshing.") but the tunnel just does not come back up.

    Attached are a series of obfuscated screenshots from both the static side and the dynamic before and after the dyndns IP address change. The address of the dynamic side was changed from x.x.x.198 to x.x.x.195. The address of the static side is 216.x.x.x. The are a lot of screenshots so please pay close attention to the words "staticSide", "dynamicSide", "Before" and "After" in the file names to figure out which is which.

    (the screenshots are spread out over the next couple of posts because of attachment limits in this forum)

    It is interesting to note that after the address change, the racoon service on the static side stopped, as shown in the screenshot "staticSide-ServicesStatus-AfterDyndnsChange.PNG". To get the tunnel back up, I had to start the stopped racoon service on the static side and restart the running racoon service on the dynamic side.










  • more screeshots










  • even more screenshots














  • Last set of screenshots. Phew!












  • I'm encountering the same problem. Did you find a solution for it?

    Christof



  • @cegner:

    I'm encountering the same problem. Did you find a solution for it?

    Christof

    Did you try the latest firmware version?



  • Yes, but the problem still exists.

    2.0-RC1 (i386)
    built on Fri Mar 4 23:09:48 EST 2011



  • Please post again te system logs and ipsec logs



  • Here you go.

    At 04:00 the PPPoE connection was terminated. After the reconnect the IPSEC tunnel does not come up again (for hours) without a reload as the remote peer IP address has changed. It does not seem to lookup the hostname again.

    Mar  7 03:20:37 fw racoon: INFO: initiate new phase 2 negotiation: X.X.219.99[500]<=>X.X.197.44[500]
    Mar  7 03:20:37 fw racoon: INFO: IPsec-SA expired: ESP/Tunnel X.X.197.44[500]->X.X0.219.99[500] spi=143188784(0x888e330)
    Mar  7 03:20:37 fw racoon: INFO: IPsec-SA established: ESP X.X.219.99[500]->X.X.197.44[500] spi=201427959(0xc018bf7)
    Mar  7 03:20:37 fw racoon: INFO: IPsec-SA established: ESP X.X.219.99[500]->X.X.197.44[500] spi=13656620(0xd0622c)
    Mar  7 04:00:13 fw racoon: INFO: caught signal 15
    Mar  7 04:00:13 fw racoon: INFO: racoon process 30653 shutdown
    Mar  7 04:00:20 fw racoon: INFO: @(#)ipsec-tools 0.8.0.RC (http://ipsec-tools.sourceforge.net)
    Mar  7 04:00:20 fw racoon: INFO: @(#)This product linked OpenSSL 0.9.8n 24 Mar 2010 (http://www.openssl.org/)
    Mar  7 04:00:20 fw racoon: INFO: Reading configuration from "/var/etc/racoon.conf"
    Mar  7 04:00:20 fw racoon: INFO: X.X.200.246[4500] used for NAT-T
    Mar  7 04:00:20 fw racoon: INFO: X.X.200.246[4500] used as isakmp port (fd=19)
    Mar  7 04:00:20 fw racoon: INFO: X.X.200.246[500] used for NAT-T
    Mar  7 04:00:20 fw racoon: INFO: X.X.200.246[500] used as isakmp port (fd=20)
    Mar  7 04:00:20 fw racoon: INFO: unsupported PF_KEY message REGISTER
    Mar  7 04:00:20 fw racoon: ERROR: such policy already exists. anyway replace it: 192.168.1.254/32[0] 192.168.1.0/24[0] proto=any dir=out
    Mar  7 04:00:20 fw racoon: ERROR: such policy already exists. anyway replace it: 192.168.1.0/24[0] 192.168.1.254/32[0] proto=any dir=in
    Mar  7 04:00:20 fw racoon: ERROR: such policy already exists. anyway replace it: 192.168.1.0/24[0] 192.168.10.0/24[0] proto=any dir=out
    Mar  7 04:00:20 fw racoon: ERROR: such policy already exists. anyway replace it: 192.168.10.0/24[0] 192.168.1.0/24[0] proto=any dir=in
    Mar  7 04:00:29 fw racoon: INFO: IPsec-SA request for X.X.197.44 queued due to no phase1 found.
    Mar  7 04:00:29 fw racoon: INFO: initiate new phase 1 negotiation: X.X.200.246[500]<=>X.X.197.44[500]
    Mar  7 04:00:29 fw racoon: INFO: begin Aggressive mode.
    Mar  7 04:01:01 fw racoon: [X.X.197.44] ERROR: phase2 negotiation failed due to time up waiting for phase1. ESP X.X.197.44[0]->X.X.200.246[0]
    Mar  7 04:01:01 fw racoon: INFO: delete phase 2 handler.
    Mar  7 04:01:19 fw racoon: ERROR: phase1 negotiation failed due to time up. 1f856cf72bf9c322:0000000000000000
    Mar  7 04:04:55 fw racoon: INFO: IPsec-SA request for X.X.197.44 queued due to no phase1 found.
    Mar  7 04:04:55 fw racoon: INFO: initiate new phase 1 negotiation: X.X.200.246[500]<=>X.X.197.44[500]
    Mar  7 04:04:55 fw racoon: INFO: begin Aggressive mode.
    Mar  7 04:05:26 fw racoon: [X.X.197.44] ERROR: phase2 negotiation failed due to time up waiting for phase1. ESP X.X.197.44[0]->X.X.200.246[0]


Log in to reply