Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    PFSense <–> PFsense: IPSEC Tunnels Losing Connectivity

    Scheduled Pinned Locked Moved IPsec
    15 Posts 10 Posters 38.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      csnf
      last edited by

      Same issue here.  I have 2.0 on the main IPSec tunnel and 1.2.3 on 8 different machines and randomly stop sending data across the tunnel.  I have to restart raccoon to get things working again. This only happens when I upgraded to 2.0.  I hope somebody can isolate this issue.

      1 Reply Last reply Reply Quote 0
      • Z
        Zeon
        last edited by

        Hey guys,
        Just to let you all know I'm going to try what was suggested in this thread:
        http://forum.pfsense.org/index.php/topic,41617.0.html

        So remove the NAT-T traversal and dead peer detection and see how that goes.

        1 Reply Last reply Reply Quote 0
        • J
          jmarquez
          last edited by

          Hi all.

          Same frustrating problem here with 2 VPN using pfSense 2.0.1 in all sides.

          I read in some post that this only happens from version 2.0 up, so I might downgrade to 1.2.3 as this issue makes unusable the VPN connection.

          Hope this is fixed soon.

          Regards,
          Jesus

          1 Reply Last reply Reply Quote 0
          • C
            cmb
            last edited by

            @jmarquez:

            I read in some post that this only happens from version 2.0 up, so I might downgrade to 1.2.3 as this issue makes unusable the VPN connection.

            That's not true, it happens on occasion with every IPsec implementation on every device in the world. 2.0.x does not have any general IPsec problems. It's most always related to misconfigurations. Most commonly, mismatched lifetimes on P1 and/or P2 for the symptoms described here, though at times it can be circumstances where you need DPD enabled.

            There isn't enough info here on any of the reported issues to troubleshoot, and every issue is likely a different cause, so if you're having issues please start your own thread with specifics - IPsec logs from both sides in particular.

            Zeon - this one's your thread, post your IPsec logs from the other end. The bit shown here just shows one end renegotiated successfully.

            1 Reply Last reply Reply Quote 0
            • J
              jmarquez
              last edited by

              Don't get me wrong cmb.

              I'm really happy using pfSense. I think that it is a great peace of code.
              I agree with you about every person's issue related to ipSec. My issue is similar to the ones related on this thread just in the fact that tunnels drop randomly.

              In my particular problem, I followed the steps described by Zeon post (http://forum.pfsense.org/index.php/topic,41617.0.html) and the tunnel have not dropped so far.

              All the best.

              1 Reply Last reply Reply Quote 0
              • Z
                Zeon
                last edited by

                @cmb:

                @jmarquez:

                I read in some post that this only happens from version 2.0 up, so I might downgrade to 1.2.3 as this issue makes unusable the VPN connection.

                That's not true, it happens on occasion with every IPsec implementation on every device in the world. 2.0.x does not have any general IPsec problems. It's most always related to misconfigurations. Most commonly, mismatched lifetimes on P1 and/or P2 for the symptoms described here, though at times it can be circumstances where you need DPD enabled.

                There isn't enough info here on any of the reported issues to troubleshoot, and every issue is likely a different cause, so if you're having issues please start your own thread with specifics - IPsec logs from both sides in particular.

                Zeon - this one's your thread, post your IPsec logs from the other end. The bit shown here just shows one end renegotiated successfully.

                Hi CMB,
                Firstly, I can say after a few days of disabled DPD and NT-T that I have had no further dropouts and couldn't be happier. This is true across 6 separate tunnels with some having latency of 1ms and others as high as 30ms (throughput of the internet connections is anywhere between 100mbps to 30mbps).

                Unfortunately i don't have the logs of the problem anymore but will try to recreate them one weekend for the benefit of the other users on here.

                Out of interest when is DPD needed? I have had situation where I have knocked a cable out for up to 10 seconds and the tunnel still seems to work fine once I plug back in?

                1 Reply Last reply Reply Quote 0
                • C
                  cmb
                  last edited by

                  @Zeon:

                  Firstly, I can say after a few days of disabled DPD and NT-T that I have had no further dropouts and couldn't be happier. This is true across 6 separate tunnels with some having latency of 1ms and others as high as 30ms (throughput of the internet connections is anywhere between 100mbps to 30mbps).

                  Disabling NAT-T where you don't need it is a good thing to do. For DPD, as long as it's enabled on both sides with the same settings you should be good. That's what we use on all ours internally.

                  @Zeon:

                  Unfortunately i don't have the logs of the problem anymore but will try to recreate them one weekend for the benefit of the other users on here.

                  Out of interest when is DPD needed? I have had situation where I have knocked a cable out for up to 10 seconds and the tunnel still seems to work fine once I plug back in?

                  Circumstances where one end drops an SA and the other doesn't recognize when that SA is no longer valid is where DPD fixes having to force restart one or both ends. That may be a reboot on one side or the other (primarily an unplanned one like a power outage or yanking the plug, an orderly reboot should tell the other end to clear it), or an IP change on one of the sides where there are dynamic WANs. Those are the two most common that I can think of offhand. Just knocking a cable out for a few seconds or minutes even is no big deal, unless you happen to get a new IP when it's reconnected (with dynamic WANs, the link up will force reconnect to your ISP, which with some will get you a new IP). If you still have the same IP, the existing SA is still valid and will work fine.

                  1 Reply Last reply Reply Quote 0
                  • M
                    maldex
                    last edited by

                    struggeling across this thread reminds me of the same issue i had a while ago as well, quite annoying, including against Astaro 8.2.
                    I wouldn't vow this but crosschecking my config now, one of the configuration change leftovers since the performance tests we did quiet a while ago (<v2.01) is="" that="" we're="" using="" <em="">Blowfish in Phase1 now. It never happened again so i completly forgot about this. I'm using the my 2.01(dyn-IP) now against both, pfsense 2.01(also dyn-ip) and Astaro V8.3 (fixed-ip):

                    • All have public IPs (not nat involved, Nat Traversal disabled)
                    • Default Mutual PSK, Main mode (btw i thought this cannot work with ipsec by definition? well done!!! :)) , My & Peer IP Address, Default Policy Gen. and Proposal Checking.
                    • Phase1:
                      – Encryption algorithm:  Blowfish 256
                      – Hash algorithm: SHA1
                      – DH key group: 5   and  Lifetime: 86400
                      – DPD: Enabled, 10 Detection and 5 retries
                    • Phase2:
                      – Encryption algorithms: AES 256  (Only this, no other proposal)
                      – Hash algorithms: MD5  (Only this, no other proposal)
                      – PFS key group: 5 and  Lifetime: 86400.  Auto Ping remote Host is set

                    yes, not the same encryption and hashing in phase 1 and 2, but even the one with 2xphase2 works stable now. Sorry, can't provide more details,

                    I'll let you guys know if i encounter a 'stalled' vpn again.

                    cheers
                    Josh</v2.01)>

                    1 Reply Last reply Reply Quote 0
                    • B
                      boogieshafer
                      last edited by

                      on the pfsense side, try setting the P1 Policy Generation to "unique"

                      i was having similar issues for subequent reconnects for the Shrew client where restarting the pfsense ipsec process would clear the issue

                      i did NOT need to disable NAT-T or DPD, just changing the P1 Policy Generation setting from "default" to "unique" was the only change i made

                      1 Reply Last reply Reply Quote 0
                      • D
                        dhatz
                        last edited by

                        It seems that several people are reporting IPsec VPN issues with pfsense 2.x (note: which includes the recent ipsec-tools 0.8.0). While some problems may be due to misconfiguration (e.g. the racoon / mpd conflict), the pfsense<->pfsense VPN scenario should be trouble-free.

                        As most of the problems posted here seem to be related to rekeying,  I've been searching the ipsec-tools-devel mailing lists for clues. Check the following discussions:

                        http://old.nabble.com/why-is-SA-lifetime-kilobyte-limit-disabled-in-racoon–td31648198.html

                        Even if Node-A think IPsec-SA is expired at this time, Node-B doen't
                        think so. i.e. the states of IPsec-SA is mismatched.

                        Understand – similar things already happen with time-based
                        lifetimes if there is a clock skew between the two boxes.
                        (This is particulary bad if the oldest available SA is used
                        by the kernel.)

                        Racoon's strategy of rekeying is "Initiator do it." If Node-B
                          is responder, Node-A doesn't start rekeying even if IPsec-SA is
                          expired.
                        That sounds like a bug in racoon.  It seems that if either end is
                        unsatisfied with the SA, that end should trigger a new one.

                        I'd also call this a shortcoming at least. The standards are
                        weak, and one doesn't know how other implementations behave.
                        It would be safer if both sides did care about renegotiations.

                        But the key
                        question is what the other implementions do, and what the standard says.

                        I've just tried OpenBSD's isakmpd (the oldish version in pkgsrc).
                        It initiates a Phase 2 exchange if the soft timeout on its
                        side expires, even if it was responder initially. (It randomizes
                        the soft timeouts to minimize the chance that both sides start
                        the exchange simultanously.)
                        PFC2409 says that both sides can initiate rekeying. "Can" --
                        this is not much of a guideline for implementors.

                        I can see the argument that especially with a 24h or less
                        lifetime, AES doesn't need volume-based rekeying.

                        OK, I was more concerned about interoperability. What if
                        the other side insists in some volume limit?

                        I've just tried OpenBSD's isakmpd (the oldish version in pkgsrc).
                        It initiates a Phase 2 exchange if the soft timeout on its
                        side expires, even if it was responder initially. (It randomizes
                        the soft timeouts to minimize the chance that both sides start
                        the exchange simultanously.)
                        PFC2409 says that both sides can initiate rekeying. "Can" --
                        this is not much of a guideline for implementors.

                        True, but it seems the original responder initiating a renegotiation is
                        the only reasonable behavior.

                        At the very least, it would appear to suggest that if the original
                        initiator rejects an attempt on the part of the original responder to
                        rekey, that's a bug.

                        True, but it seems the original responder initiating a renegotiation is
                        the only reasonable behavior.

                        If both side start rekeying at same time, there is/was a problem of
                        SA selection.

                        The two rekeying session makes two pair of IPsec-SAs. racoon can
                        do this, and IPsec implementations (kernel side) do one of following:

                        a. Use oldest IPsec-SA to send and keep all IPsec-SAs to receive(KAME)
                        b. Use newest IPsec-SA to send and keep all IPsec-SAs to receive(Fast IPsec)
                        c. Use newest IPsec-SA to send/receive and purge older IPsec-SAs

                        Of cause, c. is bad behavior, but small implementations(kernel side)
                        may handle only one sessions and one key pair at a time.
                        Standards don't prohibit this. This problem is exist between IKE
                        standards and IPsec standards. It seems IKEv2 makes this more clean.

                        Today, most implementations select b. or have configuration for it.
                        And racoon isn't used on other than KAME, Fast IPsec, or Linux(a. or b.)
                        I think your logic actually works fine. But racoon is old product,
                        so it doesn't catch recent trends up.

                        http://marc.info/?l=ipsec-tools-devel&m=129905181832157&w=2
                        http://marc.info/?l=ipsec-tools-devel&m=129916127621017&w=2

                        let me revive the discussion on an active negotiation,
                        as opposed to a passive daemon. Until recently my use
                        of IPsec was tied to isakmpd, ipsecctl, and OpenBSD
                        and my views are conditioned by this fact. There the
                        IPsec daemon is normally active in initiating its
                        negotiations at startup, unless told to configure
                        a passive listener for a particular tunnel/transport.
                        At the other extreme there is even a so called
                        active-only setting.

                        The implicit and default setting in racoon-0.7.3 is
                        "passive off", but this still waits for a demand to be
                        detected. Thus the mode is better described as "passive
                        until harshly bugged to get going"! The need to ping
                        and wait for a ridiculously long delay should not be
                        acceptable in most circumstances. Forgive me for the
                        critisism, but to me this is a design flaw. It is a
                        question of dependability and of trust to erect the
                        desired IPsec tunnels already at booting time.

                        Funny: when we tried to switch from racoon to isakmpd at work, a long
                        long time ago, this is one of the things we noticed on our TODO list:
                        patch isakmpd to negociate SAs only when traffic comes to the tunnel :-)

                        And this is how things should (can ?) be done according to RFC 2367
                        which provide SADB_ACQUIRE PFkey message….

                        Now, doing comparative browsing in the sources 0.7.3
                        and 0.8, the actual use of the variable PASSIVE in
                        "struct remoteconf" has indeed expanded somewhat.
                        Is the code progressing or maturing into a state
                        that allows an actively negotiating daemon? I.e.,
                        without waiting for traffic demand before commencing?

                        Not afaik.
                        Feel free to provide a patch for that, this would not be so
                        complicated to parse all config and start negociation for needed
                        tunnels, but there are also setups where we want to have tunnels
                        negociated only when needed (so when traffic comes to the tunnel), so
                        a patch will need to provide this feature as optional.
                        The best would be to have a peer-based (or sainfo based ?) token for
                        that.

                        Please also note that this is quite easy to also generate dummy
                        traffic for the needed tunnels when you activate the configuration if
                        you want.
                        And of course generate dummy traffic from time to time to ensure the
                        tunnel will always be up.

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.