Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    IPSEC + VTI + IKEV2 - will not auto-reconnect

    Scheduled Pinned Locked Moved IPsec
    26 Posts 8 Posters 5.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B
      bbrendon @bbrendon
      last edited by

      @Derelict
      This is still a constant battle for me. The tunnel is down now, and I tried pinging the remote tunnel IP in pfsense using the VTI interface as the source address.

      The log is below.

      May 31 19:30:12	charon	65059	14[CFG] trap not found, unable to acquire reqid 1000
      May 31 19:30:12	charon	65059	01[KNL] creating acquire job for policy 1.1.123.153/32|/0 === 2.2.142.61/32|/0 with reqid {1000}
      May 31 19:30:09	charon	65059	01[KNL] <con3000|181> querying policy 0.0.0.0/0|/0 === 0.0.0.0/0|/0 in failed, not found
      May 31 19:30:09	charon	65059	01[NET] <con3000|181> sending packet: from 1.1.123.153[500] to 50.196.146.217[500] (80 bytes)
      May 31 19:30:09	charon	65059	01[ENC] <con3000|181> generating INFORMATIONAL response 1040 [ ]
      May 31 19:30:09	charon	65059	01[ENC] <con3000|181> parsed INFORMATIONAL request 1040 [ ]
      May 31 19:30:09	charon	65059	01[NET] <con3000|181> received packet: from 50.196.146.217[500] to 1.1.123.153[500] (80 bytes)
      May 31 19:30:08	charon	65059	09[CFG] vici client 2831 disconnected
      May 31 19:30:08	charon	65059	09[KNL] <con3000|181> querying policy 0.0.0.0/0|/0 === 0.0.0.0/0|/0 out failed, not found
      May 31 19:30:08	charon	65059	09[KNL] <con3000|181> querying policy 0.0.0.0/0|/0 === 0.0.0.0/0|/0 in failed, not found
      May 31 19:30:08	charon	65059	09[CFG] vici client 2831 requests: list-sas
      May 31 19:30:08	charon	65059	12[CFG] vici client 2831 registered for: list-sa
      May 31 19:30:08	charon	65059	09[CFG] vici client 2831 connected
      May 31 19:30:06	charon	65059	15[CFG] trap not found, unable to acquire reqid 1000
      May 31 19:30:06	charon	65059	12[KNL] creating acquire job for policy 1.1.123.153/32|/0 === 2.2.142.61/32|/0 with reqid {1000}
      May 31 19:30:02	charon	65059	12[CFG] vici client 2830 disconnected
      May 31 19:30:02	charon	65059	12[KNL] <con3000|181> querying policy 0.0.0.0/0|/0 === 0.0.0.0/0|/0 out failed, not found
      May 31 19:30:02	charon	65059	12[KNL] <con3000|181> querying policy 0.0.0.0/0|/0 === 0.0.0.0/0|/0 in failed, not found
      May 31 19:30:02	charon	65059	12[CFG] vici client 2830 requests: list-sas
      May 31 19:30:02	charon	65059	12[CFG] vici client 2830 registered for: list-sa
      May 31 19:30:02	charon	65059	15[CFG] vici client 2830 connected
      May 31 19:30:00	charon	65059	06[CFG] vici client 2829 disconnected
      May 31 19:30:00	charon	65059	11[KNL] <con3000|181> querying policy 0.0.0.0/0|/0 === 0.0.0.0/0|/0 out failed, not found
      May 31 19:30:00	charon	65059	11[KNL] <con3000|181> querying policy 0.0.0.0/0|/0 === 0.0.0.0/0|/0 in failed, not found
      May 31 19:30:00	charon	65059	11[CFG] vici client 2829 requests: list-sas
      May 31 19:30:00	charon	65059	11[CFG] vici client 2829 registered for: list-sa
      May 31 19:30:00	charon	65059	06[CFG] vici client 2829 connected
      May 31 19:30:00	charon	65059	09[CFG] trap not found, unable to acquire reqid 1000
      May 31 19:30:00	charon	65059	11[KNL] creating acquire job for policy 1.1.123.153/32|/0 === 2.2.142.61/32|/0 with reqid {1000}
      May 31 19:29:59	charon	65059	11[KNL] <con3000|181> querying policy 0.0.0.0/0|/0 === 0.0.0.0/0|/0 in failed, not found
      

      And help would be great.
      tia.

      B 1 Reply Last reply Reply Quote 0
      • B
        bbrendon @bbrendon
        last edited by

        The only thing I could find is the [0] keyingtries setting which maybe should be forever? This problem seems to mostly occur if there is a connectivity issue for more than 5-15 minutes or so.

        The issue though is I don't see a way to set it in pfsense.

        [0] https://wiki.strongswan.org/projects/strongswan/wiki/connsection

        1 Reply Last reply Reply Quote 0
        • B
          BarronC
          last edited by BarronC

          Was there ever any resolve on this? I have the same problem when the Internet drops. This usually happens when Starlink does a firmware upgrade early in the morning. In the morning I see the VPN is down and I have to click reconnect, even when there is interesting traffic.

          dotdashD 1 Reply Last reply Reply Quote 0
          • dotdashD
            dotdash @BarronC
            last edited by

            @barronc
            I've also come across this issue with VTI tunnels not reconnecting after an outage. I used the script referenced here: https://www.reddit.com/r/PFSENSE/comments/ceg1qb/ipsec_site_to_site_no_auto_restart/ as a starting point. I created a simpler script and run it via cron before work starts in the office for the day, and periodically throughout the day.

            B 1 Reply Last reply Reply Quote 0
            • B
              bbrendon @dotdash
              last edited by

              @dotdash Not resolved here. Thanks for the link.

              1 Reply Last reply Reply Quote 0
              • jimpJ
                jimp Rebel Alliance Developer Netgate
                last edited by

                On one end, set Child SA Close Action to Restart/Reconnect. Do not set it on both sides or you'll likely end up with duplicate child SAs due to collisions in negotiation.

                VTI cannot be triggered on demand because VTI does not support trap policies.

                Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                Need help fast? Netgate Global Support!

                Do not Chat/PM for help!

                B 1 Reply Last reply Reply Quote 0
                • B
                  bbrendon @jimp
                  last edited by

                  @jimp said in IPSEC + VTI + IKEV2 - will not auto-reconnect:

                  On one end, set Child SA Close Action to Restart/Reconnect.

                  Yes, but that setting doesn't solve the reconnect issue.

                  1 Reply Last reply Reply Quote 0
                  • jimpJ
                    jimp Rebel Alliance Developer Netgate
                    last edited by

                    That's exactly what it does in my testing. It will keep trying to reconnect.

                    Though perhaps there is some other edge case I'm not aware of, but there isn't anything else to be done in strongSwan other than setting the child SA close action.

                    It already tries to start VTIs when loaded, and by setting that option for child SA close actions it will reconnect any time they are cleared.

                    At least for me it's been quite persistent about it.

                    Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                    Need help fast? Netgate Global Support!

                    Do not Chat/PM for help!

                    dotdashD 1 Reply Last reply Reply Quote 0
                    • S
                      shellbr
                      last edited by

                      I'm having this same problem. VTI does not reconnect if the site is down for more than 5 minutes.
                      I've also been able to create this in a lab environment. It's only a problem for those of us where one side is set to Responder Only - as is required at one of my sites due to NAT beyond my control. The reason it works fine for tunnels not limited to responder only is because even though one side gives up after 5 minutes and changes to Disconnected, the service is still listening for inbound requests and so the tunnel comes back up as soon as the far side is back online. This becomes a problem when only one side can initiate.

                      It would be nice to see an option to make it retry every x seconds and do it indefinitely.

                      1 Reply Last reply Reply Quote 0
                      • dotdashD
                        dotdash @jimp
                        last edited by

                        @jimp
                        When I was testing the tunnel would reconnect after a short outage, but if I left it down for an hour, it would never re-establish the tunnel. The other quirk I came across was that we got packet loss after switching to VTI until setting the MSS to 1360.

                        1 Reply Last reply Reply Quote 1
                        • jimpJ
                          jimp Rebel Alliance Developer Netgate
                          last edited by

                          I thought about this some more and have some ideas. I don't know if I'll get to them soon, though.

                          See https://redmine.pfsense.org/issues/12169

                          Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                          Need help fast? Netgate Global Support!

                          Do not Chat/PM for help!

                          M 1 Reply Last reply Reply Quote 0
                          • M
                            mdomnis @jimp
                            last edited by

                            @jimp I'm playing with IPSEC + VTI + IKEv2 on the 2.6 RC and I am still seeing the tunnel (P1 and P2) remain down after a WAN outage of > ~5 minutes and subsequent recovery. I have tried setting the Child SA Close Action to Restart/Reconnect on one side, but that does not seem to help. I confirmed that the code listed in https://redmine.pfsense.org/issues/12169 is in place on my test box.

                            I'm not sure if having gateway monitoring enabled on the VTI would help in this situation, but I was forced to disable that on older versions due to ipsec restarting all tunnels any time ANY VTI gateway went down. So if I had a HQ site with 10 VTI tunnels to branch sites, any time ONE of them suffered an outage, there would be brief outages to all branches when the gateway goes down and again when it comes back up. No good. Disabling gateway monitoring on the VTI gateways helped tremendously with that problem, but now I'm having issues with the tunnel not reconnecting after a lengthy outage.

                            To me it seems like it retries for a certain period of time (5 minutes ish) and then beyond that, it's never going to reconnect itself and an admin will have to go and manually reconnect the tunnel.

                            B jimpJ 2 Replies Last reply Reply Quote 1
                            • B
                              bbrendon @mdomnis
                              last edited by

                              @mdomnis Thank you for testing this on the betas.

                              I don't run betas but our solution was to create a cron job that pings across the tunnel and if it fails, restart ipsec. It's very hacky but was the only workable option other than replacing pfsense. In the future I might try wireguard since VTI has been broken for many years now.

                              1 Reply Last reply Reply Quote 0
                              • jimpJ
                                jimp Rebel Alliance Developer Netgate @mdomnis
                                last edited by

                                @mdomnis

                                Did you enable the new option on the VTI P2 to activate the new keep-alive feature?

                                2c94ceac-aba4-474a-b22e-8154d42d289a-image.png

                                Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                                Need help fast? Netgate Global Support!

                                Do not Chat/PM for help!

                                A M B 3 Replies Last reply Reply Quote 1
                                • A
                                  ay @jimp
                                  last edited by ay

                                  @jimp
                                  for VTI tunnels, should we still be setting one side to responder only in 2.6?

                                  Troubleshooting Duplicate IPsec SA Entries

                                  jimpJ 1 Reply Last reply Reply Quote 0
                                  • jimpJ
                                    jimp Rebel Alliance Developer Netgate @ay
                                    last edited by

                                    That is best. Set one side to connect + keep alive, set the other side to responder only.

                                    Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                                    Need help fast? Netgate Global Support!

                                    Do not Chat/PM for help!

                                    1 Reply Last reply Reply Quote 1
                                    • M
                                      mdomnis @jimp
                                      last edited by

                                      @jimp I just enabled that option now and in my first two tests of > 5 minute outage, it seemed to do the job. Not sure why I didn't see that there. Doesn't help for 2.5.2, but I'll have to be patient I guess. :)

                                      Are you able to comment on the use of gateway monitoring on the VTI gateways? It was enabled by default just pointing to the remote side IP in the /30, but the results I had when testing with 2.5.2 were not good in that if I had a HQ site with 10 VTI tunnels to branch sites, any time ONE of the branches suffered an outage, there would be brief outages to all branches when the gateway goes down and again when it comes back up. I believe this was due to IPSEC restarting. TAC suggested disabling the gateway monitoring and it has helped get me much more stable. Wondering if it makes sense to update the docs with this advice or perhaps even default it to disabled for VTI gateways? Or if there is another fix in the works that might only restart the tunnel having an issue and not all of them.

                                      Thanks.

                                      1 Reply Last reply Reply Quote 0
                                      • jimpJ
                                        jimp Rebel Alliance Developer Netgate
                                        last edited by

                                        Gateway monitoring seems to work fine for me but YMMV. If it gives you trouble, turn it off. For most uses of VTI it isn't all that necessary, though it is nice to know if the VPN is experiencing packet loss.

                                        Some uses of VTI such as for failover with policy routing would need to keep monitoring enabled. Some people fail from VTI to an interface, or an interface to VTI, or VTI to VTI.

                                        The defaults are just the defaults and can easily be changed by users who need or prefer different behavior.

                                        Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                                        Need help fast? Netgate Global Support!

                                        Do not Chat/PM for help!

                                        1 Reply Last reply Reply Quote 0
                                        • S
                                          shellbr
                                          last edited by

                                          I just upgraded one side of a tunnel (the side that always initiates) to 2.6.0. It was a great time to test since I needed to take the other (responder only) site down for a 2 hour window anyway. When I brought the network back up, the IPSec tunnel was reconnected by the opposite side, so that's a success! I still have the responder only side on 2.52 and will upgrade it soon. Anyway, great work and I'm so glad the new feature provided in 2.6 resolved this long-standing issue. Thanks to all the developers!

                                          1 Reply Last reply Reply Quote 0
                                          • B
                                            bbrendon @jimp
                                            last edited by

                                            Hi @jimp .

                                            Regarding the "Keep Alive - Enable periodic keep alive check" option, should that be enabled on both sides or just the side initiating the connection?

                                            jimpJ 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.