Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    IPsec fails to renegotiate after loss of a peer

    Scheduled Pinned Locked Moved IPsec
    71 Posts 15 Posters 62.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R
      Rockets
      last edited by

      Is IPsec renegotiating properly now in 1.2.3-RC3?

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        Yes, it is working now as far as all my tests have shown both in actual tests and in running it at home and having some Internet stability issues. Seems to work fine as far as I can tell.

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • R
          Rockets
          last edited by

          Jimp wher'd I'd find RC3? It's not on the offical mirrors - only RC1. Or is RC3 a current snapshot? I'm using embedded.

          1 Reply Last reply Reply Quote 0
          • jimpJ
            jimp Rebel Alliance Developer Netgate
            last edited by

            @Rockets:

            Jimp wher'd I'd find RC3? It's not on the offical mirrors - only RC1. Or is RC3 a current snapshot? I'm using embedded.

            It's only in snapshots at the moment, but there will probably be an "official" cut of RC3 (or perhaps RC4?) before release.

            Here are the NanoBSD (new embedded system) snapshots:
            http://snapshots.pfsense.org/FreeBSD_RELENG_7_2/pfSense_RELENG_1_2/nanobsd/?C=M;O=D

            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

            Need help fast? Netgate Global Support!

            Do not Chat/PM for help!

            1 Reply Last reply Reply Quote 0
            • B
              bkm
              last edited by

              I am seeing an issue that seems to be the same. I am testing RC 1.2.3 20090924. I have tunnels set up to two separate sites to a Netopia router on the other end. The tunnels are working when I leave work in the evening. When I get to work in the morning, they are not working. The IPSec status page (SAD) shows that the tunnels are up. If I restart raccoon, the tunnel status goes down. I then ping a site and everything gets renegotiated and it works again.
              I currently have a 28800 lifetime for phase 1 and 86400 for phase 2.
              I am willing to test a couple things for a day or two if someone has a suggestion. After that I will need to put my pfsense box into production without the tunnels and I will be limited in what I can try.

              1 Reply Last reply Reply Quote 0
              • jimpJ
                jimp Rebel Alliance Developer Netgate
                last edited by

                @bkm:

                I am seeing an issue that seems to be the same. I am testing RC 1.2.3 20090924. I have tunnels set up to two separate sites to a Netopia router on the other end. The tunnels are working when I leave work in the evening. When I get to work in the morning, they are not working. The IPSec status page (SAD) shows that the tunnels are up. If I restart raccoon, the tunnel status goes down. I then ping a site and everything gets renegotiated and it works again.
                I currently have a 28800 lifetime for phase 1 and 86400 for phase 2.
                I am willing to test a couple things for a day or two if someone has a suggestion. After that I will need to put my pfsense box into production without the tunnels and I will be limited in what I can try.

                It might help to know more about these tunnels, at least this much: Are they static tunnels or mobile clients? Are they using main mode or aggressive mode? Do you have DPD enabled? Keep Alive? What shows up in the logs when the tunnels are broken?

                And anything else you can think of.

                Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                Need help fast? Netgate Global Support!

                Do not Chat/PM for help!

                1 Reply Last reply Reply Quote 0
                • F
                  fairchild
                  last edited by

                  im using the latest 2.0 snapshot, do you recommend leaving DPD enabled? I dont have access to the logs right now so i cant post them but it appears that when a tunnel goes down because of the internet connection on my end or the other end i have to restart the racoon service on both ends for the tunnel to reestablish, this is between 2 pfsense boxes… i dont even want to get started on my linksys vpn tunnel issues

                  1 Reply Last reply Reply Quote 0
                  • C
                    cmb
                    last edited by

                    @fairchild:

                    im using the latest 2.0 snapshot

                    Don't. That's not going to be stable. Pretty sure the 7.2/2.0 builds still use NAT-T which has renegotiation issues, and the 8 snapshots likely don't have a proper ipsec-tools either.

                    1 Reply Last reply Reply Quote 0
                    • F
                      fairchild
                      last edited by

                      @cmb:

                      @fairchild:

                      im using the latest 2.0 snapshot

                      Don't. That's not going to be stable. Pretty sure the 7.2/2.0 builds still use NAT-T which has renegotiation issues, and the 8 snapshots likely don't have a proper ipsec-tools either.

                      damn… well now that i think of it i really only switched to check out the new gui and multiple dyndns accounts for each of my wans, so ipsec is much more stable and up-to-date in the RC versions?, i would have figured newer code with more features was included in 2.0 for vpns specifically ipsec

                      1 Reply Last reply Reply Quote 0
                      • C
                        cmb
                        last edited by

                        @fairchild:

                        so ipsec is much more stable and up-to-date in the RC versions?,

                        yes

                        1 Reply Last reply Reply Quote 0
                        • B
                          bkm
                          last edited by

                          Ok, I enabled DPD (60 sec) and I believe that this fixed part of the problem. A couple of my tunnels stayed up overnight or at least reconnected. One of the tunnels wouldn't reconnect though until I deleted the numerous SADs and restarted raccoon. (Actually a second tunnel stopped after restarting raccoon and I again had to delete the SAD but not restart raccoon) It appears that some of the SADs were not getting dropped properly. The IPSec status showed that the tunnels were up. It may have coincided with the lifetime setting. I think that I may change my lifetimes to a shorter time frame so that I can try to duplicate the behavior. Below are some of the error messages.

                          racoon: ERROR: fatal INVALID-SPI notify messsage, phase1 should be deleted.
                          Sep 29 12:23:18 racoon: ERROR: fatal INVALID-SPI notify messsage, phase1 should be deleted.
                          Sep 29 12:22:48 racoon: ERROR: fatal INVALID-SPI notify messsage, phase1 should be deleted.
                          Sep 29 12:22:41 racoon: [VPN Name]: INFO: IPsec-SA established: ESP MyWanIPxx.xx[0]->RemoteIPxx.xx[0] spi=2670541922(0x9f2d3c62)
                          Sep 29 12:22:41 racoon: [VPN Name]: INFO: IPsec-SA established: ESP RemoteIPxx.xx[0]->MyWanIPxx.xx[0] spi=37322644(0x2397f94)
                          Sep 29 12:22:40 racoon: [VPN Name]: INFO: respond new phase 2 negotiation: MyWanIPxx.xx[0]<=>RemoteIPxx.xx[0]

                          Background: The tunnels are static tunnels to Netopia routers. One tunnel to each site or router.
                          Aggressive mode is used. For the Keep Alive I am using the remote Lan address of the Netopia router.

                          After I duplicate the problem, I will provide more log info. Most was overwritten before I thought of copying it.

                          1 Reply Last reply Reply Quote 0
                          • jimpJ
                            jimp Rebel Alliance Developer Netgate
                            last edited by

                            @fairchild:

                            I would have figured newer code with more features was included in 2.0 for vpns specifically ipsec

                            Not in this case. Even so, newer code and more features don't usually translate to more stability, especially in an alpha release :)

                            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                            Need help fast? Netgate Global Support!

                            Do not Chat/PM for help!

                            1 Reply Last reply Reply Quote 0
                            • jimpJ
                              jimp Rebel Alliance Developer Netgate
                              last edited by

                              @bkm:

                              Ok, I enabled DPD (60 sec) and I believe that this fixed part of the problem. A couple of my tunnels stayed up overnight or at least reconnected. One of the tunnels wouldn't reconnect though until I deleted the numerous SADs and restarted raccoon. (Actually a second tunnel stopped after restarting raccoon and I again had to delete the SAD but not restart raccoon) It appears that some of the SADs were not getting dropped properly. The IPSec status showed that the tunnels were up. It may have coincided with the lifetime setting. I think that I may change my lifetimes to a shorter time frame so that I can try to duplicate the behavior. Below are some of the error messages.

                              racoon: ERROR: fatal INVALID-SPI notify messsage, phase1 should be deleted.
                              Sep 29 12:23:18 racoon: ERROR: fatal INVALID-SPI notify messsage, phase1 should be deleted.
                              Sep 29 12:22:48 racoon: ERROR: fatal INVALID-SPI notify messsage, phase1 should be deleted.
                              Sep 29 12:22:41 racoon: [VPN Name]: INFO: IPsec-SA established: ESP MyWanIPxx.xx[0]->RemoteIPxx.xx[0] spi=2670541922(0x9f2d3c62)
                              Sep 29 12:22:41 racoon: [VPN Name]: INFO: IPsec-SA established: ESP RemoteIPxx.xx[0]->MyWanIPxx.xx[0] spi=37322644(0x2397f94)
                              Sep 29 12:22:40 racoon: [VPN Name]: INFO: respond new phase 2 negotiation: MyWanIPxx.xx[0]<=>RemoteIPxx.xx[0]

                              Background: The tunnels are static tunnels to Netopia routers. One tunnel to each site or router.
                              Aggressive mode is used. For the Keep Alive I am using the remote Lan address of the Netopia router.

                              After I duplicate the problem, I will provide more log info. Most was overwritten before I thought of copying it.

                              The full log is in /var/log/ipsec.log, and you can view it by executing the command: clog /var/log/ipsec.log

                              It may have just been gone from the GUI, which only shows a limited number of lines. Also, it would help to show the logs in normal order, not reverse order. If you have the reverse order box checked on Status > System Logs, Settings tab, uncheck it and save, then copy/paste the logs. There was an old bug that caused the IPsec logs to ignore this setting, but it was fixed before the snapshot you said you were running. You may also want to update to the most recent snapshot to be sure you really have the most current updates.

                              One more thing: When testing these settings, be sure to stop and restart racoon after making your changes, to be on the safe side and to be sure the SAD and SPD are clear.

                              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                              Need help fast? Netgate Global Support!

                              Do not Chat/PM for help!

                              1 Reply Last reply Reply Quote 0
                              • F
                                fairchild
                                last edited by

                                @jimp:

                                One more thing: When testing these settings, be sure to stop and restart racoon after making your changes, to be on the safe side and to be sure the SAD and SPD are clear.

                                sorry if this sounds ignorant but do you mean delete all the SPD and SAD entries?

                                1 Reply Last reply Reply Quote 0
                                • jimpJ
                                  jimp Rebel Alliance Developer Netgate
                                  last edited by

                                  @fairchild:

                                  @jimp:

                                  One more thing: When testing these settings, be sure to stop and restart racoon after making your changes, to be on the safe side and to be sure the SAD and SPD are clear.

                                  sorry if this sounds ignorant but do you mean delete all the SPD and SAD entries?

                                  Stopping and restarting racoon should do this. If you want to do it by hand, run:
                                  setkey -F
                                  setkey -F -P

                                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                  Need help fast? Netgate Global Support!

                                  Do not Chat/PM for help!

                                  1 Reply Last reply Reply Quote 0
                                  • B
                                    bkm
                                    last edited by

                                    Thanks for the advice. I knew the logs had to be somewhere. I will also restart raccoon from now on after changes are made.

                                    "sorry if this sounds ignorant but do you mean delete all the SPD and SAD entries?"

                                    In the GUI, I would go to Status-IPSec, then the SAD tab and delete the entries that corresponded to the tunnel that I was having problems with. Normally, I see two entries for each tunnel, MyWanIP to RemoteWanIP and RemoteWanIP to MyWanIP. When I was having problems, this area would get filled with entries where it was trying to connect. There is an X beside each one where they can be deleted. Restarting raccoon also deletes them though. (I was experimenting a little when deleting manually) I did not change anything on the SPD tab.

                                    I did change the lifetime setting to 1800 and 3600, on two of the tunnels so that I could see if they came back up after an expiration without waiting until tomorrow. They did come up this time. If it is down in the morning, I will post a few log entries.

                                    1 Reply Last reply Reply Quote 0
                                    • B
                                      bkm
                                      last edited by

                                      One of the tunnels went down again last night. Attached is my IPSec log. It was working at 17:00 on 9/29/09 and was not working at the end of the log on 9/30/09. The log was modified to remove the IP addresses. My Wan IP address is listed as MyWanIP. The remote Wan IP of the tunnel that went down is listed as RemoteWanIPTunnel1. Any advice welcome. Thanks.

                                      tunnel1down.txt

                                      1 Reply Last reply Reply Quote 0
                                      • B
                                        bkm
                                        last edited by

                                        Well, the tunnel came back up without any action from me other than I did try to ping the site twice. The first time it timed out. An hour later I tried and got an immediate reply. For the Keep Alive address, I am using the remote LAN IP. Should I be using the remote Wan IP instead?

                                        1 Reply Last reply Reply Quote 0
                                        • B
                                          bkm
                                          last edited by

                                          I ran a ping test on my tunnels last night to see how long and how often they were down. The test ran for about 15 hours. Tunnel 1 went down 6 times averaging 30-35 minutes each time. Tunnel 2 went down 3 times also averaging 30-35 minutes each time. Tunnel 3 stayed up the whole time. I didn't count the times where the tunnel was only down for about a minute, however this would also be a problem for our VPNs. I cannot verify that times under a couple minutes were not due to a line problem since I alternated my pings between sites. The range for the downtime was between 25 and 45 minutes. Most occurrences appeared to be at 30-35 minutes.

                                          My lifetime settings for tunnel 1 and 2 are at 1800 (30 min) for phase 1 and 3600 (60 min) for phase 2. My lifetimes for tunnel 3 are 28800 and 86400. Tunnel 3's phase 2 lifetime did not expire during the test.

                                          Attached is a partial IPSec log. Tunnel 1 went down sometime between 5:56 and 6:01. It came back up between 6:41 and 6:47 in case someone wants to look. The clocks between the two machines could be slightly off.

                                          I will try to upgrade again in the next day or so and test again, but I am not very hopeful at this point.
                                          Any suggestions are welcome.
                                          Thanks

                                          IPSec100109.txt

                                          1 Reply Last reply Reply Quote 0
                                          • A
                                            althornin
                                            last edited by

                                            I have noticed similar behavior with a pfsense<–>Linksys VPN.
                                            Basically, at some unspecified time, the tunnel appears to die, and stays down for the time the phase 2 lifetime (for me).
                                            I've seen the same behavoir without pfsense in the mix though, using just the Linksys VPN routers (which I am replacing with pfsense boxes, slowly).

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.