Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    IPSEC/CARP behaviour

    2.0-RC Snapshot Feedback and Problems - RETIRED
    4
    11
    6.0k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      Juve
      last edited by

      Hi all,

      Using latest snapshot in a cluster having a site to site IPSEC tunnel with a 1.2.3 peer (VPN endpoint is the WAN CARP), the second node try to establish SAs with the peer from times to times, which can result in a loss of communication since the SA got established with the slave node. I have hundred of clusters in the wild, this is the first time I see this behaviour.
      If I stop/start racoon after the reboot of the second node, SAs establishment attempt still occurs but fails with a "send error" and then it does not affect the peer anymore.

      It's Friday, perhaps I need to go home ;-)

      1 Reply Last reply Reply Quote 0
      • C
        cmb
        last edited by

        And the IPsec is on the CARP IP, not WAN?

        1 Reply Last reply Reply Quote 0
        • J
          Juve
          last edited by

          Yes, it's using the WAN CARP VIP as the tunnel endpoint.
          I've just checked the logs on the two systems and I found what is happening.
          Each time the admin makes a change on the primary node, the configuration is synced to the slave (normal behaviour), but CARP interfaces are switched off and then on while the configuration is synced, racoon get restarted, the secondary node then seems to become master for 1 second, time to send a SA request to the peer. If you make many changes on the master configuration, sometimes the SA get establish.

          X.X.X.X is the WAN CARP VIP

          System logs on the slave node at the time it receive a config sync:

          Apr 8 20:20:02 php: /xmlrpc.php: Resyncing OpenVPN instances.
          Apr 8 20:20:02 apinger: Starting Alarm Pinger, apinger(8874)
          Apr 8 20:20:02 kernel: vip126: link state changed to DOWN
          Apr 8 20:20:02 kernel: vip126: MASTER -> BACKUP (more frequent advertisement received)
          Apr 8 20:20:02 kernel: vip124: link state changed to DOWN
          Apr 8 20:20:02 kernel: vip124: MASTER -> BACKUP (more frequent advertisement received)
          Apr 8 20:20:02 kernel: vip125: link state changed to DOWN
          Apr 8 20:20:02 kernel: vip125: MASTER -> BACKUP (more frequent advertisement received)
          Apr 8 20:20:01 kernel: vip122: link state changed to DOWN
          Apr 8 20:20:01 kernel: vip122: MASTER -> BACKUP (more frequent advertisement received)
          Apr 8 20:20:01 kernel: vip121: link state changed to DOWN
          Apr 8 20:20:01 kernel: vip121: MASTER -> BACKUP (more frequent advertisement received)
          Apr 8 20:20:01 apinger: Exiting on signal 15.
          Apr 8 20:20:01 php: /xmlrpc.php: ROUTING: change default route to Z.Z.Z.Z
          Apr 8 20:20:01 php: : Config sync not being done because of missing sync IP (normal on secondary systems).
          Apr 8 20:20:01 kernel: vip126: link state changed to UP
          Apr 8 20:20:01 kernel: vip126: INIT -> MASTER (preempting)
          Apr 8 20:20:01 kernel: vip126: link state changed to DOWN
          Apr 8 20:20:01 kernel: vip125: link state changed to UP
          Apr 8 20:20:01 kernel: vip125: INIT -> MASTER (preempting)
          Apr 8 20:20:01 kernel: vip125: link state changed to DOWN
          Apr 8 20:20:01 kernel: vip124: link state changed to UP
          Apr 8 20:20:01 kernel: vip124: INIT -> MASTER (preempting)
          Apr 8 20:20:01 kernel: vip124: link state changed to DOWN
          Apr 8 20:20:01 kernel: vip122: link state changed to UP
          Apr 8 20:20:01 kernel: vip122: INIT -> MASTER (preempting)
          Apr 8 20:20:01 kernel: vip122: link state changed to DOWN
          Apr 8 20:20:01 check_reload_status: reloading filter
          Apr 8 20:20:01 kernel: vip121: link state changed to UP
          Apr 8 20:20:01 kernel: vip121: INIT -> MASTER (preempting)
          Apr 8 20:20:01 kernel: vip121: link state changed to DOWN
          Apr 8 20:20:01 kernel: vip126: link state changed to DOWN
          Apr 8 20:20:01 kernel: vip126: MASTER -> BACKUP (more frequent advertisement received)
          Apr 8 20:20:01 check_reload_status: syncing firewall
          Apr 8 20:20:00 kernel: vip124: link state changed to DOWN
          Apr 8 20:20:00 kernel: vip124: MASTER -> BACKUP (more frequent advertisement received)
          Apr 8 20:20:00 kernel: vip125: link state changed to DOWN
          Apr 8 20:20:00 kernel: vip125: MASTER -> BACKUP (more frequent advertisement received)
          Apr 8 20:20:00 kernel: vip122: link state changed to DOWN
          Apr 8 20:20:00 kernel: vip122: MASTER -> BACKUP (more frequent advertisement received)
          Apr 8 20:20:00 kernel: vip121: link state changed to DOWN
          Apr 8 20:20:00 kernel: vip121: MASTER -> BACKUP (more frequent advertisement received)
          Apr 8 20:20:00 kernel: vip126: link state changed to UP
          Apr 8 20:20:00 kernel: vip126: INIT -> MASTER (preempting)
          Apr 8 20:20:00 kernel: carp4: changing name to 'vip126'
          Apr 8 20:20:00 kernel: vip125: link state changed to UP
          Apr 8 20:20:00 kernel: vip125: INIT -> MASTER (preempting)
          Apr 8 20:20:00 kernel: carp3: changing name to 'vip125'
          Apr 8 20:20:00 kernel: vip124: link state changed to UP
          Apr 8 20:20:00 kernel: vip124: INIT -> MASTER (preempting)
          Apr 8 20:20:00 kernel: carp2: changing name to 'vip124'
          Apr 8 20:20:00 kernel: vip122: link state changed to UP
          Apr 8 20:20:00 kernel: vip122: INIT -> MASTER (preempting)
          Apr 8 20:20:00 kernel: carp1: changing name to 'vip122'
          Apr 8 20:20:00 php: : Config sync not being done because of missing sync IP (normal on secondary systems).
          Apr 8 20:20:00 kernel: vip121: link state changed to UP
          Apr 8 20:20:00 kernel: vip121: INIT -> MASTER (preempting)
          Apr 8 20:20:00 kernel: carp0: changing name to 'vip121'
          Apr 8 20:20:00 check_reload_status: syncing firewall
          Apr 8 20:20:00 kernel: vip126: link state changed to DOWN
          Apr 8 20:20:00 kernel: vip125: link state changed to DOWN
          Apr 8 20:20:00 kernel: vip124: link state changed to DOWN
          Apr 8 20:20:00 kernel: vip122: link state changed to DOWN
          Apr 8 20:20:00 kernel: vip121: link state changed to DOWN

          IPSEC logs on the slave at the time of the sync:
          Apr 8 20:20:03 racoon: WARNING: unrecognized route message with rtm_type: 4
          Apr 8 20:20:03 racoon: WARNING: unrecognized route message with rtm_type: 4
          Apr 8 20:20:03 racoon: WARNING: unrecognized route message with rtm_type: 4
          Apr 8 20:20:03 racoon: WARNING: unrecognized route message with rtm_type: 4
          Apr 8 20:20:01 racoon: [Self]: INFO: X.X.X.X[500] used as isakmp port (fd=17)
          Apr 8 20:20:01 racoon: [Self]: INFO: X.X.X.X[500] used for NAT-T
          Apr 8 20:20:01 racoon: [Self]: INFO: X.X.X.X[4500] used as isakmp port (fd=16)
          Apr 8 20:20:01 racoon: [Self]: INFO: X.X.X.X[4500] used for NAT-T
          Apr 8 20:20:00 racoon: ERROR: failed to begin ipsec sa negotication.
          Apr 8 20:20:00 racoon: ERROR: phase1 negotiation failed due to send error. 2f22f846442bd455:0000000000000000
          Apr 8 20:20:00 racoon: INFO: begin Identity Protection mode.
          Apr 8 20:20:00 racoon: [Remote Peer]: INFO: initiate new phase 1 negotiation: X.X.X.X[500]<=>Y.Y.Y.Y[500]
          Apr 8 20:20:00 racoon: [Remote Peer]: INFO: IPsec-SA request for Y.Y.Y.Y queued due to no phase1 found.
          Apr 8 20:20:00 racoon: [Self]: INFO: X.X.X.X[500] used as isakmp port (fd=17)
          Apr 8 20:20:00 racoon: [Self]: INFO: X.X.X.X[500] used for NAT-T
          Apr 8 20:20:00 racoon: [Self]: INFO: X.X.X.X[4500] used as isakmp port (fd=16)
          Apr 8 20:20:00 racoon: [Self]: INFO: X.X.X.X[4500] used for NAT-T

          1 Reply Last reply Reply Quote 0
          • J
            Juve
            last edited by

            I confirm what I said before.

            Doing multiple changes on the primary node makes the secondary node reconfigure it's CARP VIP.
            Secondary node negotiate new SA with the peer ! Then remote network becomes unaccessible.

            Do you know what changed in the configuration sync ?

            1 Reply Last reply Reply Quote 0
            • J
              Juve
              last edited by

              anyone else experiencing the same ?

              Thx ;)

              1 Reply Last reply Reply Quote 0
              • C
                cmb
                last edited by

                Yeah that makes sense, I see the state change too, which if you hit the IPsec at just the right time will trigger it to come up and cause issues.
                http://redmine.pfsense.org/issues/1433

                1 Reply Last reply Reply Quote 0
                • M
                  mdima
                  last edited by

                  @Juve:

                  anyone else experiencing the same ?

                  Thx ;)

                  I see "strange behaviours" when I am making many changes to the primary node… sometimes the backup node just hangs with Kernel error 12... hope this is the answer to the problem and will be fixed soon!

                  Thanks,
                  Michele

                  1 Reply Last reply Reply Quote 0
                  • J
                    Juve
                    last edited by

                    Thank you cmb.
                    I'll check the ticket status on redmine and will apply an update as soon as a patch is available and I'll give you a feedback.

                    1 Reply Last reply Reply Quote 0
                    • J
                      Juve
                      last edited by

                      I saw on redmine that Ermal had made a patch for the CARP behaviour.
                      So I have updated my 2.0 cluster to the latest snapshot (pfSense-Full-Update-2.0-RC1-i386-20110426-1925)

                      I have tested the behaviour during a configuration resync.
                      IPSECstill doing the same.

                      The problem of the CARP switch other is still here. Every time I make a change on the master (using the LAN CARP as the pfsense admin URL), at the time of the mouse click on the "Apply changes" button, I get a SSL Certificate error because I'm switching to the slave and then back to the master.

                      I saw on #1433 that Ermal said nothing had changed in the sync process, so why is this happening ?
                      It has never been the same with 1.2.3.

                      1 Reply Last reply Reply Quote 0
                      • J
                        Juve
                        last edited by

                        Using the latest RC2 snaphsot the problem of being switched to the slave when using webconfigurator though lan CARP still occurs from time to time.
                        When making many changes within few minutes, the problem ca occurs every two click on "Apply changes".

                        Anyone experiencing the same ?

                        1 Reply Last reply Reply Quote 0
                        • E
                          eri--
                          last edited by

                          There were some fixes done yesterday and some more will come on this today.

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.