Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Problem with Carp after upgrade to RC3

    Scheduled Pinned Locked Moved 2.0-RC Snapshot Feedback and Problems - RETIRED
    9 Posts 3 Posters 2.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • F
      fneto
      last edited by

      Hi all!

      I've a carp setup that works flawlessly for a month using 2 pfSense RC1 machines. Yesterday I've upgraded these box with RC3 and after 20 minutes of use my network goes down.

      I don't know what happens but I've tested every single device in my network to try to restore the communication and after almost 6 hours after testing I turn off the passive server. As soon I've turned off the backup server the network starts working again.

      I didn't change any other setting on pfSense, only upgrade distribution using the manual firmware option on the system menu and choose the RC3 tgz file.

      Below is the configuration of my network:

      Each server has 4 NICs: 1 WAN / 1 LAN / 1 DMZ / 1 CARP monitoring (Cross over cable)
      I've 8 CARP IP's: 1 for DMZ, 1 for LAN, 6 for WAN

      These 2 servers LAN Interfaces are plugged in one switch, in the same switch I have another Proxy server running in Bridge mode (CentOS) and then in the other nic of the proxy server I've connected to the main network switch.

      This setup works for a month until the upgrade to RC3, so I'd like to ask you if someone could realize what could happened in this case!

      If I remove the lan cable from the backup the network start function again, so I remove the cable and for now I'm trying to downgrade the RC3 to RC1 again but if it is a problem I think you should take care of what have changed!!

      Thanks!

      1 Reply Last reply Reply Quote 0
      • F
        fneto
        last edited by

        I forget to tell, I'm using always the i386 version ok!

        1 Reply Last reply Reply Quote 0
        • F
          fneto
          last edited by

          Hi! It's me again!

          We rollback for the previous state, we have downgraded both servers to RC1 but the problem that never happened before occurs again. Now with both servers in RC1 the fail over task didn't work.

          If I only remove the LAN Cable for one of the 2 servers my network starts working. But If I plug the lan cable after 4 to 10 minutes the network stop answering.

          The strange is that it only happens in the LAN interface. All other interfaces continue working without any problem!

          I'm desperate right now because I need fix it in some way, and I can't figure out what could it be, if a problem with pfSense, a hardware problem (fault nic) or anything else!

          Please could someone help me with any tip! Thank you very much!

          1 Reply Last reply Reply Quote 0
          • F
            fneto
            last edited by

            Hi all!!

            I was monitoring the system and when the problem happens I only see an message like this "CLOGJ|##" in bloth firewalls!!

            What could be happening?

            1 Reply Last reply Reply Quote 0
            • D
              David Handelman
              last edited by

              Maybe you should consider to drop the cluster for now.
              Just use one machine for a few days and and find more info.

              While this happen can you access the GUI? From wan or dmz
              Good luck.

              1 Reply Last reply Reply Quote 0
              • F
                fneto
                last edited by

                Well, now everything is working again without any modification, the only thing that is different is that only 10% of my users are working now, all the rest go for weekend and turn of their pc's.

                Could it be anything related to a mac address spoofing or duplicated mac address on my network??

                Thanks!

                1 Reply Last reply Reply Quote 0
                • D
                  David Handelman
                  last edited by

                  I don't think so but everything possible.
                  What hardware are you using?

                  1 Reply Last reply Reply Quote 0
                  • C
                    cmb
                    last edited by

                    Problem sounds like the switch is hanging onto MACs in its CAM table past when CARP switches over, so while the master brings up its CARP IPs, the switch is still sending the traffic destined to their MACs to the backup (or vice versa for master/backup). Eventually that would resolve itself once that entry expired in the switch and it moved that MAC over to the correct port.

                    The symptoms are somewhat similar to having a conflicting MAC address as well, the usual cause of that is having two CARP or VRRP IPs with the same vhids on the same network as they'll share the same MAC address.

                    1 Reply Last reply Reply Quote 0
                    • F
                      fneto
                      last edited by

                      I've checked all the CARP ip's configuration and everything seems ok. We have another pfsense server that is plugged in the same switch but with an ADSL extra link.

                      In this pfsense server we have only an internet link and the lan cable with a different ip number.
                      We don't have carp settings in this server because it's an standalone server. I have only 3 rules that allow 3 specific machines to browse internet using this adsl link.

                      And in the pfsense Carp server (main firewall master/backup) I have 3 rules that forward packets coming from this 3 pc's to go out to internet by the standalone pfsense server.

                      To clarify for you!

                      Main Firewall
                      Master: 10.48.3.252
                      Slave: 10.48.3.253
                      Carp IP: 10.48.3.254 (main gateway for the whole network)

                      Standalone firewall
                      IP: 10.48.3.251

                      When packets for port TCP/80 comes from 10.48.3.150, 10.48.3.146 and 10.48.3.179 the main firewall routes for the standalone firewall.

                      In the other side of the main firewall we have two cisco routers in load-balance and failover with the same schema I think (2 specific IP's and 1 virtual ip for both routers), but I have never had any problem in the internet segment of the lan, nor in the dmz, only in my lan segment where I have only 1 carp ip that I tell you above ok!

                      Thanks!

                      1 Reply Last reply Reply Quote 0
                      • First post
                        Last post
                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.