Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    CARP strange behaviour on all networks

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    15 Posts 5 Posters 4.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P
      PDJ
      last edited by

      anyone?
      I really don't know what it could be, didn't find much about this on the forums or on other pages

      1 Reply Last reply Reply Quote 0
      • P
        PDJ
        last edited by

        Do I have to report this as a bug?
        Since 2.1 we have only problems with the network, we have 14 networks and they all go down after a while.
        Do we have to use different passwords for every VID ?

        1 Reply Last reply Reply Quote 0
        • P
          PDJ
          last edited by

          I have switched the backup server off, because it was very unstable.
          So what should I do? How can I fix this problem?
          Anybody?

          Does it help to become a gold member?

          1 Reply Last reply Reply Quote 0
          • S
            ssheikh
            last edited by

            What does your MBUF usage look like?

            1 Reply Last reply Reply Quote 0
            • N
              nothing
              last edited by

              If I were you, I would disconnect all the networks and leave just the WAN and 1 LAN and if this works, start connecting the rest of LANs one by one to see when it fails.

              1 Reply Last reply Reply Quote 0
              • P
                PDJ
                last edited by

                Thanks for the answer.

                I have done that, but the problem is, with all networks connected it runs for a couple of hours and suddenly it collapse, sometimes after an hour, sometimes after a day.
                leaving all the networks disconnected for a day is not an option, that would mean downtime on a lot of services.

                @ssheikh: good question, I'll check that.

                1 Reply Last reply Reply Quote 0
                • P
                  PDJ
                  last edited by

                  I has been a while, we decided to let it rest for a while and disable CARP

                  Now we have made a test network with the same hardware and I found out something very strange.
                  First of all, when the master is down and up again, the slave won't switch back to master.
                  when I check on the slave when I do a tcpdump I get

                  IP 192.168.20.252 > 224.0.0.18: VRRPv2, Advertisement, vrid 5, prio 240, authtype none, intvl 1s, length 36

                  Funny thing is, that the master is configure as skew 0 instead of 240, where is that 240 comming from?

                  When I manually set the skew to 250 on the backup machine, I see it switch back to slave and the master becomes master.

                  But what causing the strange unstable behaviour? and why is the prio set to 240 ?

                  1 Reply Last reply Reply Quote 0
                  • P
                    podilarius
                    last edited by

                    Don't know. I checked mine and it is listed in tcpdump as:
                    <externalip>> 224.0.0.18: VRRPv2, Advertisement, vrid 124, prio 0, authtype none, intvl 1s, length 36, addrs(7): <removed to="" protect="" privacy="">It does this on all my CARP stuff. I am on 2.1 final, but all my configs are upgrades and not new installs.

                    drop to console and report the output of this back.
                    grep -e advskew -e subnet /cf/conf/config.xml</removed></externalip>

                    1 Reply Last reply Reply Quote 0
                    • P
                      PDJ
                      last edited by

                      Thanks for the answer, I found more info it has something to do with preempt, if 1 interface fails, the rest will be set to 240 so all interfaces will switch over (that's not something I prefer, but since 2006 you can't change this, pfsense has enabled this by default)
                      However in my case, both boxes do the same, result all interfaces have advskew 240 on master and slave, and with 20 carp networks will bring both boxes down because of the constant switching master -> backup -> master….

                      I have set net.inet.carp.preempt to 0 in the system tunables, but it is not changing.

                      1 Reply Last reply Reply Quote 0
                      • P
                        podilarius
                        last edited by

                        In you backup FW, do you have configuration setting sync turned on?
                        Personally, if I have one link fail, I would need all to fail over. Mostly this is cause I will need to bring down the master for maintenance. Also cause the WAN died and I don't want any LAN to go to the box where the WAN link failed. If its on of the LAN, sure, its not that big a deal, it will just go out the other WAN port. But you never know.

                        1 Reply Last reply Reply Quote 0
                        • P
                          PDJ
                          last edited by

                          For me it's easier to have only one failover, the setup is so that the slave doesn't have all features (no backup wan connection) so only 1 network doesn't have the failover when there is a network fail.
                          If all networks will switch in depended, I still can switch the master down, all networks will go down and the slave would take over all networks.

                          I have created a stable situation again, I found out when there is an open network (both pfsense are set to init, the network becomes unstable in a couple of hours)

                          But still I want to failover independent, I don't get why the option has been taken out.

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.