Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Intermittent interface blips leading to brief CARP failovers

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    16 Posts 3 Posters 2.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DerelictD Offline
      Derelict LAYER 8 Netgate
      last edited by

      dedicated Ethernet port for direct CARP connection between firewalls.

      Don't confuse CARP with pfsync.

      CARP should happen locally on your switches and has nothing to do with gateway up or down status. You need solid layer 2 between the interfaces in the failover group (those sharing the CARP VIP)

      The sync interface has nothing to do with which node is master or backup for any particular CARP VIP.

      Chattanooga, Tennessee, USA
      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
      Do Not Chat For Help! NO_WAN_EGRESS(TM)

      1 Reply Last reply Reply Quote 0
      • D Offline
        dz-015
        last edited by

        @Derelict:

        Don't confuse CARP with pfsync.

        I wasn't, I was just using incorrect/misleading terminology in that bit of my description of our setup, in my haste to get the post written so I could ask for help. Apologies for any confusion.

        Did you have any thoughts or suggestions regarding these issues I've described?

        1 Reply Last reply Reply Quote 0
        • DerelictD Offline
          Derelict LAYER 8 Netgate
          last edited by

          Figure out why you're dropping/delaying CARP packets between your interfaces.

          Chattanooga, Tennessee, USA
          A comprehensive network diagram is worth 10,000 words and 15 conference calls.
          DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
          Do Not Chat For Help! NO_WAN_EGRESS(TM)

          1 Reply Last reply Reply Quote 0
          • D Offline
            dz-015
            last edited by

            @KOM:

            Now that you know where to look, I would check again after you have detected the latest failover.  See if there is any correlation between the time it starts flapping and other network quality events.

            It looks as if there's been one failover event which has caused the number of "out" errors to increase. Next time I'll hopefully check it in real time.

            What I'm not sure about, though, is where I can go from there? If I know that the "out" errors are linked to the failover events, how does that knowledge benefit me and what can I do about it?

            1 Reply Last reply Reply Quote 0
            • D Offline
              dz-015
              last edited by

              @dz-015:

              @KOM:

              Now that you know where to look, I would check again after you have detected the latest failover.  See if there is any correlation between the time it starts flapping and other network quality events.

              It looks as if there's been one failover event which has caused the number of "out" errors to increase. Next time I'll hopefully check it in real time.

              What I'm not sure about, though, is where I can go from there? If I know that the "out" errors are linked to the failover events, how does that knowledge benefit me and what can I do about it?

              So, further to the above, a failover event just occurred and the number of "out" errors increased by 1.

              So, what further investigation can I do to find ways of resolving this problem? There's nothing further in the logs and nothing in any console or kernel output that I can find when logging in via SSH. I'm a bit stuck for ideas really!

              1 Reply Last reply Reply Quote 0
              • DerelictD Offline
                Derelict LAYER 8 Netgate
                last edited by

                This is all in your layer 2 switching, dude, not pfSense. CARP will work with or without a gateway on the interface. See also your CARP on your LAN interface (no gateway).

                What kind of switch are you using?  How is it configured? Are the ports taking errors?

                Chattanooga, Tennessee, USA
                A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                Do Not Chat For Help! NO_WAN_EGRESS(TM)

                1 Reply Last reply Reply Quote 0
                • D Offline
                  dz-015
                  last edited by

                  @Derelict:

                  This is all in your layer 2 switching, dude, not pfSense.

                  How have you come to this conclusion?

                  @Derelict:

                  CARP will work with or without a gateway on the interface. See also your CARP on your LAN interface (no gateway).

                  The problem is with the LAN interface, as I explained in my original post. There is indeed no gateway on the LAN interface. I only mentioned gateways in response to KOM who suggested that I should look in Status - System Logs - Gateways and report what was in there.

                  @Derelict:

                  What kind of switch are you using?  How is it configured? Are the ports taking errors?

                  2 x Cisco WS-C2960S switches for redundancy. One pfSense firewall goes into one switch, the other firewall into the other switch. Each server has NIC bonding configured, with one NIC going into one switch and the other NIC going into the other switch. So the whole infrastructure is completely redundant.

                  Everything's working fine. There are no apparent errors on the switches. There are no NIC-related errors on the servers. The only issue is the intermittent, brief CARP failover on pfSense on the LAN interface, which seems to correspond to the "out" errors incrementing on the LAN interface.

                  1 Reply Last reply Reply Quote 0
                  • DerelictD Offline
                    Derelict LAYER 8 Netgate
                    last edited by

                    Everything's working fine.

                    Look again. Your CARP is failing.

                    Try new cables. Try Intel NICs - Realtek sucks.

                    Chattanooga, Tennessee, USA
                    A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                    DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                    Do Not Chat For Help! NO_WAN_EGRESS(TM)

                    1 Reply Last reply Reply Quote 0
                    • D Offline
                      dz-015
                      last edited by

                      @Derelict:

                      Look again. Your CARP is failing.

                      That's what this entire thread is about and why I posted the question originally. Not sure what your point is.

                      @Derelict:

                      Try new cables. Try Intel NICs - Realtek sucks.

                      Thanks for the suggestions. Earlier in the thread I said "it suggests that perhaps the NICs are having some issues, so maybe I need to consider hardware upgrades to machines with more robust NICs" so it seems you're potentially confirming my suspicions.

                      1 Reply Last reply Reply Quote 0
                      • D Offline
                        dz-015
                        last edited by

                        For the benefit of anyone reading this with similar problems in future: I replaced the Mini-ITX firewalls with new pfSense SG appliances and the NIC/CARP errors went away. I therefore conclude that the RealTek NICs in the old hardware weren't up to the job.

                        1 Reply Last reply Reply Quote 0
                        • DerelictD Offline
                          Derelict LAYER 8 Netgate
                          last edited by

                          This can be added to the growing list of "Realtek sucks" threads.

                          I have had zero problems with a pair of APUs, however.

                          Chattanooga, Tennessee, USA
                          A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                          DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                          Do Not Chat For Help! NO_WAN_EGRESS(TM)

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.