Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    2X Dell R515 servers and 2.1-RC0 CARP

    Scheduled Pinned Locked Moved 2.1 Snapshot Feedback and Problems - RETIRED
    9 Posts 2 Posters 2.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • W
      wabashky
      last edited by

      We have two Dell R515 servers running:

      2.1-RC0 (amd64)
      built on Mon Jul 22 03:26:44 EDT 2013
      FreeBSD 8.3-RELEASE-p8

      We have 2 separate WAN's, one 50/50 fiber pipe and a 15/2 cable for failover. We are successfully using CARP with no issues and fail-over internet works fine as well. We have six virtual interfaces and some 60 ipsec tunnels to our remote office.

      The problem we are having is we have intermittent down times, yet it is 'up' (not one dropped packet externally). There are absolutely NO corresponding logs to explain the issue at all.  We host multiple systems at different sites, and all of the down times (seconds only) are causing the tunnels to drop and so all outages are effecting us.

      We had an old supermicro running 2.0.1 and had no outages, but the reliability of the box wasn't the best (loud as could be and no hard drive space left) and it would have slow down … no outages but slow downs. We figured bringing in these BOSS dell servers as our routers would resolve. :)

      Any input is appreciated.

      1 Reply Last reply Reply Quote 0
      • S
        ssheikh
        last edited by

        Do you lose internet connectivity as well during the "down time"?

        1 Reply Last reply Reply Quote 0
        • W
          wabashky
          last edited by

          Yes we do, but only for the duration of a ping to recover. And then its from our backup internet, not our primary.

          1 Reply Last reply Reply Quote 0
          • W
            wabashky
            last edited by

            And like I said, NOTHING in logs …

            1 Reply Last reply Reply Quote 0
            • S
              ssheikh
              last edited by

              So if it is switching to the backup internet connection then your default gateway is switching. Do you see evidence of that happening in the System | Gateways log?

              Is the Monitor IP being used a reliable pingable system and have the thresholds been changed from their defaults?

              Is CARP also by any chance failing over from Master to backup?

              1 Reply Last reply Reply Quote 0
              • W
                wabashky
                last edited by

                I guess there is something in logs :) didn't know about this log.

                Jul 30 21:37:03 apinger: ALARM: gw(199.68.254.225) *** down ***
                Jul 30 21:37:03 apinger: ALARM: RRGW(67.53.57.105) *** down ***
                Jul 30 21:37:22 apinger: alarm canceled: RRGW(67.53.57.105) *** down ***
                Jul 30 21:37:22 apinger: alarm canceled: gw(199.68.254.225) *** down ***
                Aug 1 15:24:13 apinger: Exiting on signal 15.
                Aug 1 15:24:14 apinger: Starting Alarm Pinger, apinger(27599)
                Aug 1 15:24:15 apinger: SIGUSR1 received, writting status.
                Aug 1 15:46:04 apinger: ALARM: gw(199.68.254.225) *** down ***
                Aug 1 15:46:14 apinger: alarm canceled: gw(199.68.254.225) *** down ***
                Aug 1 15:46:51 apinger: ALARM: gw(199.68.254.225) *** down ***
                Aug 1 15:47:00 apinger: alarm canceled: gw(199.68.254.225) *** down ***
                Aug 1 18:40:23 apinger: ALARM: gw(199.68.254.225) *** down ***
                Aug 1 18:40:25 apinger: alarm canceled: gw(199.68.254.225) *** down ***

                Examples above from gateway logs. Like 10 seconds of down then up again.

                I assume the IPs we monitor are good. They are from our ISP.

                Carp isn't failing over, we did force it (unplug) and that worked so it does function correctly.

                1 Reply Last reply Reply Quote 0
                • S
                  ssheikh
                  last edited by

                  Check your Gateway monitors. Your pings to them is dying.

                  If this is because of the link being saturated with traffic then try relaxing your gateway monitoring thresholds.

                  1 Reply Last reply Reply Quote 0
                  • W
                    wabashky
                    last edited by

                    Thanks. We will try that out over a couple days and see if it works.

                    1 Reply Last reply Reply Quote 0
                    • W
                      wabashky
                      last edited by

                      that didnt work for us, we were still going down, just not notified about it or logging as often. BUT!>>>>

                      We found this doc on pfsense.org

                      http://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards

                      and (knock on wood) we have not had an outage at all today! Maybe we are good? We will monitor over the next week and I'll update if we find anything.

                      Thanks ssheikh

                      1 Reply Last reply Reply Quote 0
                      • First post
                        Last post
                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.