• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

2X Dell R515 servers and 2.1-RC0 CARP

2.1 Snapshot Feedback and Problems - RETIRED
2
9
2.2k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • W
    wabashky
    last edited by Aug 3, 2013, 2:53 AM

    We have two Dell R515 servers running:

    2.1-RC0 (amd64)
    built on Mon Jul 22 03:26:44 EDT 2013
    FreeBSD 8.3-RELEASE-p8

    We have 2 separate WAN's, one 50/50 fiber pipe and a 15/2 cable for failover. We are successfully using CARP with no issues and fail-over internet works fine as well. We have six virtual interfaces and some 60 ipsec tunnels to our remote office.

    The problem we are having is we have intermittent down times, yet it is 'up' (not one dropped packet externally). There are absolutely NO corresponding logs to explain the issue at all.  We host multiple systems at different sites, and all of the down times (seconds only) are causing the tunnels to drop and so all outages are effecting us.

    We had an old supermicro running 2.0.1 and had no outages, but the reliability of the box wasn't the best (loud as could be and no hard drive space left) and it would have slow down … no outages but slow downs. We figured bringing in these BOSS dell servers as our routers would resolve. :)

    Any input is appreciated.

    1 Reply Last reply Reply Quote 0
    • S
      ssheikh
      last edited by Aug 3, 2013, 12:49 PM

      Do you lose internet connectivity as well during the "down time"?

      1 Reply Last reply Reply Quote 0
      • W
        wabashky
        last edited by Aug 3, 2013, 11:03 PM

        Yes we do, but only for the duration of a ping to recover. And then its from our backup internet, not our primary.

        1 Reply Last reply Reply Quote 0
        • W
          wabashky
          last edited by Aug 3, 2013, 11:05 PM

          And like I said, NOTHING in logs …

          1 Reply Last reply Reply Quote 0
          • S
            ssheikh
            last edited by Aug 4, 2013, 12:48 AM

            So if it is switching to the backup internet connection then your default gateway is switching. Do you see evidence of that happening in the System | Gateways log?

            Is the Monitor IP being used a reliable pingable system and have the thresholds been changed from their defaults?

            Is CARP also by any chance failing over from Master to backup?

            1 Reply Last reply Reply Quote 0
            • W
              wabashky
              last edited by Aug 4, 2013, 4:21 AM

              I guess there is something in logs :) didn't know about this log.

              Jul 30 21:37:03 apinger: ALARM: gw(199.68.254.225) *** down ***
              Jul 30 21:37:03 apinger: ALARM: RRGW(67.53.57.105) *** down ***
              Jul 30 21:37:22 apinger: alarm canceled: RRGW(67.53.57.105) *** down ***
              Jul 30 21:37:22 apinger: alarm canceled: gw(199.68.254.225) *** down ***
              Aug 1 15:24:13 apinger: Exiting on signal 15.
              Aug 1 15:24:14 apinger: Starting Alarm Pinger, apinger(27599)
              Aug 1 15:24:15 apinger: SIGUSR1 received, writting status.
              Aug 1 15:46:04 apinger: ALARM: gw(199.68.254.225) *** down ***
              Aug 1 15:46:14 apinger: alarm canceled: gw(199.68.254.225) *** down ***
              Aug 1 15:46:51 apinger: ALARM: gw(199.68.254.225) *** down ***
              Aug 1 15:47:00 apinger: alarm canceled: gw(199.68.254.225) *** down ***
              Aug 1 18:40:23 apinger: ALARM: gw(199.68.254.225) *** down ***
              Aug 1 18:40:25 apinger: alarm canceled: gw(199.68.254.225) *** down ***

              Examples above from gateway logs. Like 10 seconds of down then up again.

              I assume the IPs we monitor are good. They are from our ISP.

              Carp isn't failing over, we did force it (unplug) and that worked so it does function correctly.

              1 Reply Last reply Reply Quote 0
              • S
                ssheikh
                last edited by Aug 4, 2013, 4:35 AM

                Check your Gateway monitors. Your pings to them is dying.

                If this is because of the link being saturated with traffic then try relaxing your gateway monitoring thresholds.

                1 Reply Last reply Reply Quote 0
                • W
                  wabashky
                  last edited by Aug 4, 2013, 5:02 AM

                  Thanks. We will try that out over a couple days and see if it works.

                  1 Reply Last reply Reply Quote 0
                  • W
                    wabashky
                    last edited by Aug 5, 2013, 9:20 PM

                    that didnt work for us, we were still going down, just not notified about it or logging as often. BUT!>>>>

                    We found this doc on pfsense.org

                    http://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards

                    and (knock on wood) we have not had an outage at all today! Maybe we are good? We will monitor over the next week and I'll update if we find anything.

                    Thanks ssheikh

                    1 Reply Last reply Reply Quote 0
                    2 out of 9
                    • First post
                      2/9
                      Last post
                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.