Navigation

    Netgate Discussion Forum
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search

    WAN2 goes down for packet loss, doesn't come back up until gateways page viewed

    Routing and Multi WAN
    3
    10
    200
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      SteveITS last edited by

      Have a client's SG-2440 with two WANs (it's the same one as https://forum.netgate.com/topic/147889/member-down-triggering-with-0-loss actually, though that's probably not relevant as that's on WAN1). It's on 2.4.4-p3. On multiple occasions this year if WAN2 goes down, it stays down until I log in to the router and view the gateways page, at which point pfSense suddenly realizes the connection is up again. Logs:

      Jul 6 18:01:32 	php-cgi 		notify_monitor.php: Message sent to support@example.com OK
      Jul 6 18:01:30 	php-fpm 	77236 	/system_gateways.php: 77236MONITOR: WAN2_DHCP is available now, adding to routing group GWGROUP 8.8.8.8|172.16.0.51|WAN2_DHCP|18.194ms|0.461ms|0.0%|none
      Jul 6 18:01:22 	php-fpm 	3179 	/index.php: Successful login for user 'admin' from: 173.x.x.x (Local Database)
      Jul 6 17:06:12 	check_reload_status 		Reloading filter
      Jul 6 17:06:12 	check_reload_status 		Restarting OpenVPN tunnels/interfaces
      Jul 6 17:06:12 	check_reload_status 		Restarting ipsec tunnels
      Jul 6 17:06:12 	check_reload_status 		updating dyndns WAN2_DHCP
      Jul 6 17:06:12 	rc.gateway_alarm 	99608 	>>> Gateway alarm: WAN2_DHCP (Addr:8.8.8.8 Alarm:0 RTT:18.124ms RTTsd:.421ms Loss:13%)
      Jul 6 17:03:26 	php-cgi 		notify_monitor.php: Message sent to support@example.com OK
      Jul 6 17:03:26 	php-fpm 	3179 	/rc.openvpn: MONITOR: WAN2_DHCP is down, omitting from routing group GWGROUP 8.8.8.8|172.16.0.51|WAN2_DHCP|18.312ms|0.485ms|22%|down
      Jul 6 17:03:25 	check_reload_status 		Reloading filter
      Jul 6 17:03:25 	check_reload_status 		Restarting OpenVPN tunnels/interfaces
      Jul 6 17:03:25 	check_reload_status 		Restarting ipsec tunnels
      Jul 6 17:03:25 	check_reload_status 		updating dyndns WAN2_DHCP
      Jul 6 17:03:25 	rc.gateway_alarm 	79942 	>>> Gateway alarm: WAN2_DHCP (Addr:8.8.8.8 Alarm:1 RTT:18.312ms RTTsd:.460ms Loss:21%)
      

      It doesn't matter if the delay for logging in is an hour or a couple days, it's immediate upon viewing the system_gateways.php page. Is there some way to get it to realize WAN2 is online again?

      WAN1 doesn't seem to have this problem.

      1 Reply Last reply Reply Quote 0
      • S
        serbus last edited by

        Hello!

        Could be related to:

        https://redmine.pfsense.org/issues/9450

        John

        1 Reply Last reply Reply Quote 0
        • S
          SteveITS last edited by

          Hmm, sounds similar. dpinger logged:

          Jul 6 17:06:12 dpinger WAN2_DHCP 8.8.8.8: Clear latency 18124us stddev 421us loss 13%
          Jul 6 17:03:25 dpinger WAN2_DHCP 8.8.8.8: Alarm latency 18312us stddev 460us loss 21%

          So that cleared the gateway down because it was under 20% packet loss?

          I definitely do not have to save the gateway but I have clicked the edit button to open the gateway. I can try next time to just sit on the system_gateways.php page for a bit and see if it sends the email.

          1 Reply Last reply Reply Quote 0
          • S
            serbus last edited by

            Hello!

            If you want to go-kludge, you could run some code like this when the gw status is out of sync :

            /***********************************************************************/
            #!/usr/local/bin/php-cgi -q
            <?php
            require_once("gwlb.inc");

            $options = getopt("g:");

            $members = [];

            if ($options['g'] <> "") {
            $gwgroup = $options['g'];
            }

            if (!empty($gwgroup)) {
            $members = get_gwgroup_members($gwgroup);
            }

            var_dump ($members);

            ?>
            /***********************************************************************/

            Run...

            php /saved/here/named_this.php -g="GWGRP_Name"

            ...from a shell/cron/DiagCommandPrompt/etc...

            This might prod get_gwgroup_members_inner() to reactivate the member.

            John

            N 1 Reply Last reply Reply Quote 0
            • N
              netblues @serbus last edited by

              I wouldn't trust pinging google dns for gateway availability. I have seen google rate limiting pings leading to failing pings, (when at the same time everything else works.)
              You can always find something closer within your isp for such checks.

              As for the redmine bug, just hitting edit certainly doesn't do anything until you save..
              I don't see this in other multiwans though.

              S 1 Reply Last reply Reply Quote 0
              • S
                SteveITS last edited by

                I edited that a bit and ran this from Diagnostics/Command Prompt:

                require_once("gwlb.inc");
                $members = [];
                $gwgroup = 'GWGROUP';
                if (!empty($gwgroup)) {
                $members = get_gwgroup_members($gwgroup);
                }
                var_dump ($members);
                

                That reconnected the gateway as you theorized. In practice of course just viewing the gateways page is easier. :)

                re: which IP to ping, I've tried picking an ISP's router partway up the chain and over time those can change. Since this is at a client's site that would be difficult to correct if the link goes down, though, in this case both WANs would likely not drop together. Pinging the ISP's router at the other end of the patch cable is of course not that helpful, though I've seen people leave the monitoring IP empty which does that. :)

                1 Reply Last reply Reply Quote 0
                • S
                  serbus last edited by

                  Hello!

                  You could schedule the command to run every so often and then you wouldnt have to login to refresh the group manually.

                  John

                  1 Reply Last reply Reply Quote 0
                  • S
                    SteveITS last edited by

                    I set up a cron job. Before that, some interesting notes for posterity:

                    Twice in the last couple of weeks the WAN2 gateway status reset by itself at 1:01 am. There is a cron job that runs /etc/rc.dyndns.update at that time. No we don't have a DDNS set up. There are however other days in the last few months it did not reset itself at that time. Unclear why the difference.

                    I found by accident this morning that if I edit/add a firewall rule and save/reload, that also updates the gateway status.

                    1 Reply Last reply Reply Quote 0
                    • S
                      SteveITS @netblues last edited by

                      @netblues For what it's worth changing off using Google DNS as the gateway target didn't "prevent" the packet loss.

                      S 1 Reply Last reply Reply Quote 0
                      • S
                        SteveITS @SteveITS last edited by SteveITS

                        I noticed this was fixed in 2.5/21.2:
                        https://redmine.pfsense.org/issues/10546
                        "In this case, pfsense will consider a gateway down when it has actually returned to a normal state, necessitating administrator action to return it back to a proper state."

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post

                        Products

                        • Platform Overview
                        • TNSR
                        • pfSense
                        • Appliances

                        Services

                        • Training
                        • Professional Services

                        Support

                        • Subscription Plans
                        • Contact Support
                        • Product Lifecycle
                        • Documentation

                        News

                        • Media Coverage
                        • Press
                        • Events

                        Resources

                        • Blog
                        • FAQ
                        • Find a Partner
                        • Resource Library
                        • Security Information

                        Company

                        • About Us
                        • Careers
                        • Partners
                        • Contact Us
                        • Legal
                        Our Mission

                        We provide leading-edge network security at a fair price - regardless of organizational size or network sophistication. We believe that an open-source security model offers disruptive pricing along with the agility required to quickly address emerging threats.

                        Subscribe to our Newsletter

                        Product information, software announcements, and special offers. See our newsletter archive to sign up for future newsletters and to read past announcements.

                        © 2021 Rubicon Communications, LLC | Privacy Policy