Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    2.0RC1 Multi-Wan: No default gateway/route change after link failure.

    Scheduled Pinned Locked Moved 2.0-RC Snapshot Feedback and Problems - RETIRED
    39 Posts 13 Posters 22.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      sot010174
      last edited by

      It did work indeed… Once. ???

      I had both interfaces up, then as usual, I removed Virtua's cable and pfsense changed the route to WAN's gateway (YAY!!!). When Virtua was back online, pfsense recognized that and rewrote the rules back to virtua's gateway (Yay x2!).

      However, the second time I tried it, it did rewrite the route to the WAN's gateway when Virtua failed but it didn't revert to Virtua when it was back online (Default gateway).

      Oh well...

      Here's the full log after restoring virtua's access and then removing the cable again:
      Apr 6 22:32:51 kernel: arpresolve: can't allocate llinfo for 201.17.96.1
      Apr 6 22:32:56 kernel: arpresolve: can't allocate llinfo for 201.17.96.1
      Apr 6 22:33:05 kernel: arpresolve: can't allocate llinfo for 201.17.96.1
      Apr 6 22:33:49 kernel: em2: link state changed to UP (Plugged cable back - 1st time)
      Apr 6 22:33:49 check_reload_status: Linkup starting em2
      Apr 6 22:33:49 php: : DEVD Ethernet attached event for opt1
      Apr 6 22:33:49 php: : HOTPLUG: Configuring interface opt1
      Apr 6 22:33:50 dhclient: PREINIT
      Apr 6 22:33:50 dhclient[54413]: DHCPREQUEST on em2 to 255.255.255.255 port 67
      Apr 6 22:33:50 dhclient[54413]: DHCPACK from 201.17.96.1
      Apr 6 22:33:50 dhclient: REBOOT
      Apr 6 22:33:50 dhclient: Starting add_new_address()
      Apr 6 22:33:50 dhclient: ifconfig em2 inet 201.17.110.81 netmask 255.255.240.0 broadcast 201.17.111.255
      Apr 6 22:33:50 dhclient: New IP Address (em2): 201.17.110.81
      Apr 6 22:33:50 dhclient: New Subnet Mask (em2): 255.255.240.0
      Apr 6 22:33:50 dhclient: New Broadcast Address (em2): 201.17.111.255
      Apr 6 22:33:50 dhclient: New Routers (em2): 201.17.96.1
      Apr 6 22:33:50 dhclient: Adding new routes to interface: em2
      Apr 6 22:33:50 dhclient: /sbin/route add default 201.17.96.1
      Apr 6 22:33:50 dhclient: Creating resolv.conf
      Apr 6 22:33:50 check_reload_status: rc.newwanip starting em2
      Apr 6 22:33:51 php: : rc.newwanip: Informational is starting em2.
      Apr 6 22:33:51 dhclient[54413]: bound to 201.17.110.81 – renewal in 5400 seconds.
      Apr 6 22:33:51 php: : rc.newwanip: on (IP address: 201.17.110.81) (interface: opt1) (real interface: em2).
      Apr 6 22:33:51 php: : ROUTING: change default route to 201.17.96.1
      Apr 6 22:33:51 apinger: alarm canceled: VIRTUA(201.17.96.1) *** down ***
      Apr 6 22:33:51 apinger: Exiting on signal 15.
      Apr 6 22:33:51 check_reload_status: reloading filter
      Apr 6 22:33:52 php: : Gateways status could not be determined, considering all as up/active.
      Apr 6 22:33:52 php: : Gateways status could not be determined, considering all as up/active.
      Apr 6 22:33:52 php: : Gateways status could not be determined, considering all as up/active.
      Apr 6 22:33:52 check_reload_status: reloading filter
      Apr 6 22:33:53 apinger: Starting Alarm Pinger, apinger(13659)
      Apr 6 22:33:53 php: : Gateways status could not be determined, considering all as up/active.
      Apr 6 22:33:53 php: : Gateways status could not be determined, considering all as up/active.
      Apr 6 22:33:53 php: : Gateways status could not be determined, considering all as up/active.
      Apr 6 22:34:40 apinger: ALARM: VIRTUA(201.17.96.1) *** down *** - Second time it's down
      Apr 6 22:34:50 check_reload_status: reloading filter
      Apr 6 22:34:51 php: : Default gateway down setting WAN as default!
      Apr 6 22:34:51 php: : MONITOR: VIRTUA is down, removing from routing group
      Apr 6 22:34:51 php: : MONITOR: VIRTUA is down, removing from routing group
      Apr 6 22:34:51 php: : MONITOR: VIRTUA is down, removing from routing group
      Apr 6 22:35:07 dnsmasq[10400]: reading /etc/resolv.conf
      Apr 6 22:35:07 dnsmasq[10400]: using nameserver 200.165.132.154#53
      Apr 6 22:35:07 dnsmasq[10400]: using nameserver 200.149.55.142#53
      Apr 6 22:35:07 dnsmasq[10400]: using nameserver 201.17.0.95#53
      Apr 6 22:35:07 dnsmasq[10400]: using nameserver 201.17.0.94#53
      Apr 6 22:35:13 apinger: alarm canceled: VIRTUA(201.17.96.1) *** down *** - UP again, but no route change  :(
      Apr 6 22:35:23 check_reload_status: reloading filter

      Update 01 It seems if the GATEWAY fails, pfSense won't revert routes. However if the INTERFACE goes down, then it does rewrite. I'm going to test some more.

      Update 02 That's it. if the gateway goes offline and the interface stays up, pfsense won't revert the routes BACK to the default GW (It does change the route to a backup however). Removing the cable causes the fix to work as intended.

      1 Reply Last reply Reply Quote 0
      • T
        torsurfer
        last edited by

        Updated to snapshot 20110406-1323, and I'm thrilled to report it now works for me!  ;D BIG thank you to Ermal for the hardwork. (Squid is 'cruising' along smoothly now for clients.)

        This is what I've been doing before applying the Snapshot. I took Ermal's suggestion to edit the WAN gateway and 'ticked' the checkbox 'default gateway'. I didn't do this before, as I thought this was not needed if I wanted to load-balance the traffic. Anyhow…. I re-tested. Took down WAN, checked the routing table... Nope. No dice. Default route for OPT WAN was not written to the table. Pinging the Web from pfSense just returned 'no trace route'.

        (Btw, in reply to Ermal's earlier question to me, I've no idea why 192.168.2.1 is not in the routing table. But I guess the routing table does have 192.168.2.0/24 (link #3). That ought to do it, I suppose. I've even manually keyed-in the IP in the OPT WAN gateway to replace 'dynamic', but it didn't change the routing table entry.)

        After I applied the Snapshot though, everything just started to work. If WAN is down, a default route is written to the table based on OPT WAN. And when WAN is back up again, default route is re-written to the table based on WAN. I tested this four times, worked in all (yay!).

        Think I have better luck than sot010174, in all my tests I just took down the gateway and default route gets properly written to the table (vice versa). I didn't need to take down the interface for this to work.

        Good luck sot010174!

        1 Reply Last reply Reply Quote 0
        • E
          eri--
          last edited by

          Good that it worked for you torsurfer.

          sot010174 you skipped the interesting part of the log you posted at the end :)

          1 Reply Last reply Reply Quote 0
          • S
            sot010174
            last edited by

            Hi ermal! It will work for me too eventually, I'm sure !  :D

            That's the interesting bit. There aren't any additional entries on the log. I'll repeat the scenario and post everything here.

            CAP01 This happens whenever I shutdown Virtua's gateway (notice that the interface stays up.)
            CAP02 is the screenshot of the log panel before I restore Virtua's gateway.
            CAP03 Virtua's gateway UP again. It recognizes that but doesn't restore the default route (Maintains Wan's route, previously set when Virtua's gw went down).

            Can you please provide the command to activate the auto route changing mechanism if possible? Maybe if I force it to run even if it doesn't detect any changes it will restore the correct settings.

            Oh, and another (bad) thing: I left the test router online and went to sleep. When I woke up this morning, no default route was set. I examined the system.log and saw that both ISPs had a busy night and there were times that both links went down. I'm going to conduct further testing and I'll report back.

            Anyway, thanks a lot for looking into this issue.

            CAP01.JPG
            CAP01.JPG_thumb
            CAP02.JPG
            CAP02.JPG_thumb
            CAP03.JPG
            CAP03.JPG_thumb

            1 Reply Last reply Reply Quote 0
            • L
              lnaimi
              last edited by

              Then for a configuration loadbalance and failover is necessary to set the DefaultGateway?? I have 3 Gateway and I have not set any DefaultGateway, but I did not see how it behaves when a gateway is down.

              1 Reply Last reply Reply Quote 0
              • E
                eri--
                last edited by

                lnaimi it is not neccessary to set it since pfSense will assume one by default, WAN.
                Please do not hijack the thread since my comments are only related to the topic.

                1 Reply Last reply Reply Quote 0
                • S
                  sot010174
                  last edited by

                  Well an update:

                  I managed to test again (just once, I was in a hurry) the scenario which both gateways go down simultaneously.

                  1. This time, I simply removed both WAN and Virtua's cable. With the log cleared, I watched the results (2GWDown.log).

                  2. Cleared the log again and plugged in WAN's ethernet cable (WANUP.log). pfSense recognized the link up, restored the PPPOE link, but not the route (no default route in the table).

                  3. Cleared the log and plugged in Virtua's ethernet (VIRUPAFWAN.log).pfSense recognized the link up, got it's (same) IP configuration and WROTE Virtua's default route in the table.

                  I guess if I did as I did before (shutting down virtua's gateway and letting the interface stay up) I would end up with no route, as It happened previously. I think I even couldn't surf the web.

                  I tried to rename a zip file to .txt and attach to this message but I couldn't, so here's a link to it:

                  Logfiles:
                  http://www.mediafire.com/?xh4d08orgka6kyr

                  1 Reply Last reply Reply Quote 0
                  • L
                    lnaimi
                    last edited by

                    Thanks

                    1 Reply Last reply Reply Quote 0
                    • C
                      crzykidd
                      last edited by

                      I seem to have this exact same problem.  I am still on RC1 official build AMD64.  I have tried with a default route and without a default route checked.  netstat -rn just shows the default route regardless of what happens.  Both of my connections are ethernet to fiber feeds, so they don't go down very often.  I will follow this thread and upgrade to a build when it has been confirmed fixed.  If there is a build that this should work I will test it and report back.

                      Thanks,

                      1 Reply Last reply Reply Quote 0
                      • D
                        DimC
                        last edited by

                        I have the same problem (2.0-RC1 (i386) built on Tue Apr 19 00:04:38 EDT 2011).
                        If the gateway goes offline and the interface stays up, pfsense won't revert the routes BACK to the default GW.
                        Is there a solution for this problem?

                        Thank you.

                        1 Reply Last reply Reply Quote 0
                        • L
                          lp
                          last edited by

                          Same problem on 2.0-RC1 (i386) built on Mon May 2 05:54:35 EDT 2011 .
                          I'm likely to have an eye at source code a bit later.

                          1 Reply Last reply Reply Quote 0
                          • S
                            sergu61
                            last edited by

                            After May, 18th again the same problem! :(

                            It was necessary to be rolled away on the Snapshot "pfSense-Full-Update-2.0-RC1-i386-20110517-2328.tgz" - on it the route by default switches automatically:
                            –--------
                            May 20 2:36:45 PM pfsense php:: Default gateway down setting 2_GW as default!
                            May 20 2:36:45 PM pfsense php:: MONITOR: 1_GW is down, removing from routing group

                            On new (after May, 18th) Snapshots the first line in system.log missed!

                            Sergu61

                            1 Reply Last reply Reply Quote 0
                            • E
                              eri--
                              last edited by

                              The change has been backed out since it caused issues and i have plans to put a knob under system->advanced to allow enabling it.

                              Give me time to come to it.

                              1 Reply Last reply Reply Quote 0
                              • S
                                sot010174
                                last edited by

                                @ermal:

                                The change has been backed out since it caused issues and i have plans to put a knob under system->advanced to allow enabling it.

                                Give me time to come to it.

                                Please include a "timed refresh" to this feature. In my case I think it would solve all my problems, something like:

                                Recheck default gateway route every ___ minutes. (and it would call the auto-change function or write the default gateway route if all gateways are down).

                                1 Reply Last reply Reply Quote 0
                                • L
                                  lhodgkins
                                  last edited by

                                  Hi Guys,
                                      Just curious what the status of this issue is…  Has the fix been incorporated into the latest snapshot yet??

                                  1 Reply Last reply Reply Quote 0
                                  • S
                                    sergu61
                                    last edited by

                                    In the file system_advanced_misc.php
                                    in line:
                                    …..
                                    />
                                    …..

                                    need to change:
                                    lb_use_sticky    ->      gw_switch_default.

                                    After that, everything starts to work!

                                    1 Reply Last reply Reply Quote 0
                                    • E
                                      eri--
                                      last edited by

                                      I just fixed this.

                                      1 Reply Last reply Reply Quote 0
                                      • S
                                        sergu61
                                        last edited by

                                        Thank you!
                                        Now the system has become even more comfortable!

                                        1 Reply Last reply Reply Quote 0
                                        • P
                                          phospher
                                          last edited by

                                          Awesome. I've been waiting for this fix as well.  How long does it usually take for the snapshot to get built?

                                          1 Reply Last reply Reply Quote 0
                                          • M
                                            Michael Sh.
                                            last edited by

                                            I'm a little corrected. For me it works better.
                                            1. Restores the configured default gateway, if gateway is up.
                                            2. Bypasses until IPv6 gateways.
                                            3. Will don't touch routing without the need for.

                                            
                                                    if (isset($config['system']['gw_switch_default'])) {
                                                    /*
                                                     * NOTE: The code below is meant to replace the default gateway when it goes down.
                                                     *      This facilitates services running on pfSense itself and are not handled by a PBR to continue working.
                                                     */
                                                    $upgw = "";
                                                    $dfltgwfound = false;
                                                    foreach ($gateways_arr as $gwname => $gwsttng) {
                                                        if (is_ipaddrv4($gateways_arr[$gwname]['gateway'])) {
                                                            if (isset($gwsttng['defaultgw'])) {
                                                                    $dfltgwfound = true;
                                                                    if (!stristr($gateways_status[$gwname]['status'], "down"))
                                                                            $upgw = $gwname;
                                                            }
                                                            /* Keep a record of the last up gateway */
                                                            if (empty($upgw) && !stristr($gateways_status[$gwname]['status'], "down"))
                                                                    $upgw = $gwname;
                                                        }
                                                    }
                                                    if ($dfltgwfound == false) {
                                                            $gwname = convert_friendly_interface_to_friendly_descr("wan");
                                                            if (!stristr($gateways_status[$gwname]['status'], "down"))
                                                                    $upgw = $gwname;
                                                    }
                                                    if (!empty($upgw)) {
                                                            if ($gateways_arr[$upgw]['gateway'] == "dynamic")
                                                                    $gateways_arr[$upgw]['gateway'] = get_interface_gateway($gateways_arr[$upgw]['friendlyiface']);
                                                            if (is_ipaddr($gateways_arr[$upgw]['gateway'])) {
                                                                    exec("/usr/bin/netstat -rnf inet | /usr/bin/awk '/default/ {print $2}'", $currgwip);
                                                                    if (!stristr($gateways_arr[$upgw]['gateway'], $currgwip[0])) {
                                                                        log_error("Setting default gateway '{$upgw}' will replace (IP={$currgwip[0]})!");
                                                                        mwexec("/sbin/route delete -inet default; /sbin/route add -inet default {$gateways_arr[$upgw]['gateway']}");
                                                                    }
                                                            }
                                                    }
                                                    unset($upgw, $dfltgwfound, $currgwip, $gwname, $gwsttng);
                                                    }
                                            
                                            

                                            Test is not easy, unfortunately. Frequent kernel panic when the twitching interfaces.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.