Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Traffic is not re-routed over secondary internet connection (PPPOE), once it returns from being down.

    Scheduled Pinned Locked Moved Plus 22.05 Development Snapshots (Retired)
    32 Posts 4 Posters 3.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • B
      BNetworker
      last edited by BNetworker

      Interesting stuff here, no filter reload logged after WAN2 comes back up. What caught my eye is that "WAN2_PPPOE is available now, adding to routing group WAN1WAN2". I don't see any equivalent line for WAN2WAN1 gateway group.

      A manual filter reload fixed it again. Everything back to normal.

      May 26 17:22:44	ppp	27297	[opt1] IPCP: state change Ack-Sent --> Opened
      May 26 17:22:44	ppp	27297	[opt1] IPCP: LayerUp
      May 26 17:22:44	ppp	27297	[opt1] 174.x.x.x -> 67.x.x.x
      May 26 17:22:44	check_reload_status	634	rc.newwanip starting pppoe0
      May 26 17:22:44	ppp	27297	[opt1] IFACE: Up event
      May 26 17:22:44	ppp	27297	[opt1] IFACE: Rename interface ng0 to pppoe0
      May 26 17:22:44	ppp	27297	[opt1] IFACE: Add description "WAN2"
      May 26 17:22:45	php-fpm	16062	/rc.newwanip: rc.newwanip: Info: starting on pppoe0.
      May 26 17:22:45	php-fpm	16062	/rc.newwanip: rc.newwanip: on (IP address: 174.x.x.x) (interface: WAN2[opt1]) (real interface: pppoe0).
      May 26 17:22:46	php-fpm	16062	/rc.newwanip: MONITOR: WAN2_PPPOE is available now, adding to routing group WAN1WAN2
      May 26 17:22:46	php-fpm	16062	67.x.x.x|174.x.x.x|WAN2_PPPOE|7.753ms|0.132ms|0.0%|online|none
      May 26 17:22:46	php-fpm	16062	/rc.newwanip: Gateway, NONE AVAILABLE
      May 26 17:22:46	php-fpm	16062	/rc.newwanip: IP Address has changed, killing states on former IP Address 174.x.x.x.
      May 26 17:22:46	php-fpm	16062	/rc.newwanip: Resyncing OpenVPN instances for interface WAN2.
      May 26 17:22:47	php-fpm	16062	/rc.newwanip: Creating rrd update script
      May 26 17:22:47	php-fpm	16062	/rc.newwanip: Netgate pfSense Plus package system has detected an IP change or dynamic WAN reconnection - 174.x.x.x -> 174.x.x.x - Restarting packages.
      May 26 17:22:47	check_reload_status	634	Starting packages
      May 26 17:22:47	php	21531	notify_monitor.php: Message sent to xxxxxxxx@gmail.com OK
      May 26 17:22:48	php-fpm	79431	/rc.start_packages: Restarting/Starting all packages.
      

      I waited a while, nothing changed, so I did the manual filter reload

      May 26 18:59:27	check_reload_status	634	Reloading filter
      
      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        Based on the log messages from rc.newwanip the code path it was taking did run a filter_configure_sync() but maybe it was too early in the process and it needs another after reconfiguring the later items.

        You could try a change like this to see if it makes a difference:

        diff --git a/src/etc/rc.newwanip b/src/etc/rc.newwanip
        index 34aa4c602d..ae2c68fa38 100755
        --- a/src/etc/rc.newwanip
        +++ b/src/etc/rc.newwanip
        @@ -279,9 +279,7 @@ if (!is_ipaddr($oldip) || ($curwanip != $oldip) || file_exists("{$g['tmp_path']}
                if (empty($config['interfaces'][$interface]['ipaddrv6'])) {
                        unlink_if_exists("{$g['tmp_path']}/{$interface}_upstart6");
                }
        -} else {
        -       /* signal filter reload */
        -       filter_configure();
         }
         
        +filter_configure();
         ?>
        

        Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • jimpJ
          jimp Rebel Alliance Developer Netgate
          last edited by

          I just tested this in my lab on 2.7.0 snapshots and even without the above change it seems to work for me. After a PPPoE interface reconnected I checked the rules and it has the interface back in the gateway groups it should be in as the preferred tier.

          Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

          Need help fast? Netgate Global Support!

          Do not Chat/PM for help!

          1 Reply Last reply Reply Quote 0
          • B
            BNetworker
            last edited by

            @jimp - Thanks for the additional info. I made the suggested change:

            unlink_if_exists("{$g['tmp_path']}/{$interface}_upstart6");
            	}
            
            }
            
            +filter_configure();
            ?>
            

            And i'm super happy to report that the filter did reload, and the WAN2 came up in gateways. Clients are routing NEW traffic out WAN2 as expected now! :

            May 27 08:34:35	ppp	27297	[opt1] IPCP: LayerUp
            May 27 08:34:35	ppp	27297	[opt1] 174.x.x.x -> 67.x.x.x
            May 27 08:34:35	check_reload_status	634	rc.newwanip starting pppoe0
            May 27 08:34:35	ppp	27297	[opt1] IFACE: Up event
            May 27 08:34:35	ppp	27297	[opt1] IFACE: Rename interface ng0 to pppoe0
            May 27 08:34:35	ppp	27297	[opt1] IFACE: Add description "WAN2"
            May 27 08:34:36	php-fpm	16062	/rc.newwanip: rc.newwanip: Info: starting on pppoe0.
            May 27 08:34:36	php-fpm	16062	/rc.newwanip: rc.newwanip: on (IP address: 174.x.x.x) (interface: WAN2[opt1]) (real interface: pppoe0).
            May 27 08:34:38	php-fpm	16062	/rc.newwanip: MONITOR: WAN2_PPPOE is available now, adding to routing group WAN1WAN2
            May 27 08:34:38	php-fpm	16062	67.x.x.x|174.x.x.x|WAN2_PPPOE|7.966ms|0.128ms|0.0%|online|none
            May 27 08:34:38	php-fpm	16062	/rc.newwanip: Gateway, NONE AVAILABLE
            May 27 08:34:38	php-fpm	16062	/rc.newwanip: IP Address has changed, killing states on former IP Address 174.x.x.x.
            May 27 08:34:38	php-fpm	16062	/rc.newwanip: Resyncing OpenVPN instances for interface WAN2.
            May 27 08:34:38	php-fpm	16062	/rc.newwanip: Creating rrd update script
            May 27 08:34:38	php-fpm	16062	/rc.newwanip: Netgate pfSense Plus package system has detected an IP change or dynamic WAN reconnection - 174.x.x.x.x -> 174.x.x.x - Restarting packages.
            May 27 08:34:38	check_reload_status	634	Starting packages
            May 27 08:34:38	check_reload_status	634	Reloading filter
            
            # Gateways
            GWWAN_DHCP = " route-to ( ix3 73.x.x.1 ) "
            GWWAN2_PPPOE = " route-to ( pppoe0 67.x.x.10 ) "
            GWWAN1WAN2 = "  route-to { ( ix3 73.x.x.1 )  }  "
            GWWAN2WAN1 = "  route-to { ( pppoe0 67.x.x.10 )  }  "
            
            1 Reply Last reply Reply Quote 0
            • jimpJ
              jimp Rebel Alliance Developer Netgate
              last edited by

              Interesting. I'm curious why it works OK for me here in my lab without that change.

              Without knowing more about why it helps I'm hesitant to commit the change as-is. Though it should be reasonably safe from what I can see.

              Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

              Need help fast? Netgate Global Support!

              Do not Chat/PM for help!

              1 Reply Last reply Reply Quote 0
              • B
                BNetworker
                last edited by BNetworker

                @jimp - As you can see I had accidently left the + in

                +filter_configure();
                

                Funny thing is it still resolved the issue. Not sure if it still ran the command, or if it was removing the other code that allowed it to work. I took out the + and tested, still works.

                1 Reply Last reply Reply Quote 0
                • jimpJ
                  jimp Rebel Alliance Developer Netgate
                  last edited by

                  The + in that context is fairly harmless, it would affect the return value of the function but the return value isn't checked so it's just tossed out.

                  I made https://redmine.pfsense.org/issues/13228 to track this for the next release. For now you can add that in a system patches package entry and set it to auto-apply.

                  Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                  Need help fast? Netgate Global Support!

                  Do not Chat/PM for help!

                  1 Reply Last reply Reply Quote 1
                  • B
                    BNetworker
                    last edited by

                    Thanks @jimp - Will do. Let me know if you need any more testing, or can think of a way to further troubleshoot / debug.

                    1 Reply Last reply Reply Quote 0
                    • B
                      BNetworker
                      last edited by

                      Looking through the code, It must be matching this section:

                      if (!is_ipaddr($oldip) || ($curwanip != $oldip) || file_exists("{$g['tmp_path']}/{$interface}_upstart4") ||
                         (!is_ipaddrv4($config['interfaces'][$interface]['ipaddr']) && ($config['interfaces'][$interface]['ipaddr'] != 'dhcp'))) {
                      

                      Cause we get this in the log, which is from below that if statement:

                      May 27 08:34:38	php-fpm	16062	/rc.newwanip: IP Address has changed, killing states on former IP Address 174.x.x.107.
                      

                      The filter reload that is called then is:

                      filter_configure_sync();
                      

                      Since we are matching that section, we would skip this else and not actually do the filter_configure():

                      } else {
                      	/* signal filter reload */
                      	filter_configure();
                      

                      Is the filter_configure_sync(); functionally the same as the filter_configure(); we manually put in?

                      1 Reply Last reply Reply Quote 0
                      • jimpJ
                        jimp Rebel Alliance Developer Netgate
                        last edited by

                        Both methods end up running filter_configure_sync() but one is directly running the function and the other sends the event through the event queue which can introduce a little delay before it gets executed.

                        Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                        Need help fast? Netgate Global Support!

                        Do not Chat/PM for help!

                        1 Reply Last reply Reply Quote 0
                        • B
                          BNetworker
                          last edited by BNetworker

                          as a test, in rc.newwanip, I put it all back to default, then changed line 222 from

                          filter_configure_sync();
                          

                          to

                          filter_configure();
                          

                          Leaving the else at the bottom, and it also funtions correctly. In the logs I see the filter reloading much sooner, but it still works. So i'm not sure it's a timing issue. Maybe another issue it has with the filter_configure_sync(); command

                          check_reload_status	634	Reloading filter
                          
                          1 Reply Last reply Reply Quote 0
                          • jimpJ
                            jimp Rebel Alliance Developer Netgate
                            last edited by

                            IIIRC it has to call filter_configure_sync() on that code path because some of the functions called after it need the data it updates to be done before they run. When using filter_configure() it may happen after which leads to other problems.

                            Doing it again at the end is probably the safest way to handle it without (re)introducing other hard to chase down problems.

                            Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                            Need help fast? Netgate Global Support!

                            Do not Chat/PM for help!

                            B 1 Reply Last reply Reply Quote 0
                            • B
                              BNetworker @jimp
                              last edited by

                              @jimp Sure, that makes sense. I was just hoping to give you as much info as possible to try and narrow it down. I'm not too sure where to go from here to help find the root cause.

                              Would be interesting if @w0w could re-produce the (temp) fix with his setup as well.

                              w0wW 1 Reply Last reply Reply Quote 0
                              • w0wW
                                w0w @BNetworker
                                last edited by w0w

                                @bnetworker
                                Can you provide steps to reproduce this issue?
                                I am asking, because I have had this issue several times, but did not find how to trigger it. It does not happening every time when PPPoE connection is down even if it's ISP failure or whatever.

                                B 1 Reply Last reply Reply Quote 0
                                • B
                                  BNetworker @w0w
                                  last edited by BNetworker

                                  @w0w

                                  The way I can trigger it (100% of the time) here is to drop (unplug) the DSL line going into the modem/bridge. Then plug it back in. It will re-negotiate and them I'm stuck with the blank gateway. As you said, If you drop Ethernet (from modem/bridge to Netgate box), it's been functioning correctly.

                                  w0wW 2 Replies Last reply Reply Quote 0
                                  • w0wW
                                    w0w @BNetworker
                                    last edited by

                                    @bnetworker
                                    I have plain PPPoE, no modem, just ethernet cable. I'll try some other methods tomorrow, I hope, and let you know.

                                    1 Reply Last reply Reply Quote 0
                                    • w0wW
                                      w0w @BNetworker
                                      last edited by w0w

                                      @bnetworker
                                      No, I can not re-produce this on the 22.05.b.20220524.1701, what build you have now?

                                      B 1 Reply Last reply Reply Quote 0
                                      • B
                                        BNetworker @w0w
                                        last edited by

                                        @w0w

                                        22.05.b.20220524.0600, but I've had this issue on every recent version. So, it may be a difference in config that is causing the issue. My setup is

                                        DSL -> Modem in Bridge Mode (Carrier VLAN setup here) -> PFSense (Auth here)

                                        w0wW 1 Reply Last reply Reply Quote 0
                                        • w0wW
                                          w0w @BNetworker
                                          last edited by

                                          @bnetworker
                                          How did you configure the default gateway? Mine is configured as group and using tiers to prioritize which one is the primary.

                                          B 1 Reply Last reply Reply Quote 0
                                          • B
                                            BNetworker @w0w
                                            last edited by

                                            @w0w -

                                            Yes the overall default gateway is my primary gateway group, WAN1WAN2, with WAN 1 having tier 1 priority, WAN2 Tier 2.

                                            But... configured in the firewall for INSIDE, I have explicitly setup the WAN1WAN2 gateway group as being their default gateway. The Guest network explicitly has WAN2WAN1.

                                            Now that the filters are reloading at the end of the rc.newwanip, I've had zero failover issues. It's been working great.

                                            w0wW 1 Reply Last reply Reply Quote 2
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.