Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Reset States on Recovery of Tier 1 WAN in Gateway Group

    Scheduled Pinned Locked Moved Routing and Multi WAN
    17 Posts 4 Posters 2.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • X
      Ximulate
      last edited by

      In my slow journey in trying to get a working Failover Gateway group, I just experienced the issue mentioned here:
      https://redmine.pfsense.org/issues/8#note-50

      I have a Gateway Group setup for failover using one Tier 1 WAN and 1 Tier 2 WAN. On Tier 1 WAN fail, the Groups switches to Tier 2 just fine. However, after the primary (Tier 1) WAN recovers, VoIP/SIP devices maintain thier connection through the backup (Tier 2) WAN. The Tier 2 WAN is a metered connection, so maintaining the connection eats up a lot of unnecessary bandwidth. Manually Resetting States brings the SIP connections back to the primary WAN.

      Since this issue has been brought up before, has it been resolved and if so I'm I just missing a setting somewhere? If not, is there a "hack" that can automate the process of killing the states when the Tier 1 WAN is up?

      R 1 Reply Last reply Reply Quote 0
      • X
        Ximulate
        last edited by

        Posting this as a possible solution:
        https://github.com/mk-fg/pfsense-scripts

        I'm not particularly proficient at bash, so I haven't studied this enough to determine if it works for my situation.

        V 1 Reply Last reply Reply Quote 1
        • X
          Ximulate
          last edited by

          Based on comment in this thread:
          https://forum.netgate.com/topic/33246/kill-state/2

          I've added the following code:

          d=192.168.XXX; for i in YY1 YY2 YY3 YY4; do pfctl -k $d.$i; done
          

          as a CRON job that executes twice a day, where YY1 thru YY4 are the VoIP devices on the 192.168.XXX network. Not particularly elegant, but simple.

          dragoangelD 1 Reply Last reply Reply Quote 0
          • dragoangelD
            dragoangel @Ximulate
            last edited by dragoangel

            @Ximulate for more elegant way to do this i think you can create cron job each 1mins that check your IP via curl and if IP NOT "tier1" then write temp file. Run regex is good and exit script if output of curl not valid IP (like internet loss etc). The second part of script check if this temp file exist on filesystem and your IP IS "tier1" it reset states and removes temp file. By such script you will not reset states without needed to and your states will back to tier1 more quickly - in 2 mins for example.

            Latest stable pfSense on 2x XG-7100 and 1x Intel Xeon Server, running mutiWAN, he.net IPv6, pfBlockerNG-devel, HAProxy-devel, Syslog-ng, Zabbix-agent, OpenVPN, IPsec site-to-site, DNS-over-TLS...
            Unifi AP-AC-LR with EAP RADIUS, US-24

            X 1 Reply Last reply Reply Quote 0
            • dragoangelD
              dragoangel
              last edited by dragoangel

              You can use /tmp/xxx_defaultgw too and not use curl (but then check wangw, not wan ips). Anyway temp file needed to capture "changes in routing" for "state2" of script: states reset only if routing was changed

              Latest stable pfSense on 2x XG-7100 and 1x Intel Xeon Server, running mutiWAN, he.net IPv6, pfBlockerNG-devel, HAProxy-devel, Syslog-ng, Zabbix-agent, OpenVPN, IPsec site-to-site, DNS-over-TLS...
              Unifi AP-AC-LR with EAP RADIUS, US-24

              1 Reply Last reply Reply Quote 0
              • R
                renat_kaa @Ximulate
                last edited by

                @Ximulate did you specify dns server for each of multiwan members? It is very important item for correct switching. https://docs.netgate.com/pfsense/en/latest/routing/multi-wan.html#dns-considerations

                dragoangelD X 2 Replies Last reply Reply Quote 0
                • dragoangelD
                  dragoangel @renat_kaa
                  last edited by

                  @Renat he haven't problem on pfsense itself, what dns in his case will potentially fix? Did you see somewhere issues with domain resolving?
                  P.s. your recommendation is wrong most logical cases: anybody can use any dns and no matter this system multiwan or not, main goal to use dns accessible for both WANs or specific for each one. Newer like use isp domain resolvers, they are mostly not so good as public one: 1.1.1.1 (cloudflare), opendns (cisco), quad9 etc.

                  Latest stable pfSense on 2x XG-7100 and 1x Intel Xeon Server, running mutiWAN, he.net IPv6, pfBlockerNG-devel, HAProxy-devel, Syslog-ng, Zabbix-agent, OpenVPN, IPsec site-to-site, DNS-over-TLS...
                  Unifi AP-AC-LR with EAP RADIUS, US-24

                  R X 2 Replies Last reply Reply Quote 0
                  • R
                    renat_kaa @dragoangel
                    last edited by

                    @dragoangel, I got it, thanks! Anyway, this worked for me and a number of pfsense users. And I didn't say about ISP dns. Public dns is good point.

                    1 Reply Last reply Reply Quote 0
                    • X
                      Ximulate @dragoangel
                      last edited by

                      This post is deleted!
                      1 Reply Last reply Reply Quote 0
                      • X
                        Ximulate @dragoangel
                        last edited by

                        @dragoangel Your suggestions look good, thank you. I'm barely proficient at bash scripting, so I'll make small improvements as necessary or time permits.

                        1 Reply Last reply Reply Quote 1
                        • X
                          Ximulate @renat_kaa
                          last edited by

                          @Renat Yes, I did. Thank you for the suggestion anyway.

                          1 Reply Last reply Reply Quote 0
                          • V
                            venix91 @Ximulate
                            last edited by

                            @Ximulate the link you provided ( https://github.com/mk-fg/pfsense-scripts ) worked great for my needs. I have a Netgeate SG3100. I have a cable modem connection as my primary WAN, it's flaky in the evening so i got a Netgear LB1121 LTE modem and a Ting GSM Sim to use for failover. It works great except when the gateway group fails back over to the cable modem gateway many states are left alive on the metered LTE connection. I was able to bandaid this by manually killing the states however i wanted this to be automatic. I struggled to find a way to automatically kill the states on primary gateway fallback and until i came across your post i thought it was hopeless. The gateway_change_conn_reset.sh script from that github page did it for me. Now i have a script that works perfectly across reboots and everything.

                            Thanks.

                            X 1 Reply Last reply Reply Quote 1
                            • X
                              Ximulate @venix91
                              last edited by

                              @venix91 Great to hear! I have basically the same failover set-up: Netgear with Ting. I haven't gone any further than the CRON job I posted above, so I'm glad to know the script will work when I'm ready to move on and glad to know this post help out others.

                              1 Reply Last reply Reply Quote 0
                              • X
                                Ximulate
                                last edited by Ximulate

                                Here's my latest script. It runs as a cron job every hour.
                                // Checks the WAN IP as reprted by an external service (opendns)
                                // Grabs IP of the WANs from the primary and backup gateways
                                // Compares reported IP to primary WAN IP, and if the same it kills the states on the backup ip

                                Code is executing, but need time to see if it actually behaves as expected (kills the backup wan states).

                                reported_ip="$(drill myip.opendns.com @resolver1.opendns.com | grep 'myip.opendns.com.')";
                                reported_ip="$(echo "$reported_ip" | grep -w -E -o "([0-9]{1,3}[.]){3}[0-9]{1,3}")";
                                primary_ip="$(ifconfig igb0 | grep -w -E -o "inet ([0-9]{1,3}[.]){3}[0-9]{1,3}")";
                                primary_ip="$(echo "$primary_ip" | grep -w -E -o "([0-9]{1,3}[.]){3}[0-9]{1,3}")";
                                backup_ip="$(ifconfig igb2 | grep -w -E -o "inet ([0-9]{1,3}[.]){3}[0-9]{1,3}")";
                                backup_ip="$(echo "$backup_ip" | grep -w -E -o "([0-9]{1,3}[.]){3}[0-9]{1,3}";)";
                                if [ "$reported_ip" = "$primary_ip" ]; then
                                pfctl -k $backup_ip;
                                fi

                                Edit: replaced DIG command with Drill commands to correct issue that occurs when scheduled incrontab

                                X 1 Reply Last reply Reply Quote 0
                                • X
                                  Ximulate @Ximulate
                                  last edited by

                                  This post is deleted!
                                  1 Reply Last reply Reply Quote 0
                                  • X
                                    Ximulate
                                    last edited by

                                    The problem with the scripts is that it will kill an active phone conversation. Not sure how to resolve that.

                                    1 Reply Last reply Reply Quote 0
                                    • X
                                      Ximulate
                                      last edited by

                                      Before trying the scripts, you may want to check "firewall optimization" is normal or aggressive. The VoIP configuration docs suggest conservative, which could be aggravating this particular problem. I've bumped mine to aggressive, but have no idea if this will cause other issues.

                                      https://docs.netgate.com/pfsense/en/latest/book/config/advanced-firewall-nat.html#config-advanced-firewall-optimization
                                      https://docs.netgate.com/pfsense/en/latest/book/config/advanced-firewall-nat.html#state-timeouts

                                      1 Reply Last reply Reply Quote 0
                                      • First post
                                        Last post
                                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.