Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    How to configure failback for WAN1 up

    Scheduled Pinned Locked Moved Routing and Multi WAN
    38 Posts 11 Posters 11.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • K Offline
      kapara
      last edited by

      Had script created…

      Have not tested yet though....

      https://forum.pfsense.org/index.php?topic=113643.0

      Skype ID:  Marinhd

      1 Reply Last reply Reply Quote 0
      • J Offline
        jmonline
        last edited by

        Just to show you have I have posted onto the Redmine Bug#5090

        In simple terms, take a VoIP/SIP phone service, if a connection failovers over from the primary WAN1 connection to a secondary WAN2 connection, at what point should that VoIP/SIP connection be expected to fall back onto the WAN1 connection when it becomes available again. Are you saying that with state killing on failback it would move these sessions immediately?
        Or how long would/should the state remain open on the WAN2 connection?

        We are currently having real problems with this on 2 client sites setup as follows:

        WAN1 - ADSL connection just used for VoIP traffic
        WAN2 - EFM higher bandwidth connection used for all internet access, VPN etc.

        Gateway group named "EFMFirst"
        WAN2 EFM - Tier 1
        WAN1 ADSL - Tier 2

        Gateway group named "DSLFirst"
        WAN1 ADSL - Tier 1
        WAN2 EFM - Tier 2

        Firewall Rules for Voice network:
        Traffic set to Gateway: DSLFirst

        Firewall Rules for LAN network:
        Traffic set to Gateway: EFMFirst

        The problem is that if the ADSL line drops, the VoIP traffic goes onto the EFM connection. This is fine for a short period of time, but due to the other traffic on this line the bandwidth is not enough so we can get call quality issues. This is not a problem for a short period of time (better to have some phone service than none at all).

        When the ADSL line comes back online (Status>Gateways confirms this), the VoIP traffic stays over the EFM connection. Looking at the State table you can see the TCP & UDP traffic stuck to WAN2.

        It can be left for 24hrs and still the VoIP traffic will be on the wrong WAN. It will never move the traffic back onto the ADSL connection where it should be. Therefore the call quality issues remain due to the lack of bandwidth.

        What would you suggest, is this truly not a bug?
        Is there not something that can force the states to re-associate with the firewall rule and therefor the correct WAN gateway after a specified period of time perhaps?

        Also if you Kill the 2 States for each VoIP phone in the Diagnostics > States section, they re-appear straight away on the same ports and interfaces as they were previously.
        This is done by filtering the state's list by the IP address of the device. You can then see both UDP states (one on the internal network & one on the wan). Then press the "Kill States" button. This removes the 2 states very briefly, but then they reappear, still on the wrong WAN interface.
        They have definitely cleared since the Byte count returns down to 0KB and starts counting again.
        Surely clearing the state should have forced it to reconnect and follow the current rule and gateway group to the correct gateway??

        1 Reply Last reply Reply Quote 0
        • K Offline
          kapara
          last edited by

          Not sure why you are having an issue.  Possibly due to your 2 rules for the same traffic.

          I have all my phones on a vlan and there is only one rule which is the default which is using the gatewaygroup.  If I go an kill all the states for the phones on the backup interface they reconnect via the correct gateway.  Again maybe something amiss on your setup.

          Skype ID:  Marinhd

          1 Reply Last reply Reply Quote 0
          • J Offline
            jmonline
            last edited by

            @kapara:

            Not sure why you are having an issue.  Possibly due to your 2 rules for the same traffic.

            I have all my phones on a vlan and there is only one rule which is the default which is using the gatewaygroup.  If I go an kill all the states for the phones on the backup interface they reconnect via the correct gateway.  Again maybe something amiss on your setup.

            Yeah I too have all the phones in a VLAN.
            The Voice Network (VLAN30) has a rule which sends traffic over the Gateway group "DSLFirst"
            The LAN Network (VLAN1) has a rule which sends traffic over the Gateway group "EFMFirst"

            That is why I have 2 rules, because I have 2 networks (each using a different WAN as their primary/default).

            I have found that killing states will work very occasionally, but most of the time the states stay open on the wrong WAN.

            1 Reply Last reply Reply Quote 0
            • K Offline
              kapara
              last edited by

              you have one rule for each interface or 2 rules for each interface?

              Skype ID:  Marinhd

              1 Reply Last reply Reply Quote 0
              • J Offline
                jmonline
                last edited by

                @kapara:

                you have one rule for each interface or 2 rules for each interface?

                1 Rule for each interface

                1 Reply Last reply Reply Quote 0
                • K Offline
                  kapara
                  last edited by

                  from SSH or from gui try to run the following command:

                  pfctl -i igb0 -k 192.168.65.0/24

                  where igbX is your backup interface and the subnet is what is used by your phones

                  Skype ID:  Marinhd

                  1 Reply Last reply Reply Quote 0
                  • J Offline
                    jmonline
                    last edited by

                    @kapara:

                    from SSH or from gui try to run the following command:

                    pfctl -i igb0 -k 192.168.65.0/24

                    where igbX is your backup interface and the subnet is what is used by your phones

                    My backup WAN interface is called WAN_EFM.
                    My Voice network is on 10.10.30.0/24

                    I ran pfctl -i WAN_EFM -k 10.10.30.0/24 and I got the result:
                    killed 0 states from 1 sources and 0 destinations.

                    Yet if I look at the state table, select the Interface as WAN_EFM, and Filter expression as 10.10.30 I can see a whole list of UDP states, one for each phone.
                    If I look at the WAN_DSL interface there are no states open for the phones.

                    I'll print an output of the states below with the WAN & PBX IPs masked.

                    WAN_EFM udp 135.196.xxx.xxx:42190 (10.10.30.39:14079) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.346 K / 7.021 K 4.39 MiB / 2.71 MiB
                    WAN_EFM udp 135.196.xxx.xxx:9175 (10.10.30.49:58472) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.379 K / 7.045 K 4.42 MiB / 2.71 MiB
                    WAN_EFM udp 135.196.xxx.xxx:47285 (10.10.30.42:25810) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.453 K / 7.131 K 4.48 MiB / 2.76 MiB
                    WAN_EFM udp 135.196.xxx.xxx:20572 (10.10.30.53:59061) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.453 K / 7.125 K 4.48 MiB / 2.76 MiB
                    WAN_EFM udp 135.196.xxx.xxx:4430 (10.10.30.40:12615) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.428 K / 7.106 K 4.46 MiB / 2.74 MiB
                    WAN_EFM udp 135.196.xxx.xxx:25173 (10.10.30.38:50089) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.433 K / 7.111 K 4.46 MiB / 2.75 MiB
                    WAN_EFM udp 135.196.xxx.xxx:36676 (10.10.30.5:57001) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 7.438 K / 7.093 K 4.24 MiB / 2.74 MiB
                    WAN_EFM udp 135.196.xxx.xxx:20383 (10.10.30.26:12710) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 8.817 K / 8.472 K 5.27 MiB / 3.68 MiB

                    1 Reply Last reply Reply Quote 0
                    • K Offline
                      kapara
                      last edited by

                      What if you use the actual interface instead of the label?

                      Skype ID:  Marinhd

                      1 Reply Last reply Reply Quote 0
                      • J Offline
                        jmonline
                        last edited by

                        @kapara:

                        What if you use the actual interface instead of the label?

                        pfctl -i igb2 -k 10.10.30.0/24 gives me:
                        killed 0 states from 1 sources and 0 destinations

                        pfctl -i opt1 -k 10.10.30.0/24 gives me:
                        killed 0 states from 1 sources and 0 destinations

                        I don't get it because the EFM connection is on the physical interface igb2.

                        Status > Interfaces gives me this:

                        WAN_EFM Interface (opt1, igb2)
                        Status: up
                        MAC Address: 00:0d:b9:xx:xx:xx
                        IPv4 Address: 135.196.xxx.xxx
                        Subnet mask IPv4: 255.255.255.252
                        Gateway IPv4: 135.196.xxx.xxx
                        IPv6 Link Local: fe80::xxx:xxx:fe41:73f6%igb2
                        MTU: 1500
                        Media: 100baseTX <full-duplex>
                        In/out packets: 70894297/45691236 (43.12 GiB/17.40 GiB)
                        In/out packets (pass): 70894297/45691236 (43.12 GiB/17.40 GiB)</full-duplex>
                        

                        Yet the state table is still full of states on the WAN_EFM connection and there's none on the WAN_DSL where it should be going because WAN_DSL is Tier 1 in the Gateway group.

                        1 Reply Last reply Reply Quote 0
                        • J Offline
                          jmonline
                          last edited by

                          I have just Reset the whole firewall state table from Diagnostics > States > Reset States

                          This has made no difference, connections are still on WAN_EFM even though WAN_ADSL is showing up and online.

                          1 Reply Last reply Reply Quote 0
                          • K Offline
                            kapara
                            last edited by

                            Try removing -i and the interface.  Be aware this may kill all connections for the subnet to both interfaces

                            Skype ID:  Marinhd

                            1 Reply Last reply Reply Quote 0
                            • K Offline
                              kapara
                              last edited by

                              In your gateway group the 2 interfaces are on different tiers? Or same tier?

                              Skype ID:  Marinhd

                              1 Reply Last reply Reply Quote 0
                              • K Offline
                                kapara
                                last edited by

                                Maybe specify the up of the end point IP. You might have to specify 2 commands.  Both to and from the IP's

                                Based on the statement below it makes since that no states were killed:

                                WAN_EFM  udp  135.196.xxx.xxx:42190 (10.10.30.39:14079) -> 185.83.xxx.xxx:5060  MULTIPLE:MULTIPLE  7.346 K / 7.021 K  4.39 MiB / 2.71 MiB

                                pfctl -i igb0 -k 192.168.65.0/24 -k 135.196.xxx.xxx

                                pfctl -i igb0 -k IP of Voip System -k 192.168.65.0/24

                                or

                                pfctl -i igb0 -k 135.196.xxx.xxx

                                pfctl -i igb0 -k 192.168.65.0/24

                                -k host
                                    Kill all of the state entries originating from the specified
                                    host.

                                -h     Help.

                                -i interface
                                    Restrict the operation to the given interface.

                                -k host
                                    Kill all of the state entries originating from the specified
                                    host.  A second -k host option may be specified, which will kill
                                    all the state entries from the first host to the second host.
                                    For example, to kill all of the state entries originating from
                                    host:

                                # pfctl -k host

                                To kill all of the state entries from host1 to host2:

                                # pfctl -k host1 -k host2

                                Skype ID:  Marinhd

                                1 Reply Last reply Reply Quote 0
                                • J Offline
                                  jmonline
                                  last edited by

                                  Yes my WAN Gateways are on different tiers


                                  To confirm a couple of things:

                                  My WAN_EFM connection is on opt1, igb2
                                  My WAN_EFM connection IP is the one starting 135.196.xxx.xxx

                                  My WAN_ADSL connection is on wan, pppoe0
                                  My WAN_ADSL connection IP is the one starting 82.152.xxx.xxx

                                  My LAN Interface is on lan, igb1
                                  This has a network of 10.10.1.0/24
                                  It is used for general PC & Servers

                                  My 30VOICELAN is on opt3, igb1_vlan30
                                  It has a network of 10.10.30.0/24
                                  It is used for all VoIP phone devices

                                  My External VoIP Server is hosted in a datacenter and is the IP beginning 185.83.xxx.xxx

                                  Gateway group named "EFMFirst"
                                  Tier 1 - WAN_EFM
                                  Tier 2 - WAN_ADSL

                                  Gateway group named "DSLFirst"
                                  Tier 1 - WAN_ADSL
                                  Tier 2 - WAN_EFM

                                  Firewall Rules for LAN network:
                                  Traffic set to Gateway: EFMFirst

                                  Firewall Rules for 30VOICELAN network:
                                  Traffic set to Gateway: DSLFirst


                                  If the WAN_ADSL connection goes down, the state table confirms that the states for the voice traffic are now going over the WAN_EFM connection (135.196.xxx.xxx).

                                  When the WAN_ADSL connection comes back UP, none of the states ever return to the WAN_ADSL connection.

                                  If you Reset the firewall state table all the states go back to the correct paths (LAN devices over the EFM connection and VOICELAN devices over the DSL connection)!!

                                  Resetting the firewall state table is a bit overkill since it kills all the states on every device/connection.

                                  Shouldn't I be able to kill just the states of the 30VOICENET devices which are going over the wrong connection (WAN_EFM)?

                                  Interestingly yesterday I connected a brand new VoIP phone to the network (after having the WAN_ADSL connection down earlier that day), it connected to my Hosted VoIP server through the WAN_EFM connection, even though the WAN_ADSL connection was UP and this device had no previous states ever on the router. ….. Does this mean that when that WAN_DSL had come back up earlier that day (before I connected this new device), something in PFSENSE hasn't triggered the Firewall rules/Gateways to follow the correct path? The Gateway status always reports correct, when a connection comes back UP, the status reports Online and vise versa.

                                  What command should I be running in pfctl to Kill all of the states for devices on the 30VOICELAN network to trigger the devices to register on the correct connection?

                                  If I run```
                                  pfctl -k 10.10.30.0/24

                                  
                                  If I run```
                                  pfctl -i igb2 -k 10.10.30.0/24
                                  ```this tells me _0 states from 1 sources and 0 destinations_ have been killed
                                  
                                  Yet if I look at the state table I can still see:
                                  
                                  WAN_EFM udp 135.196.xxx.xxx:29023 (10.10.30.11:38251) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 4.932 K / 4.71 K 2.94 MiB / 1.82 MiB
                                  WAN_EFM udp 135.196.xxx.xxx:2239 (10.10.30.54:37815) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 5.155 K / 4.679 K 3.07 MiB / 1.80 MiB
                                  WAN_EFM udp 135.196.xxx.xxx:44077 (10.10.30.46:26578) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 5.151 K / 4.675 K 3.07 MiB / 1.80 MiB
                                  WAN_EFM udp 135.196.xxx.xxx:10148 (10.10.30.22:22774) -> 185.83.xxx.xxx:5060 MULTIPLE:MULTIPLE 5.472 K / 4.954 K 3.26 MiB / 2.18 MiB
                                  30VOICELAN udp 185.83.xxx.xxx:5060 -> 10.10.30.25:41959 MULTIPLE:MULTIPLE 309 / 321 138 KiB / 196 KiB
                                  30VOICELAN udp 185.83.xxx.xxx:5060 -> 10.10.30.11:38251 MULTIPLE:MULTIPLE 252 / 263 99 KiB / 161 KiB
                                  30VOICELAN udp 185.83.xxx.xxx:5060 <- 10.10.30.52:52783 MULTIPLE:MULTIPLE 266 / 254 163 KiB / 101 KiB
                                  30VOICELAN udp 185.83.xxx.xxx:5060 <- 10.10.30.38:39870 MULTIPLE:MULTIPLE 264 / 252 161 KiB / 99 KiB
                                  30VOICELAN udp 185.83.xxx.xxx:5060 <- 10.10.30.49:20139 MULTIPLE:MULTIPLE 264 / 252 161 KiB / 99 KiB
                                  
                                  Note the above is just a sample of the states table, there are essentially 2 states for every VoIP device (1 showing on the WAN_EFM side and one showing on the 30WOICELAN side).
                                  
                                  What pfctl command should I be using to force all of these states to go back to the correct connections?
                                  
                                  The **Reset the firewall state table** command does the job but is not targeted enough.
                                  
                                  Why does a new device attached go over the wrong WAN (following a earlier disconnection/reconnection) until such time as the Firewall state table is reset? Is this a clue as to whats going on?
                                  
                                  I hope that gives enough information…. :)
                                  
                                  Thanks
                                  James
                                  1 Reply Last reply Reply Quote 0
                                  • K Offline
                                    kapara
                                    last edited by

                                    But did you try to use the public ip in the statement?

                                    Skype ID:  Marinhd

                                    1 Reply Last reply Reply Quote 0
                                    • J Offline
                                      jmonline
                                      last edited by

                                      @kapara:

                                      But did you try to use the public ip in the statement?

                                      Yes

                                      pfctl -k 185.83.xxx.xxx -k 10.10.30.0/24
                                      

                                      This prints: killed 2 states from 1 sources and 1 destinations

                                      Yet the state table doesn't change and the states are still over the wrong WAN connection.

                                      1 Reply Last reply Reply Quote 0
                                      • K Offline
                                        kapara
                                        last edited by

                                        :o

                                        Makes no sense.  The command seems to reference that it is the purpose.  Curious what the comand is that is executed when you kill all the states.

                                        I am hoping this weekend to go to the site where I have 2 WAN connections with voip to test.  I am really intrigued and fustrated…  I will try to post this to some BSD forums....

                                        Skype ID:  Marinhd

                                        1 Reply Last reply Reply Quote 0
                                        • J Offline
                                          jmonline
                                          last edited by

                                          I have posted some more information here https://forum.pfsense.org/index.php?topic=93998.msg632887#msg632887 in response to some questions on the same subject.

                                          1 Reply Last reply Reply Quote 0
                                          • DerelictD Offline
                                            Derelict LAYER 8 Netgate
                                            last edited by

                                            If you use pfctl -vss you will get the age of the state. That might be good information when troubleshooting this.

                                            Chattanooga, Tennessee, USA
                                            A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                            DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                            Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.