Gateway tier priority backwards?

  • Hi folks - maybe i'm reading this wrong in the help pages.  What I understand is that in a gateway group, the interface connection has priority values.  Meaning, if i have interfaces that are defined as tier 1 within the gateway group that they should take priority over interfaces within the same gateway group that are defined as tier 2.

    Not sure why but my tier 1 interfaces seem to be taking lower priority than the tier 2.  Meaning, the amount of traffic coming into and out of my tier 2 defined interfaces is much greater than the tier 1 interfaces.  I have checked logs and such and my tier 1 interfaces are not going down.



  • ** bump  **

    Do i understand the documentation correctly? Is this a bug perhaps?  My tier interfaces take priority over the tier interfaces.  A side note, my interfaces are openvpn clients.

  • your vpn provider probably pushes a default route that overwrites the policy routing.

    check 'dont pull routes' & try again

  • I have checked on all vpn sessions (for both tier 1 and tier 2 providers)

    • don't pull routes
    • don't add/remove routes

  • Hi folks - here is what i'm talking about for example.  1 day of traffic. Notice provider B appears to be handling all the traffic.

    VPN Provider A Connection # 1 UDP4 up Tue Jan 2 0:16:44 2018 14.62 MiB / 629 KiB

    VPN Provider A Connection # 2 UDP4 up Tue Jan 2 0:16:39 2018 16.01 MiB / 4.67 MiB

    VPN Provider A Connection # 3 UDP4 up Tue Jan 2 0:16:39 2018 2.22 GiB / 95.97 MiB

    VPN Provider A Connection # 4 UDP4 up Tue Jan 2 0:16:40 2018 16.01 MiB / 4.63 MiB

    VPN Provider A Connection # 5 UDP4 up Tue Jan 2 0:16:40 2018 16.02 MiB / 4.62 MiB

    VPN Provider B Connection # 1 UDP4 up Tue Jan 2 14:53:30 2018 601.69 MiB / 1.14 GiB

    VPN Provider B Connection # 2 UDP4 up Tue Jan 2 14:53:29 2018 4.01 GiB / 3.48 GiB

    VPN Provider B Connection # 3 UDP4 up Tue Jan 2 14:53:25 2018 99.69 MiB / 1.44 GiB

    VPN Provider B Connection # 4 UDP4 up Tue Jan 2 14:53:33 2018 181.82 MiB / 1.54 GiB

    VPN Provider B Connection # 5 UDP4 up Tue Jan 2 14:53:32 2018 1.54 GiB / 2.48 GiB

  • LAYER 8 Netgate

    What is defined on your gateway groups? WAN interfaces or VPN provider gateways?

    Post your policy routing rules and gateway group configurations.

  • Each OpenVPN Client has it's own interface
    Each of these interfaces, including WAN are in the single gateway group

    Gateway group configuration:
    WAN = Never
    All VPN interfaces = Tier 1
    Trigger = packet loss or high latency

    LAN rule routes traffic out of gateway group.

  • LAYER 8 Netgate

    Load balancing has no way to know how much traffic a particular state will end up transferring when it is created. It balances states among connections, not traffic. The fact that you show approximately the same number of states on each connection means it is working.

    ETA: Moving to Multi-WAN

  • I'm not following your statement.  If you look at the 2 providers, VPN A shows megabytes. VPN B shows gigabytes.  And if i understand, the LB method is round robin so over time these numbers should be roughly the same.

  • LAYER 8 Netgate

    It doesn't matter. They both have 5 states. Again, when you make a connection it has no idea how much traffic is going to go over it.

    If VPN A has 5 states established and VPN B has 4, the new connection goes out VPN B. How much traffic has gone over the states in the past is not evaluated. In fact, the one with the most traffic on it might be idle at the time the new state is created and never transfer another byte.

  • are there any plans to change LB algorithms? like least connection?

  • LAYER 8 Netgate

    No. I made a reference already that shows why that would be folly. A state that has transferred 100GB might never transfer another byte. There is no way for the firewall to know or predict with any accuracy what is going to happen based on what has happened. And it is impossible to move states between interfaces to correct later.

    You can skew the algorithm regarding the number of states put on each circuit with the gateway weights but that is about it.

    For example two gateways in a group, both tier 1. The first gateway has a weight of 4, the second a weight of 1. 4 out of 5 states - or 80% - will be created on the first gateway, 20% on the second.

    Another reference:

  • I did an experiment.  I turned off all but 1 of the VPN interfaces with VPN B.  That left me with

    5 VPN A clients running
    1 VPN B client running

    The gategroup detected the 4 VPN B clients were down and now all traffic is being routed through the sole VPN B client.

    The only difference between the two VPN client configurations is that VPN B uses a TLS key where as VPN A does not.

    This looks like a defect. I have 30ish clients running various activities behind this firewall.  I should see a significant increase in the VPN A clients activity but I am not.

  • LAYER 8 Netgate

    Post your rules.

    There is not a problem with Load Balancing. It does what it does very well. See the other thread I posted. Every time I test it because someone claims it doesn't work right it works fine. Not going to do it again.

    5 VPN A clients running
    1 VPN B client running

    I don't understand at all what you are doing there. Going to need a much better description.

    If you have 5 VPN clients running to one provider they all need assigned interfaces and they all need gateways in the gateway group if you want them to be utilized in that manner.

  • I've pseudo posted my rules above.  What do you want to see specifically? glad to grab screen shots of the areas you want to look at.

  • LAYER 8 Netgate

    Start with a screen shot of the gateway group. If you don't have 6 gateways there you're not going to be utilizing 6 OpenVPN client instances.

  • ** edit - didn't know you had requested screen shot.  screen grab for gateway group :

    I did another test recently 2 days ago.

    • All openvpn clients have assigned interfaces.  The only change to default interface is checking disallowing bogon networks.

    • VPN A does not have a specific TLS key.

    • VPN A has 5 openvpn client sessions / interfaces

    • VPN B does have a specific TLS key

    • VPN B has 1 openvpn client session / interface

    • All 6 are defined in the gateway group.

    • Within the gateway group WAN is set to never

    • All 6 interfaces are set to tier 1

    VPN A#1 1.06 MiB / 50 KiB
    VPN A#2 1.27 MiB / 710 KiB
    VPN A#3 1.41 MiB / 888 KiB
    VPN A#4 1.65 MiB / 1.64 MiB
    VPN A#5 1.27 MiB / 709 KiB

    VPN B#1 2.52 GiB / 4.18 GiB

    • I have 30 clients behind this firewall and the above information is for 2 days of collection
    • VPN A interfaces only begin taking traffic when I specifically stop the openvpn client session of VPN B

    Is there something unique about load balancing and a TLS key being used with the openvpn client, gateway group or some other dependency?

  • As Derelic already pointed out: The Loadbalancer balances connections, not traffic.

    How do you know that your clients are actually creating new connections all the time?
    Those 2.52/4.18 GiB you see on VPN B#1 could be from a single connection.

  • @GruensFroeschli:

    As Derelic already pointed out: The Loadbalancer balances connections, not traffic.

    How do you know that your clients are actually creating new connections all the time?
    Those 2.52/4.18 GiB you see on VPN B#1 could be from a single connection.

    I understand this, even before it was mentioned.  However it is evidence it's not loading properly or something I don't understand.  Do you think 30 clients over 2 days are going to only transfer kilobytes of traffic? That doesn't pass my smoke test.

    ***edit: i just dumped the active states on the firewall.  the VPN A interfaces 1 - 5 are not in the table with exception of these entries:

    VPN A 3 icmp xx.xx.xx.xx:7611 -> xx.xx.xx.xx:7611 0:00 20.866 K / 0 571 KiB / 0 B
    VPN A 5 icmp xx.xx.xx.xx:8466 -> xx.xx.xx.xx:8466 0:00 20.867 K / 0 571 KiB / 0 B
    VPN A 4 tcp xx.xx.xx.xx:63300 (xx.xx.xx.xx:63032) -> xx.xx.xx.xx:443 ESTABLISHED:ESTABLISHED 5.449 K / 5.46 K 231 KiB / 762 KiB
    VPN A 1    icmp xx.xx.xx.xx:7229 -> xx.xx.xx.xx:7229 0:00 20.866 K / 0 571 KiB / 0 B
    VPN A 2 icmp xx.xx.xx.xx:7271 -> xx.xx.xx.xx:7271 0:00 20.865 K / 0 571 KiB / 0 B
    VPN A 4 icmp xx.xx.xx.xx:8068 -> xx.xx.xx.xx:8068 0:00 20.866 K / 0 571 KiB / 0 B

  • LAYER 8 Netgate

    That is just one connection. Not a connection from 30 clients.

    That is the amount of traffic that has been transmitted over THAT connection since its creation.

    Every TCP connection gets its own state.

    Load balancing works fine, though it often doesn't match users' misunderstandings about how it should be behaving. See the other thread.

  • my state table had roughly 500+ states.  What you see above is exactly what was in that 500+ states.  What specifically should i take from the link you suggested?

    I'm going to build up a load gen and slam my firewall with thousands of states and put a serious load on it and will come back to this thread with the results.  Maybe it's just a matter of small load on my firewall.

  • LAYER 8 Netgate

    That's exactly what those graphs represent. Trex generating approximately 350K states though 4- and 8- interface load balance configurations.

    Works fine.

Log in to reply