Interface Groups vs LAGG: Multi-Wan DNS Streaming Service Problems


  • I've just added a failover WAN to my system:

    • I've created a gateway group with my primary WAN as "Tier 1" and the failover WAN as "Tier 2".
    • Modified my firewall rules to use this gateway group where I want to use it.
    • The failover WAN is not used for anything else.
      State table is empty other than the ping test
    • everything seems to work as expected except streaming services (Hulu, Netlix)...sometimes. The services will report network problems and not connect when acting up.

    Based on what I have read for multi-wan, I should have a unique DNS server for each WAN gateway at "System/General Setup/DNS Server Settings"
    ...but this is where the problem seems to originate.

    My understanding is that by adding the DNS server for the failover the way I did, that the firewall will use it just like any other DNS server.
    My suspicion is that if the streaming service resolves an address via the failover WAN DNS entry and tries to use that address on the primary WAN, that address may not be available via the primary route (due to streaming service having local routes/servers with the ISP (we're talking Google and ATT as the ISPs)).

    Removing the failover WAN entry in the DNS list and the problem goes away...but then I'd have no DNS during failover...right?

    So my questions are

    • "How do I configure the system such that the DNS entries are "isolated"
      ...only use DNS via the primary when it is active and from the failover when it is active?
    • "What about the remote gateway default DNS server?"
      ...Is this a potential problem? I'd think so?
      ...In my DNS server list in addition to those DNS servers I have explicitly added in "System/General Setup/DNS Server Settings", I have: 192.68.10.1 (primary) and 192.168.1.254 (secondary)

    TIA for any help


  • As I was getting no responses here, I posed the question on the FreeBSD forum...albeit a bit more generically related to unbound behavior.
    Their reply "use lagg"
    I can see how using lagg would address my DNS problem (as there is only one interface to deal with), so now my question becomes:

    • What are the tradeoffs between "interface groups" and "lagg" for a failover dual WAN?

    I did a quick search and really don't see this addressed anywhere, so would appreciate someone educating me.
    Thanks


  • I'll continue to talk to myself I guess.

    • LAGG would work for the DNS problem (as effectively only "one" interface), but would break the failover as it only (appears) to mark an interface down when the link is "gone" (no ping monitoring capacity)
      ...I mean really...whose "link" goes down these days unless somebody pulls a plug?
    • Tiered Gateway Groups (as reported above) do not properly (IMO) handle the DNS...so that is broken also. Another use case for non-use of failover gateway is when the failover is metered (Don't want to pay for useless DNS queries, do you?)
      ...Found on Reddit...guy trying to figure this same problem out for different reason.

    Anyhow...I'm pulling the source down to see if I can figure out a solution.
    If anyone else agrees this is a problem, I'll pull a bug report.

  • Rebel Alliance Developer Netgate

    If you have links from two different providers (or even the same provider but not the same L2) then LAGG wouldn't even be possible. LAGG requires cooperation at L2 to work properly and acts as a single interface.

    Interface groups do not help you for WANs or failover.

    "How do I configure the system such that the DNS entries are "isolated"
    ...only use DNS via the primary when it is active and from the failover when it is active?

    Don't set gateways for any DNS server. Leave the DNS resolver in resolver mode (not forwarding mode). Set the default gateway to be your gateway group. Then DNS queries will always use whichever WAN is currently the default gateway.

    Alternately, don't use the firewall for DNS. Put DNS servers on clients directly or have another local DNS system like a Pi-hole. The traffic from local clients will hit your failover rules and be respected as anything else would be.

    "What about the remote gateway default DNS server?"
    ...Is this a potential problem? I'd think so?

    Probably not. In most cases you'll get more or less the same DNS responses from different links so there wouldn't be any kind of confusion like you state. Not unless one of your DNS servers is doing something shady or somehow your two links end up being considered as in two different countries / proxied /etc or something else that streaming providers think is fishy.

    ...In my DNS server list in addition to those DNS servers I have explicitly added in "System/General Setup/DNS Server Settings", I have: 192.68.10.1 (primary) and 192.168.1.254 (secondary)

    If those are coming from your WAN CPE/Modem devices then (a) you need to disable NAT/routing features in those and use them as bridges, double NAT from your modems is probably not helping matters, and (b) you can tell the firewall to ignore upstream DNS from dynamic WANs. On System > General, uncheck DNS Server Override.


  • @jimp said in Interface Groups vs LAGG: Multi-Wan DNS Streaming Service Problems:

    Don't set gateways for any DNS server. Leave the DNS resolver in resolver mode (not forwarding mode). Set the default gateway to be your gateway group. Then DNS queries will always use whichever WAN is currently the default gateway.

    I completely agree...and think that is what I am doing...
    dd34c8a3-4a9e-4f40-b008-db251264c25c-image.png
    612cef0a-c99a-4a61-a4de-8ea401af91fd-image.png
    3e7a31e7-6e00-4d21-af83-ee33700bb462-image.png

    However, I am still getting DNS traffic on the failover interface...as shown by packet capture and DNSleaktest
    404099cd-a10f-495c-aa32-56244f4d4cc6-image.png
    192.168.1.65 is the failover gateway.
    aa01b518-5f9c-423d-b39d-346109475fda-image.png
    And looking at /var/unbound/unbound.conf, I see.

    outgoing-interface: 192.168.10.100
    outgoing-interface: 2605:a601:a628:f000:20c:29ff:fe2c:a1a4
    outgoing-interface: 192.168.1.65
    outgoing-interface: 2600:1700:1850:e00:20c:29ff:fe2c:a1b8

    Thanks so much for the time to reply, but I'm not sure what I'm missing...is there something else that needs to change?


  • @jimp said in Interface Groups vs LAGG: Multi-Wan DNS Streaming Service Problems:

    If those are coming from your WAN CPE/Modem devices then (a) you need to disable NAT/routing features in those and use them as bridges, double NAT from your modems is probably not helping matters, and (b) you can tell the firewall to ignore upstream DNS from dynamic WANs. On System > General, uncheck DNS Server Override.

    These gateways cannot be put into bridge mode (unfortunately), but I'll try changing the override setting to get pfsense to stop using them.


  • @jmbraben said in Interface Groups vs LAGG: Multi-Wan DNS Streaming Service Problems:

    These gateways cannot be put into bridge mode (unfortunately), but I'll try changing the override setting to get pfsense to stop using them.

    I unchecked the DNS server override...now...I'm really confused. My system Information is showing:
    4c8f99f1-2547-4f3e-ba7b-3f640bdf39d6-image.png
    However /var/unbound/unbound.conf (this is based on "outgoing network interfaces" dialog...right?)
    outgoing-interface: 192.168.10.100
    outgoing-interface: 2605:a601:a628:f000:20c:29ff:fe2c:a1a4
    outgoing-interface: 192.168.1.65
    outgoing-interface: 2600:1700:1850:e00:20c:29ff:fe2c:a1b8
    As there is no "foward-zone" in the unbound.conf none of my "general setup" "DNS Servers" are even being used...correct?

    In any case, the Failover WAN is still being used for DNS queries based on being an "outgoing-interface"...right? But if I remove the failover network as an "outgoing-interface", I will have no DNS when the primary fails...so I think I'm back to where I started?

    outgoing-interface: <ip address or ip6 netblock>
                  Interface to use to connect to the network.  This  interface  is
                  used  to send queries to authoritative servers and receive their
                  replies.
    
  • Rebel Alliance Developer Netgate

    Don't select any outgoing interfaces, so the OS can decide on its own which egress path to use.

    By selecting both you're forcing it to use both all the time. That's not what you want.


  • @jimp said in Interface Groups vs LAGG: Multi-Wan DNS Streaming Service Problems:

    Don't select any outgoing interfaces, so the OS can decide on its own which egress path to use.

    OK, I am IMMENSELY grateful for the help...because I would have never thought "all" would be the correct choice. Based on the documentation:

    Outgoing Network Interfaces: Specific interface(s) to use for sourcing outbound queries. By default any interface may be used. Can be useful for selecting a specific WAN or local interface for VPN queries.
    
    outgoing-interface: <ip address or ip6 netblock>
     ****If  none  are  given  the default (all) is used.****             
    

    it would seem "all" would use every interface (including a VPN client which obviously I would NOT want to use generally).

    007478c1-d46d-4610-af57-be74654e2a31-image.png

    Anyhow, with "all" selected there are NO "outgoing-interface" records in /var/unbound/unbound.conf
    dnsleaktest looks good (only primary wan dns being used)
    And there are NO DNS queries on the failover WAN.
    😂

    I would politely suggest a documentation change may be helpful.