[solved] BUG: pfsense 2.4.4 update_breaks http/https from LAN - workaround



  • After upgrading from 2.4.3 I found my web access no longer worked from either of my LAN segments, though nslookup, traceroute, smtp/pop, etc was work.

    i have two VPN connections with a gateway group setup
    System/Routing/Gateways

    • Default IPv4 = VPNGW

    System/Routing/Gateway Groups
    VPNGW = VPN1 Tier1, VPN2 Tier2

    Firewall Rules
    WAN
    WAN - Block All IPv4/6

    LAN
    LAN - Block All IPv4/6
    CHINA - Allow IP Table GW = WAN
    NET - Allow DNSSEC/TELNET GW = VPNGW
    POP - Allow SMTPS/POP3S GW = VPNGW
    HTTP - Allow HTTP/HTTPS GW = VPNGW **

    The above config worked in 2.4.3 but stopped after upgrading to 2.4.4
    I found that by setting the GW for HTTP to * resolved the issue, so new config is

    LAN
    LAN - Block All IPv4/6
    CHINA - Allow IP Table GW = WAN
    NET - Allow DNSSEC/TELNET GW = VPNGW
    POP - Allow SMTPS/POP3S GW = VPNGW
    HTTP - Allow HTTP/HTTPS GW = *

    traceroute port 80 shows it going via VPN, but i imagine we should be able to force the gateway,
    even if the default gateway is the same.


  • Netgate Administrator

    Is this in 2.4.4p2? There were some issues with the new default gateway group setting in 2.4.4 but they were mostly resolved in p1.

    Is it actually routing http/s traffic over the VPN in that setup? There should be no difference in having VPNGW set as default gateway or policy routing to it.

    Steve



  • it is p2 and should be no difference, but for some reason there is.
    according to traceroute port 80 it goes via the vpn as intended


  • Netgate Administrator

    The clients are using assigned DNS servers directly?

    If they are using unbound in pfSense I could imagine some change in the default gateway handling making a difference there perhaps.

    Steve



  • clients point to pfsense LAN address (2x) for DNS on standard port 53.
    53 is blocked in/out of wan
    853 is allowed and all DNS servers for pfsense are DNSSEC


  • Netgate Administrator

    Ah so your DNS servers are local effectively but you were sending http/s traffic over VPN.

    I could see somethings having a problem with that but not all.

    In 2.4.3 you could not set a gateway group as the default gateway so what was it set to there?

    I would set it back and run a packet capture on the VPN interface. What is actually leaving across it and what comes back.

    Steve



  • In the previous setup where it was working prior to upgrading to 2.4.4;

    • default gateway was WAN interface
    • all incoming/outgoing blocked
    • allow rules had the GW set for vpn pool (2x ExpressVPN)
    • all 80/443/POP/etc goes via vpn
    • voip goes over WAN because i had problems with calling dropping when via the VPN

    Under 2.4.4 if the above settings are used, the following problems are exhibited

    • access to 80/443 pages timeout and get stuck
    • POP/SMTP work
    • SIP registration works (with rules pointing to WAN)

    to fix the 80/443 access, i have to change the settings to;

    • default gateway is VPN pool (2x ExpressVPN)
    • all incoming/outgoing blocked
    • allow rules point to * and not specific gateway
    • voip sip fails to register when pointing to WAN gateway
    • voip sip registers when pointing to * - but then i am back to calls dropping over the VPN

    In summary,

    • prior to 2.4.4 the possible use cases for the default vs specified gateways all worked
    • under 2.4.4 the possible use cases for the default vs specific gateways do not work

  • Netgate Administrator

    The big difference that would apply here between 2.4.3 and 2.4.4 was the additional default gateway handling code.

    If it wasn't specifically defined in 2.4.3 it could be set ti the wrong gateway or no gateway in 2.4.4. That was fixed in 2.4.4p1.

    Changing the default gateway to the VPN group will mean Unbound uses it for DNS. That should remove any DNS issues with traffic using different routes.

    You will have to see what's actually failing when you send that traffic in a packet capture.

    Steve



  • I have done some testing, and it seems, whenever i set a gateway other than the default gateway,
    i get bad udp checksum on the pfsense to client path.

    When i set the default gateway to the WAN, the VOIP box works ok because it uses the default gateway,
    however, browsing 80/443 uses the VPN Pool gateway and i get the below.

    192.168.20.x.53 > 192.168.20.x.35918: [bad udp cksum 0xa9ef -> 0x21fd!] 22042 q: AAAA? star.c10r.facebook.com. 1/0/1

    When I set the default gateway to the VPN Pool, browsing works because it is using the vpn as default gateway.The voip box returns the same bad udp checksum errors as above, as it is using the WAN as a specified gateway.
    In both cases, the client to the pfsense box, has udp checksum ok. It is only the pfsense to client where the persistent udp checksum errors occur. and it is only the DNS that has this problem.
    recap. client is pointed to pfsense DNS (53), pfsense (853) point to DNSSEC servers.


  • Netgate Administrator

    That checksum error could just be hardware offloading on the NIC. You can try disabling all the off loading in System > Advanced > Networking. You usually have to reboot for that to apply or at least re-save the interfaces.

    Steve



  • That doesn't really make sense Steve. If it was the case, the problem would not be restricted to a specific port and/or direction. This problem is very clearly only on port 53 from the pfsense to the client direction. This to me seems more like an issue with DNS Resolver and/or queries coming in on 53 and forwarding on 853


  • Netgate Administrator

    If you have 'Hardware Checksum Offloading' enabled that's exactly how it can appear in a packet capture for transmitted packets.

    https://www.wireshark.org/docs/wsug_html_chunked/ChAdvChecksums.html

    I would not normally expect that to be for just the DNS traffic but you should disable it anyway to be sure it's not just a quirk of the capture.

    Steve



  • @stephenw10 Thanks Steve, I stand corrected. That actually appears to have solved the problem.


  • Netgate Administrator

    The entire problem or just removed the errors from the packet capture?

    Usually those checksum errors are just false positives, annoying but ultimately harmless.

    Steve



  • removing the checksum, allowed browsing to return to normal.
    the only problem i am having now is linux and debian services upgrading via the vpn. Seems a name resolution problem, but most likely unrealted (although it id work in the previous version as well)


  • Netgate Administrator

    Mmm, interesting. I wonder if a driver update enabled that on your NICs on FreeBSD 11.2. Or indeed broke it.

    Steve



  • fyi my system has 4*Intel WG82583 10/100M/1000M Lan


  • Netgate Administrator

    Hmm, curious. There almost certainly were updates to igb in FreeBSD 11.2. I run that in almost everything though.

    Are those part of an SoC? It looks like they are not. There's a lot of fake Intel chips about unfortunately. Speculation at this point. I'd just leave it disabled and move on.

    Steve



  • My box is a J1900 N10 with two LAN segments and one WAN.
    Unfortunately, i still think there is an issue with the Gateway Pool Routing. As although disabling the h/w checksum enables browsing, i am having problems with POP/SMTP and the linuxmint being able to find package update servers. they all come up with 0kb/s speed. if i switch the gateway from the VPN to default it works. so back to the original problem - none of which existing in 2.4.3.

    What i can say, i am using 2x ExpressVPN access points as a failover, and if i set the specific gateways to the individual access interface, it works ok. it is only a problem when selecting a pool, per my original problem description. so for me 2.4.4 definitely breaks a setup that was previously working under 2.4.2 and 2.4.3



  • Steve, I still have issues with linuxmint and debian clients accessing the package updates via the vpn and google mail struggles. these all previously worked. One thing i notice under the Diag - Interfaces.
    DNS servers all show below the WAN interface.
    127.0.0.1
    1.1.1.1
    9.9.9.9
    1.0.0.1
    9.9.9.10
    81.3.27.54
    46.182.19.48
    In actual fact, they are assigned via the general settings per
    1.1.1.1, 9.9.9.9 = vpn1
    1.0.0.1, 9.9.9.10 = vpn2
    81.3.27.54, 46.182.19.48 = wan


  • Netgate Administrator

    I'm pretty sure that's just a display artifact. Check the routing table. DNS servers set to a specific WAN should show a static route via that gateway.

    Steve



  • Steve, i got to the bottom of the problem i am having. Sorted and confirmed DNS redirecting to the box, and then out over 853. Confirmed with android devices that ignore dhcp and try and connect to google.

    I have a IP4/6 blocking rule for everything on all interfaces and only allow selected ports within the LAN and separately selected ports via the WAN

    I have 2 ExpressVPNs as Tier 1 and 2. Routing correctly shows 1x Quad9 + 1xCloudfare DNS per connection.

    When both connections are up browsing is working but things like android updates, linux updates, etc don't work. This is true, whether i use the pool gateway or the individual gateways.

    When i force one of the VPNs down, then everything is working.
    So, it seems i miss something..from a route blocking / etc? Do you have some idea?


  • Netgate Administrator

    Hmm, if the VPNs are in a gateway group just one of them going down would not normally do anything.

    The fact that changes anything implies perhaps you are allowing the VPN connection to add new routes to the system still, or a new default route even.

    Or potentially you have rules that not applied when the gateway is down. Check in System > Advanced > Misc. Do you have 'Skip rules when gateway is down' enabled?

    Steve



  • I had it checked, and now unchecked. Doesn't make any difference. As soon as i enable the 2nd VPN the android clients are unable to update (as practical example). In the gateway group VPN1 = Tier 1 and VPN2 = Tier 2 based on member down. So in theory, VPN2 shouldn't even be touched, but seems like some kind of routing problem. although strange browsing still works


  • Netgate Administrator

    Hmm. Probably need to dig into the ruleset and states here to see what rules are passing the traffic and where.



  • something much more fundamental Steve.

    I turned all rules off. and only open DNS and HTTP/HTTPS from LAN to Gateway Pool.

    With 2x OpenVPN Client active; i can browse the net, but android updates time out.
    Shutdown one of the OpenVPN clients, and android updates takes off.


  • Banned

    @gwaitsi said in [solved] BUG: pfsense 2.4.4 update_breaks http/https from LAN - workaround:

    With 2x OpenVPN Client active; i can browse the net, but android updates time out.
    Shutdown one of the OpenVPN clients, and android updates takes off.

    Well maybe the android update servers block that VPN provider/IP.



  • @grimson Nope. Both VPN1 and 2 are ExpressVPN access points in different countries. If either one is disabled, it works. it is only when both are enable at the same time that it doesn't work. Android was just quick/easy example. Linux Mint updates break. only the standard standards servers are found at a much lower bandwidth, no local servers. When only 1 VPN enabled, linux standard servers increase bandwidth dramatically and the local servers can be found.


  • Netgate Administrator

    Appears like it's load-balancing between those gateways but they should be failover right?

    Check how they show in /tmp/rules.debug.

    Steve



  • yep, trigger level is member down.
    note sure what i am looking for m8, but here is what i see

    Gateways

    GWRED_DHCP = " route-to ( em0 192.168.0.1 ) "
    GWLUXVPN_VPNV4 = " route-to ( ovpnc1 10.149.0.73 ) "
    GWVPNNLD_VPNV4 = " route-to ( ovpnc3 10.156.0.129 ) "
    GWExpressVPN = " route-to { ( ovpnc1 10.149.0.73 ) } "

    Load balancing anchor

    rdr-anchor "relayd/*"

    UPnPd rdr anchor

    rdr-anchor "miniupnpd"

    anchor "relayd/"
    anchor "openvpn/
    "
    anchor "ipsec/*"

    VPN Rules

    anchor "tftp-proxy/*"



  • @stephenw10 I am guessing the issue relates to the pushing of the DNS?

    Options error: option 'route' cannot be used in this context ([PUSH-OPTIONS])
    Options error: option 'dhcp-option' cannot be used in this context ([PUSH-OPTIONS])
    Options error: option 'redirect-gateway' cannot be used in this context ([PUSH-OPTIONS])
    PUSH: Received control message: 'PUSH_REPLY,redirect-gateway def1,dhcp-option DNS 10.154.0.1,comp-lzo no,route 10.154.0.1,topology net30,ping 10,ping-restart 60,ifconfig 10.154.1.18 10.154.1.17,peer-id 23'


  • Netgate Administrator

    Hmm, well those options look broken. I assume the servers are sending them to you.

    You might try the options in the OpenVPN client 'do not pull routes' and or 'do not add routes'. Those may prevent whatever routing changes are happening.

    You definitely don't want it redirecting the gateway. Luckily that's a broken option!

    Your file does not have a load-balance group shown. The behaviour is very much like that but can't be for that reason.

    Steve



  • it is definitely to with the openvpn client config. using "force gateway" down doesn't help.
    Only disabling one of either of the clients works.


  • Netgate Administrator

    Ok so what actually changes when you disabled one? Or presumably enable it again breaking stuff?

    Different routes? Different DNS?

    Steve



  • Below is the working route/gateway table. Firewall rules used to push traffic to VPN

    Internet:
    Destination        Gateway            Flags     Netif Expire
    default            192.168.0.1        UGS         em0
    1.1.1.1            10.156.0.29        UGHS     ovpnc3
    9.9.9.9            10.156.0.29        UGHS     ovpnc3
    10.156.0.29        link#9             UH       ovpnc3
    10.156.0.30        link#9             UHS         lo0
    46.182.19.48       192.168.0.1        UGHS        em0
    81.3.27.54         192.168.0.1        UGHS        em0
    91.xx.xx.xx        10.156.0.29        UGHS     ovpnc3
    127.0.0.1          link#6             UH          lo0
    192.168.0.0/24     link#1             U           em0
    192.168.0.234      link#1             UHS         lo0
    192.168.20.0/24    link#2             U           em1
    192.168.20.5       link#2             UHS         lo0
    192.168.21.0/24    link#4             U           em3
    192.168.21.5       link#4             UHS         lo0
    

    Table from where is does not work.

    Internet:
    Destination        Gateway            Flags     Netif Expire
    default            192.168.0.1        UGS         em0
    1.0.0.1            10.149.0.13        UGHS     ovpnc1
    1.1.1.1            10.156.0.29        UGHS     ovpnc3
    9.9.9.9            10.156.0.29        UGHS     ovpnc3
    9.9.9.10           10.149.0.13        UGHS     ovpnc1
    10.149.0.13        link#10            UH       ovpnc1
    10.149.0.14        link#10            UHS         lo0
    10.156.0.29        link#9             UH       ovpnc3
    10.156.0.30        link#9             UHS         lo0
    46.182.19.48       192.168.0.1        UGHS        em0
    81.3.27.54         192.168.0.1        UGHS        em0
    85.xx.xx.xx        10.149.0.13        UGHS     ovpnc1
    91.xx.xx.xx        10.156.0.29        UGHS     ovpnc3
    127.0.0.1          link#6             UH          lo0
    192.168.0.0/24     link#1             U           em0
    192.168.0.234      link#1             UHS         lo0
    192.168.20.0/24    link#2             U           em1
    192.168.20.5       link#2             UHS         lo0
    192.168.21.0/24    link#4             U           em3
    192.168.21.5       link#4             UHS         lo0
    

    This is the traceroute (which works) from when both VPNs are up.

    traceroute to mintlinux.mirror.wearetriple.com (93.187.10.106), 30 hops max, 60 byte packets
     1  10.156.0.1 (10.156.0.1)  30.850 ms *  30.785 ms
     2  * v741.ce01.ams-01.nl.leaseweb.net (37.48.118.60)  30.738 ms *
     3  * ae-5.cr01.ams-01.nl.leaseweb.net (81.17.33.128)  30.667 ms *
     4  be-111.bb03.ams-01.leaseweb.net (31.31.38.200)  30.622 ms * be-112.bb03.ams-01.leaseweb.net (31.31.38.204)  30.578 ms
     5  * triple-it.telecity2.nl-ix.net (193.239.116.57)  37.565 ms *
     6  mirror.wearetriple.com (93.187.10.106)  37.501 ms  25.516 ms *
    

    however, this is the error from the linux package manager and as i say, on the whole http browsing works.

    Failed to fetch http://mintlinux.mirror.wearetriple.com/packages/dists/tessa/InRelease  Cannot initiate the connection to mintlinux.mirror.wearetriple.com:80 (2a00:1f00:dc06:10::106). - connect (101: Network is unreachable) Could not connect to mintlinux.mirror.wearetriple.com:80 (93.187.10.106), connection timed
    

    you can see it is resolving in the application error message, so i don't think it is a dns issue. If i shutdown either of the VPN clients, this will work.


Log in to reply