[solved] BUG: pfsense 2.4.4 update_breaks http/https from LAN - workaround

4o4rh

After upgrading from 2.4.3 I found my web access no longer worked from either of my LAN segments, though nslookup, traceroute, smtp/pop, etc was work.

i have two VPN connections with a gateway group setup
System/Routing/Gateways

Default IPv4 = VPNGW

System/Routing/Gateway Groups
VPNGW = VPN1 Tier1, VPN2 Tier2

Firewall Rules
WAN
WAN - Block All IPv4/6

LAN
LAN - Block All IPv4/6
CHINA - Allow IP Table GW = WAN
NET - Allow DNSSEC/TELNET GW = VPNGW
POP - Allow SMTPS/POP3S GW = VPNGW
HTTP - Allow HTTP/HTTPS GW = VPNGW **

The above config worked in 2.4.3 but stopped after upgrading to 2.4.4
I found that by setting the GW for HTTP to * resolved the issue, so new config is

LAN
LAN - Block All IPv4/6
CHINA - Allow IP Table GW = WAN
NET - Allow DNSSEC/TELNET GW = VPNGW
POP - Allow SMTPS/POP3S GW = VPNGW
HTTP - Allow HTTP/HTTPS GW = *

traceroute port 80 shows it going via VPN, but i imagine we should be able to force the gateway,
even if the default gateway is the same.

stephenw10

Is this in 2.4.4p2? There were some issues with the new default gateway group setting in 2.4.4 but they were mostly resolved in p1.

Is it actually routing http/s traffic over the VPN in that setup? There should be no difference in having VPNGW set as default gateway or policy routing to it.

Steve

4o4rh

it is p2 and should be no difference, but for some reason there is.
according to traceroute port 80 it goes via the vpn as intended

stephenw10

The clients are using assigned DNS servers directly?

If they are using unbound in pfSense I could imagine some change in the default gateway handling making a difference there perhaps.

Steve

4o4rh

clients point to pfsense LAN address (2x) for DNS on standard port 53.
53 is blocked in/out of wan
853 is allowed and all DNS servers for pfsense are DNSSEC

stephenw10

Ah so your DNS servers are local effectively but you were sending http/s traffic over VPN.

I could see somethings having a problem with that but not all.

In 2.4.3 you could not set a gateway group as the default gateway so what was it set to there?

I would set it back and run a packet capture on the VPN interface. What is actually leaving across it and what comes back.

Steve

4o4rh

In the previous setup where it was working prior to upgrading to 2.4.4;

default gateway was WAN interface
all incoming/outgoing blocked
allow rules had the GW set for vpn pool (2x ExpressVPN)
all 80/443/POP/etc goes via vpn
voip goes over WAN because i had problems with calling dropping when via the VPN

Under 2.4.4 if the above settings are used, the following problems are exhibited

access to 80/443 pages timeout and get stuck
POP/SMTP work
SIP registration works (with rules pointing to WAN)

to fix the 80/443 access, i have to change the settings to;

default gateway is VPN pool (2x ExpressVPN)
all incoming/outgoing blocked
allow rules point to * and not specific gateway
voip sip fails to register when pointing to WAN gateway
voip sip registers when pointing to * - but then i am back to calls dropping over the VPN

In summary,

prior to 2.4.4 the possible use cases for the default vs specified gateways all worked
under 2.4.4 the possible use cases for the default vs specific gateways do not work

stephenw10

The big difference that would apply here between 2.4.3 and 2.4.4 was the additional default gateway handling code.

If it wasn't specifically defined in 2.4.3 it could be set ti the wrong gateway or no gateway in 2.4.4. That was fixed in 2.4.4p1.

Changing the default gateway to the VPN group will mean Unbound uses it for DNS. That should remove any DNS issues with traffic using different routes.

You will have to see what's actually failing when you send that traffic in a packet capture.

Steve

4o4rh

I have done some testing, and it seems, whenever i set a gateway other than the default gateway,
i get bad udp checksum on the pfsense to client path.

When i set the default gateway to the WAN, the VOIP box works ok because it uses the default gateway,
however, browsing 80/443 uses the VPN Pool gateway and i get the below.

192.168.20.x.53 > 192.168.20.x.35918: [bad udp cksum 0xa9ef -> 0x21fd!] 22042 q: AAAA? star.c10r.facebook.com. 1/0/1

When I set the default gateway to the VPN Pool, browsing works because it is using the vpn as default gateway.The voip box returns the same bad udp checksum errors as above, as it is using the WAN as a specified gateway.
In both cases, the client to the pfsense box, has udp checksum ok. It is only the pfsense to client where the persistent udp checksum errors occur. and it is only the DNS that has this problem.
recap. client is pointed to pfsense DNS (53), pfsense (853) point to DNSSEC servers.

stephenw10

That checksum error could just be hardware offloading on the NIC. You can try disabling all the off loading in System > Advanced > Networking. You usually have to reboot for that to apply or at least re-save the interfaces.

Steve

4o4rh

That doesn't really make sense Steve. If it was the case, the problem would not be restricted to a specific port and/or direction. This problem is very clearly only on port 53 from the pfsense to the client direction. This to me seems more like an issue with DNS Resolver and/or queries coming in on 53 and forwarding on 853

stephenw10

If you have 'Hardware Checksum Offloading' enabled that's exactly how it can appear in a packet capture for transmitted packets.

https://www.wireshark.org/docs/wsug_html_chunked/ChAdvChecksums.html

I would not normally expect that to be for just the DNS traffic but you should disable it anyway to be sure it's not just a quirk of the capture.

Steve

4o4rh

@stephenw10 Thanks Steve, I stand corrected. That actually appears to have solved the problem.

stephenw10

The entire problem or just removed the errors from the packet capture?

Usually those checksum errors are just false positives, annoying but ultimately harmless.

Steve

4o4rh

removing the checksum, allowed browsing to return to normal.
the only problem i am having now is linux and debian services upgrading via the vpn. Seems a name resolution problem, but most likely unrealted (although it id work in the previous version as well)

stephenw10

Mmm, interesting. I wonder if a driver update enabled that on your NICs on FreeBSD 11.2. Or indeed broke it.

Steve

4o4rh

fyi my system has 4*Intel WG82583 10/100M/1000M Lan

stephenw10

Hmm, curious. There almost certainly were updates to igb in FreeBSD 11.2. I run that in almost everything though.

Are those part of an SoC? It looks like they are not. There's a lot of fake Intel chips about unfortunately. Speculation at this point. I'd just leave it disabled and move on.

Steve

4o4rh

My box is a J1900 N10 with two LAN segments and one WAN.
Unfortunately, i still think there is an issue with the Gateway Pool Routing. As although disabling the h/w checksum enables browsing, i am having problems with POP/SMTP and the linuxmint being able to find package update servers. they all come up with 0kb/s speed. if i switch the gateway from the VPN to default it works. so back to the original problem - none of which existing in 2.4.3.

What i can say, i am using 2x ExpressVPN access points as a failover, and if i set the specific gateways to the individual access interface, it works ok. it is only a problem when selecting a pool, per my original problem description. so for me 2.4.4 definitely breaks a setup that was previously working under 2.4.2 and 2.4.3

4o4rh

Steve, I still have issues with linuxmint and debian clients accessing the package updates via the vpn and google mail struggles. these all previously worked. One thing i notice under the Diag - Interfaces.
DNS servers all show below the WAN interface.
127.0.0.1
1.1.1.1
9.9.9.9
1.0.0.1
9.9.9.10
81.3.27.54
46.182.19.48
In actual fact, they are assigned via the general settings per
1.1.1.1, 9.9.9.9 = vpn1
1.0.0.1, 9.9.9.10 = vpn2
81.3.27.54, 46.182.19.48 = wan