OpenVPN and pf scrub and big packets
-
The example issue is 2 offices with a site-to-site OpenVPN connection between them. To replicate I have made a local pair of pfSense with a subnet in the middle between them:
LAN 10.49.208.250/22 (happened to be the LAN on my home pfSense)
pfSenseA
OPT1 10.49.221.250
10.49.221.0/24 - middle subnet
WAN 10.49.221.1
pfSenseB
LAN 192.168.1.1/24Put allow rules for everything. Added a gateway on pfSenseA pointing to pfSenseB WAN 10.49.221.1 and a route through that gateway to 192.168.1.0/24. Turned off outbound NAT on pfSenseB.
Now things route inside these 3 subnets without any NAT happening.From a client (192.168.1.100) in pfSenseB LAN I can ping a client (10.49.211.0) in pfSenseA LAN. I can use any length ping. It happily gets fragmented when the resulting packet would get bigger than the 1500 MTU. The packet fragments pass back and forth through pfSenseB and pfSenseA and can be seen with packet capture - all good.
Now I add a site-to-site OpenVPN server listening on pfSenseA OPT1 and OpenVPN client from pfSenseB WAN across to pfSenseA OPT1 server. I put local/remote networks so that the traffic between 192.168* and 10.49* travels across the OpenVPN link. The tunnel subnet is 172.16.0.0/24
Ping up to 1472 bytes works fine. I can see the echo and reply packet with packet capture at each interface:
Normal working example with ping 10.49.211.0 -l 1472 A - at source LAN 15:28:35.427672 IP 192.168.1.100 > 10.49.211.0: ICMP echo request, id 1, seq 11818, length 1480 15:28:35.431684 IP 10.49.211.0 > 192.168.1.100: ICMP echo reply, id 1, seq 11818, length 1480 B - out client OpenVPN 15:27:55.182118 IP 192.168.1.100 > 10.49.211.0: ICMP echo request, id 1, seq 11814, length 1480 15:27:55.185847 IP 10.49.211.0 > 192.168.1.100: ICMP echo reply, id 1, seq 11814, length 1480 C - into server OpenVPN 21:11:14.548049 IP 192.168.1.100 > 10.49.211.0: ICMP echo request, id 1, seq 11806, length 1480 21:11:14.549585 IP 10.49.211.0 > 192.168.1.100: ICMP echo reply, id 1, seq 11806, length 1480 D - out destination LAN 21:12:00.686812 IP 192.168.1.100 > 10.49.211.0: ICMP echo request, id 1, seq 11810, length 1480 21:12:00.688455 IP 10.49.211.0 > 192.168.1.100: ICMP echo reply, id 1, seq 11810, length 1480
Increase the ping length to 1473 and there is trouble:
Now with ping 10.49.211.0 -l 1473 A - at source LAN 15:31:22.523248 IP 192.168.1.100 > 10.49.211.0: ICMP echo request, id 1, seq 11830, length 1480 15:31:22.523280 IP 192.168.1.100 > 10.49.211.0: ip-proto-1 15:31:22.525687 IP 172.16.0.1 > 192.168.1.100: ICMP host 10.49.211.0 unreachable, length 36 B - out client OpenVPN 15:30:06.414688 IP 192.168.1.100 > 10.49.211.0: ICMP echo request, id 1, seq 11826, length 1480 15:30:06.414711 IP 192.168.1.100 > 10.49.211.0: ip-proto-1 15:30:06.417065 IP 172.16.0.1 > 192.168.1.100: ICMP host 10.49.211.0 unreachable, length 36 C - into server OpenVPN 21:10:01.591288 IP 192.168.1.100 > 10.49.211.0: ICMP echo request, id 1, seq 11802, length 1480 21:10:01.591500 IP 172.16.0.1 > 192.168.1.100: ICMP host 10.49.211.0 unreachable, length 36 21:10:01.591625 IP 192.168.1.100 > 10.49.211.0: ip-proto-1 D - out destination LAN 21:09:03.522662 IP 192.168.1.100 > 10.49.211.0: ICMP echo request, id 1, seq 11798, length 1481
At step C (arriving at the OpenVPN server) the system responds (sourced from 172.16.0.1 the server tunnel IP) with an unreachable message. That message is actually sent even before the 2nd small fragment of the full echo is received???
Then at step D a packet of length 1481 is sent out LAN - with 20 byte IPv4 header that is actually a total of 1501 bytes, bigger than the LAN MTU???
The Windows client at 10.49.211.0 does not do anything with that. But interestingly if it is a Linux client it can receive the "too big" packet and make an echo reply. The echo reply is later than the "unreachable" that was generated in pfSense somewhere, so it gets back to the source node 192.168.1.100 too late to be any use.
Increase the ping length:
Now with ping 10.49.211.0 -l 1600 A - at source LAN 15:32:42.707428 IP 192.168.1.100 > 10.49.211.0: ICMP echo request, id 1, seq 11834, length 1480 15:32:42.707462 IP 192.168.1.100 > 10.49.211.0: ip-proto-1 15:32:42.709975 IP 172.16.0.1 > 192.168.1.100: ICMP host 10.49.211.0 unreachable, length 36 B - out client OpenVPN 15:33:29.309912 IP 192.168.1.100 > 10.49.211.0: ICMP echo request, id 1, seq 11838, length 1480 15:33:29.309931 IP 192.168.1.100 > 10.49.211.0: ip-proto-1 15:33:29.312388 IP 172.16.0.1 > 192.168.1.100: ICMP host 10.49.211.0 unreachable, length 36 C - into server OpenVPN 21:19:14.038766 IP 192.168.1.100 > 10.49.211.0: ICMP echo request, id 1, seq 11842, length 1480 21:19:14.038973 IP 172.16.0.1 > 192.168.1.100: ICMP host 10.49.211.0 unreachable, length 36 21:19:14.039154 IP 192.168.1.100 > 10.49.211.0: ip-proto-1 D - out destination LAN 21:20:16.577948 IP 192.168.1.100 > 10.49.211.0: ICMP echo request, id 1, seq 11846, length 1608
All similar stuff, but now packet capture sees a length 1608 packet being put on pfSenseA LAN!
I tried some advanced options in OpenVPN at both ends:
tun-mtu 1500;fragment 1300;mssfix;
That made no difference. The issue is where the packet fragments pop out of the OpenVPN tunnel and presumably pf tries to reassemble them, but gets it wrong somehow and something in the stack generates the "unreachable" response. But something else does reassemble the packet and transmit it on LAN without fragmenting it to match the LAN MTU.
If I do System->Advanced->Firewall, "Disable Firewall Scrub" then any length ping works. There is no attempt by pf to reassemble packet fragments, and so they seem to be passed through OK from OpenVPN and out onto LAN.
"Disable Firewall Scrub" is a global thing and will turn of the scrub functions for all interfaces. I do not expect to need to use it in this situation. There seems to be some underlying problem with packet fragments coming out of OpenVPN and being delivered to pf and being mistreated - firstly the "unreachable" response is generated, secondly the packet is actually reassembled and the too-big packet is transmitted on LAN.
Is there a bug in here somewhere?
Or am I missing some configuration option?
Or does everyone just "Disable Firewall Scrub" and forget about it?For reference the OpenVPN server conf:
dev ovpns3 verb 1 dev-type tun tun-ipv6 dev-node /dev/tun3 writepid /var/run/openvpn_server3.pid #user nobody #group nobody script-security 3 daemon keepalive 10 60 ping-timer-rem persist-tun persist-key proto udp cipher AES-128-CBC auth SHA1 up /usr/local/sbin/ovpn-linkup down /usr/local/sbin/ovpn-linkdown local 10.49.221.250 ifconfig 172.16.0.1 172.16.0.2 lport 1194 management /var/etc/openvpn/server3.sock unix push "route 10.0.0.0 255.0.0.0" route 192.168.0.0 255.255.252.0 secret /var/etc/openvpn/server3.secret
OpenVPN client conf:
dev ovpnc1 verb 1 dev-type tun tun-ipv6 dev-node /dev/tun1 writepid /var/run/openvpn_client1.pid #user nobody #group nobody script-security 3 daemon keepalive 10 60 ping-timer-rem persist-tun persist-key proto udp cipher AES-128-CBC auth SHA1 up /usr/local/sbin/ovpn-linkup down /usr/local/sbin/ovpn-linkdown local 10.49.221.1 lport 0 management /var/etc/openvpn/client1.sock unix remote 10.49.221.250 1194 ifconfig 172.16.0.2 172.16.0.1 route 10.0.0.0 255.0.0.0 secret /var/etc/openvpn/client1.secret resolv-retry infinite
Both pfSense running 2.2.3-RELEASE
-
Did you find a solution?