[UPDATE: Resolved] Outbound NAT: Some ICMP Echo Replies are silently dropped
-
Hi,
I'm seeing some odd behaviour with ICMP and I'm hoping someone can point me in the right direction.
My setup is as follows
ISP–-[PPPoe]–--WAN[pfSense]LAN–-Client
My pfSense is virtualised, using VirtIO drivers, atop of KVM/Proxox. I have turned off all checksumming. Because of PPPoE, my MTU is 1492 for packets heading to the Internet. Traffic in general passes fine, though I have noticed some edge cases of TCP stalling/dropping which to me appears to be an MTU problem. Thus I've gone looking and ended up here today.
From my client, a Linux PC, I see the following behaviour:
(note all pings being run with DO NOT FRAGMENT requested)Ping with very large size and the expected reply:
tim@micro:~$ ping -M do -s 1490 8.8.8.8 -c1 PING 8.8.8.8 (8.8.8.8) 1490(1518) bytes of data. ping: local error: Message too long, mtu=1500 --- 8.8.8.8 ping statistics --- 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
Ping with 1444 size - I get a reply as expected.
tim@micro:~$ ping -M do -s 1444 8.8.8.8 -c1 PING 8.8.8.8 (8.8.8.8) 1444(1472) bytes of data. 1452 bytes from 8.8.8.8: icmp_seq=1 ttl=55 time=36.4 ms --- 8.8.8.8 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 36.456/36.456/36.456/0.000 ms
Ping with a size of 1465
tim@micro:~$ ping -M do -s 1465 8.8.8.8 -c1 PING 8.8.8.8 (8.8.8.8) 1465(1493) bytes of data. ping: local error: Message too long, mtu=1492 ^C --- 8.8.8.8 ping statistics --- 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
Ping with a size of 1464 (one less than previous ping)
tim@micro:~$ ping -M do -s 1464 8.8.8.8 -c1 PING 8.8.8.8 (8.8.8.8) 1464(1492) bytes of data. --- 8.8.8.8 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms
The last one there is dropped. No "must fragment" is received.
However when I tcpdump my pppoe0 interface when sending the ping -M do -s 1464 8.8.8.8 I do see both the pings leaving the box AND the replies being received:
tim@micro:~$ ping -M do -s 1464 8.8.8.8 -c 3 PING 8.8.8.8 (8.8.8.8) 1464(1492) bytes of data. --- 8.8.8.8 ping statistics --- ----------------------------- TCP DUMP ON PPPoE Inteface OF ABOVE 3 PINGS --------------------------------- [2.4.2-RELEASE][admin@trogdor.muppetz.com]/root: tcpdump -i pppoe0 icmp and not host 202.56.33.251 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on pppoe0, link-type NULL (BSD loopback), capture size 262144 bytes 10:27:13.670050 IP 202-137-243-17.static.nownz.co.nz > google-public-dns-a.google.com: ICMP echo request, id 21470, seq 1, length 1472 10:27:13.706961 IP google-public-dns-a.google.com > 202-137-243-17.static.nownz.co.nz: ICMP echo reply, id 21470, seq 1, length 1448 10:27:13.707021 IP google-public-dns-a.google.com > 202-137-243-17.static.nownz.co.nz: icmp 10:27:14.697492 IP 202-137-243-17.static.nownz.co.nz > google-public-dns-a.google.com: ICMP echo request, id 21470, seq 2, length 1472 10:27:14.731000 IP google-public-dns-a.google.com > 202-137-243-17.static.nownz.co.nz: ICMP echo reply, id 21470, seq 2, length 1448 10:27:14.731098 IP google-public-dns-a.google.com > 202-137-243-17.static.nownz.co.nz: icmp 10:27:15.737654 IP 202-137-243-17.static.nownz.co.nz > google-public-dns-a.google.com: ICMP echo request, id 21470, seq 3, length 1472 10:27:15.771040 IP google-public-dns-a.google.com > 202-137-243-17.static.nownz.co.nz: ICMP echo reply, id 21470, seq 3, length 1448 10:27:15.771090 IP google-public-dns-a.google.com > 202-137-243-17.static.nownz.co.nz: icmp
Doing the same command and doing a tcpdump on the LAN interface, I see only 1-way traffic:
tim@micro:~$ ping -M do -s 1464 8.8.8.8 -c 3 PING 8.8.8.8 (8.8.8.8) 1464(1492) bytes of data. --- 8.8.8.8 ping statistics --- 3 packets transmitted, 0 received, 100% packet loss, time 2059ms ----------------------------- TCP DUMP ON LAN Interface OF ABOVE 3 PINGS --------------------------------- [2.4.2-RELEASE][admin@trogdor.muppetz.com]/root: tcpdump -i vtnet1 icmp and not host 202.56.33.251 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on vtnet1, link-type EN10MB (Ethernet), capture size 262144 bytes 10:28:55.129186 IP micro.muppetz.com > google-public-dns-a.google.com: ICMP echo request, id 1873, seq 1, length 1472 10:28:56.148912 IP micro.muppetz.com > google-public-dns-a.google.com: ICMP echo request, id 1873, seq 2, length 1472 10:28:57.189876 IP micro.muppetz.com > google-public-dns-a.google.com: ICMP echo request, id 1873, seq 3, length 1472
If I look at the session table for 3 failing pings I see:
LAN icmp 192.168.0.5:1874 -> 8.8.8.8:1874 0:0 3 / 0 4 KiB / 0 B WAN icmp 202.137.243.17:32606 (192.168.0.5:1874) -> 8.8.8.8:32606 0:0 3 / 0 4 KiB / 0 B
If I look at the session table for 3 passing pings I see:
LAN icmp 192.168.0.5:1880 -> 8.8.8.8:1880 0:0 3 / 3 4 KiB / 4 KiB WAN icmp 202.137.243.17:48100 (192.168.0.5:1880) -> 8.8.8.8:48100 0:0 3 / 3 4 KiB / 4 KiB
To recap:
ping -M do -s 1444 is the maximum I can send and get replies from my Linux PC.
ping -M do -s 1445 through -s 1464 are dropped silently somewhere in pfSense, even though I see them incoming on the PPPoE interface with TCPDump, I don't see them egress the LAN interface and the Linux PC never gets a reply.
ping -M d0 -s 1465 gives me ping: local error: Message too long, mtu=1492 as expected.Aaahhhh I've just found they're being dropped by the Firewall!
But why?
Feb 12 10:45:03 trogdor filterlog: 9,,,1000000103,pppoe0,match,block,in,4,0x0,,56,1042,0,+,1,icmp,1468,8.8.8.8,202.137.243.17,reply,33195,11448 Feb 12 10:45:03 trogdor filterlog: 9,,,1000000103,pppoe0,match,block,in,4,0x0,,56,1042,1448,none,1,icmp,44,8.8.8.8,202.137.243.17, Feb 12 10:45:04 trogdor filterlog: 9,,,1000000103,pppoe0,match,block,in,4,0x0,,56,1590,0,+,1,icmp,1468,8.8.8.8,202.137.243.17,reply,33195,21448 Feb 12 10:45:04 trogdor filterlog: 9,,,1000000103,pppoe0,match,block,in,4,0x0,,56,1590,1448,none,1,icmp,44,8.8.8.8,202.137.243.17, Feb 12 10:45:05 trogdor filterlog: 9,,,1000000103,pppoe0,match,block,in,4,0x0,,56,2061,0,+,1,icmp,1468,8.8.8.8,202.137.243.17,reply,33195,31448 Feb 12 10:45:05 trogdor filterlog: 9,,,1000000103,pppoe0,match,block,in,4,0x0,,56,2061,1448,none,1,icmp,44,8.8.8.8,202.137.243.17,
What rule is causing them to be discarded?
-
Probably bad form to reply to myself?
Anyway, I can reproduce this from the pfsense CLI itself:
ping 1444 = works
ping 1445-1464 = silent fail
ping 1465 = ICMP Must Fragment received[2.4.2-RELEASE][admin@trogdor.muppetz.com]/var/log: ping -D -s 1444 -c1 8.8.8.8 PING 8.8.8.8 (8.8.8.8): 1444 data bytes 1452 bytes from 8.8.8.8: icmp_seq=0 ttl=56 time=36.418 ms --- 8.8.8.8 ping statistics --- 1 packets transmitted, 1 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 36.418/36.418/36.418/0.000 ms [2.4.2-RELEASE][admin@trogdor.muppetz.com]/var/log: ping -D -s 1445 -c1 8.8.8.8 PING 8.8.8.8 (8.8.8.8): 1445 data bytes --- 8.8.8.8 ping statistics --- 1 packets transmitted, 0 packets received, 100.0% packet loss [2.4.2-RELEASE][admin@trogdor.muppetz.com]/var/log: ping -D -s 1464 -c1 8.8.8.8 PING 8.8.8.8 (8.8.8.8): 1464 data bytes --- 8.8.8.8 ping statistics --- 1 packets transmitted, 0 packets received, 100.0% packet loss [2.4.2-RELEASE][admin@trogdor.muppetz.com]/var/log: ping -D -s 1465 -c1 8.8.8.8 PING 8.8.8.8 (8.8.8.8): 1465 data bytes 36 bytes from localhost (127.0.0.1): frag needed and DF set (MTU 1492) Vr HL TOS Len ID Flg off TTL Pro cks Src Dst 4 5 00 05d5 0000 0 0000 40 01 2519 202.137.243.17 8.8.8.8 --- 8.8.8.8 ping statistics --- 1 packets transmitted, 0 packets received, 100.0% packet loss
So it doesn't appear the LAN interface even forms part of the problem.
-
Cause Found!
If Disable Firewall Scrub is ticked (i.e. the Firewall strub is disabled) then the packets are dropped.
If the option is unticked (The default) the packets pass.
Still curious as to why this is - I've noticed it only happens towards Google (8.8.8.8 and 8.8.4.4)
Towards other hosts it didn't matter if this option was ticked or not.
I'd love to fully understand why, does anyone with knowledge of the pf/scrub option have any thoughts?