Tracking down bad MTU
-
Hey team,
So I have a strange issue I'm trying to confirm and was looking for some input.
Network is small, Cisco Catalyst 3560-CX with 10Gb LACP bond to a PFSense box I have running. The NIC is an Intel 10Gb card (not sure how to get exact model from shell).
A couple of months ago I moved to a PFSense box and since then, have not been able to get by TMobile Home CellSpot to work correctly. There are a few threads on the site for both the TMobile and ATT tower having issues and I've tried all the fixes:
Changing MTU on the interfaces, one the switches, disabling any scrubbing, hardware offloading, clearing the fragments bits, nothing seems to work. As far as I can tell, the MTU is a solid 1500 everywhere, but, doing a capture, I still see the fragmentation happening, and that seems to be the only thing with the traffic that I can find wrong.
Also, after switching, the PS3 and PS4 both can see that fragmentation is happening and show a warning when configuring the network interfaces (also something I've seen on the thread before).I hate chasing ghosts, but I really think there might be something not matching up here MTU wise. Any input would be great. Thanks!
Here are some snips of config. Notice the JUMBO_MTU option on both the ix and Lagg interface:
From the switch: home-acc-1#show system mtu System MTU size is 1500 bytes System Jumbo MTU size is 1500 bytes System Alternate MTU size is 1500 bytes Routing MTU size is 1500 bytes home-acc-1#show interfaces | i MTU MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec, MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec, MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec, MTU 1500 bytes, BW 10000 Kbit/sec, DLY 1000 usec, MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec, MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec, MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec, MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec, MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec, MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec, MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec, MTU 1500 bytes, BW 10000 Kbit/sec, DLY 1000 usec, MTU 1500 bytes, BW 10000 Kbit/sec, DLY 1000 usec, MTU 1500 bytes, BW 10000 Kbit/sec, DLY 1000 usec, MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec, MTU 1500 bytes, BW 10000 Kbit/sec, DLY 1000 usec, MTU 1500 bytes, BW 10000 Kbit/sec, DLY 1000 usec, MTU 1500 bytes, BW 10000 Kbit/sec, DLY 1000 usec, MTU 1500 bytes, BW 10000000 Kbit/sec, DLY 10 usec, MTU 1500 bytes, BW 10000000 Kbit/sec, DLY 10 usec, MTU 1500 bytes, BW 10000000 Kbit/sec, DLY 10 usec, From the shell on the FW: [2.3.3-RELEASE][admin@home-rtr-1]/root: ifconfig ix0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 options=c400b8 <vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,txcsum_ipv6>ether 90:e2:ba:54:37:bc nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect (10Gbase-SR <full-duplex,rxpause,txpause>) status: active ix1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 options=c400b8 <vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,txcsum_ipv6>ether 90:e2:ba:54:37:bc nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect status: no carrier em0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 options=42098 <vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic,vlan_hwtso>ether 00:25:64:9a:ce:2e inet6 fe80::225:64ff:fe9a:ce2e%em0 prefixlen 64 scopeid 0x3 inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255 nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect status: no carrier pflog0: flags=100 <promisc>metric 0 mtu 33160 pfsync0: flags=0<> metric 0 mtu 1500 syncpeer: 224.0.0.240 maxupd: 128 defer: on syncok: 1 enc0: flags=0<> metric 0 mtu 1536 nd6 options=21 <performnud,auto_linklocal>lo0: flags=8049 <up,loopback,running,multicast>metric 0 mtu 16384 options=600003 <rxcsum,txcsum,rxcsum_ipv6,txcsum_ipv6>inet 127.0.0.1 netmask 0xff000000 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x7 nd6 options=21 <performnud,auto_linklocal>lagg0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 options=c400b8 <vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,txcsum_ipv6>ether 90:e2:ba:54:37:bc inet6 fe80::92e2:baff:fe54:37bc%lagg0 prefixlen 64 scopeid 0x8 nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect status: active laggproto lacp lagghash l2,l3,l4 laggport: ix0 flags=1c <active,collecting,distributing>laggport: ix1 flags=0<> lagg0_vlan200: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 ether 90:e2:ba:54:37:bc inet6 fe80::92e2:baff:fe54:37bc%lagg0_vlan200 prefixlen 64 scopeid 0x9 inet 10.1.2.1 netmask 0xffffff00 broadcast 10.1.2.255 nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect status: active vlan: 200 vlanpcp: 0 parent interface: lagg0 lagg0_vlan5: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 ether 90:e2:ba:54:37:bc inet6 fe80::92e2:baff:fe54:37bc%lagg0_vlan5 prefixlen 64 scopeid 0xa inet 10.10.113.30 netmask 0xfffffc00 broadcast 10.10.115.255 nd6 options=23 <performnud,accept_rtadv,auto_linklocal>media: Ethernet autoselect status: active vlan: 5 vlanpcp: 0 parent interface: lagg0 lagg0_vlan100: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 ether 90:e2:ba:54:37:bc inet6 fe80::92e2:baff:fe54:37bc%lagg0_vlan100 prefixlen 64 scopeid 0xb inet 10.1.1.1 netmask 0xffffff00 broadcast 10.1.1.255 nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect status: active vlan: 100 vlanpcp: 0 parent interface: lagg0 lagg0_vlan600: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 ether 90:e2:ba:54:37:bc inet6 fe80::92e2:baff:fe54:37bc%lagg0_vlan600 prefixlen 64 scopeid 0xc inet 10.1.6.1 netmask 0xffffff00 broadcast 10.1.6.255 nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect status: active vlan: 600 vlanpcp: 0 parent interface: lagg0 lagg0_vlan300: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 ether 90:e2:ba:54:37:bc inet6 fe80::92e2:baff:fe54:37bc%lagg0_vlan300 prefixlen 64 scopeid 0xd inet 10.1.3.1 netmask 0xffffff00 broadcast 10.1.3.255 nd6 options=21 <performnud,auto_linklocal>media: Ethernet autoselect status: active vlan: 300 vlanpcp: 0 parent interface: lagg0</performnud,auto_linklocal></up,broadcast,running,simplex,multicast></performnud,auto_linklocal></up,broadcast,running,simplex,multicast></performnud,auto_linklocal></up,broadcast,running,simplex,multicast></performnud,accept_rtadv,auto_linklocal></up,broadcast,running,simplex,multicast></performnud,auto_linklocal></up,broadcast,running,simplex,multicast></active,collecting,distributing></performnud,auto_linklocal></vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,txcsum_ipv6></up,broadcast,running,simplex,multicast></performnud,auto_linklocal></rxcsum,txcsum,rxcsum_ipv6,txcsum_ipv6></up,loopback,running,multicast></performnud,auto_linklocal></promisc></performnud,auto_linklocal></vlan_mtu,vlan_hwtagging,vlan_hwcsum,wol_magic,vlan_hwtso></up,broadcast,running,simplex,multicast></performnud,auto_linklocal></vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,txcsum_ipv6></up,broadcast,running,simplex,multicast></full-duplex,rxpause,txpause></performnud,auto_linklocal></vlan_mtu,vlan_hwtagging,jumbo_mtu,vlan_hwcsum,vlan_hwtso,txcsum_ipv6></up,broadcast,running,simplex,multicast>
-
JUMBO_MTU listed there is just a capability of the NIC. Not a setting. What matters there is mtu 1500.
Why do you think you have an MTU problem? Simply introducing pfSense will not create one of those unless you deliberately change the default 1500-byte MTUs.
-
I went down the rabbit hole of the FW being the issue for a couple reasons:
-Old gateway (off the shelf home wifi) did not have this issue, was using the same switch
-The Cellspot will connect home successfully when wired directly to the internet (no NAT or FW)
-Same switch being used in both scenariosThe only way I can consistently recreate the issue is when the FW is in the direct path of the data. For the VLAN this device is on (VLAN200), the rules are SRC: VLAN200 DST: any ACTION: permit
Also on the NAT side, nothing fancy, just static NAT mapping off the WAN for anything in rfc1918 private space.I keep coming back to the MTU issue because for other users on the site who had similar issues, setting the MTU to 1500 seemed to solve the problem, but has not for me.
At one point, yes I had an MTU of 9000, but have since reverted back. The issue was occurring before and after the 9000 setup.Also, the capture traffic looks good (DNS queries, etc), except for the fragementation, and based on some research and some experience, IPSec tunnels (specifically the one this devices looks to be trying to form back to TMobile), do not work correctly when the data is fragmented.
Just looking for some input further than what I've done. I've run out of things to try.
-
Are you sure you reverted EVERYTHING back from jumbo frames?
Also on the NAT side, nothing fancy, just static NAT mapping off the WAN for anything in rfc1918 private space.
What do you mean by static NAT mapping?
You might want to backup your config, go back to factory defaults, and see if your microcell comes up. You probably want to factory reset it too and let it like sit overnight. They are temperamental wenches.
Look in the DHCP Leases, get it's IP address, and look at the states. They will probably look perfectly normal. Probably either UDP500 and UDP 4500 or UDP 500 and protocol ESP.