Web sites not loading when accessing pfSense through VLAN trunk.
-
Some Realtek NICs have broken long frame support, so when you're trying to pass packets that have the full 1500 MTU and then add the VLAN tag, they refuse to send or receive them. Your symptoms match that scenario 100%. Not sure anyone has done VLANs on those boxes, but it'd be far from the first NIC issues people have seen on them.
-
I am having difficulty understanding the various flags reported by ifconfig but it seems to me that the vlanmtu flag is used by the VLAN driver to determine whether or not the interface supports VLAN frames larger than 1500. If it is set and the interface in fact does not support this then it could be the cause of many problems. There is a suggested solution to this:
The vlan driver automatically recognizes devices that natively support
long frames for vlan use and calculates the appropriate frame MTU based
on the capabilities of the parent interface. Some other interfaces not
listed above may handle long frames, but they do not advertise this abil-
ity of theirs. The MTU setting on vlan can be corrected manually if used
in conjunction with such a parent interface.My own X700 box has died completely so I can't check. :(
What MTU size has the VLAN driver determined is correct on your box? What flags are reported by ifconfig on the re interfaces?Steve
-
Here's a quote from YongHyeon PYUN, the re(4) maintainer/author:
@http://freebsd.1045724.n5.nabble.com/Abysmal-re-4-performance-under-8-1-STABLE-mid-August-td3946608.html:I'm sure this has nothing to do that this issue.
If you want to disable checksum offloading of VLAN
interface, use vlan interface instead of parent interface
of the VLAN interface(i.e. ifconfig vlan0 -txcsum -rxcsum).
And you can't disable VLAN_MTU on re(4). There is no
reason to disable supporting VLAN oversized frames.So perhaps a manual MTU reduction is necessary.
Steve
-
Thanks for all the insight!
So far I tried disabling all the hardware features on the LAGG interfaces and reducing the MTU. I'll try doing one thing at a time this weekend, just wanted to see the result of the extreme set of changes. I had to delete the LAGG and all the VLANs to be able to change the underlying interfaces.
There are two differences I noticed so far:
1 - now I can ping VLAN PCs with all the packet sizes with no packet loss. Ie. doing ping -v -c 1 -g 1470 -G 1492 -S 192.168.10.1 192.168.10.64 on pfSense box doesn't exibit packet loss anymore.
2 - while pinging the pfSense box via the VLAN interfaces (with hw. features disabled), when using packet sizes that didn't work before (~1474, basically MTU - 28), there are no echo replies detected in tcpdump. Previously, echo replies were logged but nothing got outAlso, no watchdog timeouts logged yet, but I've done no stress testing yet either.
Certain web sites are still inaccessible when connecting via VLAN interfaces. Actually, I think it even got worse since I can't even load imgur now, while previously it was just a matter of refreshing the page until the main pic loaded.
Next up I'll try disabling the LAGG. Even though I don't suspect it of inducing errors, it doesn't make changing interface settings any easier. It did inherit all the relevant changes I made to the interfaces, like MTU and hw. features.
-
2 - while pinging the pfSense box via the VLAN interfaces (with hw. features disabled), when using packet sizes that didn't work before (~1474, basically MTU - 28), there are no echo replies detected in tcpdump. Previously, echo replies were logged but nothing got out.
Hmm, anything in the firewall log? Did you reinstate the firewall rules? Easily overlooked. ;)
Looks like you're making some progress.Steve
-
What I meant in point 2:
PE1900, pinging normal interface with HW acceleration enabled:
root@bobeus:~# ping -c3 -s 1474 -I 192.168.2.16 192.168.2.1 PING 192.168.2.1 (192.168.2.1) from 192.168.2.16 : 1474(1502) bytes of data. ^C --- 192.168.2.1 ping statistics --- 3 packets transmitted, 0 received, 100% packet loss, time 2015ms
tcpdump on pfSense box:
23:58:29.303739 IP 192.168.2.16 > 192.168.2.1: ICMP echo request, id 22611, seq 1, length 1480 23:58:29.303762 IP 192.168.2.16 > 192.168.2.1: icmp 23:58:29.303871 IP 192.168.2.1 > 192.168.2.16: ICMP echo reply, id 22611, seq 1, length 1480 23:58:29.303875 IP 192.168.2.1 > 192.168.2.16: icmp 23:58:30.316443 IP 192.168.2.16 > 192.168.2.1: ICMP echo request, id 22611, seq 2, length 1480 23:58:30.316464 IP 192.168.2.16 > 192.168.2.1: icmp 23:58:30.316517 IP 192.168.2.1 > 192.168.2.16: ICMP echo reply, id 22611, seq 2, length 1480 23:58:30.316521 IP 192.168.2.1 > 192.168.2.16: icmp 23:58:31.329564 IP 192.168.2.16 > 192.168.2.1: ICMP echo request, id 22611, seq 3, length 1480 23:58:31.329586 IP 192.168.2.16 > 192.168.2.1: icmp 23:58:31.329646 IP 192.168.2.1 > 192.168.2.16: ICMP echo reply, id 22611, seq 3, length 1480 23:58:31.329650 IP 192.168.2.1 > 192.168.2.16: icmp
PE1900, pinging the a VLAN interface with hw acceleration disabled.
root@bobeus:~# ping -c3 -s 1468 -I 192.168.10.64 192.168.10.1 PING 192.168.10.1 (192.168.10.1) from 192.168.10.64 : 1468(1496) bytes of data. ^C --- 192.168.10.1 ping statistics --- 3 packets transmitted, 0 received, 100% packet loss, time 2015ms
tcpdump on pfSense box:
[2.0.1-RELEASE][root@pfsense.bobnet]/root(19): tcpdump -i re3_vlan128 host 192.168.10.64 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on re3_vlan128, link-type EN10MB (Ethernet), capture size 96 bytes 00:03:14.496355 IP 192.168.10.64 > 192.168.10.1: ICMP echo request, id 22616, seq 1, length 1476 00:03:15.500857 IP 192.168.10.64 > 192.168.10.1: ICMP echo request, id 22616, seq 2, length 1476 00:03:16.505921 IP 192.168.10.64 > 192.168.10.1: ICMP echo request, id 22616, seq 3, length 1476 00:03:19.529010 ARP, Request who-has 192.168.10.1 tell 192.168.10.64, length 42 00:03:19.529036 ARP, Reply 192.168.10.1 is-at 00:90:7f:2e:84:db (oui Unknown), length 28 ^C 5 packets captured 5 packets received by filter 0 packets dropped by kernel
PE1900, pinging the a VLAN interface with a smaller payload.
root@bobeus:~# ping -c3 -s 1452 -I 192.168.10.64 192.168.10.1 PING 192.168.10.1 (192.168.10.1) from 192.168.10.64 : 1452(1480) bytes of data. 1460 bytes from 192.168.10.1: icmp_req=1 ttl=64 time=0.503 ms 1460 bytes from 192.168.10.1: icmp_req=2 ttl=64 time=0.443 ms 1460 bytes from 192.168.10.1: icmp_req=3 ttl=64 time=0.434 ms --- 192.168.10.1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 1998ms rtt min/avg/max/mdev = 0.434/0.460/0.503/0.030 ms
tcpdump on pfSense box:
[2.0.1-RELEASE][root@pfsense.bobnet]/root(25): tcpdump -i re3_vlan128 host 192.168.10.64 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on re3_vlan128, link-type EN10MB (Ethernet), capture size 96 bytes 00:08:47.328097 IP 192.168.10.64 > 192.168.10.1: ICMP echo request, id 22659, seq 1, length 1460 00:08:47.328201 IP 192.168.10.1 > 192.168.10.64: ICMP echo reply, id 22659, seq 1, length 1460 00:08:48.332135 IP 192.168.10.64 > 192.168.10.1: ICMP echo request, id 22659, seq 2, length 1460 00:08:48.332183 IP 192.168.10.1 > 192.168.10.64: ICMP echo reply, id 22659, seq 2, length 1460 00:08:49.336806 IP 192.168.10.64 > 192.168.10.1: ICMP echo request, id 22659, seq 3, length 1460 00:08:49.336850 IP 192.168.10.1 > 192.168.10.64: ICMP echo reply, id 22659, seq 3, length 1460 ^C 6 packets captured 6 packets received by filter 0 packets dropped by kernel
I've also looked at the interface itself, no replies visible either.
Pinging with large payloads (2000+) works well, whenever the packets are fragmented.
Question - if I set the interface/VLAN interface MTU very low, say 300, should a ping of 1200 bytes directed to it be automatically split up into smaller chunks? Or should it always fail (like it does here)? I think I need to read up on the basics…
-
If you don't have 'do not fragment' set then it should simply fragment the packets into suitably sized frames. The problem is how it decides whether it needs to do that and how it decides what a suitable size is.
To be honest this is now well outside my own experience! ;)Steve
-
Question - if I set the interface/VLAN interface MTU very low, say 300, should a ping of 1200 bytes directed to it be automatically split up into smaller chunks? Or should it always fail (like it does here)? I think I need to read up on the basics…
It'll get dropped, can't accept frames larger than your MTU. Nothing in the path to fragment it.
-
I tried, tested, tuned and couldn't get it to work in a reasonable amount of time. So I dropped the idea of using VLANs on the pfSense/firebox combo.
Instead of aggregating four ports into a LAGG and passing VLANs through that, I just mapped four ports on the switch to different VLANs and setup interfaces normally. It works fine this way, but I really liked the idea of having a theoretical throughput of 400Mb/s to play with, along with a flexible amount of VLAN interfaces to control.
-
A dissapointing result but hopefully save someone else some time. ::)
I'm sure it could be made to work but whether it would be worth the effort or not is debatable. It would probably be easier to just put an Intel gigabit card in the PCI slot with the case mods that requires.Steve