MTU not being set correctly via VLAN, LAGG
-
I'm trying to set up jumbo frames under 2.2 Alpha (Wed Jul 09 18:07:44 CDT 2014 Build). While this seems to work fine when using raw interfaces (igb driver), I'm encountering issues using VLANs and/or LAGG groups. There seem to be two (potentially related) issues occurring:
- VLANs and LAGG groups ignore MTU settings greater than 1500 when set via the VLAN or LAGG group, but work when set via the parent interface or set to values less than 1500.
- Setting the VLAN or LAGG MTU does not update the MTU of the parent interfaces (seems related to https://redmine.pfsense.org/issues/2786).
For example, if I start out with a new interface called "TEST" and set it to use igb1 with no MTU or MSS set, I get the following default MTU of 1500, as expected:
TEST -> igb1
TEST MTU = DEFAULT (1500)[2.2-ALPHA][admin@pf]/root(16): ifconfig | grep mtu igb0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb1: flags=8c02 <broadcast,oactive,simplex,multicast>metric 0 mtu 1500 igb2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb3: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb4: flags=8c02 <broadcast,oactive,simplex,multicast>metric 0 mtu 1500 igb5: flags=8c02 <broadcast,oactive,simplex,multicast>metric 0 mtu 1500 ...</broadcast,oactive,simplex,multicast></broadcast,oactive,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></broadcast,oactive,simplex,multicast></up,broadcast,running,simplex,multicast>
Now when I set the TEST interface to an MTU of 3000, I get the an MTU of 3000, again, as expected:
TEST -> igb1
TEST MTU = 3000[2.2-ALPHA][admin@pf]/root(17): ifconfig | grep mtu igb0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 3000 igb2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb3: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb4: flags=8c02 <broadcast,oactive,simplex,multicast>metric 0 mtu 1500 igb5: flags=8c02 <broadcast,oactive,simplex,multicast>metric 0 mtu 1500 ...</broadcast,oactive,simplex,multicast></broadcast,oactive,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast>
Now I set up a new VLAN with ID=2 and PARENT=igb1. I remove all MTU settings from the TEST interface to force the default (should be 1500) and set the TEST interface to use igb1_vlan2. Instead of resetting both igb1 and igb1_vlan2 to 1500 as you would expect, the old igb1 MTU of 3000 stays active, and the igb1_vlan2 interface uses the same value:
TEST -> igb1_vlan2
TEST MTU = DEFAULT (1500)[2.2-ALPHA][admin@pf]/root(30): ifconfig | grep mtu igb0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 3000 igb2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb3: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb4: flags=8c02 <broadcast,oactive,simplex,multicast>metric 0 mtu 1500 igb5: flags=8c02 <broadcast,oactive,simplex,multicast>metric 0 mtu 1500 ... igb1_vlan2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 3000</up,broadcast,running,simplex,multicast></broadcast,oactive,simplex,multicast></broadcast,oactive,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast>
Furthermore, when I set TEST to use an MTU of 6000, again we see no change in either igb1_vlan2 or igb1 (we would expect an MTU of 6000 for both):
TEST -> igb1_vlan2
TEST MTU = 6000[2.2-ALPHA][admin@pf]/root(30): ifconfig | grep mtu igb0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 3000 igb2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb3: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb4: flags=8c02 <broadcast,oactive,simplex,multicast>metric 0 mtu 1500 igb5: flags=8c02 <broadcast,oactive,simplex,multicast>metric 0 mtu 1500 ... igb1_vlan2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 3000</up,broadcast,running,simplex,multicast></broadcast,oactive,simplex,multicast></broadcast,oactive,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast>
But when I set TEST to use igb1 again, the 6000 MTU now gets applied to both igb1_vlan2 and igb1:
TEST -> igb1
TEST MTU = 6000[2.2-ALPHA][admin@pf]/root(30): ifconfig | grep mtu igb0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 6000 igb2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb3: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb4: flags=8c02 <broadcast,oactive,simplex,multicast>metric 0 mtu 1500 igb5: flags=8c02 <broadcast,oactive,simplex,multicast>metric 0 mtu 1500 ... igb1_vlan2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 6000</up,broadcast,running,simplex,multicast></broadcast,oactive,simplex,multicast></broadcast,oactive,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast>
I see the same issue when using LAGG (e.g. LACP) interfaces. If I set up a new LAGG interface with PARENT=igb4+igb5, and set TEST to use it with the MTU still set to 6000, I get teh default MTUs of 1500:
TEST -> lagg1
TEST MTU = 6000
lagg1 -> igb4, igb5[2.2-ALPHA][admin@pf]/root(33): ifconfig | grep mtu igb0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 6000 igb2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb3: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb4: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb5: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 ... igb1_vlan2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 6000 lagg1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500</up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast>
Likewise, if I combine VLANS with LAGG by setting vlan2's parent to lagg1, the MTU of 6000 continues to have no effect:
TEST -> lagg1_vlan2
TEST MTU = 6000
lagg1 -> igb4, igb5[2.2-ALPHA][admin@pf]/root(35): ifconfig | grep mtu igb0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 6000 igb2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb3: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb4: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb5: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 ... lagg1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 lagg1_vlan2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500</up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast>
Finally, when I set TEST to use lagg1_vlan2 to a value less than 1500, it seems to update lagg1_vlan2, but again, this change does not propagate to the parent lagg and vlan interfaces (e.g. igb4, igb5, and lagg1):
TEST -> lagg1_vlan2
TEST MTU = 1000
lagg1 -> igb4, igb5[2.2-ALPHA][admin@pf]/root(36): ifconfig | grep mtu igb0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 6000 igb2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb3: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb4: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 igb5: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 ... lagg1: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500 lagg1_vlan2: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1000</up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast></up,broadcast,running,simplex,multicast>
Like I said, there seem to be two bugs here:
1. MTU settings on VLAN and LAGG interfaces do not correctly propagate to the parent interfaces as expected (e.g. https://redmine.pfsense.org/issues/2786).
2. VLAN and LAGG interfaces seem to ignore MTU values > 1500, even when the parent interfaces support these MTU values. VLAN and LAGG interfaces will, however, take MTU values greater than 1500 when the parent interface is directly set to use such values.Thoughts?
-
Can anyone confirm that they see these same issues in 2.2?
How about 2.1? I don't have a 2.1 install handy at the moment, so it's hard for me to say whether these are long standing issues with the way pfSense handles MTU settings on LAGG and VLAN interfaces or whether it's a regression in 2.2.
I can open up a bug report if others feel that these are actual issues/regressions.
-
Not sure about your specific issue with the igb0 driver, but because I have a FreeBSD server that uses LAGG (starting with 9.0 and now on 10), I know there's a limitation of the LAGG and VLAN drivers that the driver settings of the first device are considered to be the settings of all devices, since they must ALL have the same settings: http://www.freebsd.org/cgi/man.cgi?query=lagg&apropos=0&sektion=4&manpath=FreeBSD+9.1-RELEASE&arch=default&format=html
I think this is standard implementation too, since when I tried it on Solaris and Linux, I noticed the same instructions and warnings about MTU and link aggregation. I'm guessing this is due to frame ordering and fragmentation problems otherwise (standard UDP wouldn't work, since you can't do discovery). As for why this applies to VLAN as well as LAGG, perhaps they share a bunch of code (both involving frame-tagging protocols).
But this could be a FreeBSD limitation or oversight –- since, if fragmentation and/or MTU discovery problems occur, your throughput can go down to way down (10x less throughput or more!). In fact, a little bit of Googling confirms that setting MTUs can be a bit of a minefield, which is probably why IEEE was adamant about not standardizing Jumbo Frames (it's mainly supported by a manufacturer consensus): http://blog.ipspace.net/2011/07/all-mtus-are-not-same.html
-
I don't think this is an OS or driver issue: when I manually set the interface MTUs, I can get things to work as required. The issue seems to be that the pfSense interface is not correctly translating MTUs for VLAN and LAGG interfaces into the necessary underlying config changes.
I get that there are limits on needing to use the same MTU across various VLAN and LAGG devices, but the current pfSense 2.2 interface doesn't even seem to account for that fact or properly synce the MTUs between LAGG and VLAN related interfaces.
-
It seems like the issue is that pfSense isn't setting the MTU for the interfaces used to create the LAGG device prior to creating the LAGG device. According to the BSD docs, a LAGG interface will use the MTU of the first member interface added to it. Thus, pfSense needs to set the MTU for the underlying physical interfaces to teh proper value prior to creating the LAGG interface (and thus also prior to creating any of the VLAN interfaces).
This seems to correspond to what I see playing around on the command line: setting the MTU of a non-LAGG affiliated interfaces works just fine, but trying to set a LAGG-affiliated interface or the LAGG interface itself leads to "ifconfig: ioctl (set mtu): Invalid argument" errors.
It also seems that the existing MTU configuration model PFsense uses of setting the MTU at the top level abstract interface level (e.g. WAN, LAN) breaks down when using derivative interfaces like VLANs or LAGGs or when assigning multiple top level interfaces to a single physical device. MTU is really a per physical device setting (or per VLAN setting, assuming the VLAN mtu is less than or equal to the MTU of the underlying device), not a per top-level abstract interface setting. Thus, it might make more sense to have a separate config page where you set the MTUs for each physical device, LAGG, or VLAN enforcing all the necessary rules at that point (e.g. not allowing user to set VLAN MTUs higher than their underlying interface MTUs, setting the MTU for any LAGG member to the MTU of the LAGG interface, etc). That would seemingly make the MTU configuration more straightforward and resistant to impossible configuration requests in any case going beyond a simple 1 abstract interface -> 1 physical interface mapping.
For now, using MTUs other than 1500 in complex (LAGG, etc) interface setups doesn't seem to work in pfSense.
Thoughts?
-
It looks like at least one other user encounter the lack of MTU support atop LAG inertafces using 2.1, so this must not be a new "issue": https://forum.pfsense.org/index.php?topic=50444.15.
I can give the workaround mentioned in that thread a try, but it's quite the hack.
Any chance we could add this as a feature request for 2.2? Seems as good a time to fix it as any. I can create a feature request in Redmine if need be.
-
I opened a bug for this at https://redmine.pfsense.org/issues/3774.