Poor VLAN and NAT Performance



  • Hi, I have a pfSense box (2.2.2-RELEASE (amd64) with:

    bge WAN
    bge LAN
    lagg (4x em)

    over lagg i have:
    untagged VLAN + many VLANs

    hardware is:
    Intel(R) Xeon(R) CPU 3070 @ 2.66GHz
    Current: 1666 MHz, Max: 2666 MHz
    2 CPUs: 1 package(s) x 2 core(s)

    memory usage: 10% of 2GB ECC (waiting for 8gbs - already ordered)

    MBUF usage: 22%

    Every virtual Windows Server sits on a VLAN with /30 subnets
    The problem is that with dedicated VLANs the connection is very poor, but if I use the untagged VLAN performance is quite good.

    I have a Dell Powerconnect 2824 with already tuned spanning tree bridge priority value.

    I enabled fastforwarding and tuned the kernel for em and bge cards. I don't know how to investigate further on the problem.

    Thanks,
    Giacomo


  • LAYER 8 Global Moderator

    " I don't know how to investigate further on the problem."

    Same here since not really anything to go on..  Why would you be ordering more ram, if your currently only using 10% of your 2GB - why do you think going to 8 is needed?

    So what is talking to what that you believe the performance is bad.. What performance do you see on your untagged vs your tagged?  Are you talking performance between VMs - you mention virtual windows servers, but give no details of anything behind your switch.

    So you have a lan, and then a lagg using different nics?



  • Sorry for not providing more details.

    I have Citrix XenServer with Active-Active bonding (4x em) with untagged vlan and tagged vlans for VMs.

    The tagged VLANs are very slow and the http download is about 10Kibs against 1.2Mibs over untagged VLAN.

    The switch is configured with LAG where BSD has LAGG and Linux has active-active bonding.

    Can't reach the LACP to work neither.



  • Are you running the firewall on bare metal or inside of Xen? What of these is in Xen?

    It sounds like the firewall's on bare metal and the Windows servers are inside of Xen?

    The symptom of such very slow download sounds a lot like one of the various possible TCP checksum bugs with Xen or its host OS.



  • PfSense is bare metal.

    XenServer hosts Windows Server 2012 R2 guests.

    The offload of the Xen driver is disabled and this gave better performance but we have big packet loss and slow networking.

    If I add a NO VLAN network in the guests, the networking it's OK.

    If i use a VLAN tagged network, the performance is BAD.

    The switch has four LAGS:

    LAG 2: pfSense (4x em 1GB)
    LAG 3: FreeNAS (3x em 1GB)
    LAG 4: XenServer hosts (4x em 1GB)
    LAG 5: XenServer management (2x em 1GB)


  • LAYER 8 Global Moderator

    And how is the performance without the lag?



  • Actually I'm deploying servers without VLAN tag (VLAN 1) over the LAG and performance is very good.

    I'll try disabling the LAG and I'll report it.

    It looks like there is some problem with VLAN tagging. Also, DHCP fails sometimes with it so I think there is some serious problem but can't understand if it's the NIC driver of XenServer. The NIC is an Intel 82571EB (copper)


  • Banned

    Disable active/active and run active/standby unless you run route based in IP or source MAC hash.



  • Now I'll try installing every update and every new driver. Already updated the switch. Too bad my switch does not support LACP, next time I'll buy better. Intel released a new driver for my card last month but Citrix did not release the update. Not sure if compiling it's supported.

    I'll write updates after a few t



  • Given traffic that isn't tagged by Xen is fine, and traffic that is tagged by Xen is problematic, that's definitely a problem on the Xen server somewhere. The firewall has no ability to tell whether Xen is tagging traffic or not, much less any way to treat it any differently depending.

    I strongly suspect some kind of checksum offloading issue, though don't work a whole lot with Xen to be able to suggest where or why. Google suggests:
    http://wiki.xenproject.org/wiki/Xen_Networking
    "With the DomUs bridged to VLAN interfaces, some optimizations need to be disabled or tcp and udp connections will fail." It's also possible they won't fail but have significant performance problems in that circumstance.

    If that doesn't help, that's a question where you're likely to get better answers on a Xen-related forum, where you'll find a larger audience with in-depth Xen expertise.



  • Yes, it's definitely a Xen problem.

    Now I have a giant question: how to have many and many isolated networks on the same Xen and pfSense interfaces? VLANs were the "traditional solution". Other than subnetting each server /30 with dedicated gateway, obviously…


Log in to reply