PfSense WAN Traffic Incredibly Slow on XenServer 6.5 SP1


  • Hi,

    Here is our setup:

    xenserver-primary: 2 x 6-core Xeon X5660 @ 2.8GHz, 60GB DDR3, 4 x WD Red 1TB RAID10, 3 x 1Gbit ethernet ports in use (WAN, LAN, CCTV [OPT1])
    xenserver-secondary: 2 x 6-core Xeon L5640 @ 2.27GHz, 48GB DDR3, 4 x WD Red 1TB RAID10, 3 x 1Gbit ethernet ports in use (WAN, LAN, CCTV)

    We have Dell PowerConnect 5324 split into 3 x 8-port VLANs, effectively acting as three switches so that each pair of matching interfaces sit on their own network: 2 x WAN on one, 2 x LAN on another, and 2 x CCTV on the remaining. We have a Hitron modem also connected to the same VLAN as the WAN interfaces. Unfortunately, we're double-NATed as the modem-only mode doesn't work properly, and so we've pointed the DMZ to pfSense as a partial fix.

    We were happily running pfSense on xenserver-primary for a few months with good internet speeds of 200Mbps+.

    Yesterday, I moved the VM to xenserver-secondary, and although the VM fired up without issue, we found that WAN traffic was exceptionally slow: around 1Mbps according to speedtest.net. After numerous reboots of the VM and hardware involved, I transferred the VM back to the original server and the speed was right back where it should be. I tried the same again today to ensure there were no more variables, and sure enough, the speed dropped to 1Mbps. I moved it back, and voila; we're back to full speed.

    Can you think why this might be? The hardware on these two hosts is so similar that I wouldn't think it would make a difference, and I should point out that the CPU usage on the offending host was barely more than idle while pfSense was running, so I don't think the CPU is the bottleneck.

    I hope my setup is clear enough, I can create a diagram if not.

    Thanks



  • Also to give a try: wrong configured power-settings in BIOS can make XenServer incredible slow.


  • @johnkeates:

    See: https://forum.pfsense.org/index.php?topic=88467.0

    Thanks, John. I implemented that particular fix a while ago for the LAN interface as physical machines could access the net fine, but not VMs on the same host. Unless I've misunderstood something, I don't believe that this problem is related. I double-checked the settings regardless, and checksums are still turned off for the LAN interface.

    @Nagilum:

    Also to give a try: wrong configured power-settings in BIOS can make XenServer incredible slow.

    Would that be power settings for the host's BIOS, or for the VM's BIOS? Also, which settings in particular should I look at? Thanks.


  • For a test, I installed a fresh copy of pfSense on xenserver-secondary (the one having issues: Proliant G6). I connected the WAN interface to the modem, and created a private network called "Testlab LAN". I disabled checksumming on this private network (predictably, there was no internet until I did this), and connected a Windows VM to this network for testing. Again, I got woefully slow internet, so that rules out the original pfSense VM as being the problem. There has to be a difference in how the G6 host is configured when compared to the working host (Proliant G7). All other VMs on the G6 have no issues, but pfSense doesn't like it.


  • How is the LAN-LAN speed (i.e. according to iperf)?


  • @johnkeates:

    How is the LAN-LAN speed (i.e. according to iperf)?

    At the moment the VMs are transferring over to another host so that I can install XenServer 7.0 on the problematic one, so I'll have to try benchmarking it when its up and running. Although I never tested performance with iperf, I was getting typical GbE speeds when transferring files. One thing is should perhaps note is that I needed to disable VT-d on the problematic host in order to install XenServer 6.5 in the first place; it would crash otherwise. I know that VT-d is often associated with graphics cards, but I wonder if pfSense requires more direct access to the NICs that cannot be achieved without VT-d.


  • Update:

    I found the source of the slow WAN traffic: it is when using the HP NC364T quad-port NIC. It doesn't matter if I'm running XenServer 6.5 or 7.0, pfSense does not like using any ports on that NIC for WAN traffic. If I switch to the onboard Intel NIC for WAN, then I get full speed. The strange thing is that I can use the HP cards for LAN traffic with no such issue. So, now I have the LAN and OPT1 traffic on the HP NIC and the WAN traffic on the onboard, and I'm getting full speed.

    Why would only WAN traffic be a problem?


  • @tyh:

    Update:

    Why would only WAN traffic be a problem?

    Probably because of MTU, offload or connection differences. Did you assign the card via VT-d, SR-IOV or did you use bridges? I have the same card with a SuperMicro X10SLV running Xen 4.4 with VT-d and all four ports are working fine.


  • Interesting. VT-d is actually disabled on xenserver-secondary (the Proliant G6), because all XenServer versions refuse to install or boot with it enabled. In my initial post, I said it was only xenserver-secondary having issues, but it turns out that xenserver-primary also has problems when using that particular network card for WAN. I'm going to say that I'm also not using SR-IOV, mainly because this is the first time I've heard of it.

    If the problem is related to VT-d, it makes sense that both hosts are affected, as CPU masking will likely have disabled VT-d for xenserver-primary (the G7) too, to keep the pool homogeneous.


  • Stange. That'd mean that the XenServer product is using a Dom0 that doesn't play nice with the hardware. I checked my inventory, I have two HP servers running Xen (not XenServer, just Xen), a DL180 G6 and a DL360 G6, and they are doing just fine. They are nearly identical setups; Xen 4.6 and Xen 4.8, but both Debian 8 as a Dom0 host, pfSense 2.3.2-p1 as firewall and a bunch of DomU's, all Linux (mix of Debian and Fedora). I'm using the internal NIC's on both of them, and pulling 100Mbit up/down. (branch office uplinks)