Extremely slow networking under Hyper-V on Intel NICs
-
I am running pfSense 2.4.4 p2 x64 on Hyper-V. The VM host is Dell R720 with Intel I350-t rNDC Quad-port Gigabit NICs, 2 x E5-2690 XEON (16 cores), running Windows Server 2016. I've assigned the pfSense VM 4 GB RAM, and 4 CPUs.
The pfSense VM is set up with two vNICs. One is attached to a virtual switch that is attached to one of the physical NICs and is acting as a WAN port. This physical NIC is connected directly to a gigabit cable modem.
The second vNIC is attached to a second virtual switch to which all of the other VMs are also attached. A second physical NIC is also attached to this virtual switch, acting as a LAN port to feed physical machines on the network via a 16 port gigabit switch.
The physical NICs are configured to enable SR-IOV, and both of the vNICs in the VM are also configured to utilize SR-IOV, with VMQ disabled, although enabling VMQ (and/or disabling SR-IOV) makes no difference in the problem. The other VM's run fastest with SR-IOV enabled and VMQ disabled, so this is how I have the pfSense VM set up as well.
I am seeing extremely slow speeds on Internet traffic through pfSense. If I disconnect the WAN network cable from the back of the server and plug it into my laptop (so my laptop is connected directly to the modem utilizing the exact same cable as the server uses) I see ~810 mpbs down and ~46 mbps up, which is reasonably close to the "up to 1000 mpbs/50 mpbs" advertised. So the physical link is good.
However, running through pfSense is a different story. Physical machines connected to the LAN as well as virtual machines connected to the LAN virtual switch are seeing fairly consistent Internet speeds of only 57 mbps down and 7 mbps up. When the speed test is doing the download at 57 mpbs, pfSense reports its CPU utilization at 20%, while the Hyper-V host reports the VM using 1% of total available CPU resources.
I have Hardware Checksum Offloading, Hardware TCP Segmentation Offloading, Hardware Large Receiving Offloading all disabled, however I have tested it with them on, and it makes no difference in the slow performance.
I tried installing and running a speedtest from the Debian command line inside the pfSense VM, and it reported 310 mbps down and 8.1 mbps up. I know that this is not wholly accurate as it is the firewall VM talking to the WAN, not through pfSense, but it at least shows that it should be capable of handling at least higher speeds than it is.
I did some LAN tests through the virtual LAN switch. Transferring files between a physical machine on the network and VMs on the virtual LAN switch I see ~950 mbps bidrectionally, which is the link speed. This tells me that the physical network layer is fine, as is the virtual switch, and virtual network cards.
At this point I've isolated and eliminated:
- Physical network
- Virtual switches
- Physical NIC configuration
All of these are operating correctly. The only thing left is pfSense itself, or perhaps debian's interface with the guest host? pfSense is fairly close to a bare metal installation.
I see other people reporting that this type of configuration is working great for them. I've read a ton of "pfSense is slow on Hyper-V" posts, and most seem to be related to Broadcom NICs or other issues that don't exist on my setup.
Any ideas or help in further diagnosing the issue would be most appreciated!
-
@ITFlyer Sounds like a problem with the WAN VSwitch or possibly the ethernet port? Have you already deleted and recreated the VSwitch or tried allocating the VSwitch to another port? I think these cards run the ports in pairs, so I'd try using the other pair on the same card or one of the other cards altogether. I don't have your bandwidth, but I get my advertised 300/25 through a tiny 2102R2 Hyper-V vm on my 10 year old quad core, using 2 cores and 1.5GB RAM, even using Realteks (now using an i340-T4). Inside pfSense, I have all offloading disabled (tried both ways), but all advanced features enabled on the physical NICs at stock settings (SR-IOV n/a on the i340) and VMQ enabled on the vSwitches. I also use a fixed disk size and fixed memory, since pf didn't seem to honor it anyway.
-
Hi provels,
Thanks for writing. OK, with this in mind, I did one more test just now.
I spun up a new Windows VM on my Hyper-V host. I then shut down the pfSense VM, bound the WAN vSwitch to the Windows VM, and started it up. The Windows VM was now connected to the WAN exactly the same way pfSense is.
I then ran a speed test on the Windows VM. The results: 796 mpbs down, 40 mpbs up.
So that rules out pretty much everything except pfSense itself. Both virtual switches are transferring at gigabit speeds, with no discernable server load. The NICs are handling the traffic without any problems. All the other VMs are communicating with one another and with other physical hosts at gigabit speeds.
The only slowness is...pfSense. I've pretty much eliminated everything else as being a factor. So something within pfSense, its configuration, or perhaps in the way in which it interfaces with the vSwitches.
Interestingly, I'm seeing a LOT of dropped packets in my pfSense queues. Like in the tens of thousands. Any time I try to use any kind of bandwidth through pfSense, the Drops counter on the corresponding queue starts counting upwards steadily. I see other people mentioning that their Drops counters stay locked at zero. So there is a bottleneck somewhere.
-
@ITFlyer And all the offloading options are disabled in System/Advanced/Networking? You say it's pretty much "bare metal" so no packages, etc. ? You could revert to OOB with Diags/Factory Defaults and rerun setup. Might want to search the r/PFSENSE sub at Reddit, too. Maybe try reverting the physical NICs to factory, too. Don't know but I'll think about it. Sorry, if you want to compare any other settings, let me know.
EDIT - https://www.reddit.com/r/PFSENSE/comments/7po5zf/lots_of_packet_loss_and_increased_ping/dsk11bp?utm_source=share&utm_medium=web2x
-
That's correct - no packages, the only thing installed over and above the original distro was the speedtest Python app, but that was to help diagnose the slowness that already existed.
The physical NICs are factory already - they default to SR-IOV enabled. I should have mentioned also that everything in the server is running the latest firmware.
-
@ITFlyer I think someone with 2016 and i350 NICs will need to chime in. I'm a couple steps behind. Even my latest i340 Win driver is from 2013... Sorry.
Maybe try the 2.5.0 dev version.
-
@ITFlyer If you have an open NIC port, how about creating a third vSwitch using that, binding only the pfSense LAN to that, and running the cable out to your physical switch, where your other physical and virtual hosts can link to it, then rerun your speed tests?
EDIT - I suppose you have ruled out duplex mismatch, right? Everything auto-neg?
-
I hate when people write a post like this and then never follow up with the results. It's been a couple weeks, so I thought I would post what I ended up doing.
I started many, many hours of research and trials of FreeBSD under Hyper-V. The other VMs on the host all had no problems moving gigabit-speed bidrectional traffic, so I knew there was no hardware issue. The other VMs all had SR-IOV enabled. Disabling SR-IOV on those VMs dropped their throughput to similar rates as I was seeing on the pfSense VM host. So even though the pfSense VM had SR-IOV enabled in Hyper-V, it appeared that it was not utilizing it.
I also tried creating and configuring an opnSense VM, which also runs under FreeBSD. It ran marginally faster (10mbps faster upstream, same speed downstream), but for the most part suffered from the same performance issues as pfSense.From the research that I did, it appears that the issue is simply that the SR-IOV is not supported on FreeBSD when running under Hyper-V. This chart seems to bear that out:
https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/supported-freebsd-virtual-machines-on-hyper-v#table-legend
So I decided to table the idea of running pfSense as a VM for now, until SR-IOV support is added. In the meantime, I bought a four-core HP T620+ and added a four-port Intel I350-T4 NIC (same one that's in my VM host).
Running pfSense on this little box, it handles gigabit traffic without even slightly breaking a sweat. This confirms that the I350-T4 NIC is not the issue, it is the lack of support for SR-IOV under Hyper-V - and there's nothing I can do about that until Microsoft or FreeBSD or whomever decides to implement it.
-
You said "Disabling SR-IOV on those VMs dropped their throughput to similar rates as I was seeing on the pfSense VM host."
Something is going on that is not SR-IOV related. I have a Hyper-V host running 2016 w/ an Intel I350-T4 and it can easily hit 113 MB/s going from vLAN to LAN without SR-IOV or VMQ. I can also max out my 250/40 internet without issue.
My motherboard doesn't support SR-IOV so it's not an option here. However from my understanding SR-IOV and VMQ were designed for 10 gbps or higher links.
When you say you are disabling VMQ do you mean in the VM properties page, or in the actual host OS via powershell?
-
I am not sure that your scenario is apples to apples. I am having almost exactly the same issue as the OP.
FioS gigabit connection
-direct through verizon router from laptop is 900mbps
-no packaged, no snort, etc. just a few basic routes and a nat or two
-speedtest-cli 450mbps max
-lan traffic through pfSense 250mbps max
-speedtest through firewall < 50% cpu on pfSense
-hyper-v host CPU less then < 5% during speed test
-vmq and other setting disabled, same as the OP.I have tried everything under the sun to resolve this. Next step is a metal install - then on to another firewall if that does not work.
-
idk but i have a dell R710 with ubuntu server and virtualbox, suricata on all the interfaces and i have full speed
it seems like a problem with windows to me and franky it's not something to be surprised about -
What is "full speed"?
FWIW - I don't doubt it could be a windows issue.
~20 Year MS partner here that fully migrated to Mac OS 1 years ago and will NEVER look back. I no longer use Office and do just about anything I can to avoid deploying windows in an SMB environment. It is an unmitigated disaster from the dekstop to the data center and it gets worse every day.
-
i mean that with or without pfsense in the middle i have the same speed near the limit of what my isp is declaring, fttc 200/20
the last Windows server I had to manage was Windows sbs 2003, and i had realy big trouble managing dns server/dc, nothing was working as expected ... instead i love that you can configure a server where you just need to read (and understand) a commented text file instead of searching all over in a not human friendly regedit whenever you have trouble. maybe 20 years ago it wasn't like today but now it never happened to me a real reason in a smb environment to have a windows server over a linux one.
i must admit that i don't have any experience with mac server -
I don't use Mac server (it was garbage and is pretty much discontinued). We just avoid windows server (and desktop) whenever possible for most clients now. In many cases we deploy terminal services (RDS) on Server 2008 running a windows 7 desktop. That is too coming to an end due to 3rd party software requirements. That said, 90% of my customers applications are SaaS in one form or another anyway.
In any case, I think the relevant issue here is not the bottleneck IN pfSense that could be many things -
What appears to be the real issue is that numerous people are having trouble getting anything north of 500mbps from the pfSense VM to the WAN when hosted on hyper-v.
-
yes, indeed, we are going out of topic and entering in personal preferences. anyway i agree with you
-
@ITFlyer According to this article you have to change the mac address on the virtual switch after disabling VMQ.
See https://www.dell.com/support/article/us/en/04/sln132131/windows-server-slow-network-performance-on-hyper-v-virtual-machines-with-virtual-machine-queue-vmq-enabled?lang=en
When you say you used a fixed disk size are you sure? In 2016/2019 you have to create the fixed disk outside the new VM wizard. You have to go to new > disk in the upper right hand side and select fixed disk type. Then attach it to your new VM that you created with the "attach a disk later" option. If you manually typed in something other then 127GB in the wizard it is still a dynamic vhdx.
With a dynamic vhdx I get between 40-60ish MB/s typically in my guests. In my guests that are fixed vhdx I get 113 MB/s without waiver.
You can convert a disk from dynamic to fixed via the disk tools, or create a new disk and reinstall pfsense if you are in fact using a dynamic disk.
-
You can configure the Hyper-V with additional resources and use these tips in the name of the speed:
<snipped by mod - screams spammy on OLD thread>
-
I appreciate you taking the time to post your spammy list of stuff along with a link promoting your own site, but what does any of this have to do with the specific networking issue we're talking about?
@UhimU said in Extremely slow networking under Hyper-V on Intel NICs:
You can configure the Hyper-V with additional resources and use these tips in the name of the speed:
Enable Hyper-V Integration Services
Use fixed VHD files
Don’t use Hyper-V snapshots as a Hyper-V backup alternative
Configure the size of paging files
Do not create -
I think the information can be useful to people and just share my thoughts. Sorry if you understood this as the spam
-
Sorry but I am with @ITFlyer - and I edited your post to remove what amounts to keywords and a link..
Glad your wanting to help - but keep it on topic to the question at hand.. And why would you join a forum, minutes latter add such a post to a almost year old thread, because it mentions something related to what your wanting to promote is what it looked like.