Unbelieveably bad performance
-
I would be disabling the paravirtualised drivers for the pfSense VM to test that.
Yeah, forcing the VM to e1000 would be ideal and likely would fix the issue. From some brief searching though it doesn't appear easy, if possible at all, to force Xen to present a specific NIC to the VM. Ugly, every other hypervisor handles that far, far better.
-
This is a known issue in upstream FreeBSD 10 after they incorporated the Xen paravirtualized drivers in the standard kernel. It's not exactly pfSense's fault.
Yeah, forcing the VM to e1000 would be ideal and likely would fix the issue. From some brief searching though it doesn't appear easy, if possible at all, to force Xen to present a specific NIC to the VM. Ugly, every other hypervisor handles that far, far better.
It's definitely possible. There's a wrapper script for QEMU in```
/opt/xensource/libexec/qemu-dm-wrapperAnyways, I've been experiencing the same network performance issues in pfSense 2.2 snapshots, both on XenServer 6.2 and XenServer Creedence RC. However, I haven't found any way to remove or blacklist drivers _in the kernel_ the way one would on Linux (e.g. rmmod or adding bootloader parameters). So, the only workaround I've found, to revert to emulated NICs, is to recompile the BSD kernel without PVHVM drivers. I've [written instructions here](https://code.dingcorp.com/frederick.ding/pfsense-tools/wikis/removing-pvhvm), tested a few weeks ago, though it's a convoluted process to recompile a kernel.
-
So how is it these drivers cause the packets to show up on the physical nic? of the host - but not get answered?? While I can see how drivers can cause problems in virt.. From the sniffs sure looks like info is put on the physical nic.. Is there something wrong with the info put on the wire? Mangled packets? I did not look that deep into it - just following the stream.. that the other side doesn't like and doesn't see?? If the other side actual saw the traffic then yeah would have to look deeper into why packet there but not seeing it, etc..
-
I agree with you that it looks like there's no reply and hence an external problem. The 404 response is reaching the client correctly though?
However in light of the known issues with the xn(4) drivers in FreeBSD 10 it seems unproductive to continue without testing a standard NIC driver, even if it's re(4). This fits the fact it worked fine under 2.1.5 also.
Steve
-
So how is it these drivers cause the packets to show up on the physical nic? of the host - but not get answered??
I'm pretty confident judging by the packet captures it's because some packets are ending up with bad checksums, so it doesn't matter that they're getting there, they're dropped for that reason.
It's definitely possible. There's a wrapper script for QEMU in```
/opt/xensource/libexec/qemu-dm-wrapperAh good, thanks for the tip, at least it's possible and hopefully that'll help others.
-
But the invalid checksum is most likely to it just being offloaded, etc. I see that so much in sniffs that I have even turned off checking for it.
-
But the invalid checksum is most likely to it just being offloaded, etc. I see that so much in sniffs that I have even turned off checking for it.
That's true most of the time where you see bad checksums, but where it's inconsistent that's not the case. Everything would have bad checksums if it were hardware checksum offloading at fault, and some of those packets have valid checksums. Also where hardware checksum offloading is to blame, the checksum is most always 0 in the capture, which also isn't the case here.
-
good points.. I will keep that in mind when looking at future sniffs ;)
-
I'm seeing very similar issues as the OP, using KVM via Promox 3.3. Running on an AMD fx8350 system with a quad port Intel Nic.
2.1.5 is running perfectly. Upgrading to the very latest RC 2.2 seems to migrate fine, but upon boot won't pass any traffic except icmp.
Have tried both paravirtualized nic drivers, as well as the e1000 drivers. No change.
I did try a bare bones install of rc2.2 in a new vm using e1000 drivers, and with very minimal configuration it did appear to work correctly. So it seems that some aspect of the migrated configuration is causing problems. I haven't had a chance yet to figure out what portion.
Will probably try disabling the offloaded checksum calc first (it's easy), and if that doesn't fix it, start removing components of the existing config to see what is causing issues.
Moderately simple pfsense system config. No modules, no vlans. Does have 1 wan and two lan ports (running as emX), multiple ports forwards, schedules, logging. It's running as a pure fw appliance. So, dns/dhcp, sip/asterisk, vpn/strongswan, etc, all running on different internal hosts.
If necessary I can certainly build the whole config again…
-
@johnkeates:
I posted this in a different thread, I hope it's okay to semi-double post
You're more than welcome to cross-post solutions across however many threads are relevant. :) There are probably a dozen different threads around here on this same root issue. Feel free to post it in however many threads are relevant. Many people only follow specific threads and may miss a fix for the same problem posted in a different thread otherwise.
pf does have a history of breaking checksums in certain areas, though I can't say I've seen any of that recently outside of this particular issue with Xen. It's probably a combination of pf+xn from the sound of your description. Can take our /tmp/rules.debug file, copy it over to stock FreeBSD, kldload pf && pfctl -f rules.debug (assuming stock system has same NICs) and see what happens. I'm definitely curious on the results.