Very slow traffic from other VM's through pfSense on XenServer



  • ___First identify the uuid of the VIF's:
    xe vm-vif-list uuid=VMUUID

    And disable the offload settings:
    xe vif-param-set uuid=VIFUUID other-config:ethtool-gso="off"
    xe vif-param-set uuid=VIFUUID other-config:ethtool-ufo="off"
    xe vif-param-set uuid=VIFUUID other-config:ethtool-tso="off"
    xe vif-param-set uuid=VIFUUID other-config:ethtool-sg="off"
    xe vif-param-set uuid=VIFUUID other-config:ethtool-tx="off"
    xe vif-param-set uuid=VIFUUID other-config:ethtool-rx="off"

    shutdown / start the VM___

    Used this on both a XenServer 6.5 and a 6.2 later upgraded to 6.5. On both it has given other VM's internet-access again.

    Run the xe commands on a Xenserver Private Network, so I hope the speed degrade will only occur on traffic that involves that net.
    I think, both the pfSense VM and the other VM's need to be restartet to get useful speed.



  • @phadm:

    You have to disable the offload function at the VIF at the XenServer.
    First identify the uuid of the VIF's:

    Which VIF? Local or WAN or both?

    Thanks,
    Florian


  • LAYER 8 Netgate

    I did it on all.



  • This helped me too. I only did this for my LAN port.

    In my setup it seemed to be sufficient to execute:
    xe vif-param-set uuid=VIFUUID other-config:ethtool-tx="off"
    xe vif-param-set uuid=VIFUUID other-config:ethtool-rx="off"



  • @jpenninkhof:

    This helped me too. I only did this for my LAN port.

    In my setup it seemed to be sufficient to execute:
    xe vif-param-set uuid=VIFUUID other-config:ethtool-tx="off"
    xe vif-param-set uuid=VIFUUID other-config:ethtool-rx="off"

    I can confirm that the LAN port should be enough. On a related note, did someone install the XenServer Tools in the VM?



  • Hi,

    updated my XenServer 6.2 to 6.5 a few day ago with my VM pfsense 2.1.5 with no issue

    updated pfsense to 2.2 WITH XENTOOLS (xe-guest-utilties 6.0.2_3) and got the same issue !

    installed xentool using that method http://blog.feld.me/posts/2014/07/pfsense-on-citrix-xenserver/ (Thanks feld !)

    look like issue remain even with Xentools :/

    anyone can confirm ?


  • LAYER 8 Netgate

    Yes.  It's broken.



  • damn !

    but a quesiton remain … was it working well in snapshot ? was it working well with previous version of xentool ?

    in this thread
    https://forum.pfsense.org/index.php?topic=86827.0
    it look like to be an issue with xn nic …
    maybe a previous version should work ?


  • LAYER 8 Netgate

    No.

    Just disable the tx/rx like in the above until FreeBSD and/or Citrix fixes it.



  • Ok

    didi the above fix and it finally work.

    Thanks folks !



  • My Internet speed normally is 20 Mb/s down and 2 Mb/s up.

    I deployed pfSense 2.2-RELEASE X64 in XenServer 6.5

    Without modification, the pfSense 2.2 would only muster 5 Mb/s down, and 0.06 Mb/s up. Painful.

    I applied the changes to the LAN side VIF and the upload speed went back to full 2 Mb/s. The WAN speed did not improve.

    I applied the changes to the WAN side VIF and the upload speed went back up to 20 Mb/s.

    Eureka!



  • It's just the tx-offload setting that needs to be changed, rx-offload is fixed-on.

    I can confirm the problem and fix with Debian Wheezy/Xen 4.1.4 dom0.

    ethtool -K ${dev} tx off in vif-bridge online did the trick.

    The issue wasn't submitted to freebsd-bugs so far, now it is:
    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197344



  • Interesting - only appears to apply to virtual interfaces.

    My pfSense VM is running in xen 4.2 (Centos 6.6 dom0) and has no speed issues, but I'm using pci-passthrough to give 2 dedicated hardware NICs (off a dual-port Intel card) to pfSense for LAN/WAN  (so that DMZ/intranet are physically separate too).



  • Thanks johnkeats for putting that up here. It really helped me sort this out.

    One thing to note is disabling tx offload using ethtool -K does not persist across guest reboots or live migration because the dom-id and assigned vif changes, while xe vif-param-set other-config:ethtool-tx="off" does.

    Is there any downside to using the vif-param-set option, or are the two basically equivalent?



  • @johnkeates:

    You only need to disable checksum offloading on the hypervisor side of pfSense's interface.

    Any interface that does DomU-DomU communication on pfSense's side produces un-checksummed packets which get dropped by PF in BSD.

    sudo ethtool -K $interface tx off

    where $interface is the VIF on the Xen Dom0 side is enough. Setting TX off on the bridge forces the Dom0 to calculate ALL checksums on ALL packets no matter where the come from or where they are going. This is not a smart idea since it creates a lot of calculations where they might not be needed. So if the pfSense DomU is on vif123.0 you run: sudo ethtool -K vif123.0 tx off

    Sorry noob question here,

    I am using a Xen implementation on a unraid distribution, when you say Dom0 side are you talking about the VIF that is spun up with the PFsense VM ? Like when i ifconfig to list my interfaces I just don't really know how to identify the interface you are referring to.

    Sorry for the noob question again


  • LAYER 8 Netgate

    It's all here:

    https://forum.pfsense.org/index.php?topic=85797.msg475906#msg475906

    I recently just rebuilt my test stack and all I did was the tx and rx on every NIC which is still probably more than is necessary but it worked.



  • @johnkeates:

    You only need to disable checksum offloading on the hypervisor side of pfSense's interface.

    Any interface that does DomU-DomU communication on pfSense's side produces un-checksummed packets which get dropped by PF in BSD.

    sudo ethtool -K $interface tx off

    where $interface is the VIF on the Xen Dom0 side is enough. Setting TX off on the bridge forces the Dom0 to calculate ALL checksums on ALL packets no matter where the come from or where they are going. This is not a smart idea since it creates a lot of calculations where they might not be needed. So if the pfSense DomU is on vif123.0 you run: sudo ethtool -K vif123.0 tx off

    Thank you for taking the time to explain this, i turned the TX off on the pfsense vif and all was good. Happy days



  • Hello all…

    Thanks for the information - sure helped us solve this but I have some more information that wasn't clear to me from all posted here.

    This issue only seems to apply where Pf is communicating with hosts within the same xen host (dom0).

    We use xenserver 6.2 fwiw. We have two xen dom0 - pf was natting for two services - one on dom0-a and one on dom0-b

    pf itself was located on dom0-b
    The dom0-a service worked perfectly after the update to 2.2.2 - the dom0-b service did not.

    For people new to xenserver / for completeness, we used:
    xe vm-list
    #then find the uuid of your pf vm
    xe vif-list vm-uuid={uuid of the vm from above}
    #note the uuid of the vif - not the network you want to change!
    #for each vif you can check the status:
    xe vif-param-get uuid={uuid of vif} param-name=other-config
    xe vif-param-set uuid={uuid of vif} other-config:ethtool-tx="off"

    For what it's worth I was able to turn off tx on only the LAN interface (which nats for the dom0-b service).

    I tried but did not need to keep offload off for the WAN interface which seems to get proper checksum as it leaves the dom0 through the physical nic.

    Once complete you need to reboot the pf vm. the setting will persist across reboots.

    Hope that helps someone else :-)

    Mitch



  • I've been running pfsense 2.2 on XenServer 6.2 for a while with the mentioned offloads disabled and it's been working great. I believe since I upgraded to XenServer 6.5 (or when I upgraded to 6.5 SP1) pfsense only works as before on one specific host in the pool. I have 3 hosts in the pool and when pfsense is running on 2 of them it is very slow, but on the 3rd host it works fine.

    How come..?? ???



  • Without knowing your network I can only guess… but see if this makes sense.

    What I found was that if the pfsense was routing traffic for vm's on other systems (outside the xen box itself) then things worked - the offload worked as expected as the offload is added at the nic as the data leaves the xen server.

    When I was routing traffic that was contained by the virtual network on the same xen host, that's when it didn't work - until I disabled the offloads - you only need to disable on the paths which you see the performance issues in my opinion - but you have to think it through.

    Cheers.


  • LAYER 8 Netgate

    The stack in the diagram in my sig is all on XenServer 6.5.  Works fine as long as the checksumming is turned off.



  • Well, this issue is when traffic flows from external machines through pfsense wan-interface to resources on the internal lan.

    The host on where this works has different hardware (including different NIC's) than the other two hosts in the pool. So when I migrate or restarts pfsense on  host 1 or 2 I don't get through the firewall from the outside (ia its so slow that it dont work). But with pfsense on host 3 it works as expected.

    Before it worked on all 3 hosts. Now the pfsense is not protected against host failure.



  • @Gr1pen:

    Well, this issue is when traffic flows from external machines through pfsense wan-interface to resources on the internal lan.

    The host on where this works has different hardware (including different NIC's) than the other two hosts in the pool. So when I migrate or restarts pfsense on  host 1 or 2 I don't get through the firewall from the outside (ia its so slow that it dont work). But with pfsense on host 3 it works as expected.

    Before it worked on all 3 hosts. Now the pfsense is not protected against host failure.

    What are the eth specs when it's failing? And is it a live migration or a shutdown-boot migration?
    If you want to protect against failure, it's better to use pfSense's failover options instead of hypervisor-based failover.



  • I think he was trying to do that but he perceived one pfsense to work and two others not to work.

    I'll try to explain it another way… the interface (if any) which transmits traffic to machines on the same physical xen server needs to have tx check sums turned off as I noted in my post. That's the only interface affected.

    If you have a pf on xen and it does not route for any hosts on the same xen box you don't see any problem.

    This would affect any traffic to which check sums would be applicable (all I think?) - so it would affect carp traffic too I imagine IF your pf boxes were on the same network - if they are on different boxes the carp traffic will be fine.

    Just turn off the tx check sums for all the pfsense interfaces if you don't understand what I mean - the method I described surives rebooting and only affects the pf vms you apply the changes to.

    Hope that clarfies. Cheers.



  • Perhaps my explanation was not so clear. The offload settings mentioned here has been applied on all interfaces of pf from the start when I was running it on XenServer 6.2. That fixed the problem then and pf worked perfectly fine on all 3 hosts. It was like living in a Dream where the streets where paved with gold and there was free candy for everyone.

    After upgrading to XS 6.5/SP1 pf only works on 1 host. It doesnt matter if I live migrate or shut down and restart on Another host. It ONLY works on "host 3".

    I am only running 1 instance of pfsense and sure it may be better running 2 or more in a HA  setup, but thats not really the question here. I had a fine working setup. But not anymore. The candy is all gone and the only change is XS that has been upgraded.

    In reply to johnkeates I dont know what eht spec I should look into…?



  • @Gr1pen:

    In reply to johnkeates I dont know what eht spec I should look into…?

    Use XE to get all the vif specs from the working pf hypervisor and one non-functional hypervisor, as well as ethtool parameters for both.
    We're looking for other variables that might mess with the in-memory transport, because that's where VirtIO related issues seem to lie.
    If you could post those 4 outputs it'd help us diagnose.



  • My bad…

    I noticed tht the interfaces on 2 failing XenServer hosts was reordered for some reason. Correcting this solved my problem, hence it was not related to pfsense.

    I am thankful for your effort to help out and apologize for confusing you!



  • @Gr1pen:

    My bad…

    I noticed tht the interfaces on 2 failing XenServer hosts was reordered for some reason. Correcting this solved my problem, hence it was not related to pfsense.

    I am thankful for your effort to help out and apologize for confusing you!

    Glad you got it fixed!



  • Just to keep this updated.

    This problem still happens on XenServer 7.0 with pfSense 2.3.1.



  • @viniciusferrao:

    Just to keep this updated.

    This problem still happens on XenServer 7.0 with pfSense 2.3.1.

    Yep, until it's fixed in upstream FreeBSD it won't get fixed, ever.



  • @johnkeates:

    @viniciusferrao:

    Just to keep this updated.

    This problem still happens on XenServer 7.0 with pfSense 2.3.1.

    Yep, until it's fixed in upstream FreeBSD it won't get fixed, ever.

    Just figured I'd update this thread on these issues.  It looks like freebsd 11 is supporting dom0 support for xen, so hopefully these issues will be fixed.  I'm just getting a virtualized setup going with support ending for 32 bit here soon so I may try 2.4 of PFSense to see how it works out of the box with xen.

    Here is a link to the freebsd support, though it will be experimental at this stage:

    https://wiki.freebsd.org/Xen



  • @johnkeates:

    @gothicman02:

    @johnkeates:

    @viniciusferrao:

    Just to keep this updated.

    This problem still happens on XenServer 7.0 with pfSense 2.3.1.

    Yep, until it's fixed in upstream FreeBSD it won't get fixed, ever.

    Just figured I'd update this thread on these issues.  It looks like freebsd 11 is supporting dom0 support for xen, so hopefully these issues will be fixed.  I'm just getting a virtualized setup going with support ending for 32 bit here soon so I may try 2.4 of PFSense to see how it works out of the box with xen.

    Here is a link to the freebsd support, though it will be experimental at this stage:

    https://wiki.freebsd.org/Xen

    I suppose that could actually fix the netback/netfront problems because it will be BSD on the other end too. Interesting.

    Yes very.  Although there is still some work to do.  I got the latest 2.4 snapshot running (as of March 18th) with FreeBSD 11.0-p8 under Xenserver 7.1 with all patches, and the issues with checksum offloading still exist.  Disabling it still fixes the issue through only on the rx and tx side, but I do believe there is a slight performance drop like others have said here.  I haven't tested local file transfers yet, but I do notice a slight drop in internet bandwidth.  I'll do more testing when I got time.



  • So as I understand it, we need an upstream fix from FreeBSD for this to be magically solved once and for all. What about workarounds? Can someone summarize what steps to take so we can add it to the Wiki under Virtualization / Xen?

    Out of curiosity, is it the same with other environments, like KVM or ESXI?



  • Hi Guys, is this soluction necessary? I mean, I've already disabled the "hardware checksum offloading" and I'm running XS 7.2 with citrix DVSC... I just can't go thru the internet from my second server, all my VM's hosted in my pool master are working fine...
    PFSENSE is my gateway running on master.


  • LAYER 8 Netgate

    Yes. It is still necessary if you want to use the PV NICs.

    You can also put this in /boot/loader.conf.local:

    hw.xen.disable_pv_nics=1

    Your interfaces will now present as reX and you will not have to make those VM checksum changes. But they won't be paravirtualized.



  • @bbmitch

    I'm wondering... this thread is 4 years old... and problem is still there.
    The trick xe vif-param-set uuid={uuid of vif} other-config:ethtool-tx="off" is still working also.

    Any idea when this will be fixed?

    Thanks



  • Hello, I run my pfSense on XenServer. I edit both lan card

    xe vif-param-set uuid=... other-config:ethtool-rx="off"
    and
    xe vif-param-set uuid=... other-config:ethtool-rx="on"

    and speed of internet will be beter. But I still have problem with some pages (www) some pages open some pages don't open, some pages open very long and sometimes works fine :(.

    Any idea ??

    Thanks.




  • Banned

    This post is deleted!

Log in to reply