Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Unbelieveably bad performance

    Scheduled Pinned Locked Moved General pfSense Questions
    49 Posts 7 Posters 12.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      Douglas Haber
      last edited by

      @johnpoz:

      In a perfect world trying to track this down.. I wold sniff at the physical interface of your host, on both pfsense interfaces and then at the VM interface.

      This gives us full path..  And allows us to validate that inbound packets are getting all the way to the vm client behind pfsense - it answers and then pfsense sends that back and it goes out the physical interface of the hypervisor host..

      http://douglashaber.com/dump/hypervisor.cap
      http://douglashaber.com/dump/WANCapture.cap
      http://douglashaber.com/dump/LANCapture.cap

      warning - hypervisor cap ture is pretty big

      1 Reply Last reply Reply Quote 0
      • johnpozJ
        johnpoz LAYER 8 Global Moderator
        last edited by

        Ok followed one connection - see attached.

        Physical on the left, vm pfsense on the right

        So you see the syn come in from 6.46 to pfsense 6.38 saying hey I want to talk to you from port 38877 to your port 80

        So you see the syn,ack back and then the ack to the syn - typical handshake..

        Now 6.46 sends get some html shit..  you see ack back that says ok got your get.. Then sends 404..  He never gets an ack back that 6.46 got the ack to the 404..  So he sends 404 again, and again -  that is the retrans.

        So clearly pfsense put that on its virtual interface..  And as you can see on the left its also on the physical HOST interface..  So why does 6.46 never send back ack??  Did he not get it??  Your issue is between phsyical interface of host, and that 6.46 box..  Pfsense is doing exactly what its been asked to do..

        I see the 404 go out on the phsyical capture.. So why does 6.46 not ack??  Did he get it an ack and then that ack got lost.. Never shows up on the phsyical… Can you sniff on the 6.46 host??

        oknoackback.png
        oknoackback.png_thumb

        An intelligent man is sometimes forced to be drunk to spend time with his fools
        If you get confused: Listen to the Music Play
        Please don't Chat/PM me for help, unless mod related
        SG-4860 24.11 | Lab VMs 2.7.2, 24.11

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          This thread is a great example in diagnostics.  :)

          However it does seem hard to explain why it should have worked perfectly under pfSense 2.1.5 and not 2.2 if the error exists outside the host box.  :-\

          Have you read this: https://forum.pfsense.org/index.php?topic=85797.msg475906#msg475906

          I would be disabling the paravirtualised drivers for the pfSense VM to test that.

          Steve

          1 Reply Last reply Reply Quote 0
          • C
            cmb
            last edited by

            @stephenw10:

            I would be disabling the paravirtualised drivers for the pfSense VM to test that.

            Yeah, forcing the VM to e1000 would be ideal and likely would fix the issue. From some brief searching though it doesn't appear easy, if possible at all, to force Xen to present a specific NIC to the VM. Ugly, every other hypervisor handles that far, far better.

            1 Reply Last reply Reply Quote 0
            • F
              frederickding
              last edited by

              This is a known issue in upstream FreeBSD 10 after they incorporated the Xen paravirtualized drivers in the standard kernel. It's not exactly pfSense's fault.

              Yeah, forcing the VM to e1000 would be ideal and likely would fix the issue. From some brief searching though it doesn't appear easy, if possible at all, to force Xen to present a specific NIC to the VM. Ugly, every other hypervisor handles that far, far better.

              It's definitely possible. There's a wrapper script for QEMU in```
              /opt/xensource/libexec/qemu-dm-wrapper

              
              Anyways, I've been experiencing the same network performance issues in pfSense 2.2 snapshots, both on XenServer 6.2 and XenServer Creedence RC.
              
              However, I haven't found any way to remove or blacklist drivers _in the kernel_ the way one would on Linux (e.g. rmmod or adding bootloader parameters). So, the only workaround I've found, to revert to emulated NICs, is to recompile the BSD kernel without PVHVM drivers. I've [written instructions here](https://code.dingcorp.com/frederick.ding/pfsense-tools/wikis/removing-pvhvm), tested a few weeks ago, though it's a convoluted process to recompile a kernel.
              1 Reply Last reply Reply Quote 0
              • johnpozJ
                johnpoz LAYER 8 Global Moderator
                last edited by

                So how is it these drivers cause the packets to show up on the physical nic? of the host - but not get answered??  While I can see how drivers can cause problems in virt.. From the sniffs sure looks like info is put on the physical nic.. Is there something wrong with the info put on the wire?  Mangled packets?  I did not look that deep into it - just following the stream.. that the other side doesn't like and doesn't see??  If the other side actual saw the traffic then yeah would have to look deeper into why packet there but not seeing it, etc..

                An intelligent man is sometimes forced to be drunk to spend time with his fools
                If you get confused: Listen to the Music Play
                Please don't Chat/PM me for help, unless mod related
                SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  I agree with you that it looks like there's no reply and hence an external problem. The 404 response is reaching the client correctly though?

                  However in light of the known issues with the xn(4) drivers in FreeBSD 10 it seems unproductive to continue without testing a standard NIC driver, even if it's re(4). This fits the fact it worked fine under 2.1.5 also.

                  Steve

                  1 Reply Last reply Reply Quote 0
                  • C
                    cmb
                    last edited by

                    @johnpoz:

                    So how is it these drivers cause the packets to show up on the physical nic? of the host - but not get answered??

                    I'm pretty confident judging by the packet captures it's because some packets are ending up with bad checksums, so it doesn't matter that they're getting there, they're dropped for that reason.

                    @frederickding:

                    It's definitely possible. There's a wrapper script for QEMU in```
                    /opt/xensource/libexec/qemu-dm-wrapper

                    Ah good, thanks for the tip, at least it's possible and hopefully that'll help others.

                    1 Reply Last reply Reply Quote 0
                    • johnpozJ
                      johnpoz LAYER 8 Global Moderator
                      last edited by

                      But the invalid checksum is most likely to it just being offloaded, etc.  I see that so much in sniffs that I have even turned off checking for it.

                      An intelligent man is sometimes forced to be drunk to spend time with his fools
                      If you get confused: Listen to the Music Play
                      Please don't Chat/PM me for help, unless mod related
                      SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                      1 Reply Last reply Reply Quote 0
                      • C
                        cmb
                        last edited by

                        @johnpoz:

                        But the invalid checksum is most likely to it just being offloaded, etc.  I see that so much in sniffs that I have even turned off checking for it.

                        That's true most of the time where you see bad checksums, but where it's inconsistent that's not the case. Everything would have bad checksums if it were hardware checksum offloading at fault, and some of those packets have valid checksums. Also where hardware checksum offloading is to blame, the checksum is most always 0 in the capture, which also isn't the case here.

                        1 Reply Last reply Reply Quote 0
                        • johnpozJ
                          johnpoz LAYER 8 Global Moderator
                          last edited by

                          good points..  I will keep that in mind when looking at future sniffs ;)

                          An intelligent man is sometimes forced to be drunk to spend time with his fools
                          If you get confused: Listen to the Music Play
                          Please don't Chat/PM me for help, unless mod related
                          SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                          1 Reply Last reply Reply Quote 0
                          • B
                            beyondcrazy
                            last edited by

                            I'm seeing very similar issues as the OP, using KVM via Promox 3.3. Running on an AMD fx8350 system with a quad port Intel Nic.

                            2.1.5 is running perfectly. Upgrading to the very latest RC 2.2 seems to migrate fine, but upon boot won't pass any traffic except icmp.

                            Have tried both paravirtualized nic drivers, as well as the e1000 drivers. No change.

                            I did try a bare bones install of rc2.2 in a new vm using e1000 drivers, and with very minimal configuration it did appear to work correctly. So it seems that some aspect of the migrated configuration is causing problems. I haven't had a chance yet to figure out what portion.

                            Will probably try disabling the offloaded checksum calc first (it's easy), and if that doesn't fix it, start removing components of the existing config to see what is causing issues.

                            Moderately simple pfsense system config. No modules, no vlans. Does have 1 wan and two lan ports (running as emX), multiple ports forwards, schedules, logging. It's running as a pure fw appliance. So, dns/dhcp, sip/asterisk, vpn/strongswan, etc, all running on different internal hosts.

                            If necessary I can certainly build the whole config again…

                            1 Reply Last reply Reply Quote 0
                            • C
                              cmb
                              last edited by

                              @johnkeates:

                              I posted this in a different thread, I hope it's okay to semi-double post

                              You're more than welcome to cross-post solutions across however many threads are relevant.  :) There are probably a dozen different threads around here on this same root issue. Feel free to post it in however many threads are relevant. Many people only follow specific threads and may miss a fix for the same problem posted in a different thread otherwise.

                              pf does have a history of breaking checksums in certain areas, though I can't say I've seen any of that recently outside of this particular issue with Xen. It's probably a combination of pf+xn from the sound of your description. Can take our /tmp/rules.debug file, copy it over to stock FreeBSD, kldload pf && pfctl -f rules.debug (assuming stock system has same NICs) and see what happens. I'm definitely curious on the results.

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.