Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Intervlan performance slow on my C2758 atom 8 core.

    Scheduled Pinned Locked Moved General pfSense Questions
    27 Posts 8 Posters 6.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • G
      GomezAddams
      last edited by

      You are for sure only using one member of your port channel for transfers between two nodes. Cisco switches don't round-robin across port channels. By default, they hash the source MAC address and come up with a member of the port channel to send the data across. Transfers between two MAC addresses will always go across the same member of a port channel. You can modify this to use combinations of source and destination MAC address or IP address, but the net result is the same - your data is going to use only one link.

      However, that doesn't explain the 50% drop in throughput (unless there is other fairly heavy traffic). You might try running WireShark on one of the transfer hosts and watching the actual traffic. See if anything jumps out at you.

      1 Reply Last reply Reply Quote 0
      • F
        FlashEngineer
        last edited by

        Exactly, even if it was using one link, and nothing else is happening on the network, that's equivalent of full duplex 1Gbps (2 Gbps both ways).

        I'm using src dst mac for the algorithm.

        1 Reply Last reply Reply Quote 0
        • DerelictD
          Derelict LAYER 8 Netgate
          last edited by

          Your expectations on switched versus routed performance on a single TCP stream might be too high.

          Chattanooga, Tennessee, USA
          A comprehensive network diagram is worth 10,000 words and 15 conference calls.
          DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
          Do Not Chat For Help! NO_WAN_EGRESS(TM)

          1 Reply Last reply Reply Quote 0
          • ?
            Guest
            last edited by

            So I'm not sure what's up with pfsense.  I have the same hardware as the one in pfsense store,

            To what kind of unit please you are comparing your Board?

            2nd best (2758 atom 8 core). I'm running 8 total nics with 16 gb RAM.

            4 NICs are soldered on the Board and the others what kind of NICs this are?

            The uplink to my switch Cisco 2960x, is a 6 port LACP lagg.

            Why a LAG (LACP) and not a 10 GbE uplink or a SFP+ uplink? Is this a Layer2 or Layer3 Switch?

            If I'm transferring a file via FTP within same vlan, which bypasses pfsense all together, I'm getting around 95MB/sec or close to 900+mbps.  So nothing wrong with my switch.

            How great is this file you are transferring? And how the pfSense will be bypassed in this case?
            Is this a Layer3 Switch that is routing between the VLANs itself or only inside of each VLAN?

            When doing intervlan transfer, obviously going through pfsense, it drops to about 40MB/sec so maybe around 450-500mbps.

            A real good test method will be using iPerf or NetIO from one device acting as a server and another one acting
            as a client, that will be protocol independent and more saying likes your copy or FTP test.

            I don't think for this hardware and my setup I would only be transferring half a gigabit through pfsense.

            Me too, but in many cases the user thinks that he owns a real pfSense bomb and is turning
            many features on, serving also many other things and installs nearly all packets he can find.  ;)
            Ok not really all, but many or much, and if then something is pushed through the pfSense box
            that is not so fast as he was thinking hell is open.

            Unless I'm wrong about their hardware they are selling.

            No and why? They are also selling both types of hardware, the SG-8860 based on the Intel C2758 SoC and
            on top the C2758 1U that must be more matching and according to your SuperMicro board, but one think
            I really guess we all would never get together working like them and this is the pre-tuned and fine tuned
            pfSense version! Some hints and tips would be nice to activate or enable for sure, but like they where doing
            it we all together would not remake or re-configure as I see it right, and this also can´t payed by money too.

            So no one else has issues with firewall throughput?

            But not really with this board as I see it right, it is really powerful in my eyes.
            If your switch will be a Layer3 device you could try out and let him doing the
            routing between the VLANs and inside the VLANs, that would be much more
            fast and with lower latencies. But there fore are existing two different camps
            one is preferring this and the other not.

            • Did you enable the PowerD (hi adaptive) option?
              For the usage of all available CPU frequencies  likes needed
            • Do you use a SSD or mSATA and enable TRIM support?
              Not really urgent but perhaps a nice to do
            • Did you high up the mbufs size to 1.000.000?
              According to your amount of RAM it will be no problem.

            If you are using LACP to handle the LAG, at first one line (cable or port) must be full rendered and then
            the next one will be in usage! There are often many more options that you should try out first before telling
            around that the Supermicro hardware is slow or lame. So you could try out to play around with the settings
            like active - active and active - passive or thinking about to use a static LAG (without LACP) then you might
            be able to set up round robin and active - active so all pipes (LAN ports or cables in usage) will be filled
            all together step by step but not only one.

            The LAG is more to surround or work around a so called bottleneck and this is mostly occurring when many
            users or clients are connecting to one server. It can be nice to have this feature in some rarely cases but often
            a real 10 GbE or SFP+ uplink will be better to work around this.

            Dynamic LAG (LACP) automatic configuration over the LACP
            Balancing over hashing algorithm
            active - active

            Static LAG (No LACP) manual configuration by hand
            Balancing over round robin (only one possibility)
            active - active

            Often changing LAGs or often high up or narrowing down LAGs (adding ports or leaving ports)
            might be better to go with LAG (LACP) but never changing LAGs a static one could be better to
            go with.

            Again if your switch is a Layer3 device and contains one or more SFP+ ports I would changing
            the VLAN routing to him and over a 10 GbE interface you might be having more success and speed.

            1 Reply Last reply Reply Quote 0
            • F
              FlashEngineer
              last edited by

              @Derelict:

              Your expectations on switched versus routed performance on a single TCP stream might be too high.

              Might be, but migrating from my old box running Zeroshell, on similar LAGG with LACP, but weaker hardware.  AMD Athlon X2 2.8Ghz Dual core with intel NICs PCIe.  I achieved higher thoughput, not 900+ but around 800mbps or 85MB/sec through the firewall.

              1 Reply Last reply Reply Quote 0
              • ?
                Guest
                last edited by

                Zeroshell is based on Linux and more tight and thin it is really near the hardware programmed and
                so some more smooth and liquid running and the hardware, so it could really surely be that Linux
                is under an older hardware more powerful for sure.

                1 Reply Last reply Reply Quote 0
                • F
                  FlashEngineer
                  last edited by

                  @BlueKobold:

                  To what kind of unit please you are comparing your Board?

                  I have the exact same server/MB/cpu as this.  The only improvements are 16GB ram vs 8gb, and dual 120GB SSD for mirror.  I even have the supermicro 4 port gigabit PCIe adapter they sell.

                  https://store.pfsense.org/C2758/

                  4 NICs are soldered on the Board and the others what kind of NICs this are?

                  As stated, the supermicro 4 gigabit PCIe card.

                  Why a LAG (LACP) and not a 10 GbE uplink or a SFP+ uplink? Is this a Layer2 or Layer3 Switch?

                  Didn't think I need 10GbE uplink for my home network, I just need concurrent gigabit throughput, not more than that, but not half gigabit throughput either.  I have a C2960X in L2 mode.

                  How great is this file you are transferring? And how the pfSense will be bypassed in this case?
                  Is this a Layer3 Switch that is routing between the VLANs itself or only inside of each VLAN?

                  About 7GB.  Within same VLAN ID, the switch will handle the transfer and not send the packets to the trunk LAGG on pfSense.  Switch again is L2 and not routing, pfSense is doing routing between VLANs.

                  A real good test method will be using iPerf or NetIO from one device acting as a server and another one acting
                  as a client, that will be protocol independent and more saying likes your copy or FTP test.

                  Probably, but in actual use case is what matters here, FTP or SMB transfers, I tried on different machines, all result same speed through pfSense.

                  Me too, but in many cases the user thinks that he owns a real pfSense bomb and is turning
                  many features on, serving also many other things and installs nearly all packets he can find.  ;)
                  Ok not really all, but many or much, and if then something is pushed through the pfSense box
                  that is not so fast as he was thinking hell is open.

                  I don't have anything cpu intensive turned on yet, like snort or squid etc.  It's basic setup just with multiple VLANs and rules between the vlans, that's it.

                  No and why? They are also selling both types of hardware, the SG-8860 based on the Intel C2758 SoC and
                  on top the C2758 1U that must be more matching and according to your SuperMicro board, but one think
                  I really guess we all would never get together working like them and this is the pre-tuned and fine tuned
                  pfSense version! Some hints and tips would be nice to activate or enable for sure, but like they where doing
                  it we all together would not remake or re-configure as I see it right, and this also can´t payed by money too.

                  The hardware is exactly the same as I linked above with several improvements which shouldn't impact NIC performance between VLANs.  What exactly are they doing that is fine tuning that everyone else can't get?  Settings should be able to use for anyone.  This is open source right?

                  But not really with this board as I see it right, it is really powerful in my eyes.
                  If your switch will be a Layer3 device you could try out and let him doing the
                  routing between the VLANs and inside the VLANs, that would be much more
                  fast and with lower latencies. But there fore are existing two different camps
                  one is preferring this and the other not.

                  • Did you enable the PowerD (hi adaptive) option?
                    For the usage of all available CPU frequencies  likes needed
                  • Do you use a SSD or mSATA and enable TRIM support?
                    Not really urgent but perhaps a nice to do
                  • Did you high up the mbufs size to 1.000.000?
                    According to your amount of RAM it will be no problem.

                  If you are using LACP to handle the LAG, at first one line (cable or port) must be full rendered and then
                  the next one will be in usage! There are often many more options that you should try out first before telling
                  around that the Supermicro hardware is slow or lame. So you could try out to play around with the settings
                  like active - active and active - passive or thinking about to use a static LAG (without LACP) then you might
                  be able to set up round robin and active - active so all pipes (LAN ports or cables in usage) will be filled
                  all together step by step but not only one.

                  The LAG is more to surround or work around a so called bottleneck and this is mostly occurring when many
                  users or clients are connecting to one server. It can be nice to have this feature in some rarely cases but often
                  a real 10 GbE or SFP+ uplink will be better to work around this.

                  Dynamic LAG (LACP) automatic configuration over the LACP
                  Balancing over hashing algorithm
                  active - active

                  Static LAG (No LACP) manual configuration by hand
                  Balancing over round robin (only one possibility)
                  active - active

                  Often changing LAGs or often high up or narrowing down LAGs (adding ports or leaving ports)
                  might be better to go with LAG (LACP) but never changing LAGs a static one could be better to
                  go with.

                  Again if your switch is a Layer3 device and contains one or more SFP+ ports I would changing
                  the VLAN routing to him and over a 10 GbE interface you might be having more success and speed.

                  Yup everything is setup except PowerD, 1000000 mbufs and trim enabled SSD.

                  I'm leaning towards the LAGG setup, maybe LACP isn't good but had no issues prior on my old setup with zeroshell.  no 10GbE on my switch, just a base catalyst 2960x model.

                  1 Reply Last reply Reply Quote 0
                  • H
                    heper
                    last edited by

                    i've read posts in the past that claimed drastic performance increase when enabling powerD

                    1 Reply Last reply Reply Quote 0
                    • G
                      GomezAddams
                      last edited by

                      I don't think your port channel is the problem.

                      For grins, can you spin up a linux live distro on the hardware and configure it to just route between the VLANs for a comparison test? It would be interesting.

                      You might also try installing HyperV on the hardware and running pfsense as a virtual. You'll most likely see worse performance, but you never know…

                      I will warn you that setting up free HyperV outside a windows domain is a royal PITA. I can send you docs if you want to try.

                      It may be that you just lose that much performance by virtue of all the work that routing takes when you are doing it in software. There is a reason that Cisco can charge $$$ for their layer 3 switches.

                      1 Reply Last reply Reply Quote 0
                      • F
                        FlashEngineer
                        last edited by

                        Yeah it shouldn't be the port channel since the file I'm transferring is hosted on my NAS which has a 2 port LACP to the switch, even if I'm on same vlan, it goes though a port channel.

                        1 Reply Last reply Reply Quote 0
                        • ?
                          Guest
                          last edited by

                          About 7GB.  Within same VLAN ID, the switch will handle the transfer and not send the packets to the trunk LAGG on pfSense.  Switch again is L2 and not routing, pfSense is doing routing between VLANs.

                          Why should a 7 GB file running through the firewall? And then you might be thinking about the performance
                          or think the LAG is miss matching? I don´t think so.

                          Yup everything is setup except PowerD, 1000000 mbufs and trim enabled SSD.

                          Ok that would be fine then.

                          I'm leaning towards the LAGG setup, maybe LACP isn't good but had no issues prior on my old setup with zeroshell.  no 10GbE on my switch, just a base catalyst 2960x model.

                          If have only a smaller and very cheap switch with 2 SFP ports one is connected to the NAS and and one to a
                          server and the pfSense firewall will be "only" connected to a 1 GBit/s port, but must on the other side also
                          and only routing the WAN - LAN traffic and the Switch is doing the entire LAN routing. If the firewall fails
                          at some time the entire LAN traffic will flow without a break.

                          The hardware is exactly the same as I linked above with several improvements which shouldn't
                          impact NIC performance between VLANs.

                          OK

                          What exactly are they doing that is fine tuning that everyone else can't get?

                          Because they know the hardware that is coming with the pre-installed version of pfSense
                          and so they can do some tunings that matches exactly this hardware, to unleash the full
                          power the hardware. And yes it is the same version like we both are using, but with some
                          tunings because if they sell the hardware they know what is exactly sold. By the community
                          version for everyone, no tuning can be done, because the developers are not knowing what
                          kind of hardware we are all using or we will use!

                          Settings should be able to use for anyone.

                          Yes, for sure they are, but I really don´t thing that we all have so much wisdom and deeper knowledge
                          about pfSense as the developers will own! And if they know the hardware because they are selling it self
                          they can do some more things as we will be able to do. Or how many about such things you will know?

                          This is open source right?

                          Yes OpenSource for sure, but if you are offering the software only without knowing what kind of hardware
                          will be in usage at the endpoint or on the customer site, what you will tune of pre-tune? But if you are selling
                          the hardware and the software together and also pre-installed, you will be exactly knowing the hardware basis
                          and would be able to pre-install and tune the absolutely identically pfSense community version that we are all
                          using, but with the deeper knowledge from the developer site that we all never will have. Not more but also
                          not less.

                          But back to your problem, what kind of settings you where using in ZeroShell?
                          Are these the same one like now? And again a LAG is more for the use case that
                          many clients will connect to one other device likes your NAS. Because they will
                          be able to render one line completely and the next one will be in usage then.

                          1 Reply Last reply Reply Quote 0
                          • F
                            FlashEngineer
                            last edited by

                            HiAdaptive did not do anything :(

                            Zeroshell was the same as what pfSense is doing, LACP with 6 ports.

                            I had no issues with zeroshell, ran about 850mbps throughput.  And yes LAG is primarily for multiple users connected to my NAS to transfer files.

                            1 Reply Last reply Reply Quote 0
                            • G
                              GomezAddams
                              last edited by

                              I think you'd be wise to do a wireshark capture of an FTP session to look for things like retransmissions or tcp zero windows. You might be able to tweak your systems' tcp parameters to get better throughput.

                              Ordinarily, I recommend against using jumbo frames on gigabit (and even on 10gb except for iSCSI), but in your case reducing the number of packets that pfsense has to look at might boost your performance.

                              Lastly, you might want to consider installing ESX or HyperV (ESX probably wouldn't have drivers for your supermicro NIC) and use pfsense for firewall, and something like zeroshell for intervlan routing.

                              Or, buy a layer three switch.

                              1 Reply Last reply Reply Quote 0
                              • ?
                                Guest
                                last edited by

                                HiAdaptive did not do anything

                                Oh really sad, in normal it does the following, if the machine gets stressed it uses the full 2,4GHz and
                                if less power is used it saves electric power by running the CPU only a sometimes like 60MHz or 800MHz
                                like it is needed, and so if this is not enabled it can be that the cpu frequency is only and static running
                                at 600MHz or 800MHz and this will then not really unleash or delivers the performance and on top the
                                needed throughput, that you will need from time to time!

                                Zeroshell was the same as what pfSense is doing, LACP with 6 ports.

                                Did you remember the settings like "active - active" or anything else, that you are not really
                                using or configuring this time together with pfSense?

                                1 Reply Last reply Reply Quote 0
                                • F
                                  FlashEngineer
                                  last edited by

                                  @BlueKobold:

                                  HiAdaptive did not do anything

                                  Oh really sad, in normal it does the following, if the machine gets stressed it uses the full 2,4GHz and
                                  if less power is used it saves electric power by running the CPU only a sometimes like 60MHz or 800MHz
                                  like it is needed, and so if this is not enabled it can be that the cpu frequency is only and static running
                                  at 600MHz or 800MHz and this will then not really unleash or delivers the performance and on top the
                                  needed throughput, that you will need from time to time!

                                  Zeroshell was the same as what pfSense is doing, LACP with 6 ports.

                                  Did you remember the settings like "active - active" or anything else, that you are not really
                                  using or configuring this time together with pfSense?

                                  It was just configuring in the interfaces file like any linux distro.

                                  auto bond0
                                  iface bond0 inet static
                                  address 192.168.1.10
                                  gateway 192.168.1.1
                                  netmask 255.255.255.0
                                  bond-mode 4
                                  bond-miimon 100
                                  bond-slaves none
                                  

                                  Something like that, there's nothing really to state active or passive.

                                  I do have a layer 3 switch., the Cisco C2960X-48TS-L is not a full L3 switch but has routing capabilities and ACL.

                                  I just don't know if I can define all the same rules in the switch and also allow certain hosts/networks outbound on pfSense to different gateways (OpenVPN clients)

                                  From what I've read, you're supposed to use a transit network from the switch to pfsense so pfsense doesn't really know the internal vlans of the switch.  In this case I don't think I can selectively route traffic outbound to different OpenVPN gateways.

                                  1 Reply Last reply Reply Quote 0
                                  • ?
                                    Guest
                                    last edited by

                                    From what I've read, you're supposed to use a transit network from the switch to pfsense so pfsense doesn't really know the internal vlans of the switch.  In this case I don't think I can selectively route traffic outbound to different OpenVPN gateways.

                                    You will be able to create a VLAN50 as an example and the Gateway of this VLAN50 will be then the IP address
                                    from the pfSense box! So you could set up routes to any other VLANs and all would be fine. Thats it.

                                    Ok perhaps you wont to walk on this way but it is a really fine solution to get all LAN traffic fast routet
                                    nearly wire speed pending on the power of your switch and the entire LAN will be also alive if the pfSense
                                    box gets rebooted or is failing.

                                    1 Reply Last reply Reply Quote 0
                                    • G
                                      GomezAddams
                                      last edited by

                                      Your C2960X-48TS-L is not a layer 3 switch, it runs the LANBase feature set. No routing possible.

                                      If your switch is a C2960XR switch, then you are in tall cotton - by all means use it for your inter-VLAN routing and inter-VLAN access control lists.

                                      Set up your VLANs on your switch and use it to route between them. Create a private IP network between the switch and the pfsense box and make the switch's default route the IP address of pfsense.

                                      I don't think the 2960XR will originate any routing protocols, so you'll have to create routes on the pfsense box that route your VLAN subnets back to the switch.

                                      You will be much happier with this setup.

                                      1 Reply Last reply Reply Quote 0
                                      • DerelictD
                                        Derelict LAYER 8 Netgate
                                        last edited by

                                        I wish the netgate guys would chime in on threads like this.

                                        Chattanooga, Tennessee, USA
                                        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                        Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                        1 Reply Last reply Reply Quote 0
                                        • F
                                          FlashEngineer
                                          last edited by

                                          Sorry didn't reply back for a while on this thread.

                                          I think I've figured out the issue, maybe not, who knows.

                                          But basically the LAGG algorithm is sending/receiving the file transfer on the same port on pfSense, so it's doing full duplex transfer.  Now theoretically, the gigabit ethernet can handle 2000mbps total.  But I ran iperf between 2 machines using the simultaneous option, and the max I was about to get was about 450mbps both ways the same time.  So not sure why?  Anyhow  when I transfer a file the other direction, the algorithm uses 2 ports on pfSense, so then I'm getting closer to 1Gb in that direction.

                                          Either way, I think I will upgrade to 10Gbe with the Chelsio card, that should solve any Gb bottlenecks.

                                          1 Reply Last reply Reply Quote 0
                                          • ?
                                            Guest
                                            last edited by

                                            But basically the LAGG algorithm is sending/receiving the file transfer on the same port on pfSense, so it's doing full duplex transfer.

                                            I am not really sure but all depends on the configuration you made! You can also configure that one
                                            LAN port is "doing" RX and the other is "doing" the TX part! And then you will be getting out;

                                            • 1 GBit/s > TX
                                            • 1 GBit/s > RX

                                            And this might be then even 1 GBit/s and not 2 GBit/s! But for sure the entire LAG (LACP) is building
                                            a aggregated 2 GBit/s fat pipe!

                                            Now theoretically, the gigabit ethernet can handle 2000mbps total.

                                            That is the exactly point where you are failing or made a so called thinking false in my eyes!
                                            1 GBit/s line (cable) is able to send and receive 1 GBit/s over 4 adders of the cable in each direction
                                            and this is then 1 GBit/s in each direction and not 2 GBit/s in one direction.

                                            But I ran iperf between 2 machines using the simultaneous option, and the max I was about to get was about 450mbps both ways the same time.  So not sure why?

                                            If the technical and theoretical max throughput of a 1 GBit/s line is 125 MBit/s and with your LAG (LACP)
                                            you will get out then in normal and as a max. 500 MBit/s (4 x 125 MBit/s) but you got 450 MBit/s + the
                                            TCP/IP overhead that must be count on this on top you will be getting also nearly the macimum, or am I
                                            wrong with this?

                                            Anyhow  when I transfer a file the other direction, the algorithm uses 2 ports on pfSense, so then I'm getting closer to 1Gb in that direction.

                                            Then perhaps the network load you were producing with iPerf was not high enough perhaps I mean?

                                            Either way, I think I will upgrade to 10Gbe with the Chelsio card, that should solve any Gb bottlenecks.

                                            It is the best option as today in my eyes!!! The Chelsio card is fully offloading tasks such as VLANs based
                                            on using an ASIC/FPGA on its NIC and it is better driver supported in pfSense! So you will be able to
                                            fully unload from your pfSense box many TCP/IP based tasks and on top you will saving ports and
                                            getting more throughput then now.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.