Best way to reduce game latency



  • Right now I have a 15/3mb cable connection and would like to keep my latency low while playing world of warcraft and other people watching netflix at the same time.  I set up HFSC with the wizard and the game runs smooth but suffer from occasional lagspikes (not as many as I had before setting TS though).

    My question is…
    -if using HFSC, how do I inform the shaper which rule has a higher priority in the scale so I can put netflix in the back.
    -should I use some other scheduler to achieve this?

    thanks in advance.



  • HFSC does not have a notion of priority. The question is are the lag spikes a cause of misconfiguration or not being aggressive enough with limiting bandwidth.

    If HFSC is too difficult to figure out the problem yourself, you may be better off with fq_Codel and this lost discussion shows how to set it up with 2.4.x nearer the end https://forum.pfsense.org/index.php?topic=126637.0

    It is turn-key easy, just set your bandwidth. I actually plan on going this route in the near-ish future. But HFSC has been working near perfectly for me for a long while.



  • Which of the 15 pages in that thread is 'near the end'??  ;D  Every page is crammed with tech jargon & CLI incantations with every Tom, Dick & Harry chiming in.



  • @Harvy66:

    HFSC does not have a notion of priority.

    ok i understand this now.

    @Harvy66:

    The question is are the lag spikes a cause of misconfiguration or not being aggressive enough with limiting bandwidth.

    I wish I knew if it was either one of those, Im not an expert at TS so I didnt really fiddle that much with the queues, theyre all set to whatever values the wizard set them, I only copied the floating rules to adapt to the different games I play.

    Thanks for the suggestion, Ill read that topic you linked.



  • Yes, it is true that the thread on fq_codel has gotten quite long.  However, most of those posts are us trying to figure out how to make the deployment simple (since it's not in the GUI yet) and how to properly tweak the algorithm's parameters.

    Having said that, here is the basic implementation in a few steps:

    1)  Setup limiters - at minimum you'll need to create two root limiters and then create one queue under reach root limiter.  You can setup more queues if it's required/desired.  This is also where you set your bandwidth limits.
    2)  Apply the queues to the necessary firewall rules (e.g. to the LAN rule(s) that allows your outbound traffic in the "In/Out Pipe" section).
    3)  Enable fq_codel via the command line (can SSH into the firewall for that):  Issue the following command:

    
    ipfw sched 1 config pipe 1 type fq_codel && ipfw sched 2 config pipe 2 type fq_codel
    
    

    To validate that the command has indeed enabled fq_codel, issue this command:

    
    ipfw sched show
    
    

    If all looks good (you should now see fq_codel listed in the output), go ahead and test to see if performance is acceptable.  If not, you can make changes by tweaking the algorithm's default parameters and/or your bandwidth limits.  For instance, you may have to increase the algorithm's target latency if you have a connection with slower upload speed, or decrease your bandwidth limits if e.g. your upload/download speeds aren't stable.

    4)  To make sure that your settings stick between reboots, install the ShellCmd add on package in pfSense.  Once you have done that make sure you add the command in step 3 to ShellCmd.

    Some additional notes:
    1)  On setting up limiters:  See post #121 in the thread:  https://forum.pfsense.org/index.php?topic=126637.msg754199#msg754199
    2)  On tweaking algorithm parameters:  See post #198 (and following) in the thread:  https://forum.pfsense.org/index.php?topic=126637.msg769665#msg769665

    Hope this helps.



  • Thank you so much mr tman22, this has solved 50% of my problems… now my question is... how do I set up ts so I can prioritize games over netflix?



  • Are you seeing an issue that you need to use priorities?



  • @Harvy66:

    Are you seeing an issue that you need to use priorities?

    even though i can tell bufferbloat is almost gone through dslreports tests, I can still see increased latency when someone is downloading a big file or watching netflix



  • Have you tried further reducing your provisioned bandwidth just to see if it helps? In theory, fq_Codel properly configured should pretty much keep your latency of other flows to within a few milliseconds, except if two flows get assigned to the same bucket.



  • @Harvy66:

    …except if two flows get assigned to the same bucket.

    howcan I monitor buckets???



  • @areynot:

    @Harvy66:

    Are you seeing an issue that you need to use priorities?

    even though i can tell bufferbloat is almost gone through dslreports tests, I can still see increased latency when someone is downloading a big file or watching netflix

    There are a few more things you can try before considering moving on to a different scheduling algorithm:

    1)  As Harvy66 already mentioned, you can try to reduce your bandwidth limits further (especially if your upload/download speeds aren't very stable).
    2)  To favor interactive flows, try reducing the quantum parameter from the default.  Start reading here for more details on how to go about this:  https://forum.pfsense.org/index.php?topic=126637.msg769665#msg769665
    3)  One last thing you could to is create multiple queues under your limiter and assign each of them weights.  For instance, let's assume you had two queues under each limiter with weight 50.  This would ensure that each queue is guaranteed at least 50% of the bandwidth, and more (up to full) if the other queue does not have any traffic in it.  Once you have those queues created, you'd have to create the relevant firewall rules to match your game traffic, assign one set of queues to it, and then assign the other set of queues to your rule that handles the remaining traffic (e.g. web traffic, netflix, etc.).  This is a little bit more tricky to setup, so I would recommend trying 1) and 2) first.

    Hope this helps.



  • thank you so much for helping me… I will be trying all of these options and will let you know how it went =)



  • @tman222:

    @areynot:

    @Harvy66:

    Are you seeing an issue that you need to use priorities?

    even though i can tell bufferbloat is almost gone through dslreports tests, I can still see increased latency when someone is downloading a big file or watching netflix

    3)  One last thing you could to is create multiple queues under your limiter and assign each of them weights.  For instance, let's assume you had two queues under each limiter with weight 50.  This would ensure that each queue is guaranteed at least 50% of the bandwidth, and more (up to full) if the other queue does not have any traffic in it.  Once you have those queues created, you'd have to create the relevant firewall rules to match your game traffic, assign one set of queues to it, and then assign the other set of queues to your rule that handles the remaining traffic (e.g. web traffic, netflix, etc.).  This is a little bit more tricky to setup, so I would recommend trying 1) and 2) first.

    Can you please give me a hand with this?

    Do i create a floating rule for games and assign one of the two 50/50 pipes to it and let the other one in the default LAN rule?

    thanks so much mr tman



  • I have similar needs, except I'm trying to make a second set of queues for my twitch stream traffic so that I don't drop RTMP packets but the weight settings don't seem to do anything different. If I stream and do a speedtest, a consistent number of stream packets get dropped. Chances are, I got something misconfigured but it's at least easy for me to reproduce the test to know if I get it working, so if I figure it out I'll chime in because if I did the same with my game traffic, it wouldn't work any differently.



  • I have some "issues" with the download queue. So i'd like to tell you what i have done so far.

    I am on PfSense 2.4.2_1 and i have a symetrical 1000Mbit line and DSL reports
    image before is attached. And My dsl report looks like this, as expected (image_1…)

    1. Creating Limiters (screenshots attached for the upload Part, for the download part its the same but with a different name)
    • Upload (limited to 900Mbit)

    • highUp 75

    • defaultUp 25

    • lowUp 5

    • Download (limited to 900Mbit)

    • HighDown

    • defaultDown

    • lowDown

    1. Creating Floating rules Rules
      I created in total 6 Floating rules but only going to show the default ones in the screenshots
      the other ones are basically clones anyway

    2. Installing the shellcmd package and adding
      ipfw sched 1 config pipe 1 type fq_codel && ipfw sched 2 config pipe 2 type fq_codel

    3. horrible results, something is not working right on the download side, dunno what it is :D

    I added an imgur album to just take a look at all the screenshots. https://imgur.com/a/bkIuA maybe @tman has an idea what i am doing :(



  • @zwck:

    I have some "issues" with the download queue. So i'd like to tell you what i have done so far.

    I am on PfSense 2.4.2_1 and i have a symetrical 1000Mbit line and DSL reports
    image before is attached. And My dsl report looks like this, as expected (image_1…)

    1. Creating Limiters (screenshots attached for the upload Part, for the download part its the same but with a different name)
    • Upload (limited to 900Mbit)

    • highUp 75

    • defaultUp 25

    • lowUp 5

    • Download (limited to 900Mbit)

    • HighDown

    • defaultDown

    • lowDown

    1. Creating Floating rules Rules
      I created in total 6 Floating rules but only going to show the default ones in the screenshots
      the other ones are basically clones anyway

    2. Installing the shellcmd package and adding
      ipfw sched 1 config pipe 1 type fq_codel && ipfw sched 2 config pipe 2 type fq_codel

    3. horrible results, something is not working right on the download side, dunno what it is :D

    I added an imgur album to just take a look at all the screenshots. https://imgur.com/a/bkIuA maybe @tman has an idea what i am doing :(

    I have not set this up with matching floating rules before, but one thing I noticed right away looking at your screenshots is that you are missing the source and destination masks in your upload and download queues.

    For each of your download queues, choose "Destination addresses" for the Mask.  For each of your upload queues, choose "Source addresses" for the Mask.

    Hope this helps.



  • fq_codel is great at reducing latency on its own. Adding complexity by having more queues may actually make it worse. Of course not in relation to the issue you're seeing.



  • @tman222:

    stuff

    not sure what you mean here, would you mind sending me some screenshots or uploading them here, i thought the floating rules were necessary. I just added for my upload limiters source  and for my download limiters destination with the same results :(



  • @Harvy66:

    fq_codel is great at reducing latency on its own. Adding complexity by having more queues may actually make it worse. Of course not in relation to the issue you're seeing.

    What would be the easiest setup here? i dont mind not dealing with queues :D



  • @zwck:

    @tman222:

    stuff

    not sure what you mean here, would you mind sending me some screenshots or uploading them here, i thought the floating rules were necessary. I just added for my upload limiters source  and for my download limiters destination with the same results :(

    Actually the most basic setup requires only an upload and download limiter with one queue under each, and no matching firewall rules.

    Here's how you would set that up:

    First, remove your existing settings including your matching firewall rules you created for fq_codel.

    Next:
    1)  Create a upload and download limiter and set their bandwidth limits
    2)  Create one queue under the Upload limiter, i.e. in your case let's call this "in" and make sure the Mask field is set to "Source Addresses".  Leave the Weight field empty.
    3)  Create one queue under the Download limiter, i.e. in your case let's call this "out" and make sure the Mask field is set to "Destination Addresses".  Leave the Weight field empty.
    4)  Next go to your LAN interface and find the rule that allows outbound traffic to the internet (e.g. your default allow all rule).  Under that rules' settings, go to Advanced Options, In/Out Pipe.
    5)  For the In Pipe use the queue you created under the upload limiter, in your case the "in" queue.
    6)  For the Out Pipe use the queue you created under the download limiter, in your case the "out" queue.
    7)  Enable fq_codel with this command:  ipfw sched 1 config pipe 1 type fq_codel && ipfw sched 2 config pipe 2 type fq_codel
    8 )  Speed test and check for buffer bloat.

    Harvy66 is right that fq_codel is pretty good at reducing latency without having to filter traffic into different queues first and then applying fq_codel.  In my case I'm only using multiple weighted queues to control the total amount of bandwidth available to different VLAN's instead of controlling the amount of bandwidth available to different traffic on the same interface/VLAN.  That may still be possible to do (e.g. with matching firewall rules), but unfortunately I have don't have any specific experience with such a setup.

    Hope this helps.



  • I must be doing something wrong.










  • @zwck:

    I must be doing something wrong.

    Try this:

    1. On the command line issue this command:  ipfw pipe flush
    2. Then go ahead and reset your firewall states.
    3. Then issue this command on the command line:  ipfw sched 1 config pipe 1 type fq_codel && ipfw sched 2 config pipe 2 type fq_codel
    4. Try another speed test.

    What do the results look like now?

    Hope this helps.



  • @tman222:

    @zwck:

    I must be doing something wrong.

    Try this:

    1. On the command line issue this command:  ipfw pipe flush
    2. Then go ahead and reset your firewall states.
    3. Then issue this command on the command line:  ipfw sched 1 config pipe 1 type fq_codel && ipfw sched 2 config pipe 2 type fq_codel
    4. Try another speed test.

    What do the results look like now?

    Hope this helps.

    First off thank you for helping me! That's really great! Unfortunately this did change the outcome significantely, or at least i get the same result.

    Could there be anything else besides the trafficshaper that influences this, whats surprising to me is that the upload part of the speedtest just works flawlessly, no bufferbloat and constant high throughput, its only that the download really is not working well, when i remove the traffic shaper its the opposite ?



  • after updating to 2.4.3 no change



  • @zwck:

    after updating to 2.4.3 no change

    Something still seems off here.  Do you have any other firewall rules (floating or otherwise) or traffic shaping settings enabled that are impacting traffic coming to or from your LAN and/or WAN?  Besides setting up the limiters and queues, are there any other changes you made to try to implement fq_codel that you might have forgotten to undo?  Can you provide screenshots again so we can see if anything does not look correct?  Also, what happens if you raise the limiters to 930 or 940Mbit?  Any difference?

    Hope this helps.



  • Hey tman222,

    So, i have some port forwarding rules activated for some services on some other machines, but other than that nothing really. I put as you suggested the in and out pipe on the lan rule instead of creating floating rules, and deactivated/deleted all the other rules I had on. When i'll come home from work i'll upload some screenshots/ or some video. Maybe there is something obviously wrong and i am just too much of a beginner.  Thanks again for all the help and effort you put into my problems.



  • Hey tman222,

    So i basically here are all my settings regarding firewalling and limiters. Could i have messed something up with nat or dns, that could cause a problem like this?

    https://imgur.com/a/5z4zM

    Edit: Update:
    When i limit the download to 500Mbit, i dont get any buffer-bloat as soon as I go above if feels like the download just crashes… any suggestions are welcome.



  • @zwck:

    Hey tman222,

    So i basically here are all my settings regarding firewalling and limiters. Could i have messed something up with nat or dns, that could cause a problem like this?

    https://imgur.com/a/5z4zM

    Edit: Update:
    When i limit the download to 500Mbit, i dont get any buffer-bloat as soon as I go above if feels like the download just crashes… any suggestions are welcome.

    The only thing I see right now in those WAN rules that I'm a little suspicious of are the two haproxy rules that pass HTTP/HTTPS traffic on port 80 and 443.  What does this NAT redirect do exactly?  If you disable those two rules temporarily does it make a difference?

    Also, are you running any IDS/IPS (e.g. Snort) on your interfaces?  If so, if you disable that, do you see any improvement?

    What are the hardware specs of your pfSense box?

    Hope this helps.



  • Hi,

    the ha proxy rules direct incoming traffic on port 80 and 443 to the internal haproxy, to direct to my personal blog and a speed test, https://speed.zwck.de so nothing critical. However, if i disable the haproxy rules the results are the same. I also dont have a snort running.

    My system is an older i5 system with 4GB ram and 4 intel nics, i am thinking maybe something is setup wrongly in the general setup. maybe dns ? i really have no idea.

    The thing is if i flush the pipe ;) (ipfw pipe flush and reload the filters) the  sched  resetsto WF2Q+ of course, when i now perform the dlsreport speed tests the speeds are to be expected 900Mbits,  quite constant, and with limited bufferbloat. However, when i have qa_coddle on the download just crashes hardcore, it goes up to 900 then stops (bufferbloat 35 seconds) then drops to 40Mbit and avg of 350 or so. its really weird. I checked my cpu performance and states and all, but nothing seems to bottle neck this.



  • @zwck:

    Hi,

    the ha proxy rules direct incoming traffic on port 80 and 443 to the internal haproxy, to direct to my personal blog and a speed test, https://speed.zwck.de so nothing critical. However, if i disable the haproxy rules the results are the same. I also dont have a snort running.

    My system is an older i5 system with 4GB ram and 4 intel nics, i am thinking maybe something is setup wrongly in the general setup. maybe dns ? i really have no idea.

    The thing is if i flush the pipe ;) (ipfw pipe flush and reload the filters) the  sched  resetsto WF2Q+ of course, when i now perform the dlsreport speed tests the speeds are to be expected 900Mbits,  quite constant, and with limited bufferbloat. However, when i have qa_coddle on the download just crashes hardcore, it goes up to 900 then stops (bufferbloat 35 seconds) then drops to 40Mbit and avg of 350 or so. its really weird. I checked my cpu performance and states and all, but nothing seems to bottle neck this.

    Thanks for the additional information.  Your particular case is indeed interesting because fq_codel looks like it's working fine on the upload side, but not on the download for some reason.  It seems like it there is a constraint somewhere, whether it's physical or some type of processing constraint.

    In any case, there are a few more things we can try:

    1. If you increase the limiters from 900Mbit to 930Mbit or 940Mbit, do you see any difference?
    2. Regarding your system specs, what make and model Intel NIC's do you have in your system?
    3. Given that yours is a very fast connection (symmetric gigabit), we might want to try tuning the NIC parameters a bit to see if it will help:

    For example, see these two threads and pfSense wiki entry:

    https://forum.pfsense.org/index.php?topic=113496.0
    https://forum.pfsense.org/index.php?topic=132345
    https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards

    In particular, I would be curious about, the rx and tx descriptors (rxd, txd), rx and tx process limit, number of queues, and the nmbclusters settings on your system.

    You can easily access these values from the command line using, e.g. : sysctl -a | grep hw.igb.txd  and so on.  Do note that depending on the type of Intel NIC's you have, you may need to "em" instead of "igb".

    I actually also have a symmetric gigabit fiber connection and was able to improve performance some after tuning some of these parameters.

    Hope this helps.



  • Hey tman222,

    thanks man for the help, when i up the limit to 930 or 940 the same happens, no real improvement.

    The NICS are https://ark.intel.com/products/64404/Intel-Ethernet-Controller-I211-AT if i check what the parameters are the following shows up

    These are my current values. maybe i should play around with them.

    hw.igb.txd: 1024
    hw.igb.rxd: 1024
    
    net.pf.states_hashsize: 32768
    net.pf.source_nodes_hashsize: 8192
    
    hw.igb.tx_process_limit: -1
    hw.igb.rx_process_limit: 100  
    
    net.inet.tcp.syncache.hashsize: 512
    net.inet.tcp.syncache.bucketlimit: 30
    

    If i would like to change them i most likely have to put them into system tunables, right ?



  • @zwck:

    Hey tman222,

    thanks man for the help, when i up the limit to 930 or 940 the same happens, no real improvement.

    The NICS are https://ark.intel.com/products/64404/Intel-Ethernet-Controller-I211-AT if i check what the parameters are the following shows up

    These are my current values. maybe i should play around with them.

    hw.igb.txd: 1024
    hw.igb.rxd: 1024
    
    net.pf.states_hashsize: 32768
    net.pf.source_nodes_hashsize: 8192
    
    hw.igb.tx_process_limit: -1
    hw.igb.rx_process_limit: 100  
    
    net.inet.tcp.syncache.hashsize: 512
    net.inet.tcp.syncache.bucketlimit: 30
    

    If i would like to change them i most likely have to put them into system tunables, right ?

    Hi again,

    Yes, you can change those settings either in the System Tunables section under Advanced Settings, or you can also put them in /boot/loader.conf.local

    To begin, I would change the following:

    hw.igb.txd: 2048
    hw.igb.rxd: 2048

    hw.igb.tx_process_limit: -1
    hw.igb.rx_process_limit: -1    (100 is probably too low for a fast connection like yours).

    Also, what value did you have for kern.ipc.nmbclusters?  If it's less than 131072, I would change it to 131072 to start and see if that offers any improvement as outlined here:

    https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards

    –-----

    Let's see if changing those parameters offers some improvement.  Hope this helps.



  • so i completely reinstalled pfsense, from scratch, just set up the traffic shaper. same results as before.

    Then i added

    hw.igb.txd: 2048
    hw.igb.rxd: 2048

    hw.igb.tx_process_limit: -1
    hw.igb.rx_process_limit: -1

    but besides taking more memory nothing really changed. my kern.ipc.nmbclusters are twice that much.  Whats next ? its 3 am and i just restored everything to the before stage… :( Thanks tman for all your help i am really clueless :(



  • Hmmm, this is indeed perplexing and I'm running out of ideas unfortunately.  However, there's an alternative we can try.  Instead of using dummynet (limiters) and fq_codel, we can emulate the behavior of fq_codel using the ALTQ traffic shaping by using the FAIRQ Scheduler and Codel controlled queues.  The performance of this is similar to fq_codel.  Would you be willing to try that?

    Here's how you would set it up:

    1. First off remove all your fq_codel limiters and associated queues from both Firewall/Traffic Shaper and from you your firewall rules.
    2. Next go to Firewall/Traffic Shaper/By Interface tab
    3. For your WAN interface, choose scheduler type FAIRQ and set bandwidth equal to 900 Mbit/s.  Check Enable/disable discipline and its children and hit Save.
    4. Next go to the bottom and click "Add new Queue".
    5. In the queue settings choose a name, then choose the default priority of 1.  For "Queue Limit", choose either 512 or 1024 (the default is 50, which is too low for your connection speed).  For scheduler options check "Default Queue" and "Codel Active Queue".  For bandwidth choose 900 Mbit/s.  Check  "Enable/disable discipline and its children".  Click Save to save the queue settings.
    6. Repeat steps 3-5 for your LAN interface.

    Once you have done that, run a speed test again.  What does the performance look like?

    Hope this helps.



  • Hiii,

    this is exactly the way i had it set up before based on this http://www.speedtest.net/insights/blog/maximized-speed-non-gigabit-internet-connection/ article, which lead me to the whole qu_coddle thread here :D

    The tests are great i get like ABA mainly,which is better then FCA, however i would really like to know what is off with my system that the qa_cddle thing isnt working, might it be the ram? or similar



  • Thanks for getting back to me.  So it's good to know that ALTQ FAIRQ + Codel does work in your case.  However, we should be able to get fq_codel working as well using dummynet (limiters).

    I have a few more questions for you:

    1. Is there anything special about your symmetric gigabit connection (e.g. are you using PPoE or something like that)?
    2. What pfSense add-on's/plug-in's are you running, if any?
    3. When you installed pfSense from scratch, did you also re-enable are your WAN NAT firewall rules, or did you try shaping with just the defaults (i.e. no special firewall rules on WAN and/or LAN)?  I'd be curious to see what results look like with just the system defaults (i.e. no special firewall rules and no add-on's/plug-in's).
    4. Can you do me a favor and show me screenshots again for your limiter and queue settings, firewall rules, as well as the fq_codel configuration (output) from the command line?  I just want to check one more time to make sure we didn't miss anything obvious.

    Hope this helps.



  • Hey Tman222,

    I am trying to answer this to the best of my ability, i dont think there is anything special about my fiber cable. Its an FTTH setup

    Fiber cable -> TP-LINK MC220L, 1x SFP 1000Base-SX/LX/LH, 1x RJ45 1000Base-T (Media converter) + TP-LINK TL-SM321B, SFP, Simplex, LX/LC (Transceiver) -> RJ45 -> PFSENSE

    PFSENSE:
    Intel(R) Core(TM) i5-5250U CPU @ 1.60GHz
    4 1Gbit Intel NIC i211-AT
    120GB SSD
    4GB Ram

    PFsense Plugins (typically shellcmd103 haproxy0552 nmap1441 ntopng0811 pfblockerng2122) however at the moment only haproxy is on.

    after resetting  the pfsense i changed the ip of the box created the limiters changed the in/outpipe of the default lan rule allow all,set the traffic shaping to qa_codle through the cmd and run the dslreport test

    i did not change anything regarding NAT or other rules, everything should be set to default. such as NAT reflection and so on.

    1. https://imgur.com/a/5z4zM this is still how i have it.

    at the moment i have my download limit to 500 and upload to 890
    /tmp/rules.limiter

    
    pipe 1 config  bw 500Mb
    queue 1 config pipe 1 mask dst-ip6 /128 dst-ip 0xffffffff
    
    pipe 2 config  bw 890Mb
    queue 2 config pipe 2 mask src-ip6 /128 src-ip 0xffffffff
    
    

    and  ipfw sched show

    
    00001: 510.000 Mbit/s    0 ms burst 0
    q00001  50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail
        mask:  0x00 0x00000000/0x0000 -> 0xffffffff/0x0000
     sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active
     FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
       Children flowsets: 1
    BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp
      0 ip           0.0.0.0/0             0.0.0.0/0      809    32614  0    0   0
    00002: 890.000 Mbit/s    0 ms burst 0
    q00002  50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail
        mask:  0x00 0xffffffff/0x0000 -> 0x00000000/0x0000
     sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active
     FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
       Children flowsets: 2
      0 ip           0.0.0.0/0             0.0.0.0/0     1219696 1819422727 416 622680 787
    
    

    when uploading traffic seems to go through it

    and upon downloading same thing..

    
    00001: 510.000 Mbit/s    0 ms burst 0
    q00001  50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail
        mask:  0x00 0x00000000/0x0000 -> 0xffffffff/0x0000
     sched 1 type FQ_CODEL flags 0x0 0 buckets 1 active
     FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
       Children flowsets: 1
    BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp
      0 ip           0.0.0.0/0             0.0.0.0/0     209511 312499033 164 245320 377
    00002: 890.000 Mbit/s    0 ms burst 0
    q00002  50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail
        mask:  0x00 0xffffffff/0x0000 -> 0x00000000/0x0000
     sched 2 type FQ_CODEL flags 0x0 0 buckets 1 active
     FQ_CODEL target 5ms interval 100ms quantum 1514 limit 10240 flows 1024 ECN
       Children flowsets: 2
      0 ip           0.0.0.0/0             0.0.0.0/0     1242    50904  0    0   0
    
    

    In my advanced>interfaces tab after setting up the pfsense disable hardware TCP segmentation offload and Disable hardware large receive offload is ticked.  is that alright, or should i be able to untick this?



  • Hi again,

    I still feel like there is a bottleneck somewhere and that is why you are seeing poor performance above 500Mbit/s.  However, I'm not quite sure yet where that bottleneck is, and while you do have a slower CPU (and it is ultra-low power), I'm not 100% convinced that's it.

    So, we can do some troubleshooting to try to find where the bottleneck is occurring on your system.  Please see this link:

    https://bsdrp.net/documentation/technical_docs/performance

    And go down to the section, "Where is the bottleneck?" at the bottom.

    Can you try some of the tools suggested there and report back the results?  I would try a test at 500Mbit with fq_codel enabled and then one at 900Mbit with fq_codel enabled to see what differences/issues might show up.

    Hope this helps.



  • So i walked through the bottle neck part of the link you posted, I basically did as you suggested and ran some of the commands in idle under download and upload conditions for 500-800 and 800-800MBPS, tbh there is not a noticeable difference i can observer during this time. Ill post the results here maybe you/or someone can spot some critical issues.

    top -CHIP for limits 500-800
    idle

    
    last pid: 95907;  load averages:  0.29,  0.20,  0.12                                                up 0+18:14:12  11:40:29
    445 processes: 5 running, 408 sleeping, 32 waiting
    CPU 0:  0.0% user,  0.0% nice,  1.2% system,  0.0% interrupt, 98.8% idle
    CPU 1:  0.4% user,  0.0% nice,  0.0% system,  0.0% interrupt, 99.6% idle
    CPU 2:  0.8% user,  0.0% nice,  0.8% system,  0.8% interrupt, 97.7% idle
    CPU 3:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
    Mem: 87M Active, 246M Inact, 475M Wired, 152K Buf, 3037M Free
    ARC: 164M Total, 22M MFU, 135M MRU, 2420K Anon, 751K Header, 4455K Other
         77M Compressed, 197M Uncompressed, 2.55:1 Ratio
    Swap: 2048M Total, 2048M Free
    
      PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME     CPU COMMAND
       11 root       155 ki31     0K    64K CPU1    1  18.0H  99.45% idle{idle: cpu1}
       11 root       155 ki31     0K    64K RUN     2  17.9H  99.12% idle{idle: cpu2}
       11 root       155 ki31     0K    64K CPU3    3  17.9H  98.78% idle{idle: cpu3}
       11 root       155 ki31     0K    64K CPU0    0  17.9H  98.59% idle{idle: cpu0}
        0 root       -92    -     0K  4288K -       0  10:38   0.98% kernel{dummynet}
    86267 root        20    0   299M   174M nanslp  2   5:58   0.64% ntopng{ntopng}
       12 root       -92    -     0K   512K WAIT    2   2:53   0.23% intr{irq269: igb1:que 0}
    57848 root        20    0 22116K  5124K CPU2    2   0:00   0.22% top
    28709 root        20    0 43056K 15236K kqread  3   6:12   0.17% haproxy
       12 root       -92    -     0K   512K WAIT    1   1:59   0.15% intr{irq267: igb0:que 1}
    86267 root        20    0   299M   174M nanslp  2   0:42   0.11% ntopng{ntopng}
    
    

    download

    last pid: 16016;  load averages:  0.27,  0.20,  0.13                                               up 0+18:14:57  11:41:14
    445 processes: 5 running, 408 sleeping, 32 waiting
    CPU 0:  0.8% user,  0.0% nice,  8.3% system, 13.0% interrupt, 78.0% idle
    CPU 1:  0.4% user,  0.0% nice,  0.4% system,  5.9% interrupt, 93.3% idle
    CPU 2:  2.4% user,  0.0% nice,  1.6% system,  7.1% interrupt, 89.0% idle
    CPU 3:  4.3% user,  0.0% nice,  2.8% system,  8.7% interrupt, 84.3% idle
    Mem: 87M Active, 246M Inact, 475M Wired, 152K Buf, 3037M Free
    ARC: 164M Total, 22M MFU, 135M MRU, 2424K Anon, 750K Header, 4455K Other
         77M Compressed, 197M Uncompressed, 2.55:1 Ratio
    Swap: 2048M Total, 2048M Free
    
      PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME     CPU COMMAND
       11 root       155 ki31     0K    64K CPU1    1  18.0H  89.47% idle{idle: cpu1}
       11 root       155 ki31     0K    64K CPU2    2  18.0H  88.66% idle{idle: cpu2}
       11 root       155 ki31     0K    64K RUN     3  17.9H  86.68% idle{idle: cpu3}
       11 root       155 ki31     0K    64K CPU0    0  17.9H  79.79% idle{idle: cpu0}
        0 root       -92    -     0K  4288K -       0  10:39  10.64% kernel{dummynet}
       12 root       -92    -     0K   512K WAIT    0   2:03   9.20% intr{irq266: igb0:que 0}
       12 root       -92    -     0K   512K WAIT    1   2:00   8.86% intr{irq267: igb0:que 1}
    86267 root        22    0   299M   174M bpf     0   2:06   7.04% ntopng{ntopng}
    86267 root        22    0   299M   174M bpf     3   1:36   6.80% ntopng{ntopng}
       12 root       -92    -     0K   512K WAIT    2   2:54   6.42% intr{irq269: igb1:que 0}
       12 root       -92    -     0K   512K WAIT    3   2:10   5.70% intr{irq270: igb1:
    

    upload

    
    last pid: 16016;  load averages:  0.43,  0.24,  0.14                                     up 0+18:15:16  11:41:33
    445 processes: 5 running, 408 sleeping, 32 waiting
    CPU 0:  5.1% user,  0.0% nice, 17.3% system, 11.0% interrupt, 66.7% idle
    CPU 1:  9.8% user,  0.0% nice,  1.6% system, 11.4% interrupt, 77.3% idle
    CPU 2:  5.1% user,  0.0% nice,  4.7% system,  8.6% interrupt, 81.6% idle
    CPU 3:  4.7% user,  0.0% nice,  2.4% system,  9.4% interrupt, 83.5% idle
    Mem: 87M Active, 246M Inact, 475M Wired, 152K Buf, 3037M Free
    ARC: 164M Total, 22M MFU, 135M MRU, 2424K Anon, 750K Header, 4455K Other
         77M Compressed, 197M Uncompressed, 2.55:1 Ratio
    Swap: 2048M Total, 2048M Free
    
      PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME     CPU COMMAND
       11 root       155 ki31     0K    64K RUN     3  18.0H  83.21% idle{idle: cpu3}
       11 root       155 ki31     0K    64K CPU2    2  18.0H  82.15% idle{idle: cpu2}
       11 root       155 ki31     0K    64K RUN     1  18.0H  80.52% idle{idle: cpu1}
       11 root       155 ki31     0K    64K RUN     0  17.9H  68.66% idle{idle: cpu0}
        0 root       -92    -     0K  4288K -       3  10:42  21.03% kernel{dummynet}
    86267 root        26    0   299M   174M bpf     0   2:08  11.87% ntopng{ntopng}
    86267 root        25    0   299M   174M bpf     1   1:38  11.45% ntopng{ntopng}
    
    

    top -CHIP for limits 800-800
    idle

    last pid: 42737;  load averages:  0.17,  0.20,  0.14                                     up 0+18:16:22  11:42:39
    445 processes: 5 running, 408 sleeping, 32 waiting
    CPU 0:  0.0% user,  0.0% nice,  1.2% system,  0.0% interrupt, 98.8% idle
    CPU 1:  0.4% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.2% idle
    CPU 2:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
    CPU 3:  0.4% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.2% idle
    Mem: 88M Active, 247M Inact, 477M Wired, 152K Buf, 3033M Free
    ARC: 165M Total, 22M MFU, 135M MRU, 2424K Anon, 757K Header, 4457K Other
         77M Compressed, 197M Uncompressed, 2.55:1 Ratio
    Swap: 2048M Total, 2048M Free
    
      PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME     CPU COMMAND
       11 root       155 ki31     0K    64K RUN     2  18.0H  99.47% idle{idle: cpu2}
       11 root       155 ki31     0K    64K CPU3    3  18.0H  99.10% idle{idle: cpu3}
       11 root       155 ki31     0K    64K CPU0    0  17.9H  98.95% idle{idle: cpu0}
       11 root       155 ki31     0K    64K CPU1    1  18.1H  97.41% idle{idle: cpu1}
        0 root       -92    -     0K  4288K -       0  10:46   0.85% kernel{dummynet}
    86267 root        20    0   299M   174M nanslp  3   5:58   0.49% ntopng{ntopng}
    

    download (when it crashes)

    last pid: 61056;  load averages:  0.17,  0.19,  0.13                                     up 0+18:17:00  11:43:17
    445 processes: 7 running, 408 sleeping, 30 waiting
    CPU 0:  4.1% user,  0.0% nice, 13.5% system, 16.5% interrupt, 65.9% idle
    CPU 1:  2.2% user,  0.0% nice,  0.4% system, 13.9% interrupt, 83.5% idle
    CPU 2:  4.5% user,  0.0% nice,  5.2% system,  9.4% interrupt, 80.9% idle
    CPU 3:  7.1% user,  0.0% nice,  4.5% system, 14.2% interrupt, 74.2% idle
    Mem: 87M Active, 247M Inact, 484M Wired, 152K Buf, 3027M Free
    ARC: 165M Total, 22M MFU, 135M MRU, 2268K Anon, 756K Header, 4458K Other
         77M Compressed, 197M Uncompressed, 2.55:1 Ratio
    Swap: 2048M Total, 2048M Free
    
      PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME     CPU COMMAND
       11 root       155 ki31     0K    64K RUN     1  18.1H  83.40% idle{idle: cpu1}
       11 root       155 ki31     0K    64K RUN     2  18.0H  82.88% idle{idle: cpu2}
       11 root       155 ki31     0K    64K CPU3    3  18.0H  79.58% idle{idle: cpu3}
       11 root       155 ki31     0K    64K CPU0    0  17.9H  69.30% idle{idle: cpu0}
        0 root       -92    -     0K  4288K -       2  10:47  16.37% kernel{dummynet}
       12 root       -92    -     0K   512K CPU0    0   2:07  14.98% intr{irq266: igb0:que 0}
       12 root       -92    -     0K   512K WAIT    1   2:04  12.14% intr{irq267: igb0:que 1}
    

    upload totally fine

    ast pid: 61192;  load averages:  0.20,  0.20,  0.14                                     up 0+18:17:17  11:43:34
    445 processes: 5 running, 408 sleeping, 32 waiting
    CPU 0:  3.1% user,  0.0% nice, 18.5% system, 14.2% interrupt, 64.2% idle
    CPU 1:  6.7% user,  0.0% nice,  1.2% system, 11.0% interrupt, 81.1% idle
    CPU 2:  3.1% user,  0.0% nice,  2.4% system,  7.1% interrupt, 87.4% idle
    CPU 3:  5.5% user,  0.0% nice,  3.9% system,  8.3% interrupt, 82.3% idle
    Mem: 87M Active, 247M Inact, 484M Wired, 152K Buf, 3027M Free
    ARC: 165M Total, 22M MFU, 135M MRU, 2436K Anon, 756K Header, 4458K Other
         77M Compressed, 197M Uncompressed, 2.55:1 Ratio
    Swap: 2048M Total, 2048M Free
    
      PID USERNAME   PRI NICE   SIZE    RES STATE   C   TIME     CPU COMMAND
       11 root       155 ki31     0K    64K RUN     3  18.0H  82.81% idle{idle: cpu3}
       11 root       155 ki31     0K    64K CPU2    2  18.0H  82.50% idle{idle: cpu2}
       11 root       155 ki31     0K    64K CPU1    1  18.1H  81.87% idle{idle: cpu1}
       11 root       155 ki31     0K    64K CPU0    0  17.9H  67.74% idle{idle: cpu0}
        0 root       -92    -     0K  4288K -       0  10:49  21.24% kernel{dummynet}
       12 root       -92    -     0K   512K WAIT    0   2:09  11.88% intr{irq266: igb0:que 0}
    86267 root        24    0   299M   174M bpf     2   2:11  11.66% ntopng{ntopng}
    86267 root        24    0   299M   174M bpf     2   1:41  11.33% ntopng{ntopng}
       12 root       -92    -     0K   512K WAIT    1   2:06   9.44% intr{irq267: 
    

    MBUFF

    
    [2.4.3-RC][admin@inferno.zwck.lan]/root: vmstat -z | head -1 ; vmstat -z | grep -i mbuf #idle
    
    ITEM                   SIZE  LIMIT       USED     FREE      REQ FAIL SLEEP
    mbuf_packet:            256, 1574895,   16790,    6992,74675913,   0,   0
    mbuf:                   256, 1574895,     412,    2371,77540344,   0,   0
    mbuf_cluster:          2048, 1000000,   23782,       6,   23782,   0,   0
    mbuf_jumbo_page:       4096, 123038,        0,     698, 4288582,   0,   0
    mbuf_jumbo_9k:         9216,  36455,        0,       0,       0,   0,   0
    mbuf_jumbo_16k:       16384,  20506,        0,       0,       0,   0,   0
    
    download
    [2.4.3-RC][admin@inferno.zwck.lan]/root: vmstat -z | head -1 ; vmstat -z | grep -i mbuf
    ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
    mbuf_packet:            256, 1574895,   16790,    6992,75836656,   0,   0
    mbuf:                   256, 1574895,     412,    2371,78304908,   0,   0
    mbuf_cluster:          2048, 1000000,   23782,       6,   23782,   0,   0
    mbuf_jumbo_page:       4096, 123038,       0,     698, 4288590,   0,   0
    mbuf_jumbo_9k:         9216,  36455,       0,       0,       0,   0,   0
    mbuf_jumbo_16k:       16384,  20506,       0,       0,       0,   0,   0
    [2.4.3-RC][admin@inferno.zwck.lan]/root: vmstat -z | head -1 ; vmstat -z | grep -i mbuf
    ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
    mbuf_packet:            256, 1574895,   16790,    6992,75836986,   0,   0
    mbuf:                   256, 1574895,     412,    2371,78305106,   0,   0
    mbuf_cluster:          2048, 1000000,   23782,       6,   23782,   0,   0
    mbuf_jumbo_page:       4096, 123038,       0,     698, 4288590,   0,   0
    mbuf_jumbo_9k:         9216,  36455,       0,       0,       0,   0,   0
    mbuf_jumbo_16k:       16384,  20506,       0,       0,       0,   0,   0
    
    upload
    [2.4.3-RC][admin@inferno.zwck.lan]/root: vmstat -z | head -1 ; vmstat -z | grep -i mbuf
    ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
    mbuf_packet:            256, 1574895,   17065,    6717,76391497,   0,   0
    mbuf:                   256, 1574895,     412,    2371,78486631,   0,   0
    mbuf_cluster:          2048, 1000000,   23782,       6,   23782,   0,   0
    mbuf_jumbo_page:       4096, 123038,       0,     698, 4288590,   0,   0
    mbuf_jumbo_9k:         9216,  36455,       0,       0,       0,   0,   0
    mbuf_jumbo_16k:       16384,  20506,       0,       0,       0,   0,   0
    [2.4.3-RC][admin@inferno.zwck.lan]/root: vmstat -z | head -1 ; vmstat -z | grep -i mbuf
    ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
    mbuf_packet:            256, 1574895,   17108,    6674,77061386,   0,   0
    mbuf:                   256, 1574895,     412,    2371,78718590,   0,   0
    mbuf_cluster:          2048, 1000000,   23782,       6,   23782,   0,   0
    mbuf_jumbo_page:       4096, 123038,       0,     698, 4288590,   0,   0
    mbuf_jumbo_9k:         9216,  36455,       0,       0,       0,   0,   0
    mbuf_jumbo_16k:       16384,  20506,       0,       0,       0,   0,   0
    
    

    irqs for each que and nic looks good…

    
    [2.4.3-RC][admin@inferno.zwck.lan]/root: vmstat -i
    interrupt                          total       rate
    irq4: uart0                          224          0
    irq23: ehci0                      137353          2
    cpu0:timer                      71932468       1048
    cpu1:timer                      18262419        266
    cpu2:timer                      25164107        367
    cpu3:timer                      19066067        278
    irq265: hdac0                         35          0
    irq266: igb0:que 0              10573218        154
    irq267: igb0:que 1               6880452        100
    irq268: igb0:link                      5          0
    irq269: igb1:que 0              18310531        267
    irq270: igb1:que 1               6241534         91
    irq271: igb1:link                      4          0
    irq272: igb2:que 0                    32          0
    irq273: igb2:que 1                    32          0
    irq274: igb2:link                      2          0
    irq275: igb3:que 0                 68643          1
    irq276: igb3:que 1                 68643          1
    irq277: igb3:link                      1          0
    irq278: ahci0                     819394         12
    Total                          177525164       2586
    [2.4.3-RC][admin@inferno.zwck.lan]/root: vmstat -i
    
    

    Netstat






  • Okay.

    What i am about to write is probably going to frustrate you, but i think it might not be the issue of pfsense but instead a browser issue.

    I have performed the test with 950 950

    chrome (dll crashes upload performs perfect)
    firefox  (ddl performs perfect upload performs perfect (bufferbloat sure, but not to the extend of what chrome does)
    edge (just crashes)

    tbh, i am not sure if this is normal.