Navigation

    Netgate Discussion Forum
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search

    bufferbloat with fq_codel after update to 23.01

    Traffic Shaping
    6
    20
    441
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T
      tomashk last edited by tomashk

      Edit:
      Observations below are wrong. As stated here in some post later the same I observe in previous version (22.05)

      Original description:

      Hi,
      I wonder if I am the only one observing worse fq_codel performance after upgrade to 23.01

      Before upgrade
      With 22.05 I had almost standard fq_codel configuration for my 600/30 Mbps connection:
      info_Zrzut ekranu 2023-02-17 210058.jpg
      And each time I run bufferbloat test I got something like the one below (depending on the day max +/- 3ms diff) - I have been getting results like that for months
      Zrzut ekranu 2023-02-17 205918.jpg

      After upgrade
      Now when I check bufferbloat on 23.01 (the same configuration) test looks the same for the first half of download test, but after a few seconds it is gradually getting worse. An example of the one of the better results I got is below
      6b2a602d-57fa-4dae-92f8-d52f2d42c082-image.png
      Change isn't big but on 23.01 I have never got anything better than 40ms for "Download Active" result

      Additional notes

      • I'm using Proxmox and for the test above:
        • I restored backup with 22.05 and run tests
        • I restored backup with 23.01 and run the second test
        • so all if it has been run multiple times within 10 minutes
        • before each tests I restarted pfsense (it doesn't change anything for the results but I wanted to be sure that I doing it properly)
      • I'm using passtrough for NICs and pciconf -lv is giving for this igc - device name 'Ethernet Controller I225-V'
      • Intel(R) Celeron(R) J4125 CPU @ 2.00GHz
      • In case it is needed I'm using pfblocker

      If additional info is needed please let me know.

      S 1 Reply Last reply Reply Quote 0
      • S
        SteveITS @tomashk last edited by

        @tomashk said in bufferbloat with fq_codel after update to 23.01:

        In case it is needed I'm using pfblocker

        If you have Wildcard Blocking (TLD) enabled ensure it isn't chewing up CPU during the test:
        https://redmine.pfsense.org/issues/13884

        Steve

        Only install packages for your version, or risk breaking it. If yours is older, select it in System/Update/Update Settings.
        When upgrading, let it finish; do not reboot early. Allow 10-15 minutes, or more depending on packages and device speed.

        T 1 Reply Last reply Reply Quote 0
        • T
          tomashk @SteveITS last edited by

          @steveits said in bufferbloat with fq_codel after update to 23.01:

          If you have Wildcard Blocking (TLD) enabled ensure it isn't chewing up CPU during the test:
          https://redmine.pfsense.org/issues/13884

          I don't have it enabled
          28466e44-264e-4aad-a5cc-3a26679c62a7-image.png

          1 Reply Last reply Reply Quote 0
          • T
            tomashk last edited by

            Also I wanted to rule out any additional factor that might make those observations not valid. For example do you know other sites testing bufferbloat? I used https://www.waveform.com/tools/bufferbloat but maybe I should check it using something else. I don't know if http://www.dslreports.com/ is good because for me it has only one server.

            dennypage 1 Reply Last reply Reply Quote 0
            • dennypage
              dennypage @tomashk last edited by

              @tomashk I did a check of my install against Waveform, and I do not see a degradation with 23.01.

              First thing I would do is to check CPU utilization during the test and see if you are seeing high CPU utilization. If so, what's using up the CPU?

              For reference, I am on a 6100 (Atom). My connection is a bit slower than yours at 300/30, however I see peak CPU utilization of under 20%. My test results are here if you want to see them.

              FWIW, depending upon how your ISP implements its own rate limiting, your limiter setting of 555Mb is probably a bit high. Even in your prior results, you have a significant bump in latency. You might try lowering the limiter rate to get that under control. My would guess somewhere around the 510-525Mb range.

              T 1 Reply Last reply Reply Quote 0
              • T
                tomashk @dennypage last edited by

                @dennypage Thank you for your suggestion. I'll check it out later. And just for reference - are your settings for fq_codel similar or completely different?

                Also, I think it would be great if someone could provide a good way to profile limiters like this. Because right now I'm just changing the settings at random and hoping for the best :)

                dennypage 1 Reply Last reply Reply Quote 0
                • dennypage
                  dennypage @tomashk last edited by

                  @tomashk said in bufferbloat with fq_codel after update to 23.01:

                  @dennypage And just for reference - are your settings for fq_codel similar or completely different?

                  That I can see from your status snapshot, they are the same except for the bandwidth settings.

                  T 1 Reply Last reply Reply Quote 0
                  • T
                    tomashk @dennypage last edited by

                    @dennypage Looks like I'll have to do a bit more research. Even with the limiter at 555Mbps the CPU was below 50%. I tried lowering the speed, but the funny thing is that even when I changed it to 450Mbps for download, the results were maybe 5-10ms better. Obviously something is wrong, but I suspect that there is something special about my configuration :). I'll probably come back to this once I've had a bit more time to observe it. Or maybe compare its behavior with 22.05.

                    dennypage 1 Reply Last reply Reply Quote 0
                    • dennypage
                      dennypage @tomashk last edited by

                      @tomashk said in bufferbloat with fq_codel after update to 23.01:

                      @dennypage Even with the limiter at 555Mbps the CPU was below 50%. I tried lowering the speed, but the funny thing is that even when I changed it to 450Mbps for download, the results were maybe 5-10ms better.

                      What is your hardware? Approaching 50% seems pretty high.

                      When you lowered to 450Mb, what was your throughput?

                      1 Reply Last reply Reply Quote 0
                      • dennypage
                        dennypage last edited by

                        @tomashk said in bufferbloat with fq_codel after update to 23.01:

                        @dennypage Even with the limiter at 555Mbps the CPU was below 50%. I tried lowering the speed, but the funny thing is that even when I changed it to 450Mbps for download, the results were maybe 5-10ms better.

                        What is your hardware? Approaching 50% seems pretty high.

                        When you lowered to 450Mb, what was your throughput?

                        T 1 Reply Last reply Reply Quote 0
                        • T
                          tomashk @dennypage last edited by

                          @dennypage So I found a temporary solution. I set it to 550Mbps and changed the setting for the VM with pfsense. I had 2 cores assigned and changed it to 4. Now I'm getting
                          9dd68373-91ca-4b6b-8e17-e1dc7d17b6d1-image.png

                          So something that's acceptable to me while I'm playing with the settings. And at the moment I have to stop because other people will be using the network.

                          I have an Intel(R) Celeron(R) J4125, so only 4 cores, but this proxmox only has this pfsense and container with unifi controller. So it will do for now.

                          dennypage 1 Reply Last reply Reply Quote 0
                          • dennypage
                            dennypage @tomashk last edited by

                            @tomashk I'm glad you found a solution.

                            1 Reply Last reply Reply Quote 1
                            • T
                              tomashk last edited by tomashk

                              It seems I was wrong. After giving it 4 cores, it will just work a little better and get an A once in a while - maybe once or twice for 20 tests. On the dashboard, the maximum CPU usage I saw was between 20 and 25%. top -HaSP doesn't show anything working hard either.

                              I guess at this point I should ask for investigation tips. This could be anything now

                              • some bug in the limiter implementation
                              • something between the new kernel and proxmox
                              • Neighbour made a voodoo doll to influence my router ;)

                              You never know.

                              Of course I'll share if I learn something useful and I'm grateful for any suggestions.

                              I'm also going to look at version 22.05 a bit more closely. It may be that the same problem exists there and I haven't investigated it well enough.

                              Bob.Dig 1 Reply Last reply Reply Quote 0
                              • Bob.Dig
                                Bob.Dig LAYER 8 @tomashk last edited by

                                @tomashk This testsite is highly dependent on your ISP, peering, maybe daytime.
                                The better test would be pinging something near and using speedtest.net in the meantime.

                                pfSense on Hyper-V

                                Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                T 1 Reply Last reply Reply Quote 1
                                • T
                                  tomashk @Bob.Dig last edited by

                                  @bob-dig Thanks for the suggestion. I hadn't thought of it that way.

                                  I guess I was right to look at an older version (22.05). After a short test, I found a similar problem there as well. Since version 22.05 worked fine for me for months (in terms of bufferbloat), I assume that something has changed recently. So I will focus on what seems more likely:

                                  • ISP has changed something (one of those using DOCSIS)
                                  • something has changed in proxmox
                                  • my configuration (pfsense or proxmox) is not very good

                                  So I have a lot to analyze. And sorry for the initial wrong guess. I should have better checked version 22.05 and not assumed that if it worked before, it will work now.

                                  But I'll try that later, because now I'd be disturbing others on the same network.

                                  dennypage 1 Reply Last reply Reply Quote 1
                                  • Cool_Corona
                                    Cool_Corona last edited by

                                    Got this...

                                    c24d435e-2ef3-487a-bf6a-31223751c36e-billede.png

                                    1 Reply Last reply Reply Quote 0
                                    • dennypage
                                      dennypage @tomashk last edited by

                                      @tomashk One thing to bring forward then. When you lowered to 450Mb, what was your throughput?

                                      The reason that I asked was to confirm that your limiter assignment rules were actually being hit.

                                      T 1 Reply Last reply Reply Quote 0
                                      • T
                                        tomashk @dennypage last edited by

                                        @dennypage I'll test it again when I get back, but I'm pretty sure limiter was used. When I was testing I usually had the dashboard open and sometimes also the 'limiter info' page. And limiter info was showing usual stuff about packet increase, drop, etc. when test started. Dashboard showed traffic graphs:

                                        • for LAN it is what was set for limiter +/- 5 Mbps
                                        • for WAN I see traffic about 5 to 10 Mbps more than LAN output

                                        Since I've been wrong a few times, I'll check again later and post if I remember correctly.

                                        dennypage 1 Reply Last reply Reply Quote 0
                                        • dennypage
                                          dennypage @tomashk last edited by

                                          @tomashk, I am using floating rules to perform limiter assignment, with no ackqueue. FWIW, I also have a floating rule just prior to exclude ICMP echo request/reply from the limiter rules.

                                          1 Reply Last reply Reply Quote 0
                                          • I
                                            ibbetsion last edited by

                                            You're not the only one. I am seeing terrible bufferbloat performance after upgrading. This and some other CPU related issues has caused me to revert back to 22.06 (thx ZFS!).

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post