Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    A definitive, example-driven, HFSC Reference Thread

    Scheduled Pinned Locked Moved Traffic Shaping
    93 Posts 14 Posters 43.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      sideout
      last edited by

      Yes you always look at it from traffic coming in from the WAN as that is how the shaping is designed to work before the NAT happens.  The only way to shape like that is to use floating rules with Interface set to WAN and I use direction any on my rules.

      1 Reply Last reply Reply Quote 0
      • DerelictD
        Derelict LAYER 8 Netgate
        last edited by

        So I ran into a problem tonight.  I wanted to take a specific OPT2 device, 192.168.225.65, and place it in a "Penalty Box" for egress to the WAN.  I created an alias Penaltybox containing Host 192.168.225.65 and created a floating rule on WAN out placing anything sourced from that alias into qPenaltyBox.

        There is also a pass any any any rule on OPT2 that assigns no queues.

        No traffic was ever placed in qPenaltyBox.  States cleared several times.  0 packets put into qPenaltyBox ever.

        Does the Pass rule on OPT2 create the state with no queue assigned before the floating rule has a chance to assign the queue?

        I did manage to get this traffic into qPenaltyBox by creating a rule on OPT2 that passed traffic from the Penaltybox host and marked it with "PB".  I then created a floating rule on WAN out putting all traffic from any to any and marked with PB into qPenaltyBox.

        Chattanooga, Tennessee, USA
        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
        Do Not Chat For Help! NO_WAN_EGRESS(TM)

        1 Reply Last reply Reply Quote 0
        • E
          Ecnerwal
          last edited by

          In the stickies for traffic shaping, there is a note from Ermal (perhaps collecting several postings together, from the look of it and the thread it links to) that makes reference to needing to kill the any/any rule. That effectively implies that we need to nail down getting everything else classified and sorted. There's also the confusing (to me, thus far) bits about order sensitivity in firewall rule processing, and that being different for interface rules .vs. floating rules, and…in short, I often find things don't do what I think they should do there; I read, I think I get it, I try things, they don't work as expected based on my reading, lather, rinse, repeat. Perhaps if I had a month and nothing else to do....

          Ermal wrote:

          Now back to why you need to disable the anti-lockout rule and the default LAN rule.
          The pf packet filter is stateful and if it registers a state about a stream of traffic it will not check the ruleset again.
          On this packet filter that is used in pfSense traffic is assigned to a queue by specifying it explicitly with the rule that matches the traffic/ the rule that creates the state.
          The default anti-lockout rule is the same as the default lan rule just createt automatically for the user to prevent his from doing stupid things.
          But this rule is to generic as it matches all the traffic passing from lan and nothing else in the ruleset gets executed. As such it sends all the traffic to the default queue which is not what the user wants with a QoS policy on.
          The same applies to the default LAN rule pfSense ships with. Since now you have to explicitly choose the queue the traffic has to go when creating a rule there is no easy solution to this other than disable these settings and have more fine tuned rules for classifying traffic to the propper queue.

          pfSense on i5 3470/DQ77MK/16GB/500GB

          1 Reply Last reply Reply Quote 0
          • K
            killerb81
            last edited by

            Maybe someone can expand on WHY this is, I just know that for the order of rules processing, it follows that:

            WAN and LAN rules are applied to the first matching condition working its way down from top to bottom.
            Floating rules apply to the LAST matching rule from top to bottom.

            Hope this helps.

            1 Reply Last reply Reply Quote 0
            • K
              killerb81
              last edited by

              @Derelict:

              I did manage to get this traffic into qPenaltyBox by creating a rule on OPT2 that passed traffic from the Penaltybox host and marked it with "PB".  I then created a floating rule on WAN out putting all traffic from any to any and marked with PB into qPenaltyBox.

              Can you expand on the marking functionality? I have some ideas on how this would be useful to me but not 100% sure on how to implement it.
              I want to mark certain packets in the LAN rules, then find those marked packets in the outgoing WAN rules (floating rules) to put them on a certain gateway.

              I posted a thread about it here: https://forum.pfsense.org/index.php?topic=83972.msg460314#msg460314

              Thank you!

              1 Reply Last reply Reply Quote 0
              • DerelictD
                Derelict LAYER 8 Netgate
                last edited by

                Not sure what you want.  It's pretty simple.  In the advanced section of a firewall rule you can either mark a packet or match based on a previous mark.

                Chattanooga, Tennessee, USA
                A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                Do Not Chat For Help! NO_WAN_EGRESS(TM)

                1 Reply Last reply Reply Quote 0
                • K
                  killerb81
                  last edited by

                  Do these marks stay on the packets for upstream pfSense instances to read?
                  (I know this question is a little (a lot) off the thread topic)… sorry.

                  1 Reply Last reply Reply Quote 0
                  • DerelictD
                    Derelict LAYER 8 Netgate
                    last edited by

                    I highly doubt it.  I don't know where they'd put them in the frame/packet.

                    Verified -

                    "Tags are internal identifiers. Tags are not sent out over the wire."

                    http://www.openbsd.org/faq/pf/tagging.html

                    You should be able to match on a pf internal tag and set a DSCP tag instead.  That should survive the trip if your gear is set to trust them.

                    Chattanooga, Tennessee, USA
                    A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                    DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                    Do Not Chat For Help! NO_WAN_EGRESS(TM)

                    1 Reply Last reply Reply Quote 0
                    • N
                      Nullity
                      last edited by

                      Edit: I created a thread to dedicated to understanding HFSC, focusing on it's "decoupled bandwidth and delay" capabilities. That thread will (hopefully) have a better revision of this post.
                      https://forum.pfsense.org/index.php?topic=89367.0

                      –--Original post----

                      Great thread! :)
                      Though, I see nothing about the primary reason for HFSC's adoption; the separation of bandwidth and delay allocation. Without employing this feature HFSC is only a slight improvement over previous class-based hierarchical link-sharing algorithms.

                      Only use real-time when you require it. It is "unfair", unlike link-share. Please read documentation for more details. Generally, only NTP and VOIP should be employing real-time queueing.

                      The wording used in the HFSC paper is "decoupled delay and bandwidth allocation". Usually, you can only allocate bandwidth and delay together. Let's say you allocate 25kbit to NTP (average speed). Sadly, this means in a worst-case scenario that a 1500byte (12kbit) NTP packet may take approximately 500ms to completely transmit upon receipt. This 500ms delay is unacceptable with NTP.

                      HFSC allows you to allocate not only an average bitrate of 25kbit but also an initial bandwidth or a "burst", as it is sometimes called. To improve the delay of NTP we will set m1 to 480kbit (80% of my 600kbit upper-limit for upload, which I think is the max for real-time allocation) which means a 1500byte (12kbit) packet would send in 25ms, so set d to 26ms (give d a little room to relax, so I add 1ms to 25ms), then set m2 to 25kbit. Now NTP packets are guaranteed to send within 26ms. Hell of an improvement over 500ms and I still have only 25kbit allocated of my 600kbit upload. NTP packets are now allocated the delay of a 480kbit connection, but the bandwidth of a 25kbit connection. This is delay and bandwidth decoupled.

                      Delay is measured as the time between the last bit being received and the last bit being transmitted.

                      Here are some links that helped me.
                      http://man7.org/linux/man-pages/man7/tc-hfsc.7.html
                      http://www.sonycsl.co.jp/person/kjc/software/TIPS.txt
                      http://serverfault.com/questions/105014/does-anyone-really-understand-how-hfsc-scheduling-in-linux-bsd-works
                      http://linux-ip.net/articles/hfsc.en/
                      and any texts I could find from the HFSC authors. The papers include lots of non-academian-level information, so do not be afraid to read them.

                      Please post any questions or corrections. :)

                      Please correct any obvious misinformation in my posts.
                      -Not a professional; an arrogant ignoramous.

                      1 Reply Last reply Reply Quote 0
                      • H
                        Harvy66
                        last edited by

                        HFSC does not affect how quickly a packet will be serialized, but will affect when a packet will be sent. An issue you can get on low bandwidth connections as your example is that a 1500MTU is quite large relative to the time slices the scheduler is targeting.

                        fq_Codel mentioned this same issue when determining which bucket to dequeue from next is connections below 10Mb need to increase their target latency to accommodate large packets. 100Mb+ is optimal for 1500MTUs and the default 5ms target.

                        Realtime is useful for any traffic you feel should have crazy low jitter, but should not use linkshare at all. If your traffic makes use of link share at all, then realtime is not a good fit, just use linkshare. This kind of goes with link utilization info that many upstream providers have been talking about over the years. A link is considered 100% at 80%. This is because prior to 80% utilization, the buffers are primarily empty. As you get past 80%, buffers start to grow. Many hardware QoS implementations on even high end managed switches also use this logic. If the port is below 80% utilization, QoS is disabled.

                        This same idea applies to realtime in HFSC. If you are at or below 80%, there should be roughly "0" dequeue latency, even if the connection is at 100%. The remaining 20% that is above the 80% is your link share and is subject to increasing amounts of jitter as you approach 100%, but realtime should be nearly unaffected. "Zero" latency is relative to the quantum of time that HFSC is targeting.

                        I have a relatively stable ping to Google, like fractional milliseconds of variation when averaged over 30+ seconds. With HFSC on PFSense, I can be at 100% utilization and not see a difference. The measurements I was taking at the time was using hrping which gives the jitter within a single standard deviation. When my upload was at 100%, the jitter was "identical" to the tenths position(0.1ms). I probably get crazy good results because I have a 1Gb connection that is rate limited to 100Mb. This means my NIC can put packets on the line really fast relative to my bandwidth. If I was trying to move 1Gb over my 1Gb link, it probably wouldn't be as stable, but I'm sure it would still be "great".

                        1 Reply Last reply Reply Quote 0
                        • N
                          Nullity
                          last edited by

                          @Harvy66:

                          HFSC does not affect how quickly a packet will be serialized, but will affect when a packet will be sent. An issue you can get on low bandwidth connections as your example is that a 1500MTU is quite large relative to the time slices the scheduler is targeting.

                          fq_Codel mentioned this same issue when determining which bucket to dequeue from next is connections below 10Mb need to increase their target latency to accommodate large packets. 100Mb+ is optimal for 1500MTUs and the default 5ms target.

                          Realtime is useful for any traffic you feel should have crazy low jitter, but should not use linkshare at all. If your traffic makes use of link share at all, then realtime is not a good fit, just use linkshare. This kind of goes with link utilization info that many upstream providers have been talking about over the years. A link is considered 100% at 80%. This is because prior to 80% utilization, the buffers are primarily empty. As you get past 80%, buffers start to grow. Many hardware QoS implementations on even high end managed switches also use this logic. If the port is below 80% utilization, QoS is disabled.

                          This same idea applies to realtime in HFSC. If you are at or below 80%, there should be roughly "0" dequeue latency, even if the connection is at 100%. The remaining 20% that is above the 80% is your link share and is subject to increasing amounts of jitter as you approach 100%, but realtime should be nearly unaffected. "Zero" latency is relative to the quantum of time that HFSC is targeting.

                          I have a relatively stable ping to Google, like fractional milliseconds of variation when averaged over 30+ seconds. With HFSC on PFSense, I can be at 100% utilization and not see a difference. The measurements I was taking at the time was using hrping which gives the jitter within a single standard deviation. When my upload was at 100%, the jitter was "identical" to the tenths position(0.1ms). I probably get crazy good results because I have a 1Gb connection that is rate limited to 100Mb. This means my NIC can put packets on the line really fast relative to my bandwidth. If I was trying to move 1Gb over my 1Gb link, it probably wouldn't be as stable, but I'm sure it would still be "great".

                          If my 600kbit upload is 80% utilized with a 480kbit backlog and then received 1500byte (12kbit) I would have a 800ms delay added to the best-case delay of 20ms (1500bytes/12kbits @ 600kbit) without QoS. I do not agree with your statement that "Many hardware QoS implementations on even high end managed switches also use this logic. If the port is below 80% utilization, QoS is disabled."

                          Have you read the HFSC paper(s)?
                          HFSC is all about delay improvements over previous queueuing algorithms by decoupling bandwidth and delay. It says this in the introduction to the paper. It is not really a point that can be argued with because they have the mathematical proofs and ran simulations to back up their claims.

                          Please correct any obvious misinformation in my posts.
                          -Not a professional; an arrogant ignoramous.

                          1 Reply Last reply Reply Quote 0
                          • H
                            Harvy66
                            last edited by

                            @Nullity:

                            @Harvy66:

                            HFSC does not affect how quickly a packet will be serialized, but will affect when a packet will be sent. An issue you can get on low bandwidth connections as your example is that a 1500MTU is quite large relative to the time slices the scheduler is targeting.

                            fq_Codel mentioned this same issue when determining which bucket to dequeue from next is connections below 10Mb need to increase their target latency to accommodate large packets. 100Mb+ is optimal for 1500MTUs and the default 5ms target.

                            Realtime is useful for any traffic you feel should have crazy low jitter, but should not use linkshare at all. If your traffic makes use of link share at all, then realtime is not a good fit, just use linkshare. This kind of goes with link utilization info that many upstream providers have been talking about over the years. A link is considered 100% at 80%. This is because prior to 80% utilization, the buffers are primarily empty. As you get past 80%, buffers start to grow. Many hardware QoS implementations on even high end managed switches also use this logic. If the port is below 80% utilization, QoS is disabled.

                            This same idea applies to realtime in HFSC. If you are at or below 80%, there should be roughly "0" dequeue latency, even if the connection is at 100%. The remaining 20% that is above the 80% is your link share and is subject to increasing amounts of jitter as you approach 100%, but realtime should be nearly unaffected. "Zero" latency is relative to the quantum of time that HFSC is targeting.

                            I have a relatively stable ping to Google, like fractional milliseconds of variation when averaged over 30+ seconds. With HFSC on PFSense, I can be at 100% utilization and not see a difference. The measurements I was taking at the time was using hrping which gives the jitter within a single standard deviation. When my upload was at 100%, the jitter was "identical" to the tenths position(0.1ms). I probably get crazy good results because I have a 1Gb connection that is rate limited to 100Mb. This means my NIC can put packets on the line really fast relative to my bandwidth. If I was trying to move 1Gb over my 1Gb link, it probably wouldn't be as stable, but I'm sure it would still be "great".

                            If my 600kbit upload is 80% utilized with a 480kbit backlog and then received 1500byte (12kbit) I would have a 800ms delay added to the best-case delay of 20ms (12kbit @ 600kbit) without QoS. I do not agree with your statement that "Many hardware QoS implementations on even high end managed switches also use this logic. If the port is below 80% utilization, QoS is disabled."

                            Have you read the HFSC paper(s)?
                            HFSC is all about delay improvements over previous queueuing algorithms by decoupling bandwidth and delay. It says this in the introduction to the paper. It is not really a point that can be argued with because they have the mathematical proofs and ran simulations to back up their claims.

                            "If my 600kbit upload is 80% utilized with a 480kbit backlog" you mean 100% utilized? A backlog indicates that packets are coming in faster than they're going out, which means your interface is at 100%. And your packets are not actually transferred at 12kb/s, they're transferred at full line rate.

                            "decoupling bandwidth and delay" just means latency is kept stable while bandwidth is honored through advanced scheduling, include noting how large the head packet is in each queue because dequeuing large packets takes longer than smaller ones.

                            It took me a bit to find a speedtest server that I didn't get my full 100mb to, but I found some in Europe that gave me around 80Mb. My queue sizes were pretty much 0 the entire time, a few blips into the teens. My upload did manage to reach 100 for a brief moment during the TCP building phase, which then backed off and stabilized with a 0 queue. My point being that if you're below 80% utilization, your queue should be pretty much empty the entire time.

                            1 Reply Last reply Reply Quote 0
                            • N
                              Nullity
                              last edited by

                              @Harvy66:

                              "If my 600kbit upload is 80% utilized with a 480kbit backlog" you mean 100% utilized? A backlog indicates that packets are coming in faster than they're going out, which means your interface is at 100%. And your packets are not actually transferred at 12kb/s, they're transferred at full line rate.

                              "decoupling bandwidth and delay" just means latency is kept stable while bandwidth is honored through advanced scheduling, include noting how large the head packet is in each queue because dequeuing large packets takes longer than smaller ones.

                              It took me a bit to find a speedtest server that I didn't get my full 100mb to, but I found some in Europe that gave me around 80Mb. My queue sizes were pretty much 0 the entire time, a few blips into the teens. My upload did manage to reach 100 for a brief moment during the TCP building phase, which then backed off and stabilized with a 0 queue. My point being that if you're below 80% utilization, your queue should be pretty much empty the entire time.

                              I meant 12kbits, as in size, not bitrate. I was trying to simplify the bit/byte conversions.
                              About the utilization percentage; during a 1-second time-span, if I send 480kbits through a 600kbit connection, is it not 80% utilized? (kinda confused about this myself, but perhaps it is a conversation for another thread)

                              Just for clarity, can you point out what corrections I need to make to my original post?
                              I have kinda got lost in our long-winded posts.

                              Regarding how HFSC defines delay and what "decoupled bandwidth and delay" means;
                              Here is an excerpt about "Delay and fairness properties of H-FSC" and "Real-time Guarantees" from an HFSC paper: http://www.ecse.rpi.edu/homepages/koushik/shivkuma-teaching/sp2003/case/CaseStudy/stoica-hfsc-ton00.pdf

                              For the rest of the discussion, we consider the arrival time of a packet to be the time when the last bit of the packet has been received, and the departing time to be the time when the last bit of the packet has been transmitted.
                              …
                              Clearly, H-FSC achieves much lower delays for both audio and video sessions. The reduction in delay with H-FSC is especially significant for the audio session. This is a direct consequence of H-FSC’s ability to decouple delay and bandwidth allocation.

                              To achieve decoupled delay and bandwidth you must use a 2-part service curve by setting the m1, d, and m2 parameters. This is not done automatically and cannot be achieved any other way (except the somewhat depreciated u-max/d-max parameters). I do not think "decoupled bandwidth and delay" means what you think it means.

                              Can you please cite some sources with your posts?
                              I would rather focus on understanding the HFSC papers and share personal anecdotes later.

                              Please correct any obvious misinformation in my posts.
                              -Not a professional; an arrogant ignoramous.

                              1 Reply Last reply Reply Quote 0
                              • H
                                Harvy66
                                last edited by

                                @Nullity:

                                @Harvy66:

                                "If my 600kbit upload is 80% utilized with a 480kbit backlog" you mean 100% utilized? A backlog indicates that packets are coming in faster than they're going out, which means your interface is at 100%. And your packets are not actually transferred at 12kb/s, they're transferred at full line rate.

                                "decoupling bandwidth and delay" just means latency is kept stable while bandwidth is honored through advanced scheduling, include noting how large the head packet is in each queue because dequeuing large packets takes longer than smaller ones.

                                It took me a bit to find a speedtest server that I didn't get my full 100mb to, but I found some in Europe that gave me around 80Mb. My queue sizes were pretty much 0 the entire time, a few blips into the teens. My upload did manage to reach 100 for a brief moment during the TCP building phase, which then backed off and stabilized with a 0 queue. My point being that if you're below 80% utilization, your queue should be pretty much empty the entire time.

                                I meant 12kbits, as in size, not bitrate. I was trying to simplify the bit/byte conversions.
                                About the utilization percentage; during a 1-second time-span, if I send 480kbits through a 600kbit connection, is it not 80% utilized? (kinda confused about this myself, but perhaps it is a conversation for another thread)

                                Just for clarity, can you point out what corrections I need to make to my original post?
                                I have kinda got lost in our long-winded posts.

                                Regarding how HFSC defines delay and what "decoupled bandwidth and delay" means;
                                Here is an excerpt about "Delay and fairness properties of H-FSC" and "Real-time Guarantees" from an HFSC paper: http://www.ecse.rpi.edu/homepages/koushik/shivkuma-teaching/sp2003/case/CaseStudy/stoica-hfsc-ton00.pdf

                                For the rest of the discussion, we consider the arrival time of a packet to be the time when the last bit of the packet has been received, and the departing time to be the time when the last bit of the packet has been transmitted.
                                …
                                Clearly, H-FSC achieves much lower delays for both audio and video sessions. The reduction in delay with H-FSC is especially significant for the audio session. This is a direct consequence of H-FSC’s ability to decouple delay and bandwidth allocation.

                                To achieve decoupled delay and bandwidth you must use a 2-part service curve by setting the m1, d, and m2 parameters. This is not done automatically and cannot be achieved any other way (except the somewhat depreciated u-max/d-max parameters). I do not think "decoupled bandwidth and delay" means what you think it means.

                                Can you please cite some sources with your posts?
                                I would rather focus on understanding the HFSC papers and share personal anecdotes later.

                                Backlog and utilization are two separate things. A backlog means packets are enqueuing faster than they're dequeuing, which also means your interface is at 100%. You can have 80% utilization without any backlog/buffering and is actually more often than not. The further you get past 80% the more common buffering becomes. And 80% doesn't mean a smooth 80%, it's an average, which means above and below 80%.

                                Ahhh.. Seems I misunderstood your 12kbit example.

                                "Decoupling" bandwidth and latency is talking about how buffering has been heavily used for naive traffic shapers. Bandwidth and delay(buffering) has gone hand-in-hand for a long time for many algorithms because they are much simpler to implement. What is hard to do is to give a queue a certain amount of bandwidth while maintaining guarantees on latency.

                                1 Reply Last reply Reply Quote 0
                                • H
                                  Harvy66
                                  last edited by

                                  Realtime can only use up to 80% but gets a delay guarantee and linkshare can use up to 100% but can have "delay issues". 80% is fine for non-bulk flow types. It is plenty enough for my games, DNS, ICMP, NTP, etc. I can keep those kinds of traffic with guaranteed delays. HTTP, P2P like to have up to 100%, but are less latency sensitive. I'm not talking about large latency, just dequeuing jitter.

                                  It is generally "bad" if a queue has realtime bandwidth assigned, but does not have enough realtime to satisfy its bandwidth needs. It would be typically undesirable to have your VoIP to have 1Mb of realtime and 1Mb of linkshare, but needs 2Mb of bandwidth. I assume the primary thing that would happen is as you go further and further beyond your realtime, jitter will start to approach that of linkshare. But below the realtime limit, all bandwidth should be "delay free".

                                  In theory, if your total bandwidth usage is below 80%, there is no real difference between Linkshare and Realtime. In practice, packets arrive in bursts for one reason or another, so HFSC can smooth out those bursts and keep jitter low. For constant rate UDP style traffic, there is little can be done, but with TCP, the PPS per flow will attempt to stabilize.

                                  I wonder what the target delay is for HFSC in PFSense, because it could possibly negatively interact with Codel which is also delay sensitive. This would probably be an issue as your approach link saturation, or above 80% utilization because HFSC is also "fair".

                                  Based on the abstract summaries of HFSC, it sounds like there is no real "target delay", so much as tweaking the quantum of bytes to dequeue per iteration. I think HFSC's "delay" is effectively it's quantum. If you assume the quantum is 10,000 bytes, which is a number I have seen thrown around, then your delay will be roughly capped by the time it takes to transfer 10,000 bytes of data. I think this is why they recommend certain minimum connection speeds, because large quantum on slow connections can mess up the delays. But your quantum can be no smaller than your MTU.

                                  Another issue that can affect delays is the thread scheduler. If the thread is only woken every 10ms, then the packet scheduler can be no more accurate. With PFSense on my current box, I see the CPU timer is  2,000/s which is 0.5ms.

                                  1 Reply Last reply Reply Quote 0
                                  • N
                                    Nullity
                                    last edited by

                                    @Harvy66:

                                    I wonder what the target delay is for HFSC in PFSense, because it could possibly negatively interact with Codel which is also delay sensitive. This would probably be an issue as your approach link saturation, or above 80% utilization because HFSC is also "fair".

                                    Based on the abstract summaries of HFSC, it sounds like there is no real "target delay", so much as tweaking the quantum of bytes to dequeue per iteration. I think HFSC's "delay" is effectively it's quantum. If you assume the quantum is 10,000 bytes, which is a number I have seen thrown around, then your delay will be roughly capped by the time it takes to transfer 10,000 bytes of data. I think this is why they recommend certain minimum connection speeds, because large quantum on slow connections can mess up the delays. But your quantum can be no smaller than your MTU.

                                    Another issue that can affect delays is the thread scheduler. If the thread is only woken every 10ms, then the packet scheduler can be no more accurate. With PFSense on my current box, I see the CPU timer is  2,000/s which is 0.5ms.

                                    When you say things like this it is painfully obvious that you have not read the HFSC papers. The target delay is set by the user with the m1, d, and m2 parameters. If you read more than the paper's abstraction you would know this.

                                    In the HFSC paper I cited earlier, dequeueing and delay are defined as two seperate conditions and should not be used interchangeably. Dequeueing overhead is measured, in the cited paper, to be below 20 microseconds on out-dated hardware. Delay is measured and configured using milliseconds. This is another example that you have not read the HFSC papers.

                                    The term "quantum" is not used once in the HFSC paper I cited. Your use of this term is confusing. The paper also says nothing about a mimimum connection speed. Perhaps you are confusing HFSC with Codel?

                                    The standard kern.hz is 1000 (though, nanobsd uses 100), so changing this setting is unneeded for most users. You are correct about system clock tick rates being important, but aside from that you are bordering on spreading misinformation.

                                    Again, please cite your sources.

                                    Please correct any obvious misinformation in my posts.
                                    -Not a professional; an arrogant ignoramous.

                                    1 Reply Last reply Reply Quote 0
                                    • H
                                      Harvy66
                                      last edited by

                                      @Nullity:

                                      @Harvy66:

                                      I wonder what the target delay is for HFSC in PFSense, because it could possibly negatively interact with Codel which is also delay sensitive. This would probably be an issue as your approach link saturation, or above 80% utilization because HFSC is also "fair".

                                      Based on the abstract summaries of HFSC, it sounds like there is no real "target delay", so much as tweaking the quantum of bytes to dequeue per iteration. I think HFSC's "delay" is effectively it's quantum. If you assume the quantum is 10,000 bytes, which is a number I have seen thrown around, then your delay will be roughly capped by the time it takes to transfer 10,000 bytes of data. I think this is why they recommend certain minimum connection speeds, because large quantum on slow connections can mess up the delays. But your quantum can be no smaller than your MTU.

                                      Another issue that can affect delays is the thread scheduler. If the thread is only woken every 10ms, then the packet scheduler can be no more accurate. With PFSense on my current box, I see the CPU timer is  2,000/s which is 0.5ms.

                                      When you say things like this it is painfully obvious that you have not read the HFSC papers. The target delay is set by the user with the m1, d, and m2 parameters. If you read more than the paper's abstraction you would know this.

                                      In the HFSC paper I cited earlier, dequeueing and delay are defined as two seperate conditions and should not be used interchangeably. Dequeueing overhead is measured, in the cited paper, to be below 20 microseconds on out-dated hardware. Delay is measured and configured using milliseconds. This is another example that you have not read the HFSC papers.

                                      The term "quantum" is not used once in the HFSC paper I cited. Your use of this term is confusing. The paper also says nothing about a mimimum connection speed. Perhaps you are confusing HFSC with Codel?

                                      The standard kern.hz is 1000 (though, nanobsd uses 100), so changing this setting is unneeded for most users. You are correct about system clock tick rates being important, but aside from that you are bordering on spreading misinformation.

                                      Again, please cite your sources.

                                      If you actually read about implementations of HFSC, they all talk about quantum. "m1, d" are completely optional if you don't care about burst. You can still benefit from the service curves without those two parameters. The notion of a quantum is extremely wide spread for many types of buffer management that work with bytes, all with every so slightly different usages of the term, but pretty much the same no matter what. An example is fq_codel treats the term "quantum" nearly identically.

                                      A quantum in this context is the number of maximum bytes to be dequeued for an pass/iteration/whatever. The scheduler will look at the head packet of each queue and combine the current service curves with the size of the head packet to determine which packets will get processed this quantum. Once the total number of bytes the quantum represents has been consumed, the scheduler will decide if another quantum must be consumed or to go to sleep. With HFSC, the priority will determine the order in which the packets are consumed for the quantum, but has no influence on which packets will be consumed.

                                      If  you have an older 10ms timer, many quantums may be consumed in order to play catch up, so packet scheduling tends to be more bursty.

                                      1 Reply Last reply Reply Quote 0
                                      • DerelictD
                                        Derelict LAYER 8 Netgate
                                        last edited by

                                        Please start another thread.  HFSC Theory or something.  This one is supposed to be for configuration examples.

                                        Chattanooga, Tennessee, USA
                                        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                        Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                        1 Reply Last reply Reply Quote 0
                                        • N
                                          Nullity
                                          last edited by

                                          @Harvy66:

                                          If you actually read about implementations of HFSC, they all talk about quantum. "m1, d" are completely optional if you don't care about burst. You can still benefit from the service curves without those two parameters. The notion of a quantum is extremely wide spread for many types of buffer management that work with bytes, all with every so slightly different usages of the term, but pretty much the same no matter what. An example is fq_codel treats the term "quantum" nearly identically.

                                          A quantum in this context is the number of maximum bytes to be dequeued for an pass/iteration/whatever. The scheduler will look at the head packet of each queue and combine the current service curves with the size of the head packet to determine which packets will get processed this quantum. Once the total number of bytes the quantum represents has been consumed, the scheduler will decide if another quantum must be consumed or to go to sleep. With HFSC, the priority will determine the order in which the packets are consumed for the quantum, but has no influence on which packets will be consumed.

                                          If  you have an older 10ms timer, many quantums may be consumed in order to play catch up, so packet scheduling tends to be more bursty.

                                          Please post a link to an HFSC paper that includes references to quantums.
                                          Edit: Please PM me the link since our fruitless back and forth is cluttering this thread.

                                          @Derelict:

                                          Please start another thread.  HFSC Theory or something.  This one is supposed to be for configuration examples.

                                          Apologies Derelict. :( I will delete my posts.
                                          Could you add the few links from https://forum.pfsense.org/index.php?topic=79589.msg494188#msg494188 and perhaps http://www.ecse.rpi.edu/homepages/koushik/shivkuma-teaching/sp2003/case/CaseStudy/stoica-hfsc-ton00.pdf to the OP?

                                          Please correct any obvious misinformation in my posts.
                                          -Not a professional; an arrogant ignoramous.

                                          1 Reply Last reply Reply Quote 0
                                          • DerelictD
                                            Derelict LAYER 8 Netgate
                                            last edited by

                                            Please don't delete.  It's good content.  If anything ask a mod to move them to a new thread.

                                            Chattanooga, Tennessee, USA
                                            A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                            DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                            Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.