Traffic not going to Limiters after 2.4.4
-
@jacotec said in Traffic not going to Limiters after 2.4.4:
+1 for a quick fix. This issue is ways too critical to wait weeks for a -p1 release in my opinion!
There's a presentation video on limiters from August 2018 for the upcoming 2.4.4 release - I can't understand that a presentation was taken although that seems to be fully untested.
As much as I love pfSense and appreciate the work Netgate puts in, but I really wonder how such a bug can make it into a release version ...
Because maybe it isn't quite that clear and it doesn't affect everyone?
There are a large number of users with limiters on 2.4.4 working just fine, with traffic using the limiters as expected. You need only peek at the FQ_CODEL thread for evidence.
-
If someone has a limiter problem where the queues DO NOT show up, including if you re-created them, I'd like to see the contents of the limiters from
config.xml
from before the upgrade as well as after. The section I'm looking for is the<dnshaper> ... </dnshaper>
section. There should not be anything too private in there, with a possible exception of a masked subnet if you used that.I'd also like to see the contents of
/tmp/rules.limiter
,ipfw pipe show
, andipfw queue show
.And as always, make sure you reset states between any limiter config change or test.
-
@jacotec It's not a GUI issue (in my case), check my first post, there is the output of ipfw pipe show and ipfw queue show. Pipes and subqueues are created properly.
-
@jimp Please find the requested info here: https://jaycloud.de/f/4a4b8a11ff4a49cfb179/
There seems to be no command "ipfw limiter show":
ipfw: bad command `limiter'
Let me know if you need any more information
-
That should have been
ipfw queue show
, sorry. I edited the message. -
@jimp OK, that one is empty. I've updated my document above.
-
You have queues defined but they are not loaded. Do your firewall rules have the queue selected or the base limiter itself?
Also the "after" settings look like they were changed after the upgrade. Was that what you have right now after attempting to make changes, or from immediately after the upgrade?
-
@jimp The base limiters have been there after the update, the child queues have been completely gone. My floating rules in the firewall are still there, but they were using the child queues before the update - after the update the child queue assignment in all floating rules were gone, just showing "none". So pfSense has deleted the configured pipes at this point after the update.
My child queues are not available anymore as the selection for the In/Out pipe of the rules, I see only the base limiters there.
I've changed the base queues to "FQCodel" later after the update, right ... hoping that I can see / recreate my children after changing the settings. Which did not happen. But the childs vanished before, right after the update and still with the old settings.
Do you think it would make sense to delete the dnshaper section from the XML, reboot and recreate the limiters and children in the GUI to see if they would work then?
-
@jimp
So, I've deleted all my limiters and deassigned the queues from the firewall rules.
I then was able to reconfigure all limiters and queues from scratch, and the child queues are now showing up and I can reassign them to the firewall rules.Did a "reset states", did a reboot - but traffic is not going to the queues.
On the console I see periodically errors:
config_aqm Unable to configure flowset, flowset busy
I've then changed all limiters / queues back to "TailDrop" / "FIFO", Reset states ... I don't see the error messages above, but still all limiters and queues are showing "0 flows" in the limiter info. :-(
Any ideas?
-
Now this forum software goes crazy as well ... wanted to add my ipfw output to my post and the forum repeatedly shows the error message "post was flagged as spam by Akismet".
So I can't post my ipfw output here. Stupid piece of software!
-
@jacotec said in Traffic not going to Limiters after 2.4.4:
Now this forum software goes crazy as well ... wanted to add my ipfw output to my post and the forum repeatedly shows the error message "post was flagged as spam by Akismet".
So I can't post my ipfw output here. Stupid piece of software!
Post it as a text attachment and not in the body of a message.
-
OK, I found that the /tmp/rules.limiter file did not show all of the queues after I've recreated them, also the ones which were there did not match to what I've configured.
It was interesting that the file did get a new timestamp after I've applied a change in the limiters GUI, but there was no change in the file content!
I've deleted the file completely and touched a queue in the GUI, then I've applied the limiters again in the GUI. The file which was created now looks good!
Afterward I needed to touch every firewall rule again (I've set them all to the base limiter and applied, then set them back to the child queues and applied again). I now see some traffic in the queues.
It's hard to say that fast if it's working as expected as the limiter info is neither fast nor gives too much detailed information, but I'll observe it.
-
@jacotec Hey!! I'm glad u're having the same problem I initially reported at the beginning of my post. For a while I thought I was being ignored by developers...
Now that u have created your subqueues on the GUI (which of course it's an upgrade bug, they shouldn't have been deleted at first), we can move forward and analyze why rules are not sending traffic to subqueues, which IMHO it's a serious bug because it implies low level pieces of software (kernel, pf, etc).
So, @jimp here it's my contents:
[2.4.4-RELEASE][root@firewall]/root: ipfw queue show q00001 50 sl. 0 flows (256 buckets) sched 1 weight 20 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 q00002 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 q00003 50 sl. 0 flows (256 buckets) sched 2 weight 20 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 q00004 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000
[2.4.4-RELEASE][root@firewall]/root: ipfw pipe show 00001: 9.500 Mbit/s 0 ms burst 0 q131073 50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail sched 65537 type FIFO flags 0x0 0 buckets 0 active 00002: 950.000 Kbit/s 0 ms burst 0 q131074 50 sl. 0 flows (1 buckets) sched 65538 weight 0 lmax 0 pri 0 droptail sched 65538 type FIFO flags 0x0 0 buckets 0 active
[2.4.4-RELEASE][root@firewall]/root: cat /tmp/rules.limiter pipe 1 config bw 9500Kb droptail sched 1 config pipe 1 type fifo queue 1 config pipe 1 weight 20 mask dst-ip6 /128 dst-ip 0xffffffff droptail queue 2 config pipe 1 weight 1 mask dst-ip6 /128 dst-ip 0xffffffff droptail pipe 2 config bw 950Kb droptail sched 2 config pipe 2 type fifo queue 3 config pipe 2 weight 20 mask src-ip6 /128 src-ip 0xffffffff droptail queue 4 config pipe 2 weight 1 mask src-ip6 /128 src-ip 0xffffffff droptail
I will also post two of my rules that uses limiters from /tmp/rules.debug:
pass in quick on $LAN $GWTC_failover_WiFi inet proto { tcp udp } from 192.168.211.0/24 to any port $navegacion_libre tracker 0100000101 keep state dnqueue( 3,1) label "USER_RULE: LAN Nav" pass in quick on $GUEST $GWTC_failover_WiFi inet proto { tcp udp } from 10.0.4.0/24 to any port $navegacion_libre tracker 1522121250 keep state dnqueue( 4,2) label "USER_RULE: Guest nav"
I can also confirm that if I move traffic to base limiters as @jacotec did, I can see some activity. But of course we loose the dynamic bandwidth assignment if doing that:
[2.4.4-RELEASE][root@firewall]/root: ipfw pipe show 00001: 9.500 Mbit/s 0 ms burst 0 q131073 50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail sched 65537 type FIFO flags 0x0 0 buckets 1 active BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 0 ip 0.0.0.0/0 0.0.0.0/0 6 4581 0 0 0 00002: 950.000 Kbit/s 0 ms burst 0 q131074 50 sl. 0 flows (1 buckets) sched 65538 weight 0 lmax 0 pri 0 droptail sched 65538 type FIFO flags 0x0 0 buckets 1 active 0 ip 0.0.0.0/0 0.0.0.0/0 6 654 0 0 0
Hope we can work together to debug this problem and find a solution. Thanks.
Victor -
@vpreatoni In my opinion at least your /tmp/rules.limiter seems to match your configuration. When you move all your firewall rules to the base queues, then press "reload the firewall" ... and then move all rules back to the child queues (reload firewall again then) - do you still see no traffic in the limiters?
I've changed mine to "Codel" / "QFQ" now instead of "Taildrop/FIFO". Here I see the traffic in the children.
As mentioned I needed to delete my /tmp/rules.limiter file and let it recreate by pfSense. But in my case it did not reflect my config. Maybe you do the same to be on the safe side? Do this before you switch away and back your firewall rules.
-
Just a shot in the dark, did you use fq_codel in the previous pfSense version? If yes, do still have a shell command, system patch or script active that was used to switch your limiters to fq_codel?
I'm running a pretty complex HFSC based traffic shaping setup with multiple (child) queues, and that upgraded fine from 2.4.3p1 to 2.4.4 and works flawless here.
-
@jacotec WOW WOW WOW!!!! We have something!!! Changing Limiter to Codel/QFQ as suggested made it work.
In my case I just changed the Download Limiter, and u can see how dynamic Download queues get filled in:
Limiters: 00001: 9.500 Mbit/s 0 ms burst 0 q131073 50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 AQM CoDel target 5ms interval 100ms NoECN sched 65537 type FIFO flags 0x0 0 buckets 0 active 00002: 950.000 Kbit/s 0 ms burst 0 q131074 50 sl. 0 flows (1 buckets) sched 65538 weight 0 lmax 0 pri 0 droptail sched 65538 type FIFO flags 0x0 0 buckets 0 active Queues: q00001 50 sl. 5 flows (256 buckets) sched 1 weight 20 lmax 1500 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 90 ip 0.0.0.0/0 192.168.211.11/0 9 4635 0 0 0 94 ip 0.0.0.0/0 192.168.211.15/0 1 330 0 0 0 99 ip 0.0.0.0/0 192.168.211.50/0 112 20579 0 0 0 108 ip 0.0.0.0/0 192.168.211.61/0 5 260 0 0 0 109 ip 0.0.0.0/0 192.168.211.60/0 94 22025 0 0 0 q00002 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 1500 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 q00003 50 sl. 0 flows (256 buckets) sched 2 weight 20 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 q00004 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000
I wouldn't imagine that choosing a more complex scheduler would make it work...
Anyway, I see this as a "Yellow alarm", because there should be no reason for Taildrop/FIFO aqm/sched to fail.PS: I didn't need to delete /tmp/rules.limiters. As soon as I applied Limiters config, it began to work.
-
@grimson Nope, I've never used that. I was on a regular 2.4.3 installation.
-
@vpreatoni Good to hear it works for you now as well!
-
Can see that our traffic shaper is nonfunctional now as of 2.4.4 in terms of per-host dynamic bandwidth shaping.
Work around for the missing queues/inability to create queues in 2.4.4 was to delete all limiters & queues, then recreate them, then reassign queues to firewall rules. This part worked eventually. Then, found out that the limiter diagnostic info was not functional with Taildrop/FIFO (ipfw pipe show and ipfw queue show). Switching to Codel/QFQ allowed monitoring queues using ipfw pipe show and ipfw queue show, but while the overall bandwidth limiting was working (capping to the max allocated bandwidth), the shaper was NOT actually shaping, either with Taildrop/FIFO or with Codel/QFQ. Codel/QFQ caused system to crash eventually and had to be manually restarted.
Whole point of the setup was to do the amazing per-host dynamic bandwidth dividing that pfsense was so good with. Can confirm now that although the limiters/queues are recreated and working to limit the maximum aggregate bandwidth, the mask by destination (Down_LAN) or sources addresses (Up_LAN) does not seem to work. These queues are under the DownLimiter and UpLimiter limiters. Up_LAN is assigned to In pipe on a LAN interface firewall rule and Down_LAN is assigned to Out pipe. All was working before 2.4.4! The hosts always showed identical traffic during peak usage, dividing the total bandwidth evenly. This is nonfunctional now.
Is there anyway I can directly verify if traffic is actually going through per-host queues when using Taildrop/FIFO though ipfw pipe show and ipfw queue show do not show that happening?
-
Update: Have now switched to Codel/Round Robin. This combination seems to work -- traffic goes to child queues as expected and we can achieve the per-host dynamic bandwidth allocation. Would be nice if any other combinations including Taildrop with FIFO or QFQ would also work in the future to try and find optimal settings.