Traffic not going to Limiters after 2.4.4
-
@vpreatoni said in Traffic not going to Limiters after 2.4.4:
Hi jimp, sorry for disturbing.
Reading in detail bug #8956 I can see it's a different situation. In that case, report it's about not being able to create queues under each limiter. Workaround for that is manually deleting all Limiters into XML file and starting from scratch.I filled in a new bug https://redmine.pfsense.org/issues/8973 because in this case, queues are properly created, they are shown into GUI and also doble checked with ipfw pipe show command, and queues are there.
Could you check if this is just a GUI issue that the traffic is not shown (but the limiters itself are working), or aren't they working at all?
I'd delete my shaper config via XML file as well and redo them it that would solve it, but as far as I understand your post this will still not help me even when I'm able to create them in the GUI.
-
@jacotec said in Traffic not going to Limiters after 2.4.4:
+1 for a quick fix. This issue is ways too critical to wait weeks for a -p1 release in my opinion!
There's a presentation video on limiters from August 2018 for the upcoming 2.4.4 release - I can't understand that a presentation was taken although that seems to be fully untested.
As much as I love pfSense and appreciate the work Netgate puts in, but I really wonder how such a bug can make it into a release version ...
Because maybe it isn't quite that clear and it doesn't affect everyone?
There are a large number of users with limiters on 2.4.4 working just fine, with traffic using the limiters as expected. You need only peek at the FQ_CODEL thread for evidence.
-
If someone has a limiter problem where the queues DO NOT show up, including if you re-created them, I'd like to see the contents of the limiters from
config.xml
from before the upgrade as well as after. The section I'm looking for is the<dnshaper> ... </dnshaper>
section. There should not be anything too private in there, with a possible exception of a masked subnet if you used that.I'd also like to see the contents of
/tmp/rules.limiter
,ipfw pipe show
, andipfw queue show
.And as always, make sure you reset states between any limiter config change or test.
-
@jacotec It's not a GUI issue (in my case), check my first post, there is the output of ipfw pipe show and ipfw queue show. Pipes and subqueues are created properly.
-
@jimp Please find the requested info here: https://jaycloud.de/f/4a4b8a11ff4a49cfb179/
There seems to be no command "ipfw limiter show":
ipfw: bad command `limiter'
Let me know if you need any more information
-
That should have been
ipfw queue show
, sorry. I edited the message. -
@jimp OK, that one is empty. I've updated my document above.
-
You have queues defined but they are not loaded. Do your firewall rules have the queue selected or the base limiter itself?
Also the "after" settings look like they were changed after the upgrade. Was that what you have right now after attempting to make changes, or from immediately after the upgrade?
-
@jimp The base limiters have been there after the update, the child queues have been completely gone. My floating rules in the firewall are still there, but they were using the child queues before the update - after the update the child queue assignment in all floating rules were gone, just showing "none". So pfSense has deleted the configured pipes at this point after the update.
My child queues are not available anymore as the selection for the In/Out pipe of the rules, I see only the base limiters there.
I've changed the base queues to "FQCodel" later after the update, right ... hoping that I can see / recreate my children after changing the settings. Which did not happen. But the childs vanished before, right after the update and still with the old settings.
Do you think it would make sense to delete the dnshaper section from the XML, reboot and recreate the limiters and children in the GUI to see if they would work then?
-
@jimp
So, I've deleted all my limiters and deassigned the queues from the firewall rules.
I then was able to reconfigure all limiters and queues from scratch, and the child queues are now showing up and I can reassign them to the firewall rules.Did a "reset states", did a reboot - but traffic is not going to the queues.
On the console I see periodically errors:
config_aqm Unable to configure flowset, flowset busy
I've then changed all limiters / queues back to "TailDrop" / "FIFO", Reset states ... I don't see the error messages above, but still all limiters and queues are showing "0 flows" in the limiter info. :-(
Any ideas?
-
Now this forum software goes crazy as well ... wanted to add my ipfw output to my post and the forum repeatedly shows the error message "post was flagged as spam by Akismet".
So I can't post my ipfw output here. Stupid piece of software!
-
@jacotec said in Traffic not going to Limiters after 2.4.4:
Now this forum software goes crazy as well ... wanted to add my ipfw output to my post and the forum repeatedly shows the error message "post was flagged as spam by Akismet".
So I can't post my ipfw output here. Stupid piece of software!
Post it as a text attachment and not in the body of a message.
-
OK, I found that the /tmp/rules.limiter file did not show all of the queues after I've recreated them, also the ones which were there did not match to what I've configured.
It was interesting that the file did get a new timestamp after I've applied a change in the limiters GUI, but there was no change in the file content!
I've deleted the file completely and touched a queue in the GUI, then I've applied the limiters again in the GUI. The file which was created now looks good!
Afterward I needed to touch every firewall rule again (I've set them all to the base limiter and applied, then set them back to the child queues and applied again). I now see some traffic in the queues.
It's hard to say that fast if it's working as expected as the limiter info is neither fast nor gives too much detailed information, but I'll observe it.
-
@jacotec Hey!! I'm glad u're having the same problem I initially reported at the beginning of my post. For a while I thought I was being ignored by developers...
Now that u have created your subqueues on the GUI (which of course it's an upgrade bug, they shouldn't have been deleted at first), we can move forward and analyze why rules are not sending traffic to subqueues, which IMHO it's a serious bug because it implies low level pieces of software (kernel, pf, etc).
So, @jimp here it's my contents:
[2.4.4-RELEASE][root@firewall]/root: ipfw queue show q00001 50 sl. 0 flows (256 buckets) sched 1 weight 20 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 q00002 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 0 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 q00003 50 sl. 0 flows (256 buckets) sched 2 weight 20 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 q00004 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000
[2.4.4-RELEASE][root@firewall]/root: ipfw pipe show 00001: 9.500 Mbit/s 0 ms burst 0 q131073 50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail sched 65537 type FIFO flags 0x0 0 buckets 0 active 00002: 950.000 Kbit/s 0 ms burst 0 q131074 50 sl. 0 flows (1 buckets) sched 65538 weight 0 lmax 0 pri 0 droptail sched 65538 type FIFO flags 0x0 0 buckets 0 active
[2.4.4-RELEASE][root@firewall]/root: cat /tmp/rules.limiter pipe 1 config bw 9500Kb droptail sched 1 config pipe 1 type fifo queue 1 config pipe 1 weight 20 mask dst-ip6 /128 dst-ip 0xffffffff droptail queue 2 config pipe 1 weight 1 mask dst-ip6 /128 dst-ip 0xffffffff droptail pipe 2 config bw 950Kb droptail sched 2 config pipe 2 type fifo queue 3 config pipe 2 weight 20 mask src-ip6 /128 src-ip 0xffffffff droptail queue 4 config pipe 2 weight 1 mask src-ip6 /128 src-ip 0xffffffff droptail
I will also post two of my rules that uses limiters from /tmp/rules.debug:
pass in quick on $LAN $GWTC_failover_WiFi inet proto { tcp udp } from 192.168.211.0/24 to any port $navegacion_libre tracker 0100000101 keep state dnqueue( 3,1) label "USER_RULE: LAN Nav" pass in quick on $GUEST $GWTC_failover_WiFi inet proto { tcp udp } from 10.0.4.0/24 to any port $navegacion_libre tracker 1522121250 keep state dnqueue( 4,2) label "USER_RULE: Guest nav"
I can also confirm that if I move traffic to base limiters as @jacotec did, I can see some activity. But of course we loose the dynamic bandwidth assignment if doing that:
[2.4.4-RELEASE][root@firewall]/root: ipfw pipe show 00001: 9.500 Mbit/s 0 ms burst 0 q131073 50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail sched 65537 type FIFO flags 0x0 0 buckets 1 active BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 0 ip 0.0.0.0/0 0.0.0.0/0 6 4581 0 0 0 00002: 950.000 Kbit/s 0 ms burst 0 q131074 50 sl. 0 flows (1 buckets) sched 65538 weight 0 lmax 0 pri 0 droptail sched 65538 type FIFO flags 0x0 0 buckets 1 active 0 ip 0.0.0.0/0 0.0.0.0/0 6 654 0 0 0
Hope we can work together to debug this problem and find a solution. Thanks.
Victor -
@vpreatoni In my opinion at least your /tmp/rules.limiter seems to match your configuration. When you move all your firewall rules to the base queues, then press "reload the firewall" ... and then move all rules back to the child queues (reload firewall again then) - do you still see no traffic in the limiters?
I've changed mine to "Codel" / "QFQ" now instead of "Taildrop/FIFO". Here I see the traffic in the children.
As mentioned I needed to delete my /tmp/rules.limiter file and let it recreate by pfSense. But in my case it did not reflect my config. Maybe you do the same to be on the safe side? Do this before you switch away and back your firewall rules.
-
Just a shot in the dark, did you use fq_codel in the previous pfSense version? If yes, do still have a shell command, system patch or script active that was used to switch your limiters to fq_codel?
I'm running a pretty complex HFSC based traffic shaping setup with multiple (child) queues, and that upgraded fine from 2.4.3p1 to 2.4.4 and works flawless here.
-
@jacotec WOW WOW WOW!!!! We have something!!! Changing Limiter to Codel/QFQ as suggested made it work.
In my case I just changed the Download Limiter, and u can see how dynamic Download queues get filled in:
Limiters: 00001: 9.500 Mbit/s 0 ms burst 0 q131073 50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 AQM CoDel target 5ms interval 100ms NoECN sched 65537 type FIFO flags 0x0 0 buckets 0 active 00002: 950.000 Kbit/s 0 ms burst 0 q131074 50 sl. 0 flows (1 buckets) sched 65538 weight 0 lmax 0 pri 0 droptail sched 65538 type FIFO flags 0x0 0 buckets 0 active Queues: q00001 50 sl. 5 flows (256 buckets) sched 1 weight 20 lmax 1500 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 BKT Prot ___Source IP/port____ ____Dest. IP/port____ Tot_pkt/bytes Pkt/Byte Drp 90 ip 0.0.0.0/0 192.168.211.11/0 9 4635 0 0 0 94 ip 0.0.0.0/0 192.168.211.15/0 1 330 0 0 0 99 ip 0.0.0.0/0 192.168.211.50/0 112 20579 0 0 0 108 ip 0.0.0.0/0 192.168.211.61/0 5 260 0 0 0 109 ip 0.0.0.0/0 192.168.211.60/0 94 22025 0 0 0 q00002 50 sl. 0 flows (256 buckets) sched 1 weight 1 lmax 1500 pri 0 droptail mask: 0x00 0x00000000/0x0000 -> 0xffffffff/0x0000 q00003 50 sl. 0 flows (256 buckets) sched 2 weight 20 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000 q00004 50 sl. 0 flows (256 buckets) sched 2 weight 1 lmax 0 pri 0 droptail mask: 0x00 0xffffffff/0x0000 -> 0x00000000/0x0000
I wouldn't imagine that choosing a more complex scheduler would make it work...
Anyway, I see this as a "Yellow alarm", because there should be no reason for Taildrop/FIFO aqm/sched to fail.PS: I didn't need to delete /tmp/rules.limiters. As soon as I applied Limiters config, it began to work.
-
@grimson Nope, I've never used that. I was on a regular 2.4.3 installation.
-
@vpreatoni Good to hear it works for you now as well!
-
Can see that our traffic shaper is nonfunctional now as of 2.4.4 in terms of per-host dynamic bandwidth shaping.
Work around for the missing queues/inability to create queues in 2.4.4 was to delete all limiters & queues, then recreate them, then reassign queues to firewall rules. This part worked eventually. Then, found out that the limiter diagnostic info was not functional with Taildrop/FIFO (ipfw pipe show and ipfw queue show). Switching to Codel/QFQ allowed monitoring queues using ipfw pipe show and ipfw queue show, but while the overall bandwidth limiting was working (capping to the max allocated bandwidth), the shaper was NOT actually shaping, either with Taildrop/FIFO or with Codel/QFQ. Codel/QFQ caused system to crash eventually and had to be manually restarted.
Whole point of the setup was to do the amazing per-host dynamic bandwidth dividing that pfsense was so good with. Can confirm now that although the limiters/queues are recreated and working to limit the maximum aggregate bandwidth, the mask by destination (Down_LAN) or sources addresses (Up_LAN) does not seem to work. These queues are under the DownLimiter and UpLimiter limiters. Up_LAN is assigned to In pipe on a LAN interface firewall rule and Down_LAN is assigned to Out pipe. All was working before 2.4.4! The hosts always showed identical traffic during peak usage, dividing the total bandwidth evenly. This is nonfunctional now.
Is there anyway I can directly verify if traffic is actually going through per-host queues when using Taildrop/FIFO though ipfw pipe show and ipfw queue show do not show that happening?
-
Update: Have now switched to Codel/Round Robin. This combination seems to work -- traffic goes to child queues as expected and we can achieve the per-host dynamic bandwidth allocation. Would be nice if any other combinations including Taildrop with FIFO or QFQ would also work in the future to try and find optimal settings.
-
After some testing, I can confirm CoDel/QFQ is pretty fucked up!. My server restarts evry 2 or 3 days use, and I get flooded with:
qfq_dequeue BUG/* non-workconserving leaf */
I'm attaching my debug info here: 0_1539455038230_textdump.tar.0
Please any dev reply to this post. I'm happy to provide any debugging information or do some testing, but IT IS NOT SERIOUS TO RELEASE A VERSION SO FUCKED UP WITH BUGS, and there was no rush to do it. 2.4.3_1 was working fine. Our servers are production machines. -
@vpreatoni said in Traffic not going to Limiters after 2.4.4:
After some testing, I can confirm CoDel/QFQ is pretty fucked up!. My server restarts evry 2 or 3 days use, and I get flooded with:
qfq_dequeue BUG/* non-workconserving leaf */
This problem was driving me crazy, but I fix it by doing this:
- Removed every limiter on traffic shaper page.
- Full reboot of pfsense box
- Recreated the limiters and I even used different names for them, just in case...
- Assigned the new limiters to the relevant firewall rules.
That fixed my problem.
-
I'm also having problems with limiters and shaper in general.
I have upgraded to 2.4.4, and limiters dissapeared. I recreated them, but there was no incoming traffic at all. Upload traffic seemed ok. So, i reverted to 2.4.3 (it's a VM), but then i realized that in 2.4.3 i cannot edit the limiters, or the shaper, or bad things happen. Just by modifying the bandwidth on the limiters resulted in very low incoming bandwidth (something like 700 kbps instead of 3.5 mbps). So, i decided to delete all limiters and re-create them, and i got the same problem than with the upgrade to 2.4.4: no incoming traffic. Removing the traffic shaper (CBQ) also resulted on the same problem. Fortunately, i just reverted to a previous snapshot, and restored normal functionality. But it seems like i cannot modify anything at all on the shapers/limiters, or bad things happen. Not good.
Maybe part of the issue was already on 2.4.3?
Adding new functionality is nice, but lots of things can go wrong with every change. To me, reliability is much more important than new stuff, and lately i have second thoughts about clicking the update button. -
Howdy Folks, I don't know if this relative but I will just throw it out here. I have traffic shaper using codel as per the Netgate vid and started to get a log of (config_aqm unable to configure flow set). Also the WAN connection would break.
This all started when I made a change to the interfaces settings, From "default" to "autoselect" .After making the change to autoselect everything ran fine for 8 hours with no indicating problems. Reboots of Pfsense and modem changed nothing.
I then changed the interfaces back to default and all problems evaporated.
When I made the changes, reboots and state table reset were done per best practice.
Using: 2.4.4
Supermicro C2558
cable
I hope this is helpful. -
I see that there's a 2.4.4-p1 version now. Has this problem been fixed there?
-
@fsr said in Traffic not going to Limiters after 2.4.4:
I see that there's a 2.4.4-p1 version now. Has this problem been fixed there?
Yes, assuming it was this: https://redmine.pfsense.org/issues/8973
-
@fsr Yes, it's fixed!. Now it is clear which scheduler is the default one (WF2Q+), and works perfect.
Haven't tested QFQ yet, but I'm pretty happy with Codel ACM/WF2Q+ sched behavior.Some other issues have been solved in 2.4.4_1 too, like the unbound memory leak.
-
Excellent!! Thanks a lot!!