Limiter Issue
-
We are using a couple of Netgate XG-7100 running on 21.05.2-RELEASE (amd64).
We have been using limiters for quite a while. Somehow we got them to "break". It started off when we changed the limiter in / out settings in the corresponding firewall rule to a new limiter offering more bandwidth on that vlan. Pfsense reported the following error back then:
18:54:05 There were error(s) loading the rules: /tmp/rules.debug:367: syntax error - The line in question reads [367]: pass in quick on $HMVLAN161VA1 inet from 10.120.161.0/24 to ! $Private_IP_Range tracker 1633693771 keep state dnpipe ( 3,) label "USER_RULE: Default Allow Internet"```
Since then all limiters we are adding or modifying in the firewall rulesets are not working anymore. As soon as we add / modify them to the rule no traffic passes through the corresponding rule anymore. In this case our "Default Internet" rule which "breaks" the whole internet connection for the corresponding vlan. No error is presented by the interface though.
Looking at the log files we see the following error every time we change in / out pipes in firewall rules:
/rc.filter_configure_sync: The command '/sbin/ipfw /tmp/rules.limiter' returned exit code '64', the output was 'Line 2: need a pipe/flowset/sched number'
We have already:
- restarted the firewall appliance (reboot & halt)
- Deleted the corresponding interfaces including all the associated firewall rules
- Deleted all limiters and manually added them again
- Deleted all traffic shaping queues
Did anyone observer something similar before? Any ideas how to fix it? ;-)
Thanks!!
-
@hannesclp If you need to get it working can you restore from config backup, or from Diagnostics/Backup & Restore/Config History if no backup was saved?
-
@steveits Thanks for your quick reply! Unfortunately we do not have a very recent backup since we can not exactly pin point when it happened the first time.
The appliance is off site though which is why I did not try an restore yet.It seems guessing from the second error that the os level is somehow out of sync with the pfsense software?
-
Hello!
Maybe out of sync...
I have a very simple limiter setup. The config.xml for the limiters looks like this :<dnshaper> <queue> <name>3MbpsOutDown_DMZ</name> <number>1</number> <qlimit></qlimit> <plr></plr> <description></description> <bandwidth> <item> <bw>3</bw> <burst></burst> <bwscale>Mb</bwscale> <bwsched>none</bwsched> </item> </bandwidth> <enabled>on</enabled> <buckets></buckets> <mask>none</mask> <maskbits></maskbits> <maskbitsv6></maskbitsv6> <delay>0</delay> <sched>wf2q+</sched> <aqm>droptail</aqm> <ecn></ecn> <param_codel_target>5</param_codel_target> <param_codel_interval>100</param_codel_interval> </queue> <queue> <name>3MbpsInUp_DMZ</name> <number>2</number> <qlimit></qlimit> <plr></plr> <description></description> <bandwidth> <item> <bw>3</bw> <burst></burst> <bwscale>Mb</bwscale> <bwsched>none</bwsched> </item> </bandwidth> <enabled>on</enabled> <buckets></buckets> <mask>none</mask> <maskbits></maskbits> <maskbitsv6></maskbitsv6> <delay>0</delay> <sched>wf2q+</sched> <aqm>droptail</aqm> <ecn></ecn> </queue> <queue> <name>1MbpsInUp_DMZ</name> <number>3</number> <qlimit></qlimit> <plr></plr> <description></description> <bandwidth> <item> <bw>1</bw> <burst></burst> <bwscale>Mb</bwscale> <bwsched>none</bwsched> </item> </bandwidth> <enabled>on</enabled> <buckets></buckets> <mask>none</mask> <maskbits></maskbits> <maskbitsv6></maskbitsv6> <delay>0</delay> <sched>wf2q+</sched> <aqm>droptail</aqm> <ecn></ecn> </queue> </dnshaper>
From this, pfsense generates the following /tmp/rules.limiter file :
pipe 1 config bw 3Mb droptail sched 1 config pipe 1 type wf2q+ pipe 2 config bw 3Mb droptail sched 2 config pipe 2 type wf2q+ pipe 3 config bw 1Mb droptail sched 3 config pipe 3 type wf2q+
It would be interesting to see what you get from the Diagnostics -> Limiter Info page, if anything.
My output looks like :
Limiters: 00001: 3.000 Mbit/s 0 ms burst 0 q131073 50 sl. 0 flows (1 buckets) sched 65537 weight 0 lmax 0 pri 0 droptail sched 65537 type FIFO flags 0x0 0 buckets 0 active 00002: 3.000 Mbit/s 0 ms burst 0 q131074 50 sl. 0 flows (1 buckets) sched 65538 weight 0 lmax 0 pri 0 droptail sched 65538 type FIFO flags 0x0 0 buckets 0 active 00003: 1.000 Mbit/s 0 ms burst 0 q131075 50 sl. 0 flows (1 buckets) sched 65539 weight 0 lmax 0 pri 0 droptail sched 65539 type FIFO flags 0x0 0 buckets 0 active Schedulers: 00001: 3.000 Mbit/s 0 ms burst 0 sched 1 type WF2Q+ flags 0x0 0 buckets 0 active 00002: 3.000 Mbit/s 0 ms burst 0 sched 2 type WF2Q+ flags 0x0 0 buckets 0 active 00003: 1.000 Mbit/s 0 ms burst 0 sched 3 type WF2Q+ flags 0x0 0 buckets 0 active
You can also check :
ipfw pipe show ipfw sched show
at the command line.
John
-
@serbus John, this helped a lot! Thank you! Looking at the /tmp/rules.limiter file comparing it to yours I was able to identify the error. It seems that the first limiter is somehow broken. It is missing a number in the /tmp/rules.limiter file:
pipe config bw 100Mb mask dst-ip6 /128 dst-ip 0xffffff00 droptail sched config pipe mask dst-ip6 /128 dst-ip 0xffffff00 type wf2q+ pipe 3 config bw 100Mb mask src-ip6 /128 src-ip 0xffffff00 droptail sched 3 config pipe 3 mask src-ip6 /128 src-ip 0xffffff00 type wf2q+ pipe 4 config bw 200Mb mask src-ip6 /128 src-ip 0xffffff00 droptail sched 4 config pipe 4 mask src-ip6 /128 src-ip 0xffffff00 type wf2q+ ....
This time I was not able to delete the limiter. A "Bad Gateway" error from nginx appears. The logs show the following error:
Feb 9 21:01:38 kernel pid 96267 (php-fpm), jid 0, uid 0: exited on signal 11 (core dumped) Feb 9 21:01:38 nginx 2022/02/09 21:01:38 [error] 96770#100213: *539 upstream prematurely closed connection while reading response header from upstream, client: abc, server: , request: "GET /firewall_shaper_vinterface.php?pipe=Upload_100mbit&queue=Upload_100mbit&action=delete HTTP/2.0", upstream: "fastcgi://unix:/var/run/php-fpm.socket:", host: "xyz.com:10443", referrer: "https://xyc.com:10443/firewall_shaper_vinterface.php?pipe=Upload_100mbit&queue=Upload_100mbit&action=show"
I can deactivate it though which I did. The traffic with the other limiters is "flowing" again though :-)
-
Hello!
Maybe there is something in the dnshaper section of config.xml that is causing the signal 11 from php. Someone that knows more than I do would have to comment on how to safely fix the config i.e. editing the config.xml. Kind of scary on a remote install...???
There is also a "hidden" action option in the limiter gui config page that is supposed to reset all limiter configs and related fw rules. I have never used it and have no idea what the correct parameters are, but if you are feeling desperate or adventurous, you could try from a logged in browser session :
<pfsense url stuff>/firewall_shaper_vinterface.php?pipe=Upload_100mbit&queue=Upload_100mbit&action=resetall
...using the name of the limiter pipe/queue that is corrupt, or maybe just...
<pfsense url stuff>/firewall_shaper_vinterface.php?action=resetall
The code still might crap out, even in the resetall action. YMMV
John
-
@serbus Thanks yet again ;-) I think I am not too adventurous in terms of trying this hidden function. Would a backup restore correct the error? Or is it essentially the same as editing the config.xml?
-
Hello!
I dont know if a backup/restore of the config would reparse/repair any limiter problems.
The official config editing instructions are good, but I am not sure what you would change/edit/delete in the dnshaper section to fix the problem.
John
-
@serbus Hi John, since I cannot delete the limiter right now, I would try to "delete" it within the XML file. I will be on-site on Friday and give it a try...
Has anyone else any other suggestions?