Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Strange error: There were error(s) loading the rules: pfctl: pfctl_rules

    Scheduled Pinned Locked Moved General pfSense Questions
    102 Posts 13 Posters 16.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • F
      Flole @stephenw10
      last edited by

      Have you heard anything back from the developers yet? Just imagine the consequences if this bug hits in a critical environment, it should be fixed ASAP.

      1 Reply Last reply Reply Quote 0
      • J
        jacko
        last edited by

        I also just had this error on my SG-1100, nothing seemed to trigger it. I rebooted it, it worked for about 2 mins and then no network connection again. I could not ping anything including the router, 2nd reboot and so far it's still running.
        Was running PFBlockerNg 3.1.0_4, but I've now disabled it.

        Theses are the errors:

        There were error(s) loading the rules: pfctl: pfctl_rules - The line in question reads [0]: @ 2022-08-11 22:26:17
        There were error(s) loading the rules: pfctl: pfctl_rules - The line in question reads [0]: @ 2022-08-11 22:26:20
        There were error(s) loading the rules: pfctl: pfctl_rules - The line in question reads [0]: @ 2022-08-11 22:26:29
        There were error(s) loading the rules: pfctl: pfctl_rules - The line in question reads [0]: @ 2022-08-11 22:37:02

        F 1 Reply Last reply Reply Quote 0
        • F
          Flole @jacko
          last edited by

          @jacko Just be glad that it failed in a "block-all" state for you. An "allow-all" state is much worse, especially if it's unnoticed....

          1 Reply Last reply Reply Quote 1
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            Mmm, unfortunately this is unhelpful:
            ioctl(3,DIOCXBEGIN,0xbfbfd9d0) ERR#16 'Device busy'

            The issue has already happened and pf is no longer responding to pfctl. What we'd need there is to see the truss output from the first invocation of pfctl after boot. But that's not easy.
            We are looking at it but we've not been able to replicate it locally. Yet.

            I opened a bug to track it. Add any new info you have there:
            https://redmine.pfsense.org/issues/13408

            Steve

            F artooroA 2 Replies Last reply Reply Quote 0
            • F
              Flole @stephenw10
              last edited by

              @stephenw10 Maybe it would help to add additional debug output in pf's code? Is it clear where the wrong branch is taken/where the error is actually thrown? If that's unclear it's probably a good idea to figure that out first. One possible way is probably the one I described above, but is there another one? If it's clear what state the firewall ends up in then it's easier to figure out potential ways how it could end up in that state.

              1 Reply Last reply Reply Quote 0
              • artooroA
                artooro @stephenw10
                last edited by

                @stephenw10 thanks, what I'll attempt tomorrow is to edit https://github.com/pfsense/pfsense/blob/60a2fa6b6f1a59f3f86933265fbb48e25f652bfc/src/etc/inc/filter.inc#L527 to use truss and output to a log file, and see if we can get something helpful there.
                As I have a couple of systems where it's pretty easy to reproduce.

                F 1 Reply Last reply Reply Quote 1
                • F
                  Flole @artooro
                  last edited by

                  @artooro I'd make a copy of every rules-file that's loaded there aswell. Just to see if loading those files in the same order causes the issue.

                  F 1 Reply Last reply Reply Quote 0
                  • F
                    Flole @Flole
                    last edited by Flole

                    After some reboots it started working again for me aswell. I noticed that my GIF tunnels did not come up this time. Maybe it is related? Are others affected by this also using (multiple) GIF tunnels?

                    C 1 Reply Last reply Reply Quote 0
                    • C
                      ChrisJenk @Flole
                      last edited by

                      @flole said in Strange error: There were error(s) loading the rules: pfctl: pfctl_rules:

                      After some reboots it started working again for me aswell. I noticed that my GIF tunnels did not come up this time. Maybe it is related? Are others affected by this also using (multiple) GIF tunnels?

                      I have a single GIF tunnel (a 6in4 for HE Tunnelbroker).

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Hmm, that could be a clue. Though I haven't seen it on any test box I have that has GIF tunnels.So maybe something else required on that specifically.

                        1 Reply Last reply Reply Quote 0
                        • artooroA
                          artooro
                          last edited by artooro

                          I've been able to capture the initial pfctl that fails via truss which I've attached here:
                          1-truss_pfctl_1660664796.txt

                          I don't have any GIF tunnels, but do use WireGuard interfaces.

                          And if anyone is interested in capturing this themselves, what I did is edit /etc/src/filter.inc and commented out line 527 and added the following quick and dirty code:

                          		$_grbg = exec("/usr/bin/truss /sbin/pfctl -o basic -f {$g['tmp_path']}/rules.debug 2>&1", $rules_error, $rules_loading);
                          		$rval = 0;
                          		if ($rules_error[count($rules_error)-1] == "process exit, rval = 1") {
                          			$rval = 1;
                          			file_notice("filter_load", sprintf("pfctl process exit, rval = 1"), "Filter Reload", "");
                          		}
                          		$date = new DateTime();
                          		$truss_filename = "/root/" . $rval . "-truss_pfctl_" . $date->getTimestamp() . ".txt";
                          		file_put_contents($truss_filename, implode("\n", $rules_error));
                          
                          F K 2 Replies Last reply Reply Quote 1
                          • F
                            Flole @artooro
                            last edited by

                            @artooro Maybe you should post the one that ran right before it aswell, just in case something there was already messed up. But it looks promising to me, I believe @kprovost is the pfctl and Kernel expert, maybe he can spot a potential bug based on that truss-output?

                            Keep in mind that if you're using the captive portal pfctl is invoked at (at least) one other location aswell.

                            artooroA 1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              Yup I pinged him. Late where he is though, I hope he's enjoying a beer by now. 😉

                              1 Reply Last reply Reply Quote 0
                              • artooroA
                                artooro @Flole
                                last edited by

                                @flole I checked and the previous run is the same as all the successful invocations.

                                1 Reply Last reply Reply Quote 1
                                • K
                                  kprovost @artooro
                                  last edited by

                                  @artooro Can you add an pfctl -x loud invocation before that truss'd pfctl?

                                  That truss output was already more illuminating, but I'm still not able to reproduce anything like this issue.

                                  It seems to fail with ioctl(3,DIOCADDRULENV,0xbfbfdae4) ERR#16 'Device busy', after it's already added a bunch of other rules without issue.
                                  EBUSY almost certainly means we've failed in pf_ioctl_addrule(), when checking ticket numbers.

                                  So I'm guessing we're seeing a race condition here, where something else (possibly another pfctl, possibly something else that adds rules or addresses) is running at the same time. The pfctl -x loud should let us work out if it's a ruleset or a pool ticket. That in turn might give us a hint.

                                  I've seen mention of pfBlockerNG, is everyone affected running that?

                                  C artooroA F 3 Replies Last reply Reply Quote 0
                                  • C
                                    ChrisJenk @kprovost
                                    last edited by ChrisJenk

                                    @kprovost said in Strange error: There were error(s) loading the rules: pfctl: pfctl_rules:

                                    I've seen mention of pfBlockerNG, is everyone affected running that?

                                    I'm not even sure what pfBlockerNG is. I'm not consciously running it (unless it is something that always runs as standard).

                                    1 Reply Last reply Reply Quote 0
                                    • stephenw10S
                                      stephenw10 Netgate Administrator
                                      last edited by

                                      It's a package. You have to actively install it.

                                      Steve

                                      1 Reply Last reply Reply Quote 0
                                      • artooroA
                                        artooro @kprovost
                                        last edited by artooro

                                        Sure I've added pfctl -x loud to be executed before the main pfctl execution. So just waiting for it to happen again.

                                        This router does have the third-party adam:ONE package which does create pf rules in the userrules anchor. So it's possible that they are both attempting to add a rule at the same time.
                                        Although this wasn't an issue prior to pfSense Plus 22.05 as previously pfSense would detect the device busy error and attempt it again. But now pfctl locks up for lack of a more technical term and additional attempts fail.

                                        I've also seen someone mention they had snort installed which may have the same symptom if snort is adding a block rule while pfctl runs.

                                        F 1 Reply Last reply Reply Quote 0
                                        • F
                                          Flole @artooro
                                          last edited by

                                          @artooro Yes that's me who has snort installed.

                                          1 Reply Last reply Reply Quote 1
                                          • F
                                            Flole @kprovost
                                            last edited by

                                            @kprovost said in Strange error: There were error(s) loading the rules: pfctl: pfctl_rules:

                                            EBUSY almost certainly means we've failed in pf_ioctl_addrule(), when checking ticket numbers.

                                            Is there any reason why there is absolutely no debug output when things fail? If there would be a log message like "ticket number x is causing issues" into dmesg it would make this a lot easier to debug. It's not like it's often executed so that additional log output wouldn't really slow things down, especially if it's only executed during failure scenarios.

                                            K 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.