Anyone know what this error could mean.
-
Thank you Steve for the info.
I've read through the topic mentioned in the ticket.
Someone wrote that he changed the OpenVPN config then the issue has started.
Actually, besides the mtr installation, I also changed the OpenVPN config, nearly at the same time, therefore yes, might be not the mtr installation, but the OpenVPN changes triggered the issue. -
As far as I'm aware it should only happen with Layer 2 type rules which wouldn't normally appear in an OpenVPN config.
Are you running the captive portal? -
I don’t use captive portal.
The topic mentioned in the ticket states that only captive portal incorporates layer 2 rules, however at least 3 people are saying that they don’t use captive portal at all. -
For most users who see that it clears at reboot. Is that the case here or is it now persistent since you made the OpenVPN change?
Can you roll back that config change as a test? -
I haven't rebooted the box just in case further debugging is needed, therefore the issue is still present.
The firewall is forwarding some traffic, but because of this error I assume not all rules are applied. I've read that reboot made situation worse for someone, therefore this was also against the reboot, and I was thinking to leave it as it is for now.Since the reboot likely clears the issue, therefore is the goal to restore the previous configuration without reboot and see if no more error messages will be printed? Do I understand correctly? If yes I am happy to visit the firewall in the evening and restore the config.
Do I need to have a USB drive with the 22.05 image handy? :) -
If you somehow create an invalid ruleset the packet filter will refuse to load it and continue with the existing ruleset. That can be confusing because it appears as though any new firewall rules you add are not being respected. In that situation if you reboot the invalid ruleset cannot be loaded at boot and you may end up with no rules. Generally bad!
However this error is not due to a bad ruleset. It happens when the code gets stuck loading a layer 2 rule as I understand it.
Check your /tmp/rules.debug file for any layer 2 (ether) rules like:
# Captive Portal ether pass on { ix2 } tag "cpzoneid_2_rdr" ether anchor "cpzoneid_2_auth/*" on { ix2 } ether anchor "cpzoneid_2_passthrumac/*" on { ix2 } ether anchor "cpzoneid_2_allowedhosts/*" on { ix2 }
Steve
-
[22.05-RELEASE][root@g.localdomain]/root: grep ether /tmp/rules.debug [22.05-RELEASE][root@g.localdomain]/root:
It seems there are no Layer 2 rules
-
Hmm, in that case it's unclear what might have caused that....unless you might have had ether rules at some point?
-
There were no ether rules configured in the past.
This is a very simple firewall, only 3 packages installed (aws-wizard, ipsec-profile-wizard, openvpn-client-export), HFSC scheduler with floating match rules and tagging for Internet upstream, some vlan interfaces with inbound rules. I think I had one static route previsouly but it has been deleted long time ago. Nothing else.
-
Well as far as I know there is no way to clear that error once it has been hit other than rebooting so I'm not sure you have any other choice. Beyond just allowing it to run as it is with the current ruleset.
Without any ether rules there I would expect it to boot back OK. -
Try running
pfSsh.php playback pfanchordrill
at the command line. Make sure there are no ether rules that have been added dynamically. -
Thank you for the command, I was just able to run it. No ether rules have been added dynamically.
[22.05-RELEASE][root@g.localdomain]/root: pfSsh.php playback pfanchordrill ipsec rules/nat contents: miniupnpd rules/nat contents: natearly rules/nat contents: natrules rules/nat contents: openvpn rules/nat contents: tftp-proxy rules/nat contents: userrules rules/nat contents:
-
Was that after rebooting?
We discovered a bug that means dynamic ether rules were not being removed when you disable or remove a captive portal instance. So if you had a captive portal instance defined since you last rebooted the rules could have remained and might trigger this issue. Since they would not be created a boot again rebooting resolves it. That probably explains the behaviour many people have seen.
Steve
-
I haven't rebooted the device since the upgrade to 22.05 version which was months ago.
[22.05-RELEASE][root@g.localdomain]/root: uptime 4:02PM up 163 days, 4:11, 1 user, load averages: 0.08, 0.04, 0.01
But more importantly I havn't enabled captive portal or dynamic ether rules. There were no features ever configured other than I described earlier.
@turrican64 said in Anyone know what this error could mean.:
This is a very simple firewall, only 3 packages installed (aws-wizard, ipsec-profile-wizard, openvpn-client-export), HFSC scheduler with floating match rules and tagging for Internet upstream, some vlan interfaces with inbound rules. I think I had one static route previsouly but it has been deleted long time ago. Nothing else.
-
Hmm, well if it's getting stuck in a similar way that's likely some new bug then or at least a variant of the known one with Ether rules.
If it is that it will likely be resolved by rebooting and that's the only way I'm aware of to clear the stuck loader process.Steve
-
I've read that developers done enough testing, no more tests required and the fix is already in the new kernel. However since you mentioned this might be a new bug and since the box seems working ok in this state, therefore I will not reboot it for now, maybe developers would like to do some further checks later.
-
You can read through the thread where this was initially diagnosed here:
https://forum.netgate.com/topic/173923/strange-error-there-were-error-s-loading-the-rules-pfctl-pfctl_rulesI doubt anything can be learned at this point since the reported errors from there will always be 'device busy'. As shown there we need to see the truss output leading up to the point where is gets stuck which is the first time pfctl is run for this that were hitting it consistently.
Steve
-
-