error(s) loading the rules: pfctl: DIOCADDRULENV: No such file or directory
-
@kprovost said in error(s) loading the rules: pfctl: DIOCADDRULENV: No such file or directory:
dtrace -n 'fbt::pf_ioctl_addrule:return { printf("@%x => %d", arg0, arg1); stack(); }
I also have this error
-
@JonathanLee I'm a little confused. That truss output shows no errors. It also doesn't show it printing anything.
Did you see the error manually running
pfctl -g -f /tmp/rules.debug
? -
@kprovost I did the command directly after the error showed in the gui
-
@JonathanLee That would mean that the error only happens intermittently, which is even stranger.
Can you reproduce the error?
If so, keep the above dtrace command running while you reproduce it and then supply that output. -
@kprovost it only happens in 24 it does not occur on my other boot environments, it also only occurs directly after a reboot. It also occurs when I have access control lists that are marked both IPv6 and IPv4 if I do not have them set that way it does not occur.
Example block both IPv4 Ipv6 to OPT1 rules like that
-
@JonathanLee I have a theory about what's happening here. Basically, the error message is misleading because we're not actually getting 'ENOENT'. The error handling code in pfctl is printing the wrong error.
The cause is likely to be a simple conflict between two processes trying to update rules at the same time. That's something the should PHP handle, but because the error doesn't match what it expects it doesn't.
If you're comfortable editing the PHP code it's a fairly simple thing to test:
--- /etc/inc/filter.inc.orig 2024-07-26 12:09:54.964680000 +0000 +++ /etc/inc/filter.inc 2024-07-26 12:10:15.221720000 +0000 @@ -624,7 +624,7 @@ break; } if (strstr($_grbg, "DIOCADDALTQ: Device busy") || - strstr($_grbg, "DIOCADDRULE: Device busy") || + strstr($_grbg, "DIOCADDRULE") || strstr($_grbg, "DIOCXCOMMIT: Device busy")) { // when busy status is returned retry after a short pause usleep(200000);//try again after 200 ms..unless it still fails after 10x
So, in human terms, edit /etc/inc/filter.inc and on line 627 and change
strstr($_grbg, "DIOCADDRULE: Device busy")
tostrstr($_grbg, "DIOCADDRULE") ||
(i.e. remove ': Device busy'). That ought to make the code match the error and retry in case of concurrent updates. -
@kprovost should this also be a Redmine? This could be a patch also…
-
@JonathanLee Let’s confirm first.
-
@kprovost give me a min I have to boot that I am doing testing with IPv6 static assignments and squid right now it is working well. Let me swap boot environments and use this config for 24 too
-
@kprovost Done...
Version 24.03-RELEASE (arm64)
with ipv6 tunnel broker over functional ssl intercept squidBefore.....
After
Will update if error returns
-
This caused issues with rule creating and the ACL rules order would move around if you changed the busy condition. This is during configuration changes.
-
@JonathanLee What sort of issues?
(I'm on holiday, so there will be no further progress for the next two weeks.)
-
@kprovost have a great vacation.
-
Experienced same error issue recently. Attached the file of the output from putty cli commands requested in earlier posts. I do not believe I was able to reproduce the issue. The putty file output text was approx 6MB and the netgate file upload accepts only 2MB so I cutoff a large portion of the end and don't know if the good content is missing or not. A lot of the lines appear identical.
Machine is:
Boot Environment
default
Current Base System
24.03
Latest Base System
24.03
Status
Up to date. -
@clawsonn In my case, I had a bad WAN connection that was triggering this issue. It was also making HAProxy crash. As soon as I disabled that WAN (it was a 4g backup), everything went back to normal.