DNS resolver exiting when loading pfblocker 25.03.b.20250409.2208
-
Seems like it must be something like that. I can't see any driver changes that could present like this directly.
Though perhaps it could be something in the SFP module since I've nothing on our 8300 test boxes that also use ice(4) NICs.
-
@w0w said in DNS resolver exiting when loading pfblocker 25.03.b.20250409.2208:
@stephenw10
Yep, it is possible that I have changed it.On mine:
machdep.hwpstate_pkg_ctrl: 0
So no difference between versions on my systems.
️
-
@stephenw10 said in DNS resolver exiting when loading pfblocker 25.03.b.20250409.2208:
Seems like it must be something like that. I can't see any driver changes that could present like this directly.
Though perhaps it could be something in the SFP module since I've nothing on our 8300 test boxes that also use ice(4) NICs.
Replaced this ipolex SFP+ DAC:
drivername: ice0 plugged: SFP/SFP+/SFP28 Unknown (Copper pigtail) vendor: ipolex PN: SFP-H10GB-CU1M SN: WTS11J72204 DATE: 2019-07-24
With a freshly purchased 10Gtek one:
drivername: ice0 plugged: SFP/SFP+/SFP28 Unknown (Copper pigtail) vendor: OEM PN: CAB-10GSFP-P1M SN: CSC241010630178 DATE: 2024-10-22
It didn't change anything and the issue remains. I wasn't really expecting a difference as I had been through my stock of SFP+ DACs but best to be sure I guess.
So far the only way to stop the issue is to revert to 24.11 and below. The problem only manifests itself with 25.03b.
-
kern.ipc.tls.enable: 1
on 25.03
I doubt that this option has any real effect, but for now it's the only difference I can see in the kernel. At the very least, it can be used by both the ice driver and the kernel itself.
-
@stephenw10 said in DNS resolver exiting when loading pfblocker 25.03.b.20250409.2208:
Seems like it must be something like that. I can't see any driver changes that could present like this directly.
Do we still need to add
ice_ddp_load="YES"
to theloader.conf.local
file, or are we done with that tuneable?I don't have it added but I presumed all is well given that the system shows that it is loaded:
ice0: <Intel(R) Ethernet Connection E823-L for SFP - 1.43.2-k> mem 0xf0000000-0xf7ffffff,0xfa010000-0xfa01ffff at device 0.0 numa-domain 0 on pci11 ice0: Loading the iflib ice driver ice0: The DDP package was successfully loaded: ICE OS Default Package version 1.3.41.0, track id 0xc0000001. ice0: fw 5.5.17 api 1.7 nvm 2.28 etid 80011e36 netlist 0.1.7000-1.25.0.f083a9d5 oem 1.3200.0
️
-
I'm not sure if this unbound error is relevant but as it appears around the time of these events:
php-fpm 16215 /rc.newwanipv6: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1749405403] unbound[41019:0] error: bind: address already in use [1749405403] unbound[41019:0] fatal error: could not open ports'
Looking at sockets:
IPv4 System Socket Information USER COMMAND PID FD PROTO LOCAL FOREIGN root php-fpm 16215 4 udp4 *:* *:* IPv6 System Socket Information USER COMMAND PID FD PROTO LOCAL FOREIGN root php-fpm 16215 5 udp6 *:* *:*
It's not an area I am familiar with.
️
-
No that's ugly but shouldn't be an issue. It tries to start Unbound too rapidly and it's already running. That should not stop it.
-
Not grabbing the bunting just yet but the short-lived 0606 beta booted ok and without all the noise the interface allegedly having issues. It did have 1 cycle of reacting to a non-existent interface issue when running but the improvement at boot was quite a change.
Now running the 0610 beta and still no issues at boot and no false interface issues for 30 hours+.
Did a bug get caught and shot?
️
-
No not as far as I know. Which is interesting! Hmmm
-
@stephenw10 said in DNS resolver exiting when loading pfblocker 25.03.b.20250409.2208:
No not as far as I know. Which is interesting! Hmmm
It could be a big coincidence of course but my last hotplug event:
2025-06-11 15:35:26.596764+01:00 php-fpm 5667 /rc.linkup: HOTPLUG: Configuring interface opt3 2025-06-11 15:35:26.596749+01:00 php-fpm 5667 /rc.linkup: DEVD Ethernet attached event for opt3 2025-06-11 15:35:26.596686+01:00 php-fpm 5667 /rc.linkup: Hotplug event detected for LAN(opt3) dynamic IP address (4: 10.0.1.1, 6: track6) 2025-06-11 15:35:26.596176+01:00 check_reload_status 648 Reloading filter
...was over 52 hours ago.
I'll keep monitoring.
️
-
The fix for VIPs on PPPoE went into that beta. But I'm not sure how that would affect LAN...
-
Over 72 hrs since my last boot and zero issues with the false interface errors. Packages are happy and DNS resolver has a healthy cache again. Looking like a fix.
@stephenw10 said in DNS resolver exiting when loading pfblocker 25.03.b.20250409.2208:
The fix for VIPs on PPPoE went into that beta. But I'm not sure how that would affect LAN...
Well it didn't really look like a genuine LAN issue from the start. Multiple DACs, vendors and different physical interfaces all showed the same issue.
Regress to v24.11 and the issue went away. Simple DHCP on the WAN to another router and the issue went away again. Remove IPv6 and the issue went away. Use v25.03b with the old PPPoE and the issue went away.
With v25.03b + PPPoE + IPv6 and the problem exerted itself, perhaps with the odd unsolicited RA in the mix, producing the loose periodicity noted. That and perhaps the new SFP28 driver.
Anyway, I am just a sample size of one but so far it all looks good with beta 0610.
I've got logs pre and post fix if required.
️
-
Nice. More fixes are incoming for various things.
If you're able to test this patch that would be good: https://forum.netgate.com/post/1218007
It works fine everywhere I've tested it but, as we are finding, there are significant differences in pppoe connections.
-
@stephenw10
This one?https://nc.netgate.com/nextcloud/s/bt2fWWjdzT4KFHy
What should it do?
It does not work for me though:
Patch does not apply cleanly (detail) Patch does not revert cleanly (detail) Debug Result: Fail This patch does not apply or revert cleanly. The patch settings may be incorrect, the patch content may not be relevant to this version, or the patch may depend upon another separate patch which must be applied first.
️
-
Oh it's probably for 2.8 and you're on the latest 25.03-beta. I'll get a diff for that tomorrow.
It allows for ISPs that don't send RAs. Otherwise the gateway is never populated. But we need to be sure it doesn't break anything in the process! (I'm pretty sure it doesn't
).
-
-
@stephenw10 said in DNS resolver exiting when loading pfblocker 25.03.b.20250409.2208:
Oh it's probably for 2.8 and you're on the latest 25.03-beta. I'll get a diff for that tomorrow.
Any news about diff for 25.03, @stephenw10?
-
Checking...
-
OK try this. It should apply against the June 10th 25.03 build.
-
@stephenw10
Awesome, I applied the patch and testing.