pfBlockerNG-devel v3.1.0_7 update - Unbound Issue
-
This afternoon I tried the following.
-
As per @Gertjan's instructions in this thread, I disabled pfBlockerNG-devel completely, stopped the unbound service, checked/killed any unbound instances (there weren't any), then started the unbound service again and finally re-enabled pfBlockerNG-devel with the DNSBL option enabled.
-
"Force Reload All" completed successfully. Snippet as follows:
Clearing all DNSBL Feeds Added DNSBL Unbound python integration settings Adding DNSBL Unbound python mounts: Creating: /var/unbound/usr/local/bin Mounting: /usr/local/bin Creating: /var/unbound/usr/local/lib Mounting: /usr/local/lib DNS Resolver ( enabled ) unbound.conf modifications: Added DNSBL Unbound Python mode Added DNSBL Unbound Python mode script VIP address(es) configured Restarting DNSBL Service Stopping Unbound Resolver. Unbound stopped in 2 sec. Additional mounts (DNSBL python): Mounting: /lib Mounting: /dev Mounting: /var/log/pfblockerng Mounting: /usr/local/share/GeoIP Starting Unbound Resolver... completed [ 12/9/22 16:49:12 ] Restarting DNSBL Service (DNSBL python) DNSBL update [ 0 | PASSED ]... completed
- All was working fine for about 18 minutes, first and last entry in dns_reply.log as per below:
DNS-reply,Dec 9 16:49:14,resolver,NULL,SOA,3600,_ta-4f66,127.0.0.1,SOA,unk DNS-reply,Dec 9 17:07:35,reply,A,A,5,teams.office.com,10.2.0.1,NXDOMAIN,unk
-
After the last entry, all services including unbound are still running, but DNS queries to the pfSense box fail and are not being logged either.
-
I then tried to disable the DNSBL option in pfBlockerNG-devel, followed by "Force Reload All". This throws up errors and eventually hangs:
Removing DNSBL Unbound python integration settings DNS Resolver ( disabled ) unbound.conf modifications: Removed DNSBL Unbound Python mode Removed DNSBL Unbound Python mode script Stop Service DNSBL Stopping Unbound Resolver.............................. Removing DNSBL Unbound python mounts: Unmounting: /usr/local/bin Removing: /var/unbound/usr/local/bin Unmounting: /usr/local/lib Removing: /var/unbound/usr/local/lib Removing: /var/unbound/usr/local Removing: /var/unbound/usr Starting Unbound Resolver. DNSBL disabled - Unbound conf update FAIL *** Fix error(s) and a Force Reload required! *** ==================== [1670606400] unbound[83318:0] error: bind: address already in use [1670606400] unbound[83318:0] fatal error: could not open ports ==================== Stopping Unbound Resolver.............................. Removing DNSBL Unbound python mounts: Removing: /var/unbound/usr/local Removing: /var/unbound/usr Starting Unbound Resolver.
- The only way to get DNS resolution working again is to reboot the pfSense box.
As I mentioned in my earlier post, I've not had any issues with pfBlockerNG-devel for some time. Assuming it is not possible to roll back to v3.1.0_6, I'm hoping somebody can help me resolve this issue?
Thanks in advance.
-
-
@aberdino said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
Stopping Unbound Resolver..............................
That's not a pfblokcerng issue.
Every dot is 'a second' where pfBlocker had instructed unbound to stop .... but it didn't stop yet.
After 20 seconds or so (count the dots) pfBlocker abbandons, and continues.
But can't start unbound, as it is still running .... so it hets the 'can't bind to ports (53)' as they are still occupied by the initial non stopped instance.Can you put unbound in more verbose mode ? Services > DNS Resolver >Advanced Settings level 2 or even 3. Now you can see at the moment it receives a "stop" what it does during stopping.
Try also to switch from Python to unbound mode :and see what happens.
@aberdino said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
All was working fine for about 18 minutes,
Unbound stops answering ....
From LAN ?
All LANs ? if you have more then one.
From 127.0.0.1 ?
What are your unbound settings ? (interfaces selected ?)
Are there interface events ? See main system log ?
Example : when I have a bad LAN connection (cable used, switch in front of it, the LAN NIC) or some other very frequent NIC event in pfSense, you see in the main system log that say "igc0: link state changed to UP " or down where igc0 is a NIC, or some OpenVPN event etc then unbound can restart. If this happens very often, it will 'fall'.And of course, the very known one, I mention it just to be sure :
if this one is set, unbound will get restarted on every DHCP lease.
If you have many device, or a chinese brain dead DHCP client, or some other bad wifi event (so DHCP gets iniated every time wifi comes back) then unbound can get restarted a lot.
You can see this happening the unbound Resolver log.
Typically, you don't want the resolver to restart more then ones a day ... if possible even less.
My pfBlockerng only syncs feeds ones a week ( ! ) so it will restart unbound only ones a week max (maybe less, feeds do not always change). -
@AberDino
As a sanity check and to let you know that you're not alone, I am experiencing the exact same issue. I am running pfSense 2.6.0 and had been running pfBlockerNG 3.1.0_6 since it was released. Both pfSense and pfB 3.1.0_6 had been running without any issues. After upgrading from _6 to _7, Unbound would fail to resolve requests on all configured interfaces until DNSBL was disabled and pfSense was rebooted. The errors that I see in the logs match yours almost exactly. For the record, my DNSBL is/was running in "Unbound Python Mode".Either whatever is causing this issue is not widespread or not that many people on 2.6.0 have applied the _7 upgrade yet, otherwise there would probably be more people posting about this issue. In any case, I will be watching this thread and the forum to see if anyone is able to help in finding the root cause and any possible fixes and to see if more people experience this issue. I am ready & willing to provide any additional information that may be requested in order to help figure what our configurations have in common that may be causing this issue. Until then, I'm going to continue trying to see if I can find the root cause and how to resolve it.
Edit:
@gertjan said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
Try also to switch from Python to unbound mode
When I switch from "Unbound Python Mode" to "Unbound Mode", the force reload completes and all appears to work fine.
-
I haven't had any issues since upgrading pfBlockerNG to 3.1.0_7. I am running pfSense 2.6.0 and in DNSBL, it is running in"Unbound Python Mode".
I don't have any ideas at the moment as to what could be causing your problem. -
@jdeloach
Thanks. With so many different configuration possibilities, I'm sure that it's some sort of edge case that may not be being handled. But whatever the cause, I'd much rather find & fix it than have to run without DNSBL in "Unbound Python Mode".Currently, I am trying to see if having Wireguard and/or HAProxy running is causing the DNS Resolver to not stop and then stop resolving.
For now, even though I have a little less functionality due to not having my regex-based list being filtered, at least I am able to run in "Unbound Mode". That's better than no DNSBL at all. :)
Back to troubleshooting...
Thanks again.
-
Thanks for the replies and suggestions.
@gertjan said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
That's not a pfblokcerng issue.
Well, pfBlocker interacts with unbound, and it appears that something has changed in v3.1.0_7 compared to v3.1.0_6, as I did not make any other changes and all was working fine before.
@gertjan said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
Can you put unbound in more verbose mode ? Services > DNS Resolver >Advanced Settings level 2 or even 3.
Now set to level 3, so tomorrow I will recreate the issue and see what's being logged.
@gertjan said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
Try also to switch from Python to unbound mode
If I do that, I lose "no AAAA" filtering, which is not an option for me.
@gertjan said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
Unbound stops answering ....
From LAN ?
All LANs ? if you have more then one.
From 127.0.0.1 ?
What are your unbound settings ? (interfaces selected ?)
Are there interface events ? See main system log ?I've got multiple VLAN interfaces and unbound stops responding on all of them. When I use the pfSense "DNS Lookup" function (Diagnostics menu), it takes a long time to resolve, I guess because it eventually falls back to an upstream DNS server. In unbound I've got 'All' interfaces selected. The only other options selected are 'Enable DNSSEC Support', 'Enable Forwarding Mode' and 'Register DHCP static mappings in the DNS Resolver'. For the avoidance of doubt, the 'Register DHCP leases in the DNS Resolver' is NOT ticked. Interfaces are stable (no connects/disconnects reported).
@gertjan said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
My pfBlockerng only syncs feeds ones a week
I've got mine set up to sync once a day (during the night).
@thexman said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
As a sanity check and to let you know that you're not alone, I am experiencing the exact same issue.
Thanks, it is good to know I'm not alone! There must be something in common with our setups.
@thexman said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
Currently, I am trying to see if having Wireguard and/or HAProxy running is causing the DNS Resolver to not stop and then stop resolving.
I don't have those services running.
-
I spent a bit of time trying to troubleshoot the issue earlier today. As was alluded to in a prior post, when unbound stops responding to requests, the service can't be stopped. I tried to stop it through the GUI and I also tried to stop the multiple instances by issuing KILL commands for the PIDs through an SSH shell, neither way would stop the service. As you said, the only way to get unbound working again was to reboot, either with DNSBL disabled or by switching DNSBL from Unbound Python Mode to Unbound Mode.
@aberdino said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
I don't have those services running.
Thanks for the confirmation. I ruled out WireGuard and HAProxy early on by stopping both services and switching back to Unbound Python Mode. It made no difference, as soon as I did a Force Reload of DNSBL, unbound would hang. I also did compares of the unbound .conf files in a working state and in a hung unbound state. I didn't see anything that was out of the ordinary. Of course when unbound stops responding, the CPU spikes up a bit and stays there, even when no DNS requests are inbound.
As you said, if this isn't a pfB issue, it certainly has something to do with the pfB upgrade package because pfB was working fine on my box with 3.1.0_6 and all of the previous versions back to 2.x.
@aberdino said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
I've got multiple VLAN interfaces
Well that's something our configurations have in common. I also have VLAN interfaces.
Hopefully, either someone more knowledgeable with the changes that were made and how it may be affecting our systems will have a solution or it becomes a more widespread issue as people start applying the upgrade and it'll get more attention.
-
@thexman said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
I also have VLAN interfaces.
When you don't use any VLANs, the issue goes away ?
That would be rather easy to demonstrate I guess.@thexman said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
I also tried to stop the multiple instances by issuing KILL commands for the PIDs through an SSH shell, neither way would stop the service.
So your issue isn't related to pFB at all.
pFB doesn't change unbound, the process.
If "python mode" is used, only this line gets added to the unbound config file :python-script: pfb_unbound.py
But using this method, or not, unbound is already using python for other build in functionalities.
In pFB "unbound mode", nothing gets changed.
-
@aberdino said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
Now set to level 3, so tomorrow I will recreate the issue and see what's being logged.
See resolver.zip file, but it doesn't seem to provide any clues as logging stops when unbound 'hangs', without any errors reported.
As it's the "no AAAA" filtering which is the most important to me (HE IPv6 tunnel), I've disabled the pfBlockerNG DNSBL option and I'm back to using a "no AAAA" python script with unbound. As @TheXman says, perhaps if/when it does become a more widespread issue, somebody will get to the bottom of it .
-
-
@aberdino See post:
https://forum.netgate.com/topic/176350/pfblockerng-devel-v3-1-0_7-v3-1-0_14/42 -
-
@steveits said in pfBlockerNG-devel v3.1.0_7 update - Unbound Issue:
See post:
https://forum.netgate.com/topic/176350/pfblockerng-devel-v3-1-0_7-v3-1-0_14/42Thanks, I followed the instructions to go back to the previous pfb_unbound.py version and it appears to have resolved my issues with unbound becoming unresponsive, so at least that confirms it is pfBlockerNG related.
-
-