CRON stall when reloading UNBOUND
-
Hi! I´m newbie using pbBlockerNG.
I can't find where the error is that causes the pfBlobkerNG CRON fails to stop and restart Unbound resolver.
Cron has been "running" for a complete day (i.e.: not ending the process) and I resolve it by dissabling it (pfb) and DNSBL, rebooting my Intel PC running pfSense and re enabling. It has working for 10 cycles (CRON is established once hourly by default), but this morning it freezes again at 8:00am, when the cron has started.
This is the last part of log:
Stopping Unbound Resolver.............................. Additional mounts: No changes required. Starting Unbound Resolver. DNSBL disabled - Unbound conf update FAIL *** Fix error(s) and a Force Reload required! *** ==================== [1618054734] unbound[362:0] error: bind: address already in use [1618054734] unbound[362:0] fatal error: could not open ports ==================== Stopping Unbound Resolver.............................. Additional mounts: Starting Unbound Resolver. **Saving configuration [ 04/10/21 08:42:40 ]**
At 08:42:40 I decide to disable pfBlockerNG and DNSBL and reboot again. The system is up with no other issues ten hours after.
Because its a critical service (I work at a healthcare company) and its saturday I decide to not re enable until Monday.
Any help is welcomed!!
-
EDIT: I was looking inside logs and I found this in system.log
coApr 10 07:56:58 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (16 occurrences) Apr 10 07:58:09 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (12 occurrences) Apr 10 07:59:27 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (20 occurrences) Apr 10 08:00:00 pfSense php[58548]: [pfBlockerNG] Starting cron process. Apr 10 08:00:30 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (96 occurrences) Apr 10 08:01:35 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (706 occurrences) Apr 10 08:02:39 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (592 occurrences) Apr 10 08:03:40 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (616 occurrences) Apr 10 08:04:43 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (301 occurrences) Apr 10 08:05:44 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (250 occurrences) Apr 10 08:06:44 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (1372 occurrences) Apr 10 08:07:44 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (1082 occurrences) Apr 10 08:08:44 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (1164 occurrences) Apr 10 08:09:44 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (1551 occurrences) Apr 10 08:10:44 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (1339 occurrences) Apr 10 08:11:46 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (1459 occurrences) Apr 10 08:12:46 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (1777 occurrences) Apr 10 08:13:47 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (1394 occurrences) Apr 10 08:14:49 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (1712 occurrences) Apr 10 08:15:50 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (2304 occurrences) Apr 10 08:16:50 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (1295 occurrences) Apr 10 08:17:51 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (726 occurrences) Apr 10 08:18:51 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (1534 occurrences) Apr 10 08:19:54 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (1434 occurrences) Apr 10 08:20:54 pfSense kernel: sonewconn: pcb 0xfffff8001ad2e5b8: Listen queue overflow: 193 already in queue awaiting acceptance (1120 occurrences) de_text
-
Which cron ?
pfBlockerNG puts in place uses 5 cron tasks.
What are these :
Listen queue overflow: 193 already in queue awaiting acceptance (1120 occurrences)
never saw that before.
-
@gertjan Thanks for your response!
In my pfSense installation I have not (or I can't find it...
) a Cron submenu under "Services", I'm using a PC as host, running 2.5.0-RELEASE (amd64).
About the errors present at logs, I've found a topic that says its a package related message --> sonewconn errors
I've restarted pfB and DNSBL and it's working fine, but randomly it crashes when reloading filter / restarting unbound service due to (by default) cron settings.
I'm a bit confused... hehehehe!
-
@gerardomdp said in CRON stall when reloading UNBOUND:
confused
My fault.
It isn't there by default.
But you want it ! (by default).Go here :
System > Package Manager> Available Packages
and install pfSense's most simple package : a GUI extension that does nothing but showing the installed cron tasks. -
@gertjan Thanks, really THANKS !!!!
I was trying to find cron from shell under root user, but system says there is no cron job for user..
Well.... the cron that goes randomly to a blackhole and remains forever in "starting unbound service" is the one that is marked, it's the only one of five cron related to pfB that runs every hour
/usr/local/bin/php /usr/local/www/pfblockerng/pfblockerng.php cron >> /var/log/pfblockerng/pfblockerng.log 2>&1
When the system "crashes" the log says this:
Apr 13 10:17:07 pfSense php-fpm[8145]: /status_services.php: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1618319827] unbound[73340:0] error: bind: address already in use [1618319827] unbound[73340:0] fatal error: could not open ports'
Why this could happen?
Any suggestion welcomed!
-
This error happens when unbound was stopped ? or crashed ? and at that moment, it is still bound to the interface (== it still listen on one ore more interface on port 53).
A new unbound instance is launched, but can't start as there is already a process "using these ports".
That's when the message is shown.What are you using as a pfSense device ? What version ?
See also the release note about the new version 2.5.1 that came out yesterday : https://docs.netgate.com/pfsense/en/latest/releases/21-02-2_2-5-1.htmlAlso : unbound crashing .... what happens when you remove some / most / all the feeds from pfBlockerNG ?
What 'mode' are you using for pfBlokcerNG ? unbound mode , Python mode ? -
@gertjan Thanks (again :) )
Let's go by steps...
This error happens when unbound was stopped ? or crashed ? and at that moment, it is still bound to the interface (== it still listen on one ore more interface on port 53).
Prior to crash I don't see this error, so I assume is due to the unbound state or something related
A new unbound instance is launched, but can't start as there is already a process "using these ports". That's when the message is shown.
Yes, It can't run because of another instance running previously and alive. I don't arrive to see what is the origin of this state
What are you using as a pfSense device ? What version ?
See also the release note about the new version 2.5.1 that came out yesterday : https://docs.netgate.com/pfsense/en/latest/releases/21-02-2_2-5-1.htmlMy device is a pc based on an Asus mobo H81 chipset, Intel Celeron @2.8Ghz, 4GB RAM and 120HDD When the cron fails, the CPU utilization goes to 50/70%, normally is under 10% (7% average). RAM is about 15/18% occupied
I see the update related to DNS Resolver/Unbound, on Monday I will upgrade to v.2.5.1 and look what happens
Also : unbound crashing .... what happens when you remove some / most / all the feeds from pfBlockerNG ?
I don't know, I've been using the feeds by default. I could change settings and see what is the result of this.
What 'mode' are you using for pfBlokcerNG ? unbound mode , Python mode ?
I'm using "Unbound mode"
Hope to be clear .... ;)
-
@gerardomdp said in CRON stall when reloading UNBOUND:
I don't know, I've been using the feeds by default. I could change settings and see what is the result of this.
What 'mode' are you using for pfBlokcerNG ? unbound mode , Python mode ?
I'm using "Unbound mode"
Hope to be clear .... ;)On a 4GB box, you have to limit the size of DNSBL db in Unbound Mode as it will drain loads of memory. Enabling all Feeds is problematic. Remove the biggest lists, Force Reload All, and tune until it doesn't crash.
Enabling Resolver Live Sync may help as it doesn't reload Unbound on Cron Update, but expect delay when restarting Unbound with too many DNSBL definitions.
The listen queue overflow is probably cause by a network driver or tuning.
-
@ronpfs said in CRON stall when reloading UNBOUND:
Enabling Resolver Live Sync may help as it doesn't reload Unbound on Cron Update
Live Sync means that pfBlockerNG3 loads unbound with flat DNSBL files. Very nice solution when there are 10, 100, or a couple of thousand of lines == DNSBL entries.
Start being bad when there are hunderds of thousands, or millions.
For every DNS lookup, unbound has to parse the entire list. These lists have to read into unbound's memory space to be used.The python mode does the same thing. Now it's the external Python that parses same lists, but its far more optimized to do so. And there is a bonus : lists can get changed on disk without unbound being restarted when there are major list updates. So unbound restart less often or even not at all.
-
@gertjan said in CRON stall when reloading UNBOUND:
Live Sync means that pfBlockerNG3 loads unbound with flat DNSBL files.
When enabled, updates to the DNS Resolver DNSBL database will be performed Live without reloading the Resolver.
During Cron Updates, it will update Unbound with only changes using unbound-control, so no interruption and no memory shortage. However Unbound Restart/Reload will check the *.conf files then load all *.conf into it's db, draining memory.