CPU usage increase suddenly
-
@stephenw10 Interestingly when i try to start the DNSBL service it fails. Nothing in the log thats helpful.
PFblocker just the IP blocking is working without issue. Just the dns sinkholing is where we're failing. My curiosity here is peaked.
Any commands you can recommend to gain some insight? -
@stephenw10 Restarted the entire pfblocker package and now its functioning. The increased in CPU usage has come back -- unbound related.
How can i diagnose better?Below is when i had pfblocker Enabled without DNSBL. Then I turned it back on.
-
Just really weird that unbound tied with pfblocker is acting so strange.
-
-
@Phizix Historically on my SG-6100 cpu utilization isnt an issue. I have a baseline so thats how i know where this an issue. Right now although DNSBL is the problem its not causing any system instability. I would like to know why its acting this way if there is indeed an issue which i suspect there is.
-
Makes sense if you had a baseline to compare to. So indeed more CPU usage than normal.
I am curious what you find when you solve it. Was there a recent package update?
Phizix
-
@Phizix I'll keep this thread as updated as i can. I started a reddit post on it so i hope the maintainer can respond there as well. @BBcan177
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 36790 unbound 1 118 0 294M 260M CPU3 3 551:03 100.96% unbound 8387 www 1 20 0 44M 23M kqread 3 41:13 1.27% haproxy 59321 root 5 20 0 666M 578M nanslp 3 187:01 0.79% suricata 18304 root 1 20 0 12M 2488K kqread 3 7:11 0.75% dhcpleases 23975 root 5 20 0 859M 711M nanslp 3 172:29 0.70% suricata 98310 root 5 20 0 745M 667M nanslp 2 145:47 0.58% suricata 30622 root 1 20 0 13M 3152K select 0 105:28 0.44% syslogd 35711 root 1 20 0 13M 3584K bpf 3 66:04 0.22% filterlog 96223 root 1 20 0 14M 4000K CPU2 2 0:00 0.10% top 60163 nobody 1 20 0 16M 4996K select 3 0:32 0.08% softflowd 46104 zabbix 1 4 0 24M 11M select 1 5:45 0.08% zabbix_agentd 13937 root 17 68 0 107M 28M sigwai 1 7:27 0.08% charon 45925 zabbix 1 20 0 24M 11M select 1 5:39 0.07% zabbix_agentd 60519 dhcpd 1 20 0 25M 13M select 1 0:49 0.07% dhcpd 31822 root 1 20 0 18M 8012K select 0 9:39 0.05% openvpn 48305 root 3 20 0 69M 35M kqread 1 1:26 0.05% syslog-ng
-
Problem solved. Re-installed pfblockerNG. Made sure i had the 'Keep settings' option enabled.
It was really a last option thing. I didnt know if a reinstall would fix it but i knew there was something wrong with the configuration.
I had a custom DNS block list that was blocking example.com. I have since removed it a while ago but i noticed the domain is still getting sink holed. I triple checked to make sure the domain wasnt listed but pfblocker was indeed blocking it.
Re-install and now im back to baseline. Weird bug in the package but without other tools to debug i cant say why the package freaked out the way it did. I also cant reproduce the problem anymore.
-
-
The problem has come back.
Restarting unbound or dnbl doesnt solve the problem. The only solution is to disable DNSBL and cpu util goes back to normal.
I honestly have no idea and im at a lost.
I reinstalled the package from completely not saving any settings.11 root 187 ki31 0B 64K RUN 2 862:26 87.99% [idle{idle: cpu2}] 11 root 187 ki31 0B 64K CPU3 3 835:22 84.28% [idle{idle: cpu3}] 11 root 187 ki31 0B 64K CPU1 1 851:44 77.69% [idle{idle: cpu1}] 18451 unbound 68 0 235M 202M kqread 1 7:55 46.19% /usr/local/sbin/unbound -c /var/unbound/unbound.conf{unbound} 18451 unbound 68 0 235M 202M kqread 1 0:00 46.00% /usr/local/sbin/unbound -c /var/unbound/unbound.conf{unbound} 18451 unbound 68 0 235M 202M kqread 0 0:00 46.00% /usr/local/sbin/unbound -c /var/unbound/unbound.conf{unbound} 18451 unbound 68 0 235M 202M kqread 2 0:00 46.00% /usr/local/sbin/unbound -c /var/unbound/unbound.conf{unbound}
-
Interesting...Checking the DNS logs i see the same script being loaded over and over again.
Im still poking around. No clue right now :)
Whats also so weird is that unbound keeps restarting.. -
Can you disable python mode in pfBlocker to test?
If you run
ps -auxwwd
can you see what script is actually running? -
@stephenw10 I think i solved it. Im fairly confident its solved now....i hope
The clues are in the DNS logs. I noticed Unbound kept restarting and i remember a while ago reading on the forums that DHCP registration causes Unbound to restart. I do have registration enabled for all VLANs so i wasnt totally buying that as a reason. Regardless I reviewed each DHCP configuration for the vlans and what do i find?
Ahhh this aggressive lease timer. I stood up new DNS servers for a vlan and needed clients to switch over quickly. I never updated this until today. Switched back to defaults and CPU utilization shot back down to normal baseline levels.
Someone correct me if im wrong but i thought the DHCP registration issue with needed Unbound to restart was solved in the latest release?
-
Part of that issue was solved but it still restarts Unbound to load the new values every time. Which is.... sub optimal!
Yes, 60s is very short. Any reason it was set to that?
-
@stephenw10 said in CPU usage increase suddenly:
Yes, 60s is very short. Any reason it was set to that?
Reply
I stood up new DNS servers and wanted devices to cut over right away which worked but caused an issue for myself.
This entire issue smelled like a config problem but i couldn't prove it at the time. I went against my rule of rebooting the firewall as i truly dislike doing that especially if things were working before.