Puzzling CPU Usage
-
@LPD7 said in Puzzling CPU Usage:
Also b4 I forget I also updated PFB to 3.2.0_18 yesterday as well so that could also be the reason as well.
possible - I guess (also not likely)
read this
https://forum.netgate.com/topic/190361/pfblockerng-devel-3-2-0_18/7?_=1727956822864
remember the pfblocker you are running "_18" is -devel code - so there are most likely "things" -- if you are "tinkering" that is ok, but on production likely best to stick to release code. IMHO (and don't try to go back now stay on _18)
Curious - have you rebooted at any point after doing any of these various updates / changes?
-
Appreciate your insights this is helping me get a handle on how to diagnose oddities. What I ran across is that within the last half hour my box lost connectivity to the internet. I recycled the provider router and connection was not restored until I rebooted my box. When I did cpu utilization shot up to 35+% and has remained there. This is where my curiosity is rooted and why I am looking for ways to get to the bottom of it as it does not make sense to me. If its normal based on what I am running or have configured then that will be one answer but the inconsistencies are what have my interest peaked. The following is the output from the console "top" command. Based on this output and simple math I cant see where the 35% is being used, my math comes up to 12.61% and the WCPU #'s seem to support this. Thank you for indulging me on this.
-
@jrey I did do a reboot at some point cant say specifically at which point in my journey that was done.
I appreciate your insights into devel and prod versions of PFB. I like to tinker but not sure if how I use PFB would be in the realm of tinkering or normal use. Since I have been using devel since using PFB I cant tell the difference I would probably have to see devel and prod side by side to see the differences. If I were to decide to go with the prod version is it as simple as just doing the install from the package manager or would I have to remove devel and do the install? Also would I have to rebuild my PFB config post change or could I do a restore of a previous backup to get back to that point?
Thanks for the link, I will look it over and see if a light bulb goes off.
-
Hmm, after 25mins I would have expected any boot processes to have finished.
Does that mystery usage eventually subside?
-
@LPD7 said in Puzzling CPU Usage:
I would probably have to see devel and prod side by side to see the differences.
For a long time they were the same, but now they are different again.
by tinkering I wasn't implying that you are coding, but rather tinkering is being done in dev and sometimes it is just better to stay on the production release, until that tinkering by devs stabilizes. (but don't go back now)What I ran across is that within the last half hour my box lost connectivity to the internet. I recycled the provider router and connection was not restored until I rebooted my box.
can you clarify this - I don't take that to mean that the boot process took 25 or more minutes - did it?
pfblocker has nothing to do with connectivity to the internet if the connection is lost and didn't come back up - until you rebooted that is likely something else --- pfb is not responsible for maintaining the connection. (you aren't blocking your ISP are you?)
what kind of connection ?are all the alias pfb lists that you are building actually used in a rule ?
in particular pfb_NAmerica_v4 is that in an Allow or Deny Rule? -
@stephenw10 Hi Stephen yes I have noticed that after the system is up and running for a while (havent actually timed it but say an hour or so) the system cpu usage goes down, as of this message my cpu usage is at 5-7% with an occasional spike which doesnt seem to last very long. Now I am guessing that you saw something related to bootup that was strange, did you get that from the fractional WCPU percentages .88, .59 and so on or did you see that somewhere else? Anything to worry about or that I can correct?
-
@jrey Hi Jrey no the system took about 2 mins or so to boot up, seemed like the normal time frame. I didnt think PFB casued the connection issue I just noticed that after it rebooted that the cpu usage was again high and I expected it to be lower since everything would have been flushed and the system started from scratch. The fact that I see lower cpu % numbers after a period of time after reboot and that spikes are limited in duration that the sustained spikes to me say something, what that is exactly is what I am trying to figure out. Stephen noted something about the boot processes, he may have seen something in the screenshot that was out of whack but dont know yet. As long as the cpu% goes down and spikes are short in duration I think I can live with that as long as the consensus is that there is no underlying issue. Again this all comes from my knowing how the box behaved prior to my cleaning up feeds and lists and the pfb 3.2.0_7 bug that I had to correct and now learning what the new normal should look like. Appreciate your feedback.
-
@LPD7 said in Puzzling CPU Usage:
Now I am guessing that you saw something related to bootup that was strange, did you get that from the fractional WCPU percentages .88, .59 and so on or did you see that somewhere else?
Nope, nothing jumps out there other than the fact there seems to be usage that isn't accounted for.
I just know that at boot a bunch of things run that can take some time to complete. But I would normally have expected that to complete well within 25mins.
Is there anything logged when the usage goes back down to base levels?
-
@stephenw10 When you say logged do you mean the output using the Top -HaSP command?
I just ran the command and included the output below, cpu % at this time was 6%.
-
@stephenw10 Can you tell me why when I run commands in the SSH shell why I get permission denied? I am playing around to get familiar/comfortable with using the shell to troubleshoot and tried the pftop command which is supposed to run from the shell and i get the permission denied response. I do recall reading somewhere that anyone other than "admin" can only run a subset of commands and the rest are reserved for the admin.
Update: I re enabled the admin account and am able to get to the enhanced shell. I am reluctant to keep the admin account active and I think doing so contradicts suggestions regarding securing PFS. But since only lan devices can access the box via ssh I am less concerned for now. I think it would be a good idea to see users who are configured as admins to have functionality somewhere in the middle so there is a balance between security and functionality.
-
@LPD7 said in Puzzling CPU Usage:
When you say logged do you mean the output using the Top -HaSP command?
No I mean in the system log. If something finishes or exits it may log something there.
@LPD7 said in Puzzling CPU Usage:
Can you tell me why when I run commands in the SSH shell why I get permission denied?
Yup it's probably because you weren't logged in as root/admin.
-
@stephenw10 How far back do you want to see for the log? I have about 15 pages and a lot of that is "exiting on signal" and "now monitoring attacks" messages. Also would that be from the System>General tab?
Update: I have included the pages that had details other than "exiting on signal" and "now monitoring attacks" messages from the System>General tab as there arent as many. I sometimes see the PID message 16962 about a memory issue but havent dove into this yet. Hope this helps. PS...When you see "attack" messages these are from my failed login attempts.
-
@LPD7 said in Puzzling CPU Usage:
"exiting on signal" and "now monitoring attacks"
these are normal
sometimes see the PID message 16962 about a memory issue
this one and it says "cannot allocate memory" --- look in pfblockerng.log and you might find the table is overflowing
look for lines near the end of the last list update , something like this - what do you see?pfSense Table Stats ------------------- table-entries hard limit 600000 Table Usage Count 142826
notice the error references the same list I asked about here..
@jrey said in Puzzling CPU Usage:
are all the alias pfb lists that you are building actually used in a rule ?
in particular pfb_NAmerica_v4 is that in an Allow or Deny Rule? -
@jrey Thanks Jrey, I noticed in the log that the only message is " ASN Token not defined. Terminating Download. " and nothing more. I see a number of feeds that show the asn token not defined message and dont recall these feeds requiring one. I am going to look at them and see if I can suss out the problem. I reached the max uploads so I have posted what I can from the logs. I need to look into your alias question, I know the feeds are capturing packets as indicated in the dashboard but how they are configured is not something I looked at, just inferred once the feed was setup the filtering would just happen. I will upload what I can find, not something I looked at before.
Log entries:
Update: As per your question regarding lists/rules for the feeds yes based on what I can take from the below SS they are being blocked, rejected, and matched, does this answer your question?
-
@jrey said in Puzzling CPU Usage:
something like this - what do you see?
pfSense Table Stats
table-entries hard limit 600000
Table Usage Count 142826
notice the error references the same list I asked about here..Some interesting things in the log parts you have provided, but I don't think you went down far enough (or at least I'm not seeing this section in what you provided)
it should be very close to (if not just before the logging of UPDATE PROCESS ENDED.)What exactly is your expectation for the NAmerica rule at the top of the rules list?
-
@jrey I found that section in the logs and included below
My goal for any of the lists is to reduce/remove ads including youtube, limit access to questionable url's, IP's, domains, etc, spam, and so on. The NAmerica v4 you asked about is actually very active:
It think it is associated with the IP>GeoIP Summary list which I think is setup as part of the install. I know I selected the ip4 country list sometime back.
I was seeing a slowdown in browser loads in various devices in the network and thought maybe i had overdone it with the feeds so I disabled PFB and DNSBL reloaded (cron) and tested the browsers and there was no change so I feel that so far PFB is not slowing things down.
Does the order of the feeds within the rules list make a difference in performance or other?
Let me know what you are seeing from the SS and what you think could be done to achieve the goals with lower system utilization and or greater performance. Thank you.
-
So generally the rule of thumb is you want the top number (900000 in your case) to be 2 times the size of the bottom number. You obviously know where to change that because you have.
However that top rule that you have is a match rule, right... here is the summary from your log.
"The match action is unique to floating rules. A rule with the match action will not pass or block a packet, but only match it for purposes of assigning traffic to queues or limiters for traffic shaping. Match rules do not work with Quick enabled."
but you have no queue "queue = none" and likely don't need one in most cases.
you will see traffic logged because traffic will "match", but honestly likely not doing what you think.Based on your stated goal having the entire NAmerica list of IPs especially on a match rule is not serving the purpose you might be thinking. So it is taking resources and providing no value to your end goal.
The list of NAmerica is sourced from MaxMind, I don't use MM for GeoIP, but you should be able to "unselect" NAmerica and it will actually make no difference to traffic, but a difference in resource usage.
The strategy for blocking "ads" then is usually DNSBL and then perhaps creating a list of specific block rules for ones that slip through. But DNSBL is only part of that.
Ads that appear within Youtube content is a very different thing because often the ads come from the same servers as the content they are embedded within. So it is harder to block one without blocking both.
The rules are one thing to control the traffic source and destination. Blocking ads can be part of that, but generally other techniques are better at doing that.
-
How long after boot are those logs from?
You can see pfBlocker updating which can use a lo of CPU but that would be shown in the top output.
-
@stephenw10 said in Puzzling CPU Usage:
How long after boot are those logs from?
Yes, and
while true - the log file appears to be from the scheduled "midnight" run - so that particular log does not appear to be related specifically to a reboot.Once we sort out the rules and list sizes, we'll look at the "ASN Token not defined, Terminating download" messages that are scattered throughout the file..
We still need to fix these as well
All three have been discussed in other topics. One won't work because the download now requires an accept button press every time a download happens, one in fact an empty list, and one needs a small code change. to the _18 -devel package.
Currently they are not causing operational issues and so are minor issues.Yes pfblocker will cause CPU usage when updating, One way we can tie the spike together would be to see both the start time (which we have) and the updated complete time (which I do not see in what has been provided) and also the update schedule. When pfblocker is updating and logging all the stuff we can see (errors and all), if that process takes say 30 minutes to run (we don't know), the processor would appear to be spiked during that time. If the schedule is "default" and with the error, if it is running hourly and attempting to do everything again, because it has not been successful, then every hour you get 50% of the hour in a spiked position.
-
This post is deleted!