2.5.1-RELEASE - Dashboard's "Firewall logs" widget kills the CPU and effectively DoS's the GUI
-
Mmm, good point. That's huge however you look at it! Whether or not it's rotating correctly.
I wonder if displaying the widget is triggering the rotation. Or if simply trying to open that file to display the data is an issue.
-
@e-1-1 what all are you logging? Is there a lot of spam in it? Are you logging all pass traffic with lots and lots of users?
What I mean by spam - like say multicast or broadcast traffic on your local side because it doesn't match up to your lan net filters? Or noise from the internet like loads and loads of udp p2p traffic?
278MBytes just seems like a lot of logging - that maybe could be trimmed down? If its filled with a lot of unwanted/needed sorts of traffic.
my logs might be smaller than normal since I don't log a lot of internet noise that I just don't care about. I only log common udp ports, and I only log syn hits on wan.
And I don't log any sort of noise from my lan side. Only stuff that is explicitly blocked, and not passed traffic going out.
[21.05-RELEASE][admin@sg4860.local.lan]/var/log: ls -ahl filter* -rw------- 1 root wheel 264K Jun 6 07:27 filter.log -rw------- 1 root wheel 500K Jun 6 02:14 filter.log.0 -rw------- 1 root wheel 500K Jun 5 15:25 filter.log.1 -rw------- 1 root wheel 499K Jun 5 07:25 filter.log.2 [21.05-RELEASE][admin@sg4860.local.lan]/var/log:
But 278MB for less than a day just seems like a lot.. Either you are pushing a hell of lot of traffic or maybe logging a lot of noise that may not need to be logged.
-
@johnpoz Thanks, with "Log Compression" set to "none", the system logs firewall tab opens much faster, CPU isn't taxed as much:
21653 root 25 0 104M 22516 S 0.0 1.1 1:19.01 | `- php-fpm: pool nginx 26182 root 26 0 11456 2336 S 0.0 0.1 0:00.00 | | `- sh -c /bin/cat '/var/log/filter.log' | /usr/bin/tail -r -n 350 26330 root 26 0 10628 1664 S **7.4** 0.1 0:01.74 | | `- /bin/cat /var/log/filter.log 26532 root 74 0 10876 1844 R **9.1** 0.1 0:02.21 | | `- /usr/bin/tail -r -n 350
Not sure if this is the way forward, without compression though.
-
@stephenw10 Agree. Wonder if there is a button in the web GUI labeled "Rotate logs for {function X} now" that I can use.
Good thing that changing the log compression setting erased the older logs.
Well, good for me now, otherwise I'd be not happy for these deletions. Guess it would be too time consuming to program a routine that decompresses and recompresses the old logs in the new desired state, if at all (with setting "none").Can test 2.5.2 on the slowest box, as it is the easiest to restore in case of bork. But didn't see anything seemingly related to this on the public Redmine issues for 2.5.2. Missing something?
-
@johnpoz A box protecting a heavily attacked, publicly NAT-ed daemon, let's say. Details in the other reply.
-
@stephenw10 My guess is that the code reading the logs is tightly coupled with the PHP web render thread, effectively crippling the GUI. If it weren't, I'd have no problem with logs loading slowly, but now in the best case it's a looong delay, in the worst case it's a HTTP 504 displayed.
-
@johnpoz "Log firewall default blocks" is enabled. Wonder if it's there enabled by default, the UI doesn't tell. <-- point to improve the interface perhaps.
Only a few reject and deny rules tied to pfBlocker's Feodo abuse list are logged, however those caught in total less than 10 MB since last box reboot nine hours ago.
No pass rules traffic is logged. It would cripple the storage, terabytes a day pass through pf.I guess your second definition of spam applies here, assisted by the snort2c table populated by Suricata.
Both the slowest and fastest box push 50 to 300 Mbps constantly, so that'd be the culprit for the logs' size.
However, my main concern is that coupling between log reading and webpage generation or loading, in both the dashboard widget and the system logs -> firewall page.
-
@e-1-1 said in 2.5.1-RELEASE - Dashboard's "Firewall logs" widget kills the CPU and effectively DoS's the GUI:
Wonder if it's there enabled by default
Yeah logging default block is on by default.
-
@e-1-1 :
Yeah logging default block is on by default.
Being DOSsed means (to me) : instead of using the firewall widget, have the logs send to a 'remote' log server which is more capable of handling 'huge' quantities of log lines.
The firewall widget is nice, but as soon as things heat up, it will slow down the access to the default login page of pfsense = the dashboard. Not good ...
The widget is nice, for the days everything is calm and nothing is happening.And of course, uncheck the default block log :
as it will only learn you that IPv4 goes from 1.1.1.1 to 254.254.254.254 ;)
-
https://docs.netgate.com/pfsense/en/latest/monitoring/logs/settings.html#log-rotation-settings
Read the notes, especially on the compression settings.
-
@jimp Thanks for linking that.
Looks like "none" is indeed the way forward to improve page load performance, current compress ratio is beautiful:
[2.5.1-RELEASE][root@hostname]/var/log: zfs get compression,compressratio zroot/var NAME PROPERTY VALUE SOURCE zroot/var compression lz4 inherited from zroot zroot/var compressratio 8.32x -
Due to job "conditioning", let's say, that pushes me to get rid of root causes and find error avoidance methods, I gotta get the following paragraphs out of my mind and here, "on paper":
I'd hereby push for a GUI hint towards this "hey, don't choose any compression, you are already using a ZFS dataset with compression enabled and it's x.y% efficient, no need for further CPU cycles to be wasted."
As well, I'd push for this "Log Compression" setting to be disabled by default when an install is done on a ZFS filesystem/volume manager, if it's not already.
And for a "best practice" guide (one can dream, something like Cisco's wireless best practice page in their wireless controllers' web interface) directly in the GUI, after an upgrade a message could pop-up saying "hey we noticed you use ZFS, but have log compression enabled. It's not ideal. Wanna change to none? warning - all previously logs will be deleted. Yes/No".
To conclude, even with "none" set, neither the dashboard, nor the firewall log section load as fast as I'd like.
Yes, @Gertjan, log shipping to a SIEM is nice, however due to resource constraints not every box/cluster that I manage benefits from this. Therefore the need to at least be able to view the system logs -> firewall tab, unfortunately with blocked entries for statistics. Now that I got to this, another thing comes to mind - if the dashboard widget and system logs are set to retrieve the last, say, 250 log lines, why is the webpage generating code trying to read all the logs, including the older ones? If it starts to read the current filter.log and it has reached the line limit, it should stop there and then imo. Ah, another one for the PHP logic.From the first post, my wish was a decoupling between the page load code and the log filter load code. Hope this is possible to do, sometime in a later release.
-
@e-1-1 said in 2.5.1-RELEASE - Dashboard's "Firewall logs" widget kills the CPU and effectively DoS's the GUI:
zfs get compression,compressratio zroot/var
Your getting a better ratio than me.. I would assume that is because your logs are bigger? Lots more info that can be compressed for better overall ratio.
[21.05-RELEASE][admin@sg4860.local.lan]/: zfs get compression,compressratio zroot/var NAME PROPERTY VALUE SOURCE zroot/var compression lz4 inherited from zroot zroot/var compressratio 3.45x - [21.05-RELEASE][admin@sg4860.local.lan]/:
Or maybe you have other logs that compress better? I am not running any sort of ips, so there are none of those logs, etc.
-
@johnpoz I guess that's the case, filter.log reached a bigger size today.
/var/log: ls -alh filter* -rw------- 1 root wheel 378M Jun 8 01:41 filter.log
Not sure why it's not rotating though. Meh.. maybe after next reboot it sorts itself out.
Edit - or maybe I know?!
From one of my previous posts:
"Log Rotation Size (Bytes)" is default at 512000.
(can't find quote format button)
Actually, that field has no value, value 512000 is faintly visible there as a grey, washed out text. Do you think setting something there will trigger a log rotation sometime in the future via a built-in cron job or something similar?
-
The 'cron' package shows :
so every minute the command /usr/sbin/newsyslog is run.
It reads its default config file /etc/newsyslog.conf in which its told to read /var/etc/newsyslog.conf.d/pfSense.conf
In that file you can find :/var/log/userlog root:wheel 600 3 1024 * B /var/log/utx.log root:wheel 644 3 1024 * B /var/log/system.log root:wheel 600 7 1024 * JC /var/log/filter.log root:wheel 600 7 1024 * JC /var/log/dhcpd.log root:wheel 600 7 1024 * JC /var/log/vpn.log root:wheel 600 7 1024 * JC /var/log/poes.log root:wheel 600 7 1024 * JC /var/log/l2tps.log root:wheel 600 7 1024 * JC /var/log/openvpn.log root:wheel 600 7 1024 * JC /var/log/portalauth.log root:wheel 600 7 1024 * JC /var/log/ipsec.log root:wheel 600 7 1024 * JC /var/log/ppp.log root:wheel 600 7 1024 * JC /var/log/wireless.log root:wheel 600 7 1024 * JC /var/log/nginx.log root:wheel 600 7 1024 * JC /var/log/ntpd.log root:wheel 600 7 1024 * JC /var/log/gateways.log root:wheel 600 7 1024 * JC /var/log/resolver.log root:wheel 600 7 1024 * JC /var/log/routing.log root:wheel 600 7 1024 * JC /var/log/auth.log root:wheel 600 7 1024 * JC
This corresponds with :
where 1024 x 1024 (K) = 1 048 576, so each file will be approximatively 1 mega byte on my pfSense.
The default is probably 512 Kbytes, as the help text states. -
We had already discussed internally about what to do by default for ZFS cases on fresh installs. Initially we didn't opt to change anything because ZFS is smart enough to not compress already compressed data (it tries first, but if it doesn't save enough, it skips it) but in practice it appears to be slower anyhow, so it's worth disabling when we know it's not ideal. I made https://redmine.pfsense.org/issues/12011 to look into that.
I also made https://redmine.pfsense.org/issues/12012 for improving the GUI help text.
As for how the logs are parsed, I know it's inefficient but it's not an easily solvable optimization. This is done for some logs like the filter log to ensure that even if the user wants 50 lines, that they actually get 50 parsed lines and not an empty log if there were 50 suppressed entries or entries which could not be parsed. This is also necessary for searching because the user wants N results, not to search for items within the N lines returned. I'm not sure if going file by file or chunks of files is going to end up being better given all the extra overhead involved in fetching parts and iterating. I made https://redmine.pfsense.org/issues/12013 for that but it's not likely to get any traction in the near future given the nature of the problem.
Most users who need that kind of efficiency and large log sizes are better served by pushing logs to a more appropriate central log server. I know a lot of people rely on the firewall for that, but they really shouldn't. It isn't reliable for log retention, a log server must be used, especially in cases where legalities/regulations are involved.
-
@gertjan Aha! Roger that.
So it looks like an empty general "Log Rotation Size (Bytes)" actually leads to a whooping
var/log: less ../etc/newsyslog.conf.d/pfSense.conf | egrep filter /var/log/filter.log root:wheel 600 7 2097151.3671875 * C
How it ended up like that, might've found out below
.
I set the general parameter in the GUI to 512000, now pfSense.conf has still (?!) that "2097151.3671875" value for /var/log/filter.log. Set it to 1048576 like in your example, still "2097151.3671875".
Digging deeper, in the firewall log's own "Log Rotation Size (Bytes)" parameter, found value "2147483000". Culprit found, value deleted, "1048576" is now shown as a fading grey text.
Saved, waited a couple of minutes, pfSense.conf still showing "2097151.3671875". Manually set to 512000, saved => pfSense.conf shows "500". OK so far.Then cleared "Log Rotation Size (Bytes)", leaving the washed out default value. Save=> pfSense.conf still shows "500". It gets this from the general setting, right? Then I guess "1048576" should not be shown at all greyed out in the firewall tab, but a hint be present that if it's not set, it gets the general default or user-defined setting.
-
@jimp Appreciate it.
As for the "quick mode" log parsing, if I may add a point - the "read only X parsable lines" for me is applicable only in two cases:
- the dashboard firewall widget that by default gets the last few logs, sorted by time from the newest. Zero need afaik to parse more than those
- the default (freshly opened, no filters) firewall log in both normal and dynamic view.
For the rest, parse away :|.
On the efficiency, log volume and "central log server" topic I agree. It's a waste of time to use anything else than dedicated distributed log aggregation, indexing and searching clusters, for example. However, in regulated and air gapped environments, I can't use pfSense, no matter how much I would like to. It's simply not "enterprise-y" enough. Left some feedback last year on this subject in the web form, I hope it sparked some discussions internally. Won't detail here, it's simply too many features to list and it's off-topic.
-
The firewall log tosses out log entries which are very spammy and useless in some cases (like some IPv6 multicast entries which get dropped by default rules and are practically worthless), so it has to fetch significantly more than we really want to ensure there is enough data.
Guess too low and we get a ton of user complaints about "why is it only showing me 20 entries when I set it to 50" and so on.
-
@jimp thanks for all your hard work. Have you thought about Wireguard concept (less is more). This concept could be a game-changer for pfSense dev team.
-
@akegec said in 2.5.1-RELEASE - Dashboard's "Firewall logs" widget kills the CPU and effectively DoS's the GUI:
Have you thought about Wireguard concept
That's an understatement.
I guess jimp has thought a lot about Wireguard recently.