Squidguard category filtering silently fails with large blacklist - a workaround
-
This post is deleted! -
@dbmandrake Please take time to check my hypothetical solution with Squidgard lists above, let me know what you think.
-
@jonathanlee I know this sounds off-topic but it's dead on.
Don't block the world and call it a bug :)
This is a hardware limitation issue. Try adding more memory first and then add many IP aliases.The real question is network design.
By Default pfSense Blocks everything. How can we add more whitelists?
Do you really want employees accessing everything or is this "Home use only?"
Sorry to be abrupt but this solves your problem. Better hardware and more ram. -
@jonathanlee Re: Containers / Jails.
This seems like a massive degree of overkill, what problem do they solve exactly ?
The reason why the ramdisk exists at all is for small devices with limited storage where the temporary disk space needed to extract the plain ascii version of the blocklists (around 300MB for the blocklist I'm using) would cause the device to run out of disk space.
I'm just not seeing how a container solves the problem of lack of disk space.
It also speeds up the extraction and importing process since what are essentially temp files don't have to be written out to disk.
However on a server with a decent sized SSD there isn't really any advantage to using the ramdisk apart from a slight speed increase, but the disadvantage is it can fail with larger blocklists and due to inadequate error checking the failure is not detected and the incomplete blocklist is imported into squidguard without complaint which then silently breaks your filter categories. This is a big problem in an environment like a school where a school has a duty of care to not allow pupils to access certain kinds of websites.
If I do write a patch to add a ramdisk enable/disable preference option I will also write a patch to fix the error checking so that a failure due to exceeding the ramdisk size (when enabled) is reported to the user and the incomplete blocklists do not overwrite the currently active ones.
I would like to do this it's just a matter of finding the time to work on it as I'm bogged down with too many things at the moment.
-
@mikeinnyc Sorry but you comments are way off the mark - "Better hardware and more ram" doesn't solve anything. You clearly haven't read and/or understood the original post and grasped the issue with the fixed size ramdisk which is currently a part of the blocklist import process.
The hardware is absolutely capable of working with a blocklist of the size in question - without even breaking a sweat. Once the limitation of the small, fixed size ramdisk is removed, that is.
-
@dbmandrake im thinking that post was some type of SPAM. I could be wrong
-
@dbmandrake I actually forgot about the speeds of SSD drives today, the hypothetical solution I hoped would also help solve the issue with downtime when updating blacklists, on my firewall everything goes offline during blacklist updates, and the firewall can't use the full blacklist because of the same issue you described and solved. My system is the MAX so it has an extra 30GBs SSD on it. Additionally, it could protect the blacklist uptime if something got corrupted with a bad blacklist update, this way it could default back to that other container if that issue should ever occur. Kind of like a HA-Proxy just for blacklists, primary and secondary. High availability.
Thanks for looking at that post, I just wanted to have some input on it with Squidguard, alongside more visibility on FreeBSD Jails for the possibly retooling them for something else.
-
@jonathanlee said in Squidguard category filtering silently fails with large blacklist - a workaround:
@dbmandrake I actually forgot about the speeds of SSD drives today, the hypothetical solution I hoped would also help solve the issue with downtime when updating blacklists, on my firewall everything goes offline during blacklist updates, and cant use the full blacklist because of the issue you described.
I've been using the full size blacklist since before I started this thread without issue - with the patch to disable the ramdisk. No issues have cropped up yet, in fact the firewall hasn't been rebooted since before this thread was started. I actually have a second firewall running this patch as well as I've had to temporarily set up a second proxy server for a slightly different use case.
Regarding going offline during the update, I'd have to check but as far as I know Squid doesn't go offline during the extraction of the tar file - which is the longest part of the process.
I think it's only offline for a few seconds at the end of the import process for the same amount of time as if you'd pressed the Apply button in the squidguard config page, which forces squidguard to re-read the on disk version of the blacklist binary database into memory.
But I should run a test to time how long the proxy is out of action. I have mine scheduled to do the blacklist update automatically at 2am anyway so if the proxy is down for a few seconds at 2am nobody cares.
Not sure what you mean when you say "everything" goes offline on your firewall when the blacklist updates - only the proxy (and transparent proxy) will be affected, all other traffic is unaffected.
-
@dbmandrake everything on my network is pointed at the proxy, plus I run a WPAD, what I mean is when Squidguard updates that blacklist the proxy starts to update and users have no internet access until it restores, I am running a Netgate SG2100-MAX it only has 4GBs ram with it. It takes a bit longer for me around 5 mins, it takes long enough that it will stop a streaming movie. I need to set it to update during the AM too, again I am running a DSL 6meg.
-
@jonathanlee For automatic scheduled update see my post in another thread.
-
@dbmandrake thanks for the information on the auto update.