https://oisd.nl
-
@cukal The code to execute the pre/post scripts for "Advanced Tuneables" is in /usr/local/pkg/pfblockerng/pfblockerng.inc and AFAICT exists ONLY for IPv4 / IPv6 lists. AFAICT, the expected "exec" call for the pre/post DNSBL is absent . By intention or mistake, IDK. I pondered and puzzled the same. The verbiage on the web page alludes to this feature is intended to be operational.
@cukal said in https://oisd.nl:
Is there a way to enable this functionality?
With a bit of coding, yes. As things stand now, AFAICT, no. Having a pre-script for DNSBL would be helpful to clean up input formats that are currently foreign to pfblockerng.
The "exec" call for the pre-script in pfblockerng.inc is in this block of code:
// IPv4/6 Advanced Tunable - (Pre Script processing) if ($pfb_script_pre && file_exists("{$pfb_script_pre}")) { pfb_logger("\nExecuting pre-script: {$list['script_pre']}\n", 1); $file_dwn_esc = escapeshellarg("{$file_dwn}.orig"); exec("{$pfb_script_pre} {$file_dwn_esc} {$list['vtype']} {$elog}"); }
And the "exec" call for the post-script in pfblockerng.inc is in this block of code:
// IP v4/6 Advanced Tunable - (Post Script processing) if ($pfb_script_post && file_exists("{$pfb_script_post}")) { pfb_logger("\nExecuting post-script: {$list['script_pre']}\n", 1); $file_org_esc = escapeshellarg("{$file_dwn}.orig"); exec("{$pfb_script_post} {$file_org_esc} {$list['vtype']} {$elog}"); }
-
@andrebrait said in https://oisd.nl:
Only for traditional hosts-style lists
I realized after reading your reply that I had asked and you answered this question already.
I can see it's use in those cases.@andrebrait said in https://oisd.nl:
But it's done this way right now because 1. adding a custom Python script or some custom native binary (compiled or otherwise) would add more complexity to the package and 2. without zero-downtime reloads, de-duplicating in the script initialization would prolong the downtime.
Makes sense.
@andrebrait said in https://oisd.nl:
Since zero-downtime reloads are essential for my use-case anyway, I'd already have to implement it, so I'm killing two birds with one stone. And it's arguably easy to do anyway (I can probably do it the next time I have more than a couple free hours, next week or so).
I'll give it a whirl when it is ready.
-
@totowentsouth I integrated your fix.
Could you pull the latest pfblockerng-adblock branch (or pfblockerng-next) and check that it's been fixed?
-
@andrebrait Patch looks good and no lines in py_error.log while running pfblockerng-next at 69f3d0455363411179a763a3c39e03b0b027b4a0.
Thanks!
-
@totowentsouth New code in the branch https://github.com/andrebrait/FreeBSD-ports/tree/pfblockerng-adblock
If you're willing to test it, it would be more than appreciated!
-
@andrebrait Yep. I have installed this on my secondary pfSense box. So far, 1 wrinkle and zero problems to report. I will exercise it over the coming days.
FWIW, I created a patch from 6b9d2aa2b78193bd8ce83d0c0e0793f157d3ed77..4683d6825a55667677803bda8444d14eb30ddf71
I removed hunk #38 for pfblockerng.inc from this patch due to a conflict. AFAICT, the change is already in 3.2.0_7 ?? -- afterwards, the patch applied clean so all is well.Hunk #38 of net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc:
@@ -10732,7 +10962,7 @@ function pfblockerng_php_pre_deinstall_command() { if (config_path_enabled('system','earlyshellcmd')) { $a_earlyshellcmd = config_get_path('system/earlyshellcmd', ''); if (preg_grep("/pfblockerng.sh aliastables/", $a_earlyshellcmd)) { - config_set_path('system','earlyshellcmd', + config_set_path('system/earlyshellcmd', preg_grep("/pfblockerng.sh aliastables/", $a_earlyshellcmd, PREG_GREP_INVERT)); } }
The wrinkle I've encountered is with the switch to use fontawesome 6 (a999ce5a96e22ab54317e7079b1871e1661f7218). I am wrestling with fonts on my machine and have some improvements. Will pfSense deliver these fonts if the host/browser does not have them installed?
-
@totowentsouth said in https://oisd.nl:
@andrebrait Yep. I have installed this on my secondary pfSense box. So far, 1 wrinkle and zero problems to report. I will exercise it over the coming days.
FWIW, I created a patch from 6b9d2aa2b78193bd8ce83d0c0e0793f157d3ed77..4683d6825a55667677803bda8444d14eb30ddf71
I removed hunk #38 for pfblockerng.inc from this patch due to a conflict. AFAICT, the change is already in 3.2.0_7 ?? -- afterwards, the patch applied clean so all is well.Hunk #38 of net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc:
@@ -10732,7 +10962,7 @@ function pfblockerng_php_pre_deinstall_command() { if (config_path_enabled('system','earlyshellcmd')) { $a_earlyshellcmd = config_get_path('system/earlyshellcmd', ''); if (preg_grep("/pfblockerng.sh aliastables/", $a_earlyshellcmd)) { - config_set_path('system','earlyshellcmd', + config_set_path('system/earlyshellcmd', preg_grep("/pfblockerng.sh aliastables/", $a_earlyshellcmd, PREG_GREP_INVERT)); } }
Zero idea about it. Perhaps a recent change from the upstream devel branch that merged cleanly for me but not for you for some reason?
The wrinkle I've encountered is with the switch to use fontawesome 6 (a999ce5a96e22ab54317e7079b1871e1661f7218). I am wrestling with fonts on my machine and have some improvements. Will pfSense deliver these fonts if the host/browser does not have them installed?
Yes, same for me. I have no idea what their plans are.
I'll update the branch later and create a version that's based off of the upstream main branch to see if I observe any differences regarding that.
-
@andrebrait I noticed pfblockerng is blocking many innoculous domains after adding EasyPrivacy list. Examples of domains I would expect to resolve include starbucks.com, sendcloud.net, substack.com, and substackcdn.com.
https://easylist.to/easylist/easyprivacy.txt
In EasyPrivacy list, the entries for these domains are:
.sendcloud.net/track/ .starbucks.com/a/ .substack.com/o/$image .substackcdn.com/open?$image
A peak at what the final list produced for a two of the domains that are blocked:
/var/db/pfblockerng: grep -nR sendcloud dnsbl/ dnsbl/hagezi_pro_plus.txt:43088:,sctrack.sendcloud.net,,2,hagezi_pro_plus,DNSBL_hagezi_pro_plus,1 dnsbl/hagezi_pro_plus.txt:49793:,track.sendcloud.org,,2,hagezi_pro_plus,DNSBL_hagezi_pro_plus,1 dnsbl/hagezi_pro_plus.txt:51974:,tracking.sendcloud.sc,,2,hagezi_pro_plus,DNSBL_hagezi_pro_plus,1 dnsbl/EasyList_Privacy.txt:4012:,sendcloud.net,,0,EasyList_Privacy,DNSBL_EasyList,0
/var/db/pfblockerng: grep -nR substack\.com dnsbl/ dnsbl/OISD_big.txt:480:,0x00000000000.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:66539:,etharticles.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:146726:,publicationgroup.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:178172:,teamproject.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:184591:,tradestrategy.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:188726:,uniproject.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:202727:,web3projects.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:203007:,webpublic.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/StevenBlack_hosts.txt:7301:,email.mg1.substack.com,,2,StevenBlack_hosts,DNSBL_Collections,0 dnsbl/EasyList_Privacy.txt:4337:,substack.com,,0,EasyList_Privacy,DNSBL_EasyList,0
EasyPrivacy seems to be one of the few that lists entries in this manner and with domains that one would expect to resolve.
BTW, this issue appears to exist in pfblockerng-next branch as well. An entry in EasyList_France that follows the pattern as those above is in the final output. This box is running pfblockerng-next code:
[23.09.1-RELEASE][admin@pfSense.localdomain]/var/db/pfblockerng: grep -nR 468\.60\.gif dnsbl dnsbl/C_EasyList_France.txt:228:,468.60.gif,,0,C_EasyList_France,DNSBL_EasyList,0
My pihole loads the same EasyList_France and although it shows ".468.60.gif" in the results for "Search Adlists", when I dig @<pihole.ip> 468.60.gif, the report in pihole is blocked by external i.e. pfblockerng. After removing EasyList_France from pfblockerng, the dig @<pihole.ip> 468.60.gif returns an answer - so it was indeed pfblockerng blocking the lookup even though pihole listed it for some reason.
Let me know if you need more information. I have my main computer behind the latest pfblockerng code and am loading it with lists to give it a thorough workout (at least as far as the adblocking goes).
-
@andrebrait I made this change to my pfblockerng install:
Edited to include exclusion of entries beginning with a hyphen.
diff --git a/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc b/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc index c332706eba77..eca18a486157 100644 --- a/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc +++ b/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc @@ -8730,6 +8730,11 @@ function sync_package_pfblockerng($cron='') { if (!$liteparser) { $lite = FALSE; + # entries that start with a period are probably ABP style. + $beginswith = substr($line, 0, 1); + if ($beginswith == '.' || $beginswith == '-') { + continue; + } if (strpos($line, '.') !== FALSE && ctype_alnum(str_replace('.', '', $line))) { $lite = TRUE;
There are entries such as _c.gif, _adobe_analytics.js, _stat.php which viewed as a domain have an unregisterd TLD as of today AFAICT.
-
Inside the next if block, leading and trailing periods are pruned from the line:
// Remove leading/trailing dots $line = trim(trim($line), '.');
-
@totowentsouth thanks for looking into it.
As far as I can tell, it's an actual bug. It should not parse entries that start with a period. Those entries in EasyList lingo mean other things, not domain names.
The intended behavior is to only parse entries that start with || or @@|| (those are exclusions) and end with ^ (or ^| as sometimes that happens).
The entries you listed should have been skipped and ignored. Those are URL patterns, and DNSBL shouldn't use them.
I'll look into it as soon as possible. Thanks for the detailed information and the code snippets :)
-
@totowentsouth said in https://oisd.nl:
Inside the next if block, leading and trailing periods are pruned from the line:
// Remove leading/trailing dots $line = trim(trim($line), '.');
That still shouldn't let those entries be parsed. They have forward slashes in them, for example, so the fact it still manages to parse them is quite weird. Those should be skipped :/
I'll take a look later this weekend.
EDIT: yup, never mind. Found it:
// If '/' character found, remove characters after '/' if (strpos($line, '/') !== FALSE) { $line = strstr($line, '/', TRUE); }
-
@andrebrait That makes sense. Thank you for the explaination.
I noticed too that the EasyPrivacy list and a few other "easylist" styled lists begin with entries that pfblockerng considers "typical host feed format", i.e.,
$easylist
remains set to its initial value ofFALSE
. After an entry following the "easylist" style is read,$easylist
is set toTRUE
. For the remaining lines in the file, execution of the block beginning withif (!$easylist) {
is then skipped.Is the intent to process the list in "normal" mode until discovery of an "easylist" style entry?
Is the intent to support lists containing a mixture of styles? If the entries in EasyPrivacy are shuffled such that a raw domain entry is after an "easylist" style, then might that throw a wrench in processing since as is currently the case$easylist
isTRUE
after the first "easylist" style entry is found? -
@totowentsouth well, I'd say that it's unusual that lists contain both things, so I assume that's why the code works the way it does, but I think it'd be safe to do it on a per-line basis because EasyList syntax for a domain name are always going to start with || or @@|| and end with ^ or ^|.
So if a line matches that, we parse it as EasyList. Otherwise, we don't.
I guess this would likely be safer and likely more correct. And either way, it should ignore those entries, especially given they have a /.
I think the original intent there was to trim // comments at the end, or some lists which contained example.com/ for some reason. Either way, there are better ways to do that. I'm gonna check it out and fix it.
Could you provide a link to the lists files? Or do you mean the EasyPrivacy URL that is the pfBlockerNG feeds tab?
-
@andrebrait Yes, the EasyPrivacy URL https://easylist.to/easylist/easyprivacy.txt in the pfBlockerNG feeds tab is the same. I create groups and provide the URLs in lieu of using the feeds tab.
-
@totowentsouth I split the check to determine whether it's an EasyList and the parsing. Now there's a first pass through the file for checking for the EasyList headers and entries before moving on to the actual parsing (which I also refined).
I checked and the offending entries are not ending up in the file anymore. Let me know if you can reproduce the fix.
-
@andrebrait I updated my patch to include 4da5a631ae8d82a109fa7880429eff63c4cfa46f and all is well when using the EasyPrivacy list. Thanks!
-
@totowentsouth I gave it some polishing, cleaned up the commit history and produced the pfblockerng-adblock-clean branch (now on 7c3a4eaef2c714c9d97466ec2430e7e867cfd414) .
Could you give it a last go so I have someone else test it? -
@andrebrait I updated a pfSense box to 7c3a4eaef2c714c9d97466ec2430e7e867cfd414.
I think the extraction of IP addresses in DNSBL is no longer extracting and storing those IPs...This particular pfSense install was using pfblockerng-next -- i.e. before pfblockerng-adblock. FWIW, I uninstalled pfblockerng and removed orphaned files. Then I installed pfblockerng-devel and applied a patch to install 7c3a4. I have yet to try pfblockerng-adblock.In particular, DNSBLIP_v4.txt is absent and original/DNSBL_v4.orig has only one entry 127.1.7.7.
Here is an example of a list that includes domains and IPv4:
https://malware-filter.gitlab.io/malware-filter/phishing-filter.txt
I will do more testing and verification in the next day or so.Edit & Update: https://malware-filter.gitlab.io/malware-filter/phishing-filter-agh.txt is their adblock style. After switching to this list, the IPs are extracted. All is well now.
-
This post is deleted!