https://oisd.nl
-
@totowentsouth said in https://oisd.nl:
@andrebrait Yep. I have installed this on my secondary pfSense box. So far, 1 wrinkle and zero problems to report. I will exercise it over the coming days.
FWIW, I created a patch from 6b9d2aa2b78193bd8ce83d0c0e0793f157d3ed77..4683d6825a55667677803bda8444d14eb30ddf71
I removed hunk #38 for pfblockerng.inc from this patch due to a conflict. AFAICT, the change is already in 3.2.0_7 ?? -- afterwards, the patch applied clean so all is well.Hunk #38 of net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc:
@@ -10732,7 +10962,7 @@ function pfblockerng_php_pre_deinstall_command() { if (config_path_enabled('system','earlyshellcmd')) { $a_earlyshellcmd = config_get_path('system/earlyshellcmd', ''); if (preg_grep("/pfblockerng.sh aliastables/", $a_earlyshellcmd)) { - config_set_path('system','earlyshellcmd', + config_set_path('system/earlyshellcmd', preg_grep("/pfblockerng.sh aliastables/", $a_earlyshellcmd, PREG_GREP_INVERT)); } }
Zero idea about it. Perhaps a recent change from the upstream devel branch that merged cleanly for me but not for you for some reason?
The wrinkle I've encountered is with the switch to use fontawesome 6 (a999ce5a96e22ab54317e7079b1871e1661f7218). I am wrestling with fonts on my machine and have some improvements. Will pfSense deliver these fonts if the host/browser does not have them installed?
Yes, same for me. I have no idea what their plans are.
I'll update the branch later and create a version that's based off of the upstream main branch to see if I observe any differences regarding that.
-
@andrebrait I noticed pfblockerng is blocking many innoculous domains after adding EasyPrivacy list. Examples of domains I would expect to resolve include starbucks.com, sendcloud.net, substack.com, and substackcdn.com.
https://easylist.to/easylist/easyprivacy.txt
In EasyPrivacy list, the entries for these domains are:
.sendcloud.net/track/ .starbucks.com/a/ .substack.com/o/$image .substackcdn.com/open?$image
A peak at what the final list produced for a two of the domains that are blocked:
/var/db/pfblockerng: grep -nR sendcloud dnsbl/ dnsbl/hagezi_pro_plus.txt:43088:,sctrack.sendcloud.net,,2,hagezi_pro_plus,DNSBL_hagezi_pro_plus,1 dnsbl/hagezi_pro_plus.txt:49793:,track.sendcloud.org,,2,hagezi_pro_plus,DNSBL_hagezi_pro_plus,1 dnsbl/hagezi_pro_plus.txt:51974:,tracking.sendcloud.sc,,2,hagezi_pro_plus,DNSBL_hagezi_pro_plus,1 dnsbl/EasyList_Privacy.txt:4012:,sendcloud.net,,0,EasyList_Privacy,DNSBL_EasyList,0
/var/db/pfblockerng: grep -nR substack\.com dnsbl/ dnsbl/OISD_big.txt:480:,0x00000000000.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:66539:,etharticles.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:146726:,publicationgroup.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:178172:,teamproject.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:184591:,tradestrategy.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:188726:,uniproject.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:202727:,web3projects.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/OISD_big.txt:203007:,webpublic.substack.com,,2,OISD_big,DNSBL_Collections,1 dnsbl/StevenBlack_hosts.txt:7301:,email.mg1.substack.com,,2,StevenBlack_hosts,DNSBL_Collections,0 dnsbl/EasyList_Privacy.txt:4337:,substack.com,,0,EasyList_Privacy,DNSBL_EasyList,0
EasyPrivacy seems to be one of the few that lists entries in this manner and with domains that one would expect to resolve.
BTW, this issue appears to exist in pfblockerng-next branch as well. An entry in EasyList_France that follows the pattern as those above is in the final output. This box is running pfblockerng-next code:
[23.09.1-RELEASE][admin@pfSense.localdomain]/var/db/pfblockerng: grep -nR 468\.60\.gif dnsbl dnsbl/C_EasyList_France.txt:228:,468.60.gif,,0,C_EasyList_France,DNSBL_EasyList,0
My pihole loads the same EasyList_France and although it shows ".468.60.gif" in the results for "Search Adlists", when I dig @<pihole.ip> 468.60.gif, the report in pihole is blocked by external i.e. pfblockerng. After removing EasyList_France from pfblockerng, the dig @<pihole.ip> 468.60.gif returns an answer - so it was indeed pfblockerng blocking the lookup even though pihole listed it for some reason.
Let me know if you need more information. I have my main computer behind the latest pfblockerng code and am loading it with lists to give it a thorough workout (at least as far as the adblocking goes).
-
@andrebrait I made this change to my pfblockerng install:
Edited to include exclusion of entries beginning with a hyphen.
diff --git a/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc b/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc index c332706eba77..eca18a486157 100644 --- a/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc +++ b/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc @@ -8730,6 +8730,11 @@ function sync_package_pfblockerng($cron='') { if (!$liteparser) { $lite = FALSE; + # entries that start with a period are probably ABP style. + $beginswith = substr($line, 0, 1); + if ($beginswith == '.' || $beginswith == '-') { + continue; + } if (strpos($line, '.') !== FALSE && ctype_alnum(str_replace('.', '', $line))) { $lite = TRUE;
There are entries such as _c.gif, _adobe_analytics.js, _stat.php which viewed as a domain have an unregisterd TLD as of today AFAICT.
-
Inside the next if block, leading and trailing periods are pruned from the line:
// Remove leading/trailing dots $line = trim(trim($line), '.');
-
@totowentsouth thanks for looking into it.
As far as I can tell, it's an actual bug. It should not parse entries that start with a period. Those entries in EasyList lingo mean other things, not domain names.
The intended behavior is to only parse entries that start with || or @@|| (those are exclusions) and end with ^ (or ^| as sometimes that happens).
The entries you listed should have been skipped and ignored. Those are URL patterns, and DNSBL shouldn't use them.
I'll look into it as soon as possible. Thanks for the detailed information and the code snippets :)
-
@totowentsouth said in https://oisd.nl:
Inside the next if block, leading and trailing periods are pruned from the line:
// Remove leading/trailing dots $line = trim(trim($line), '.');
That still shouldn't let those entries be parsed. They have forward slashes in them, for example, so the fact it still manages to parse them is quite weird. Those should be skipped :/
I'll take a look later this weekend.
EDIT: yup, never mind. Found it:
// If '/' character found, remove characters after '/' if (strpos($line, '/') !== FALSE) { $line = strstr($line, '/', TRUE); }
-
@andrebrait That makes sense. Thank you for the explaination.
I noticed too that the EasyPrivacy list and a few other "easylist" styled lists begin with entries that pfblockerng considers "typical host feed format", i.e.,
$easylist
remains set to its initial value ofFALSE
. After an entry following the "easylist" style is read,$easylist
is set toTRUE
. For the remaining lines in the file, execution of the block beginning withif (!$easylist) {
is then skipped.Is the intent to process the list in "normal" mode until discovery of an "easylist" style entry?
Is the intent to support lists containing a mixture of styles? If the entries in EasyPrivacy are shuffled such that a raw domain entry is after an "easylist" style, then might that throw a wrench in processing since as is currently the case$easylist
isTRUE
after the first "easylist" style entry is found? -
@totowentsouth well, I'd say that it's unusual that lists contain both things, so I assume that's why the code works the way it does, but I think it'd be safe to do it on a per-line basis because EasyList syntax for a domain name are always going to start with || or @@|| and end with ^ or ^|.
So if a line matches that, we parse it as EasyList. Otherwise, we don't.
I guess this would likely be safer and likely more correct. And either way, it should ignore those entries, especially given they have a /.
I think the original intent there was to trim // comments at the end, or some lists which contained example.com/ for some reason. Either way, there are better ways to do that. I'm gonna check it out and fix it.
Could you provide a link to the lists files? Or do you mean the EasyPrivacy URL that is the pfBlockerNG feeds tab?
-
@andrebrait Yes, the EasyPrivacy URL https://easylist.to/easylist/easyprivacy.txt in the pfBlockerNG feeds tab is the same. I create groups and provide the URLs in lieu of using the feeds tab.
-
@totowentsouth I split the check to determine whether it's an EasyList and the parsing. Now there's a first pass through the file for checking for the EasyList headers and entries before moving on to the actual parsing (which I also refined).
I checked and the offending entries are not ending up in the file anymore. Let me know if you can reproduce the fix.
-
@andrebrait I updated my patch to include 4da5a631ae8d82a109fa7880429eff63c4cfa46f and all is well when using the EasyPrivacy list. Thanks!
-
@totowentsouth I gave it some polishing, cleaned up the commit history and produced the pfblockerng-adblock-clean branch (now on 7c3a4eaef2c714c9d97466ec2430e7e867cfd414) .
Could you give it a last go so I have someone else test it? -
@andrebrait I updated a pfSense box to 7c3a4eaef2c714c9d97466ec2430e7e867cfd414.
I think the extraction of IP addresses in DNSBL is no longer extracting and storing those IPs...This particular pfSense install was using pfblockerng-next -- i.e. before pfblockerng-adblock. FWIW, I uninstalled pfblockerng and removed orphaned files. Then I installed pfblockerng-devel and applied a patch to install 7c3a4. I have yet to try pfblockerng-adblock.In particular, DNSBLIP_v4.txt is absent and original/DNSBL_v4.orig has only one entry 127.1.7.7.
Here is an example of a list that includes domains and IPv4:
https://malware-filter.gitlab.io/malware-filter/phishing-filter.txt
I will do more testing and verification in the next day or so.Edit & Update: https://malware-filter.gitlab.io/malware-filter/phishing-filter-agh.txt is their adblock style. After switching to this list, the IPs are extracted. All is well now.
-
This post is deleted! -
@andrebrait I began a solution for automated test coverage of pfBlockerNG's DNSBL and IP list consolidation. The setup is a little involved and undocumented. I'll flush some documentation for it over the next few days. It is on github at babilon/pfblockerng-tests. I'm now able to trivially run a suite of tests against changes to pfBlockerNG.
-
@andrebrait Functionally, everything appears well. I noticed these duplicate calls to shell functions:
diff --git a/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc b/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc index df3dc385c5f2..03e9990d64cd 100644 --- a/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc +++ b/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc @@ -9119,8 +9119,6 @@ function sync_package_pfblockerng($cron='') { // Consolidate all exclusions exec("{$pfb['script']} dnsbl_py_assemble_exclusions_file unused unused unused {$elog}"); - exec("{$pfb['script']} dnsbl_py_assemble_redundants_file unused unused unused {$elog}"); - // Process Whitelists foreach ($postprocess_dnsbl as $header_esc) { @@ -9139,8 +9137,6 @@ function sync_package_pfblockerng($cron='') { exec("{$pfb['script']} dnsbl_py_remove_redundant {$header_esc} unused unused {$elog}"); } - exec("{$pfb['script']} dnsbl_py_cleanup_exclusions_file unused unused unused {$elog}"); - exec("{$pfb['script']} dnsbl_py_cleanup_redundants_file unused unused unused {$elog}"); } --
-
@totowentsouth the function names are slightly different. One set assembles/removes the master exclusions file and the other assembles/removed the master "might make other entries redundant" file.
Because EasyLists can also contain exclusions, in order to minimize the processed lists as much as possible, I've added a post-processing step to process all files and remove block entries that would be nullified by exclusions, as well as a step to remove redundant entries (e.g. mail.google.com becomes redundant if a wildcard rule for google.com exists).
The old logic already did that a bit, but in a different manner.
Or am I missing what you're referring to?
-
@andrebrait my bad on the duplication claim. I shoulda tried <shift># and I'd have seen the difference.
All is well. I retract my previous claims of issues. Sorry for any inconviences.
I've applied the latest to all my pfSense boxes BTW.