https://oisd.nl
-
@BBcan177 Can you chime in? Can we get an update to pfblocker to support the new format?
-
@michmoor pfBlockerNG already supports the AdBlock syntax for regular records (e.g. domain only, no wildcards) for both Unbound and Python modes.
Those are the ones in the format ||domain.com^
Not sure they work as wildcards (the AdBlock spec, and IIRC they do) but they're definitely parsed by it.
I'm adding support for wildcard blacklist (Python mode only) and whitelist (both modes, though it works best in Python mode) here: https://github.com/pfsense/FreeBSD-ports/pull/1303
-
@andrebrait gotcha thank you
-
FWIW, my intermediate sledge hammer fix to the erratic unpredictable behavior of "Wildcard Blocking (TLD)" and to improve support for ABP syntax until a proper solution is available is: add all domains to the zone file.
From https://oisd.nl/:
- PfBlockerNG (Note that pfBlockerNG does support wildcard blocking, but it's implementation is wack; It won't block subdomains to already listed subdomains, eg g.doubleclick.net should block; adclick.g.doubleclick.net, adx.g.doubleclick.net, captive.googleads.g.doubleclick.net etc, but it does not.)
I have observed the behavior described above. Furthermore, if a domain listed in a block list is more than 5 levels, it is automatically excluded from the TLD list, i.e., it will not be added to the redirect zone file.
For this sledge hammer patch to work, "Wildcard Blocking (TLD)" must be disabled (AFAICT) to treat all domains as sub-domains and add to a 'transparent' zone (see tld_analysis in pfblockerng.inc).
It may be wise to ensure your box has enough RAM to support the number of domains before slaming all domains into the zone file. Code in pfblockerng.inc checks that there's enough RAM to support redirect zones. 8GB of RAM allows for 2.5 million domains when DNSBL Mode is "Unbound python mode" and "DNSBL Blocking" (Enable the DNSBL python blocking mode) is enabled. OISD and HaGeZi's Pro list is approx. 383k domains. I have a few snowflakes that I whitelist and others that add to the set.
My sledge hammer patch writes the final set to the unbound_py_zone file:
--- pfblockerng.inc 2023-10-08 15:54:01.742896000 -0500 +++ pfblockerng.inc.rab 2023-10-08 15:45:10.531307000 -0500 @@ -8942,7 +8942,7 @@ unlink_if_exists($pfb['unbound_py_data']); unlink_if_exists($pfb['unbound_py_zone']); unlink_if_exists($pfb['unbound_py_count']); - rename("{$pfb['dnsbl_file']}.raw", $pfb['unbound_py_data']); + rename("{$pfb['dnsbl_file']}.raw", $pfb['unbound_py_zone']); } }
I applied this to my pfblockerNG instances today and am happy to see behaviors I expected of the "Wildcard Blocking (TLD)" feature. Also nice to use the ABP files which are a bit smaller in size as they condense common subdomains.
I look forward to seeing @andrebrait's solution. In the mean time, I think this will hold me over.
-
@totowentsouth If I provide you with instructions, would you be willing to test my solution?
I can send you the commands to scp the files to the right places at your pfSense machine and whatnot. Just let me know if you'd be willing to do it, if you're comfortable with the command line
-
@andrebrait Yes, I can put your solution through some testing on my secondary pfSense box. I am comfortable with command line. I peaked at your patch on github. I can grab the source from there or wherever you advise. I'm happy to help.
-
@totowentsouth you can chekout my branch then and copy over the inc, sh and php files to their folder, which is /usr/local/www/pfblockerng/
For the Python file, it goes in /var/unbound
EDIT: reminder that the changed files are in and for the -devel version
-
@andrebrait A brief update: I installed pfBlockerNG-devel 3.2.0_6 and installed the modified files from https://github.com/andrebrait/FreeBSD-ports/tree/pffblockerng_devel_whitelist_regex to their respective locations.
After reloading my current set of lists (a mixture of ABP style and raw domains), dig queries are returning what I'd expect. I will continue to poke things, toggle options, etc.
Question: Regarding your pull request on github, CONTRIBUTING.md in FreeBSD-ports states they do not accept pull requests. Does someone else on github forward the patches to the FreeBSD team?
-
@totowentsouth outdated documentation. The repository has been accepting PRs for a while now.
-
@andrebrait I have encountered an undesirable outcome.
I have TLD wildcard disabled. BTW, I'm wondering if it should be eliminated with these set of changes?
I am utilizing multiple lists with ABP style and raw domains. Two lists (A and B) happen to contain:
||000free.us^
And a 3rd list (C) contains:
000free.us www.000free.us
The three lists are in the same DNSBL Group.
The three lists are in the group in the following order: A, B, C. That is, C, the raw domains list, is last in the group. I have not tried ordering the lists in reverse order.When feeding these three lists to pfblockerng with https://github.com/pfsense/FreeBSD-ports/pull/1303, /var/unbound/pfb_py_data.txt contains four entries for "000free.us". Two entries end with 1 (from lists A and B) and two (from list C) end with a 0. The order of the entries in the output are in the order that the lists in the group are. That is, the entry from list A is first in the file. Followed by the entry from list B, followed by the two entries from list C.
,000free.us,,0,A,DNSBL_Compilation,1 ,000free.us,,0,B,DNSBL_Compilation,1 ,000free.us,,0,C,DNSBL_Compilation,0 ,www.000free.us,,0,C,DNSBL_Compilation,0
The output, /var/unbound/pfb_py_data.txt, without 1303 has 000free.us from list A and www.000free.us from C. No duplicate entries.
,000free.us,,0,A,DNSBL_Compilation ,www.000free.us,,0,C,DNSBL_Compilation
Especially troubling are the dig results:
dig 000free.us ;; ANSWER SECTION: 000free.us. 60 IN A 0.0.0.0 dig www.000free.us ;; ANSWER SECTION: www.000free.us. 60 IN A 0.0.0.0 dig blarg.www.000free.us ;; ANSWER SECTION: blarg.www.000free.us. 30 IN A 104.x.x.x dig blarg.000free.us ;; ANSWER SECTION: blarg.000free.us. 30 IN A 104.x.x.x
On one hand, the use of non-regex/ABP entries combined with regex/ABP may be nonsensical. On another hand, I've developed an expectation of pfblockerng to eliminate duplicates and prefer the most hardened result.
A brief side-track to TLD wildcard enabled scenario: /var/unbound/pfb_py_zone.txt has three entries for the domain 000free.us; one entry for each of the lists that contained it.
,000free.us,,0,A,DNSBL_Compilation,1 ,000free.us,,0,B,DNSBL_Compilation,1 ,000free.us,,0,C,DNSBL_Compilation,0
Furthermore, dig 000free.us returns a valid IP.
What are your thoughts? Let me know if more information would be helpful. I'll continue poking it.
-
@totowentsouth The duplication isn't expected, and so is it not matching the subdomain even when there's an AdBlock-style entry for it (they are always implicitly wildcards, similar to having TLD matching turned ON + having the domain on a hosts-style list).
I'll have to add both here and see why it's matching like that.
Btw, I made some changes yesterday. Would you mind pulling them and testing again?
Just a reminder that the Python file goes in a different place than the other files and that you need to go on Update -> Reload and force it to reload DNSBL (or All) after replacing the files.
-
@totowentsouth ok, just by looking at the lists I think I figured out what's happening.
Will try something later :)
-
@totowentsouth I think I fixed it. Can you pull and try with the new files?
-
@andrebrait The short synopsis: DNS queries at 2995ad5f6927ace4a32a479d1375fe5123e1f1e6 are behaving as one would expect
The details:
For each iteration of changes, I install all modified files from your branch:
git diff --name-only 6b9d2aa2b78193bd8ce83d0c0e0793f157d3ed77..HEAD
To make testing before and after your changes trivial, I create custom system patches.
On the pfSense box, after changing the source files and before running an update, I delete the cached and processed files:
cd /var/db/pfblockerng/ rm -rf dnsbl/ dnsblorig/ dnsblalias/
I leave files in /var/unbound/ unmodified. I've noticed pfblockerng reliably deletes and/or updates them on an update/reload. pfblockerng does not always delete files in dnsblorig - for good reason as the cached file could be used again if desired. pfblockerng appears to update the files in dnsbl/ as expected though I delete them manually to ensure the generated files are fresh.
Also, FWIW, I host these files on a local web server to avoid hammering the maintainers' servers as I'm reloading these lists more frequently than once a day for testing purposes.
Question: There are still duplicate entries for domains that appear in multiple lists. Is this something you are or will be addressing?
A previously unnoticed undesirable behavior I noticed today is when TLD wildcard is enabled, the log file py_errors.log contains a line for each domain in pfb_py_zone.txt stating:
<date timestamp>|ERROR| [pfBlockerNG]: Failed to parse: pfb_py_zone.txt: ['','zzxedr.xyz', '', '0', 'A', 'DNSBL_Compilation', '1'] <date timestamp>|ERROR| [pfBlockerNG]: Failed to parse: pfb_py_zone.txt: ['','000free.us', '', '0', 'C', 'DNSBL_Compilation', '0']
With TLD wildcard disabled, pfb_py_zone.txt is absent and no new errors are written to py_error.log.
Folks that have TLD wildcard enabled from previous versions of pfBlockerNG will have a flood of errors in py_error.log after upgrading to the version that includes this implementation.
Question: Would you like to continue sharing feedback on this forum thread or migrate to the github pull request? I am comfortable with either.
This is looking great so far. I'll continue to hammer at the changes you push and I will share my findings as my time allows. Thank you!
-
@totowentsouth that's very useful feedback. @BBcan177 had alerted me about possible issues when upgrading due to me having added a few more columns to the intermediate CSV files. I thought I had fixed that earlier today but it seems I missed a line somewhere. Will fix it later today.
Here or GitHub are both ok for me. Whichever you prefer.
I guess on GitHub things can be a bit more structured / easier to format and suggest stuff.
And I have a far more precarious setup than you do I literally just scp the files to my pfSense VM to test stuff.
-
@totowentsouth regarding duplicates, they're expected and can't be avoided.
It's a lot easier to just parse everything as-is inside the Python code and handle the possibility of duplicates there than to try and remove them from the .whitelist files before that.
-
@andrebrait re github vs here: I am new to the development processes for pfSense/pfBlockerNG. When/how does a merge request get clearance to merge to the main branch? Will sharing feedback on github increase the likelihood of preventing a 'bad' commit getting merged to the main branch?
I like the system patches when all of the pieces align . I have too often forgotten a change I made to the live files and flailed with the output of the "debug". I am learning to adopt better practices to avoid such pain
I noticed new commits and I installed 49d2fde485847bf291bd6be1d7fba110ff5f52c0 and ran an update and domains not explicitly blocked fail to resolve. The explicitly blocked queries return the expected 'blocked' IP. In my setup, this is 0.0.0.0.
For example, "dig fls.doubleclick.net" expectedly returns 0.0.0.0:
; <<>> DiG 9.16.44-Debian <<>> fls.doubleclick.net ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4245 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;fls.doubleclick.net. IN A ;; ANSWER SECTION: fls.doubleclick.net. 60 IN A 0.0.0.0
And "dig google.com" reports a failure:
; <<>> DiG 9.16.44-Debian <<>> google.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 41375 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;google.com. IN A
I'm happy to provide more details if you are not already aware of this behavior. The new debug mode is off. TLD wildcard is also off.
FWIW, when I revert to 2995ad5f6927ace4a32a479d1375fe5123e1f1e6, all DNS queries behave as expected. Blocked are 'blocked' and not-blocked are resolved to a valid IP.
Any ideas?
Let me know if I should hold on testing new commits until you wish attention upon them.
Cheers and Thank you!
-
@totowentsouth I'll respond to the other questions properly later :)
As for the issue, could you checkout this branch and check if it happens there?
https://github.com/andrebrait/FreeBSD-ports/tree/pfblockerng-next
-
@andrebrait said in https://oisd.nl:
As for the issue, could you checkout this branch and check if it happens there?
https://github.com/andrebrait/FreeBSD-ports/tree/pfblockerng-next
DNS queries are functional at 57d776e635e08c7925c46a83b6b8ebbdd9f64d4c and TLD wildcard appears to be working too
Thanks! -
@andrebrait said in https://oisd.nl:
It's a lot easier to just parse everything as-is inside the Python code and handle the possibility of duplicates there than to try and remove them from the .whitelist files before that.
I put together a solution to optionally prune duplicates from all DNSBL files after pfBlockerNG does its job in sanitizing. In the end, I do see that the expense of pruning the duplicates is great, especially with a handful of large block lists on a low resource machine. I'm sure aggregate lists like OISD and HaGeZi already prune duplicates.
Well, I wonder if you would be interested in sharing your thoughts and opinions on such a solution. I pushed a commit to https://github.com/babilon/FreeBSD-ports/tree/pfblockerng_devel_prune_duplicates. It is based off of pfblockerng-next.
@andrebrait said in https://oisd.nl:
@totowentsouth I'll respond to the other questions properly later :)
Sounds great. I found a link to https://docs.netgate.com/pfsense/en/latest/development/pull-request.html which may answer some of my questions.