https://oisd.nl

totowentsouth

@andrebrait A brief update: I installed pfBlockerNG-devel 3.2.0_6 and installed the modified files from https://github.com/andrebrait/FreeBSD-ports/tree/pffblockerng_devel_whitelist_regex to their respective locations.

After reloading my current set of lists (a mixture of ABP style and raw domains), dig queries are returning what I'd expect. I will continue to poke things, toggle options, etc.

Question: Regarding your pull request on github, CONTRIBUTING.md in FreeBSD-ports states they do not accept pull requests. Does someone else on github forward the patches to the FreeBSD team?

andrebrait

@totowentsouth outdated documentation. The repository has been accepting PRs for a while now.

totowentsouth

@andrebrait I have encountered an undesirable outcome.

I have TLD wildcard disabled. BTW, I'm wondering if it should be eliminated with these set of changes?

I am utilizing multiple lists with ABP style and raw domains. Two lists (A and B) happen to contain:

||000free.us^

And a 3rd list (C) contains:

000free.us
www.000free.us

The three lists are in the same DNSBL Group.
The three lists are in the group in the following order: A, B, C. That is, C, the raw domains list, is last in the group. I have not tried ordering the lists in reverse order.

When feeding these three lists to pfblockerng with https://github.com/pfsense/FreeBSD-ports/pull/1303, /var/unbound/pfb_py_data.txt contains four entries for "000free.us". Two entries end with 1 (from lists A and B) and two (from list C) end with a 0. The order of the entries in the output are in the order that the lists in the group are. That is, the entry from list A is first in the file. Followed by the entry from list B, followed by the two entries from list C.

,000free.us,,0,A,DNSBL_Compilation,1
,000free.us,,0,B,DNSBL_Compilation,1
,000free.us,,0,C,DNSBL_Compilation,0
,www.000free.us,,0,C,DNSBL_Compilation,0

The output, /var/unbound/pfb_py_data.txt, without 1303 has 000free.us from list A and www.000free.us from C. No duplicate entries.

,000free.us,,0,A,DNSBL_Compilation
,www.000free.us,,0,C,DNSBL_Compilation

Especially troubling are the dig results:

dig 000free.us
;; ANSWER SECTION:
000free.us. 60 IN A 0.0.0.0

dig www.000free.us
;; ANSWER SECTION:
www.000free.us. 60 IN A 0.0.0.0

dig blarg.www.000free.us
;; ANSWER SECTION:
blarg.www.000free.us. 30 IN A 104.x.x.x

dig blarg.000free.us
;; ANSWER SECTION:
blarg.000free.us. 30 IN A 104.x.x.x

On one hand, the use of non-regex/ABP entries combined with regex/ABP may be nonsensical. On another hand, I've developed an expectation of pfblockerng to eliminate duplicates and prefer the most hardened result.

A brief side-track to TLD wildcard enabled scenario: /var/unbound/pfb_py_zone.txt has three entries for the domain 000free.us; one entry for each of the lists that contained it.

,000free.us,,0,A,DNSBL_Compilation,1
,000free.us,,0,B,DNSBL_Compilation,1
,000free.us,,0,C,DNSBL_Compilation,0

Furthermore, dig 000free.us returns a valid IP.

What are your thoughts? Let me know if more information would be helpful. I'll continue poking it.

andrebrait

@totowentsouth The duplication isn't expected, and so is it not matching the subdomain even when there's an AdBlock-style entry for it (they are always implicitly wildcards, similar to having TLD matching turned ON + having the domain on a hosts-style list).

I'll have to add both here and see why it's matching like that.

Btw, I made some changes yesterday. Would you mind pulling them and testing again?

Just a reminder that the Python file goes in a different place than the other files and that you need to go on Update -> Reload and force it to reload DNSBL (or All) after replacing the files.

andrebrait

@totowentsouth ok, just by looking at the lists I think I figured out what's happening.

Will try something later :)

andrebrait

@totowentsouth I think I fixed it. Can you pull and try with the new files?

totowentsouth

@andrebrait The short synopsis: DNS queries at 2995ad5f6927ace4a32a479d1375fe5123e1f1e6 are behaving as one would expect

The details:

For each iteration of changes, I install all modified files from your branch:

git diff --name-only 6b9d2aa2b78193bd8ce83d0c0e0793f157d3ed77..HEAD

To make testing before and after your changes trivial, I create custom system patches.

On the pfSense box, after changing the source files and before running an update, I delete the cached and processed files:

cd /var/db/pfblockerng/
rm -rf dnsbl/ dnsblorig/ dnsblalias/

I leave files in /var/unbound/ unmodified. I've noticed pfblockerng reliably deletes and/or updates them on an update/reload. pfblockerng does not always delete files in dnsblorig - for good reason as the cached file could be used again if desired. pfblockerng appears to update the files in dnsbl/ as expected though I delete them manually to ensure the generated files are fresh.

Also, FWIW, I host these files on a local web server to avoid hammering the maintainers' servers as I'm reloading these lists more frequently than once a day for testing purposes.

Question: There are still duplicate entries for domains that appear in multiple lists. Is this something you are or will be addressing?

A previously unnoticed undesirable behavior I noticed today is when TLD wildcard is enabled, the log file py_errors.log contains a line for each domain in pfb_py_zone.txt stating:

<date timestamp>|ERROR| [pfBlockerNG]: Failed to parse: pfb_py_zone.txt: ['','zzxedr.xyz', '', '0', 'A', 'DNSBL_Compilation', '1']
<date timestamp>|ERROR| [pfBlockerNG]: Failed to parse: pfb_py_zone.txt: ['','000free.us', '', '0', 'C', 'DNSBL_Compilation', '0']

With TLD wildcard disabled, pfb_py_zone.txt is absent and no new errors are written to py_error.log.

Folks that have TLD wildcard enabled from previous versions of pfBlockerNG will have a flood of errors in py_error.log after upgrading to the version that includes this implementation.

Question: Would you like to continue sharing feedback on this forum thread or migrate to the github pull request? I am comfortable with either.

This is looking great so far. I'll continue to hammer at the changes you push and I will share my findings as my time allows. Thank you!

andrebrait

@totowentsouth that's very useful feedback. @BBcan177 had alerted me about possible issues when upgrading due to me having added a few more columns to the intermediate CSV files. I thought I had fixed that earlier today but it seems I missed a line somewhere. Will fix it later today.

Here or GitHub are both ok for me. Whichever you prefer.

I guess on GitHub things can be a bit more structured / easier to format and suggest stuff.

And I have a far more precarious setup than you do I literally just scp the files to my pfSense VM to test stuff.

andrebrait

@totowentsouth regarding duplicates, they're expected and can't be avoided.

It's a lot easier to just parse everything as-is inside the Python code and handle the possibility of duplicates there than to try and remove them from the .whitelist files before that.

totowentsouth

@andrebrait re github vs here: I am new to the development processes for pfSense/pfBlockerNG. When/how does a merge request get clearance to merge to the main branch? Will sharing feedback on github increase the likelihood of preventing a 'bad' commit getting merged to the main branch?

I like the system patches when all of the pieces align . I have too often forgotten a change I made to the live files and flailed with the output of the "debug". I am learning to adopt better practices to avoid such pain

I noticed new commits and I installed 49d2fde485847bf291bd6be1d7fba110ff5f52c0 and ran an update and domains not explicitly blocked fail to resolve. The explicitly blocked queries return the expected 'blocked' IP. In my setup, this is 0.0.0.0.

For example, "dig fls.doubleclick.net" expectedly returns 0.0.0.0:

; <<>> DiG 9.16.44-Debian <<>> fls.doubleclick.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4245
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;fls.doubleclick.net.		IN	A

;; ANSWER SECTION:
fls.doubleclick.net.	60	IN	A	0.0.0.0

And "dig google.com" reports a failure:

; <<>> DiG 9.16.44-Debian <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 41375
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;google.com.			IN	A

I'm happy to provide more details if you are not already aware of this behavior. The new debug mode is off. TLD wildcard is also off.

FWIW, when I revert to 2995ad5f6927ace4a32a479d1375fe5123e1f1e6, all DNS queries behave as expected. Blocked are 'blocked' and not-blocked are resolved to a valid IP.

Any ideas?

Let me know if I should hold on testing new commits until you wish attention upon them.

Cheers and Thank you!

andrebrait

@totowentsouth I'll respond to the other questions properly later :)

As for the issue, could you checkout this branch and check if it happens there?

https://github.com/andrebrait/FreeBSD-ports/tree/pfblockerng-next

totowentsouth

@andrebrait said in https://oisd.nl:

As for the issue, could you checkout this branch and check if it happens there?

https://github.com/andrebrait/FreeBSD-ports/tree/pfblockerng-next

DNS queries are functional at 57d776e635e08c7925c46a83b6b8ebbdd9f64d4c and TLD wildcard appears to be working too
Thanks!

totowentsouth

@andrebrait said in https://oisd.nl:

It's a lot easier to just parse everything as-is inside the Python code and handle the possibility of duplicates there than to try and remove them from the .whitelist files before that.

I put together a solution to optionally prune duplicates from all DNSBL files after pfBlockerNG does its job in sanitizing. In the end, I do see that the expense of pruning the duplicates is great, especially with a handful of large block lists on a low resource machine. I'm sure aggregate lists like OISD and HaGeZi already prune duplicates.

Well, I wonder if you would be interested in sharing your thoughts and opinions on such a solution. I pushed a commit to https://github.com/babilon/FreeBSD-ports/tree/pfblockerng_devel_prune_duplicates. It is based off of pfblockerng-next.

@andrebrait said in https://oisd.nl:

@totowentsouth I'll respond to the other questions properly later :)

Sounds great. I found a link to https://docs.netgate.com/pfsense/en/latest/development/pull-request.html which may answer some of my questions.

andrebrait

@totowentsouth there is already a somewhat unrefined deduplication and pruning process that happens in the shell script that processes the lists.

It's somewhat ok and it serves the exact function you describe in your commit message, allowing you to assess how many unique domains a new list brings you.

But the current one is just a simple string matcher and/or regular expression matcher and it's fragile/not the most efficient minimizer.

Your commit actually gave me a few ideas on how to achieve what we need, but it needs some modifications to account for the different types of matching.

One thing I need to change (that's in the existing code) is that the PHP code tries to differentiate between the Python and Unbound paths too early. It does that because it validates the configuration with Unbound for each list, but I honestly think we can split those things into separate steps (and that we can trust our own process for identifying valid domain names, etc., better).

I need to sit down and work on it, but namely those would be:

Download all lists as-is
Parse each of them, ignore comments, identify the format and save those to a set of separate files for each list:

Blacklist exact-match domain names
Blacklist wildcard (a.k.a. domain and all subdomains) (only for AdBlock lists or Hosts lists for public TLDs)
Blacklist regex-like entries (only for AdBlock lists, and they should likely not be real regexes in the end, as those lists don't actually contain real regexes (not the way we use them), only star characters, which can all be reduced to a "match any", or a simple regular expression like "match this regex" rule, for that specific piece of the domain name). (Not applicable to Unbound mode)
Same as above but for whitelists

For each of those, build a tree of domain names for each type of rule, identifying duplicates and reducing the tree on each additional list processed.

From there, we have two possible approaches:

(Required for Unbound, faster but more memory intensive for Python) generate a minimal set of rules and create the corresponding final files for the mode
(Only Python) keep everything as a tree and handle the rules that way, so we don't have different types of matching inside the Python code and don't need to, for example, iterate over all regular expressions (other than the user-defined ones, for which we have no choice). We can test DFS and BFS to find out which is more efficient for domain resolution (there surely existing papers on that out there). Note that this will be slower for simple matches, and can be slower for wildcards too, but it can be more memory efficient as it doesn't require a separate dictionary for each type of rule.

Only after this we should take the time to put the lists in their specific formats (for Unbound or Python mode) so they can be parsed.

But all this is such a big breakage relative to how we do things that it's going to take more than the time I have to figure it out properly.

totowentsouth

@andrebrait My assessment after initial deployment was based on a grave coding error that I introduced during deployment to my pfSense box. After correcting this error, the numbers are reasonable. My updated initial commit of changes reflects the correct numbers (I did a force push for that oopsie). I say this in case the commit you pulled and read had the gnarly numbers. The memory used by the current script peaks at around the same as when running pfBlockerNG in Unbound (non-Python) mode with the same set of lists. I've since tuned the solution, added command line argument parsing and regex deduplication support.

@andrebrait said in https://oisd.nl:

It's somewhat ok and it serves the exact function you describe in your commit message, allowing you to assess how many unique domains a new list brings you.

But the current one is just a simple string matcher and/or regular expression matcher and it's fragile/not the most efficient minimizer.

Yes. The current deduplication does an admirable job at deduplication despite its limitations. With the introduction of ABP style matching, the limitations of the existing deduplication are more bothersome to me albeit NOT problematic for pfBlockerNG.

Grep utopia, to me, is to find ONE match in a single list. Finding the same/similar match multiple times in three lists leaves me wondering if DNSBL is broken or if I'm using a suboptimal set of lists. I realize my use case of aggregate lists plus additional is contributing to my woes. Aggregate lists are wonderful for a wide spectrum of users which naturally have varing requirements. Some of those requirements do not apply to me and so I supplement those aggregate lists.

Anyway...

FWIW #1, I have been running 57d776e635e08c7925c46a83b6b8ebbdd9f64d4c for a week or so (along with pfb_dnsbl_prune.py as a final dedup process) on both of my pfSense machines and have not observed issues. That I haven't seen an issue does not carry a lot of weight since I've not had time to poke and observe output.

FWIW #2, I have just updated both pfSense boxes to 3bea1f276e7fd557e285c999cde11202690ea81f. All is well.
I am very excited for the official release of this regex/ABP style matching. I will continue to run the 'next' versions until its release.

FWIW #3, I pushed the pfb_dnsbl_prune.py script to a separate repo https://github.com/babilon/dedup-domains/tree/main. I have also synchronized the script to my fork of FreeBSD to then easily make a system patch. I intend the license to match the pfBlockerNG's license in case anything of it may continue to be inspirational or useful.

Question #1: In each IP Group, within the "Advanced Tuneables" section, are two list boxes. One is labeled "Pre-process Script" and the second is labeled "Post-process Script". The code that runs the selected script appears to be in pfblockerng.inc:

// IP v4/6 Advanced Tunable - (Post Script processing)
if ($pfb_script_post && file_exists("{$pfb_script_post}")) {
        pfb_logger("\nExecuting post-script: {$list['script_pre']}\n", 1);
        $file_org_esc = escapeshellarg("{$file_dwn}.orig");
        exec("{$pfb_script_post} {$file_org_esc} {$list['vtype']} {$elog}");
}

In each DNSBL Group, within the "Advanced Tuneables" section, are two list boxes with the same set of labels as is seen on the IP Group page. However, AFAICT, the code that would run the DNSBL's pre- and post- scripts is absent!

(Q1) Do you know if this is a known issue? A search for "advanced tuneables" resulted in one match: #12882. This is not the issue I am looking for. I considered utilizing this functionality to run pfb_dnsbl_prune.py until I realized it's per group list. This leads to Question 2:

Question #2: What are your thoughts on a feature that allows users to optionally run a Pre- and/or Post- entire-DNSBL-process script?
I am thinking long term solution to run scripts like pfb_dnsbl_prune.py that act on the collective input/output of the DNSBL process. A solution that does not involve injecting changes to the core pfBlockerNG code to execute.

andrebrait

@totowentsouth said in https://oisd.nl:

Yes. The current deduplication does an admirable job at deduplication despite its limitations. With the introduction of ABP style matching, the limitations of the existing deduplication are more bothersome to me albeit NOT problematic for pfBlockerNG.

Honestly, I ended up finding out it's not that bad for Python mode. It's not optimal, but it doesn't have to be.

I'd like to remove Unbound mode first and then try to improve it. DNSBL code is too full of branches right now, because of both modes being present.

FWIW #1, I have been running 57d776e635e08c7925c46a83b6b8ebbdd9f64d4c for a week or so (along with pfb_dnsbl_prune.py as a final dedup process) on both of my pfSense machines and have not observed issues. That I haven't seen an issue does not carry a lot of weight since I've not had time to poke and observe output.

FWIW #2, I have just updated both pfSense boxes to 3bea1f276e7fd557e285c999cde11202690ea81f. All is well.
I am very excited for the official release of this regex/ABP style matching. I will continue to run the 'next' versions until its release.

There's a small bug in the TLD Analysis step, I think, vut I'll be fixing it tomorrow. Thanks a lot for testing it!

FWIW #3, I pushed the pfb_dnsbl_prune.py script to a separate repo https://github.com/babilon/dedup-domains/tree/main. I have also synchronized the script to my fork of FreeBSD to then easily make a system patch. I intend the license to match the pfBlockerNG's license in case anything of it may continue to be inspirational or useful.

Thanks. I'll check it out when I have time.

Question #1: In each IP Group, within the "Advanced Tuneables" section, are two list boxes. One is labeled "Pre-process Script" and the second is labeled "Post-process Script". The code that runs the selected script appears to be in pfblockerng.inc:
// IP v4/6 Advanced Tunable - (Post Script processing)
if ($pfb_script_post && file_exists("{$pfb_script_post}")) {
        pfb_logger("\nExecuting post-script: {$list['script_pre']}\n", 1);
        $file_org_esc = escapeshellarg("{$file_dwn}.orig");
        exec("{$pfb_script_post} {$file_org_esc} {$list['vtype']} {$elog}");
}
In each DNSBL Group, within the "Advanced Tuneables" section, are two list boxes with the same set of labels as is seen on the IP Group page. However, AFAICT, the code that would run the DNSBL's pre- and post- scripts is absent!

(Q1) Do you know if this is a known issue? A search for "advanced tuneables" resulted in one match: #12882. This is not the issue I am looking for. I considered utilizing this functionality to run pfb_dnsbl_prune.py until I realized it's per group list. This leads to Question 2:

I had never heard of or stumbled upon it. The pruning step should be integrated into the pipeline at a deeper level. It should be easier once a huge chunk of the DNSBL Unbound mode code is removed.

Question #2: What are your thoughts on a feature that allows users to optionally run a Pre- and/or Post- entire-DNSBL-process script?
I am thinking long term solution to run scripts like pfb_dnsbl_prune.py that act on the collective input/output of the DNSBL process. A solution that does not involve injecting changes to the core pfBlockerNG code to execute.

It could be interesting and it would allow for a ton of flexibility, but that should likely come after removing the Unbound mode. Letting power users tap into your stuff to make it more powerful is almost always a good idea. We can think of it kinda like its own plugin system.

With Python, the sky is the limit.

totowentsouth

@andrebrait Hello again. It has been a long minute. I updated my pfSense machines to a75ebaa5febdb1718cf05646f5abd51cb0f0ae47 of pfblockerng-next and noticed on one of the two regular entries in py_error.log of the form:

2023-11-11 10:08:22,845|ERROR| [pfBlockerNG]: Exception caught: 
	Traceback (most recent call last):
	  File "pfb_unbound.py", line 82, in _log
	    return func(*args, **kwargs)
	           ^^^^^^^^^^^^^^^^^^^^^
	  File "pfb_unbound.py", line 1835, in operate
	    entry = {'q_name': q_name, 'b_type': b_type, 'p_type': p_type, 'key': key, 'log': log_type, 'feed': feed, 'group': group, 'b_eval': b_eval}
	                                                           ^^^^^^
	UnboundLocalError: cannot access local variable 'p_type' where it is not associated with a value

my hotfix patch which suppresses the endless repetition of errors is:

diff --git a/net/pfSense-pkg-pfBlockerNG-devel/files/var/unbound/pfb_unbound.py b/net/pfSense-pkg-pfBlockerNG-devel/files/var/unbound/pfb_unbound.py
index e7268755c3..bc247f0942 100644
--- a/net/pfSense-pkg-pfBlockerNG-devel/files/var/unbound/pfb_unbound.py
+++ b/net/pfSense-pkg-pfBlockerNG-devel/files/var/unbound/pfb_unbound.py
@@ -1797,11 +1797,12 @@ def operate(id, event, qstate, qdata):
         
         if block_result and not cached_block:
             
+            p_type = 'Python'
+
             # Determine if domain is in HSTS database (Null blocking)
             if hstsDB:
                 debug('[{}]: checking HSTS for: {}', q_name_original, block_name)
 
-                p_type = 'Python'
 
                 # Determine if TLD is in HSTS database
                 if tld in pfb['hsts_tlds']:

IDK what is tickling this code. the fix was easy enough so I've not bothered looking further. It's interesting that only one of the two pfSense machines is reporting this. they do use different sets of DNSBL lists and other lists. the one that is reporting the errors is the primary firewall and has all the devices behind it. the 2nd one which is not reporting errors is in a kind of "lab network" and doesn't see much traffic. maybe some DNS queries are tickling this.

andrebrait

@totowentsouth ah, thanks. Yeah, silly oversight. I'm gonna commit the fix at some point today.

Thanks for testing it!

I'm now working on zero-downtime reloads for the Python mode, btw, so let me know if you'd like to also test that in the coming weeks :)

totowentsouth

@andrebrait Excellent!

@andrebrait said in https://oisd.nl:

I'm now working on zero-downtime reloads for the Python mode, btw, so let me know if you'd like to also test that in the coming weeks :)

Yes, absolutely. Zero-downtime reloads sounds intriguing.

At some point in the near-ish future, I'll update my routers to pfSense 23.09. Will these upcoming changes work in the next/latest version of pfSense?

Additional remarks:

Is "Wildcard blocking (TLD)" relevant anymore? FWIW, I have abandoned use of it going forward.
I noticed the DNSBL update procedure now executes a step described as "Removing redundant DNSBL entries". My unsolicited feedback is it good to see pruning of redundant entries and unfortunate that the process is as slow as it is, lengthening the DNSBL update process further. I am biased for speedy updates now because I now have an operational C implementation of the Python script to deduplicate the collective set. I realize the tools used for removing redundant entries is the limiting factor; it is what it is.

andrebrait

@totowentsouth said in https://oisd.nl:

At some point in the near-ish future, I'll update my routers to pfSense 23.09. Will these upcoming changes work in the next/latest version of pfSense?

It should be, yes.

Additional remarks:

Is "Wildcard blocking (TLD)" relevant anymore? FWIW, I have abandoned use of it going forward.

Only for traditional hosts-style lists

I noticed the DNSBL update procedure now executes a step described as "Removing redundant DNSBL entries". My unsolicited feedback is it good to see pruning of redundant entries and unfortunate that the process is as slow as it is, lengthening the DNSBL update process further. I am biased for speedy updates now because I now have an operational C implementation of the Python script to deduplicate the collective set. I realize the tools used for removing redundant entries is the limiting factor; it is what it is.

Ah, yes. It's a somewhat slow step, indeed. The silver lining is that the DNS resolution is still up while that runs, and while it does take extra processing, it's not supposed to be hitting too hard on the performance of the firewall while it's running.

But it's done this way right now because 1. adding a custom Python script or some custom native binary (compiled or otherwise) would add more complexity to the package and 2. without zero-downtime reloads, de-duplicating in the script initialization would prolong the downtime.

Since I'll be implementing zero-downtime reloads, we can do everything inside the Python initialization/reload code and not have to worry about downtime at all. The only time it would go down is on a pfBlockerNG version upgrade or when manually restarting Unbound.

Since zero-downtime reloads are essential for my use-case anyway, I'd already have to implement it, so I'm killing two birds with one stone. And it's arguably easy to do anyway (I can probably do it the next time I have more than a couple free hours, next week or so).