https://oisd.nl

totowentsouth

As for the issue, could you checkout this branch and check if it happens there?

https://github.com/andrebrait/FreeBSD-ports/tree/pfblockerng-next

DNS queries are functional at 57d776e635e08c7925c46a83b6b8ebbdd9f64d4c and TLD wildcard appears to be working too
Thanks!

totowentsouth

@andrebrait said in https://oisd.nl:

It's a lot easier to just parse everything as-is inside the Python code and handle the possibility of duplicates there than to try and remove them from the .whitelist files before that.

I put together a solution to optionally prune duplicates from all DNSBL files after pfBlockerNG does its job in sanitizing. In the end, I do see that the expense of pruning the duplicates is great, especially with a handful of large block lists on a low resource machine. I'm sure aggregate lists like OISD and HaGeZi already prune duplicates.

Well, I wonder if you would be interested in sharing your thoughts and opinions on such a solution. I pushed a commit to https://github.com/babilon/FreeBSD-ports/tree/pfblockerng_devel_prune_duplicates. It is based off of pfblockerng-next.

@andrebrait said in https://oisd.nl:

@totowentsouth I'll respond to the other questions properly later :)

Sounds great. I found a link to https://docs.netgate.com/pfsense/en/latest/development/pull-request.html which may answer some of my questions.

andrebrait

@totowentsouth there is already a somewhat unrefined deduplication and pruning process that happens in the shell script that processes the lists.

It's somewhat ok and it serves the exact function you describe in your commit message, allowing you to assess how many unique domains a new list brings you.

But the current one is just a simple string matcher and/or regular expression matcher and it's fragile/not the most efficient minimizer.

Your commit actually gave me a few ideas on how to achieve what we need, but it needs some modifications to account for the different types of matching.

One thing I need to change (that's in the existing code) is that the PHP code tries to differentiate between the Python and Unbound paths too early. It does that because it validates the configuration with Unbound for each list, but I honestly think we can split those things into separate steps (and that we can trust our own process for identifying valid domain names, etc., better).

I need to sit down and work on it, but namely those would be:

Download all lists as-is
Parse each of them, ignore comments, identify the format and save those to a set of separate files for each list:

Blacklist exact-match domain names
Blacklist wildcard (a.k.a. domain and all subdomains) (only for AdBlock lists or Hosts lists for public TLDs)
Blacklist regex-like entries (only for AdBlock lists, and they should likely not be real regexes in the end, as those lists don't actually contain real regexes (not the way we use them), only star characters, which can all be reduced to a "match any", or a simple regular expression like "match this regex" rule, for that specific piece of the domain name). (Not applicable to Unbound mode)
Same as above but for whitelists

For each of those, build a tree of domain names for each type of rule, identifying duplicates and reducing the tree on each additional list processed.

From there, we have two possible approaches:

(Required for Unbound, faster but more memory intensive for Python) generate a minimal set of rules and create the corresponding final files for the mode
(Only Python) keep everything as a tree and handle the rules that way, so we don't have different types of matching inside the Python code and don't need to, for example, iterate over all regular expressions (other than the user-defined ones, for which we have no choice). We can test DFS and BFS to find out which is more efficient for domain resolution (there surely existing papers on that out there). Note that this will be slower for simple matches, and can be slower for wildcards too, but it can be more memory efficient as it doesn't require a separate dictionary for each type of rule.

Only after this we should take the time to put the lists in their specific formats (for Unbound or Python mode) so they can be parsed.

But all this is such a big breakage relative to how we do things that it's going to take more than the time I have to figure it out properly.

totowentsouth

@andrebrait My assessment after initial deployment was based on a grave coding error that I introduced during deployment to my pfSense box. After correcting this error, the numbers are reasonable. My updated initial commit of changes reflects the correct numbers (I did a force push for that oopsie). I say this in case the commit you pulled and read had the gnarly numbers. The memory used by the current script peaks at around the same as when running pfBlockerNG in Unbound (non-Python) mode with the same set of lists. I've since tuned the solution, added command line argument parsing and regex deduplication support.

@andrebrait said in https://oisd.nl:

It's somewhat ok and it serves the exact function you describe in your commit message, allowing you to assess how many unique domains a new list brings you.

But the current one is just a simple string matcher and/or regular expression matcher and it's fragile/not the most efficient minimizer.

Yes. The current deduplication does an admirable job at deduplication despite its limitations. With the introduction of ABP style matching, the limitations of the existing deduplication are more bothersome to me albeit NOT problematic for pfBlockerNG.

Grep utopia, to me, is to find ONE match in a single list. Finding the same/similar match multiple times in three lists leaves me wondering if DNSBL is broken or if I'm using a suboptimal set of lists. I realize my use case of aggregate lists plus additional is contributing to my woes. Aggregate lists are wonderful for a wide spectrum of users which naturally have varing requirements. Some of those requirements do not apply to me and so I supplement those aggregate lists.

Anyway...

FWIW #1, I have been running 57d776e635e08c7925c46a83b6b8ebbdd9f64d4c for a week or so (along with pfb_dnsbl_prune.py as a final dedup process) on both of my pfSense machines and have not observed issues. That I haven't seen an issue does not carry a lot of weight since I've not had time to poke and observe output.

FWIW #2, I have just updated both pfSense boxes to 3bea1f276e7fd557e285c999cde11202690ea81f. All is well.
I am very excited for the official release of this regex/ABP style matching. I will continue to run the 'next' versions until its release.

FWIW #3, I pushed the pfb_dnsbl_prune.py script to a separate repo https://github.com/babilon/dedup-domains/tree/main. I have also synchronized the script to my fork of FreeBSD to then easily make a system patch. I intend the license to match the pfBlockerNG's license in case anything of it may continue to be inspirational or useful.

Question #1: In each IP Group, within the "Advanced Tuneables" section, are two list boxes. One is labeled "Pre-process Script" and the second is labeled "Post-process Script". The code that runs the selected script appears to be in pfblockerng.inc:

// IP v4/6 Advanced Tunable - (Post Script processing)
if ($pfb_script_post && file_exists("{$pfb_script_post}")) {
        pfb_logger("\nExecuting post-script: {$list['script_pre']}\n", 1);
        $file_org_esc = escapeshellarg("{$file_dwn}.orig");
        exec("{$pfb_script_post} {$file_org_esc} {$list['vtype']} {$elog}");
}

In each DNSBL Group, within the "Advanced Tuneables" section, are two list boxes with the same set of labels as is seen on the IP Group page. However, AFAICT, the code that would run the DNSBL's pre- and post- scripts is absent!

(Q1) Do you know if this is a known issue? A search for "advanced tuneables" resulted in one match: #12882. This is not the issue I am looking for. I considered utilizing this functionality to run pfb_dnsbl_prune.py until I realized it's per group list. This leads to Question 2:

Question #2: What are your thoughts on a feature that allows users to optionally run a Pre- and/or Post- entire-DNSBL-process script?
I am thinking long term solution to run scripts like pfb_dnsbl_prune.py that act on the collective input/output of the DNSBL process. A solution that does not involve injecting changes to the core pfBlockerNG code to execute.

andrebrait

@totowentsouth said in https://oisd.nl:

Yes. The current deduplication does an admirable job at deduplication despite its limitations. With the introduction of ABP style matching, the limitations of the existing deduplication are more bothersome to me albeit NOT problematic for pfBlockerNG.

Honestly, I ended up finding out it's not that bad for Python mode. It's not optimal, but it doesn't have to be.

I'd like to remove Unbound mode first and then try to improve it. DNSBL code is too full of branches right now, because of both modes being present.

FWIW #1, I have been running 57d776e635e08c7925c46a83b6b8ebbdd9f64d4c for a week or so (along with pfb_dnsbl_prune.py as a final dedup process) on both of my pfSense machines and have not observed issues. That I haven't seen an issue does not carry a lot of weight since I've not had time to poke and observe output.

FWIW #2, I have just updated both pfSense boxes to 3bea1f276e7fd557e285c999cde11202690ea81f. All is well.
I am very excited for the official release of this regex/ABP style matching. I will continue to run the 'next' versions until its release.

There's a small bug in the TLD Analysis step, I think, vut I'll be fixing it tomorrow. Thanks a lot for testing it!

FWIW #3, I pushed the pfb_dnsbl_prune.py script to a separate repo https://github.com/babilon/dedup-domains/tree/main. I have also synchronized the script to my fork of FreeBSD to then easily make a system patch. I intend the license to match the pfBlockerNG's license in case anything of it may continue to be inspirational or useful.

Thanks. I'll check it out when I have time.

Question #1: In each IP Group, within the "Advanced Tuneables" section, are two list boxes. One is labeled "Pre-process Script" and the second is labeled "Post-process Script". The code that runs the selected script appears to be in pfblockerng.inc:
// IP v4/6 Advanced Tunable - (Post Script processing)
if ($pfb_script_post && file_exists("{$pfb_script_post}")) {
        pfb_logger("\nExecuting post-script: {$list['script_pre']}\n", 1);
        $file_org_esc = escapeshellarg("{$file_dwn}.orig");
        exec("{$pfb_script_post} {$file_org_esc} {$list['vtype']} {$elog}");
}
In each DNSBL Group, within the "Advanced Tuneables" section, are two list boxes with the same set of labels as is seen on the IP Group page. However, AFAICT, the code that would run the DNSBL's pre- and post- scripts is absent!

(Q1) Do you know if this is a known issue? A search for "advanced tuneables" resulted in one match: #12882. This is not the issue I am looking for. I considered utilizing this functionality to run pfb_dnsbl_prune.py until I realized it's per group list. This leads to Question 2:

I had never heard of or stumbled upon it. The pruning step should be integrated into the pipeline at a deeper level. It should be easier once a huge chunk of the DNSBL Unbound mode code is removed.

Question #2: What are your thoughts on a feature that allows users to optionally run a Pre- and/or Post- entire-DNSBL-process script?
I am thinking long term solution to run scripts like pfb_dnsbl_prune.py that act on the collective input/output of the DNSBL process. A solution that does not involve injecting changes to the core pfBlockerNG code to execute.

It could be interesting and it would allow for a ton of flexibility, but that should likely come after removing the Unbound mode. Letting power users tap into your stuff to make it more powerful is almost always a good idea. We can think of it kinda like its own plugin system.

With Python, the sky is the limit.

totowentsouth

@andrebrait Hello again. It has been a long minute. I updated my pfSense machines to a75ebaa5febdb1718cf05646f5abd51cb0f0ae47 of pfblockerng-next and noticed on one of the two regular entries in py_error.log of the form:

2023-11-11 10:08:22,845|ERROR| [pfBlockerNG]: Exception caught: 
	Traceback (most recent call last):
	  File "pfb_unbound.py", line 82, in _log
	    return func(*args, **kwargs)
	           ^^^^^^^^^^^^^^^^^^^^^
	  File "pfb_unbound.py", line 1835, in operate
	    entry = {'q_name': q_name, 'b_type': b_type, 'p_type': p_type, 'key': key, 'log': log_type, 'feed': feed, 'group': group, 'b_eval': b_eval}
	                                                           ^^^^^^
	UnboundLocalError: cannot access local variable 'p_type' where it is not associated with a value

my hotfix patch which suppresses the endless repetition of errors is:

diff --git a/net/pfSense-pkg-pfBlockerNG-devel/files/var/unbound/pfb_unbound.py b/net/pfSense-pkg-pfBlockerNG-devel/files/var/unbound/pfb_unbound.py
index e7268755c3..bc247f0942 100644
--- a/net/pfSense-pkg-pfBlockerNG-devel/files/var/unbound/pfb_unbound.py
+++ b/net/pfSense-pkg-pfBlockerNG-devel/files/var/unbound/pfb_unbound.py
@@ -1797,11 +1797,12 @@ def operate(id, event, qstate, qdata):
         
         if block_result and not cached_block:
             
+            p_type = 'Python'
+
             # Determine if domain is in HSTS database (Null blocking)
             if hstsDB:
                 debug('[{}]: checking HSTS for: {}', q_name_original, block_name)
 
-                p_type = 'Python'
 
                 # Determine if TLD is in HSTS database
                 if tld in pfb['hsts_tlds']:

IDK what is tickling this code. the fix was easy enough so I've not bothered looking further. It's interesting that only one of the two pfSense machines is reporting this. they do use different sets of DNSBL lists and other lists. the one that is reporting the errors is the primary firewall and has all the devices behind it. the 2nd one which is not reporting errors is in a kind of "lab network" and doesn't see much traffic. maybe some DNS queries are tickling this.

andrebrait

@totowentsouth ah, thanks. Yeah, silly oversight. I'm gonna commit the fix at some point today.

Thanks for testing it!

I'm now working on zero-downtime reloads for the Python mode, btw, so let me know if you'd like to also test that in the coming weeks :)

totowentsouth

@andrebrait Excellent!

@andrebrait said in https://oisd.nl:

I'm now working on zero-downtime reloads for the Python mode, btw, so let me know if you'd like to also test that in the coming weeks :)

Yes, absolutely. Zero-downtime reloads sounds intriguing.

At some point in the near-ish future, I'll update my routers to pfSense 23.09. Will these upcoming changes work in the next/latest version of pfSense?

Additional remarks:

Is "Wildcard blocking (TLD)" relevant anymore? FWIW, I have abandoned use of it going forward.
I noticed the DNSBL update procedure now executes a step described as "Removing redundant DNSBL entries". My unsolicited feedback is it good to see pruning of redundant entries and unfortunate that the process is as slow as it is, lengthening the DNSBL update process further. I am biased for speedy updates now because I now have an operational C implementation of the Python script to deduplicate the collective set. I realize the tools used for removing redundant entries is the limiting factor; it is what it is.

andrebrait

@totowentsouth said in https://oisd.nl:

At some point in the near-ish future, I'll update my routers to pfSense 23.09. Will these upcoming changes work in the next/latest version of pfSense?

It should be, yes.

Additional remarks:

Is "Wildcard blocking (TLD)" relevant anymore? FWIW, I have abandoned use of it going forward.

Only for traditional hosts-style lists

I noticed the DNSBL update procedure now executes a step described as "Removing redundant DNSBL entries". My unsolicited feedback is it good to see pruning of redundant entries and unfortunate that the process is as slow as it is, lengthening the DNSBL update process further. I am biased for speedy updates now because I now have an operational C implementation of the Python script to deduplicate the collective set. I realize the tools used for removing redundant entries is the limiting factor; it is what it is.

Ah, yes. It's a somewhat slow step, indeed. The silver lining is that the DNS resolution is still up while that runs, and while it does take extra processing, it's not supposed to be hitting too hard on the performance of the firewall while it's running.

But it's done this way right now because 1. adding a custom Python script or some custom native binary (compiled or otherwise) would add more complexity to the package and 2. without zero-downtime reloads, de-duplicating in the script initialization would prolong the downtime.

Since I'll be implementing zero-downtime reloads, we can do everything inside the Python initialization/reload code and not have to worry about downtime at all. The only time it would go down is on a pfBlockerNG version upgrade or when manually restarting Unbound.

Since zero-downtime reloads are essential for my use-case anyway, I'd already have to implement it, so I'm killing two birds with one stone. And it's arguably easy to do anyway (I can probably do it the next time I have more than a couple free hours, next week or so).

cukal

I stumbled upon this thread while looking for documentation regarding the pre/post script. I'm on 2.6.0.
Am I correct in assuming scripts defined in the "Advanced Tuneables" section do not get executed?

Is there a way to enable this functionality?

Gr,
C

@andrebrait said in https://oisd.nl:

@totowentsouth said in https://oisd.nl:

Yes. The current deduplication does an admirable job at deduplication despite its limitations. With the introduction of ABP style matching, the limitations of the existing deduplication are more bothersome to me albeit NOT problematic for pfBlockerNG.

Honestly, I ended up finding out it's not that bad for Python mode. It's not optimal, but it doesn't have to be.

I'd like to remove Unbound mode first and then try to improve it. DNSBL code is too full of branches right now, because of both modes being present.

FWIW #1, I have been running 57d776e635e08c7925c46a83b6b8ebbdd9f64d4c for a week or so (along with pfb_dnsbl_prune.py as a final dedup process) on both of my pfSense machines and have not observed issues. That I haven't seen an issue does not carry a lot of weight since I've not had time to poke and observe output.

FWIW #2, I have just updated both pfSense boxes to 3bea1f276e7fd557e285c999cde11202690ea81f. All is well.
I am very excited for the official release of this regex/ABP style matching. I will continue to run the 'next' versions until its release.

There's a small bug in the TLD Analysis step, I think, vut I'll be fixing it tomorrow. Thanks a lot for testing it!

FWIW #3, I pushed the pfb_dnsbl_prune.py script to a separate repo https://github.com/babilon/dedup-domains/tree/main. I have also synchronized the script to my fork of FreeBSD to then easily make a system patch. I intend the license to match the pfBlockerNG's license in case anything of it may continue to be inspirational or useful.

Thanks. I'll check it out when I have time.
Question #1: In each IP Group, within the "Advanced Tuneables" section, are two list boxes. One is labeled "Pre-process Script" and the second is labeled "Post-process Script". The code that runs the selected script appears to be in pfblockerng.inc:
// IP v4/6 Advanced Tunable - (Post Script processing)
if ($pfb_script_post && file_exists("{$pfb_script_post}")) {
        pfb_logger("\nExecuting post-script: {$list['script_pre']}\n", 1);
        $file_org_esc = escapeshellarg("{$file_dwn}.orig");
        exec("{$pfb_script_post} {$file_org_esc} {$list['vtype']} {$elog}");
}
In each DNSBL Group, within the "Advanced Tuneables" section, are two list boxes with the same set of labels as is seen on the IP Group page. However, AFAICT, the code that would run the DNSBL's pre- and post- scripts is absent!

(Q1) Do you know if this is a known issue? A search for "advanced tuneables" resulted in one match: #12882. This is not the issue I am looking for. I considered utilizing this functionality to run pfb_dnsbl_prune.py until I realized it's per group list. This leads to Question 2:
I had never heard of or stumbled upon it. The pruning step should be integrated into the pipeline at a deeper level. It should be easier once a huge chunk of the DNSBL Unbound mode code is removed.

Question #2: What are your thoughts on a feature that allows users to optionally run a Pre- and/or Post- entire-DNSBL-process script?
I am thinking long term solution to run scripts like pfb_dnsbl_prune.py that act on the collective input/output of the DNSBL process. A solution that does not involve injecting changes to the core pfBlockerNG code to execute.

It could be interesting and it would allow for a ton of flexibility, but that should likely come after removing the Unbound mode. Letting power users tap into your stuff to make it more powerful is almost always a good idea. We can think of it kinda like its own plugin system.

With Python, the sky is the limit.

totowentsouth

@cukal The code to execute the pre/post scripts for "Advanced Tuneables" is in /usr/local/pkg/pfblockerng/pfblockerng.inc and AFAICT exists ONLY for IPv4 / IPv6 lists. AFAICT, the expected "exec" call for the pre/post DNSBL is absent . By intention or mistake, IDK. I pondered and puzzled the same. The verbiage on the web page alludes to this feature is intended to be operational.

@cukal said in https://oisd.nl:

Is there a way to enable this functionality?

With a bit of coding, yes. As things stand now, AFAICT, no. Having a pre-script for DNSBL would be helpful to clean up input formats that are currently foreign to pfblockerng.

The "exec" call for the pre-script in pfblockerng.inc is in this block of code:

// IPv4/6 Advanced Tunable - (Pre Script processing)
if ($pfb_script_pre && file_exists("{$pfb_script_pre}")) {
        pfb_logger("\nExecuting pre-script: {$list['script_pre']}\n", 1);
        $file_dwn_esc = escapeshellarg("{$file_dwn}.orig");
        exec("{$pfb_script_pre} {$file_dwn_esc} {$list['vtype']} {$elog}");
}

And the "exec" call for the post-script in pfblockerng.inc is in this block of code:

// IP v4/6 Advanced Tunable - (Post Script processing)
if ($pfb_script_post && file_exists("{$pfb_script_post}")) {
	pfb_logger("\nExecuting post-script: {$list['script_pre']}\n", 1);
	$file_org_esc = escapeshellarg("{$file_dwn}.orig");
	exec("{$pfb_script_post} {$file_org_esc} {$list['vtype']} {$elog}");
}

totowentsouth

@andrebrait said in https://oisd.nl:

Only for traditional hosts-style lists

I realized after reading your reply that I had asked and you answered this question already.
I can see it's use in those cases.

@andrebrait said in https://oisd.nl:

But it's done this way right now because 1. adding a custom Python script or some custom native binary (compiled or otherwise) would add more complexity to the package and 2. without zero-downtime reloads, de-duplicating in the script initialization would prolong the downtime.

Makes sense.

@andrebrait said in https://oisd.nl:

Since zero-downtime reloads are essential for my use-case anyway, I'd already have to implement it, so I'm killing two birds with one stone. And it's arguably easy to do anyway (I can probably do it the next time I have more than a couple free hours, next week or so).

I'll give it a whirl when it is ready.

andrebrait

@totowentsouth I integrated your fix.

Could you pull the latest pfblockerng-adblock branch (or pfblockerng-next) and check that it's been fixed?

totowentsouth

@andrebrait Patch looks good and no lines in py_error.log while running pfblockerng-next at 69f3d0455363411179a763a3c39e03b0b027b4a0.

Thanks!

andrebrait

@totowentsouth New code in the branch https://github.com/andrebrait/FreeBSD-ports/tree/pfblockerng-adblock

If you're willing to test it, it would be more than appreciated!

totowentsouth

@andrebrait Yep. I have installed this on my secondary pfSense box. So far, 1 wrinkle and zero problems to report. I will exercise it over the coming days.

FWIW, I created a patch from 6b9d2aa2b78193bd8ce83d0c0e0793f157d3ed77..4683d6825a55667677803bda8444d14eb30ddf71
I removed hunk #38 for pfblockerng.inc from this patch due to a conflict. AFAICT, the change is already in 3.2.0_7 ?? -- afterwards, the patch applied clean so all is well.

Hunk #38 of net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc:

@@ -10732,7 +10962,7 @@ function pfblockerng_php_pre_deinstall_command() {
 		if (config_path_enabled('system','earlyshellcmd')) {
 			$a_earlyshellcmd = config_get_path('system/earlyshellcmd', '');
 			if (preg_grep("/pfblockerng.sh aliastables/", $a_earlyshellcmd)) {
-				config_set_path('system','earlyshellcmd',
+				config_set_path('system/earlyshellcmd',
 								preg_grep("/pfblockerng.sh aliastables/", $a_earlyshellcmd, PREG_GREP_INVERT));
 			}
 		}

The wrinkle I've encountered is with the switch to use fontawesome 6 (a999ce5a96e22ab54317e7079b1871e1661f7218). I am wrestling with fonts on my machine and have some improvements. Will pfSense deliver these fonts if the host/browser does not have them installed?

andrebrait

@totowentsouth said in https://oisd.nl:

@andrebrait Yep. I have installed this on my secondary pfSense box. So far, 1 wrinkle and zero problems to report. I will exercise it over the coming days.

FWIW, I created a patch from 6b9d2aa2b78193bd8ce83d0c0e0793f157d3ed77..4683d6825a55667677803bda8444d14eb30ddf71
I removed hunk #38 for pfblockerng.inc from this patch due to a conflict. AFAICT, the change is already in 3.2.0_7 ?? -- afterwards, the patch applied clean so all is well.

Hunk #38 of net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc:
@@ -10732,7 +10962,7 @@ function pfblockerng_php_pre_deinstall_command() {
 		if (config_path_enabled('system','earlyshellcmd')) {
 			$a_earlyshellcmd = config_get_path('system/earlyshellcmd', '');
 			if (preg_grep("/pfblockerng.sh aliastables/", $a_earlyshellcmd)) {
-				config_set_path('system','earlyshellcmd',
+				config_set_path('system/earlyshellcmd',
 								preg_grep("/pfblockerng.sh aliastables/", $a_earlyshellcmd, PREG_GREP_INVERT));
 			}
 		}

Zero idea about it. Perhaps a recent change from the upstream devel branch that merged cleanly for me but not for you for some reason?

The wrinkle I've encountered is with the switch to use fontawesome 6 (a999ce5a96e22ab54317e7079b1871e1661f7218). I am wrestling with fonts on my machine and have some improvements. Will pfSense deliver these fonts if the host/browser does not have them installed?

Yes, same for me. I have no idea what their plans are.

I'll update the branch later and create a version that's based off of the upstream main branch to see if I observe any differences regarding that.

totowentsouth

@andrebrait I noticed pfblockerng is blocking many innoculous domains after adding EasyPrivacy list. Examples of domains I would expect to resolve include starbucks.com, sendcloud.net, substack.com, and substackcdn.com.

https://easylist.to/easylist/easyprivacy.txt

In EasyPrivacy list, the entries for these domains are:

.sendcloud.net/track/
.starbucks.com/a/
.substack.com/o/$image
.substackcdn.com/open?$image

A peak at what the final list produced for a two of the domains that are blocked:

/var/db/pfblockerng: grep -nR sendcloud dnsbl/
dnsbl/hagezi_pro_plus.txt:43088:,sctrack.sendcloud.net,,2,hagezi_pro_plus,DNSBL_hagezi_pro_plus,1
dnsbl/hagezi_pro_plus.txt:49793:,track.sendcloud.org,,2,hagezi_pro_plus,DNSBL_hagezi_pro_plus,1
dnsbl/hagezi_pro_plus.txt:51974:,tracking.sendcloud.sc,,2,hagezi_pro_plus,DNSBL_hagezi_pro_plus,1
dnsbl/EasyList_Privacy.txt:4012:,sendcloud.net,,0,EasyList_Privacy,DNSBL_EasyList,0

/var/db/pfblockerng: grep -nR substack\.com dnsbl/
dnsbl/OISD_big.txt:480:,0x00000000000.substack.com,,2,OISD_big,DNSBL_Collections,1
dnsbl/OISD_big.txt:66539:,etharticles.substack.com,,2,OISD_big,DNSBL_Collections,1
dnsbl/OISD_big.txt:146726:,publicationgroup.substack.com,,2,OISD_big,DNSBL_Collections,1
dnsbl/OISD_big.txt:178172:,teamproject.substack.com,,2,OISD_big,DNSBL_Collections,1
dnsbl/OISD_big.txt:184591:,tradestrategy.substack.com,,2,OISD_big,DNSBL_Collections,1
dnsbl/OISD_big.txt:188726:,uniproject.substack.com,,2,OISD_big,DNSBL_Collections,1
dnsbl/OISD_big.txt:202727:,web3projects.substack.com,,2,OISD_big,DNSBL_Collections,1
dnsbl/OISD_big.txt:203007:,webpublic.substack.com,,2,OISD_big,DNSBL_Collections,1
dnsbl/StevenBlack_hosts.txt:7301:,email.mg1.substack.com,,2,StevenBlack_hosts,DNSBL_Collections,0
dnsbl/EasyList_Privacy.txt:4337:,substack.com,,0,EasyList_Privacy,DNSBL_EasyList,0

EasyPrivacy seems to be one of the few that lists entries in this manner and with domains that one would expect to resolve.

BTW, this issue appears to exist in pfblockerng-next branch as well. An entry in EasyList_France that follows the pattern as those above is in the final output. This box is running pfblockerng-next code:

[23.09.1-RELEASE][admin@pfSense.localdomain]/var/db/pfblockerng: grep -nR 468\.60\.gif dnsbl
dnsbl/C_EasyList_France.txt:228:,468.60.gif,,0,C_EasyList_France,DNSBL_EasyList,0

My pihole loads the same EasyList_France and although it shows ".468.60.gif" in the results for "Search Adlists", when I dig @<pihole.ip> 468.60.gif, the report in pihole is blocked by external i.e. pfblockerng. After removing EasyList_France from pfblockerng, the dig @<pihole.ip> 468.60.gif returns an answer - so it was indeed pfblockerng blocking the lookup even though pihole listed it for some reason.

Let me know if you need more information. I have my main computer behind the latest pfblockerng code and am loading it with lists to give it a thorough workout (at least as far as the adblocking goes).

totowentsouth

@andrebrait I made this change to my pfblockerng install:

Edited to include exclusion of entries beginning with a hyphen.

diff --git a/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc b/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc
index c332706eba77..eca18a486157 100644
--- a/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc
+++ b/net/pfSense-pkg-pfBlockerNG-devel/files/usr/local/pkg/pfblockerng/pfblockerng.inc
@@ -8730,6 +8730,11 @@ function sync_package_pfblockerng($cron='') {
 												if (!$liteparser) {
 
 													$lite = FALSE;
+													# entries that start with a period are probably ABP style.
+													$beginswith = substr($line, 0, 1);
+													if ($beginswith == '.' || $beginswith == '-') {
+														continue;
+													}
 													if (strpos($line, '.') !== FALSE &&
 													    ctype_alnum(str_replace('.', '', $line))) {
 														$lite = TRUE;

There are entries such as _c.gif, _adobe_analytics.js, _stat.php which viewed as a domain have an unregisterd TLD as of today AFAICT.

totowentsouth

Inside the next if block, leading and trailing periods are pruned from the line:

// Remove leading/trailing dots
$line = trim(trim($line), '.');