Alias Entries Are Not Being Added To The Tables (Even Hardcoded IPs)
-
@dark-baritone said in Alias Entries Are Not Being Added To The Tables (Even Hardcoded IPs):
Given a line line this: https://github.com/pfsense/FreeBSD-ports/blob/devel/net/filterdns/files/filterdns.c#L502 can you give me insight into how I would increase the log level so I can get better debugging output to help me figure out the issue?
Ah, you want the debug level to be higher, like 9 ?
You've found line 502.
Look for the place where the global variable 'debug' is set :So, find where 'filterdns' is executed by pfSEnse, and set the 'd' command line option yourself to a higher value :
It's here :
https://github.com/pfsense/pfsense/blob/6bf3e080f56facab1f00e29acd24dff62d5bd707/src/etc/inc/system.inc#L1649mwexec("/usr/local/sbin/filterdns -p {$g['varrun_path']}/filterdns-route.pid -i {$interval} -c {$g['varetc_path']}/filterdns-route.hosts -d 1");
You'll know what to do now
-
@Gertjan thank you! Ok so I tried that. I opened
system.inc
and updated the line to be:mwexec("/usr/local/sbin/filterdns -p {$g['varrun_path']}/filterdns-route.pid -i {$interval} -c {$g['varetc_path']}/filterdns-route.hosts -d 9");
and I'm still only getting two lines in my log files. I even rebooted the system but examples below are the only lines I'm seeing in
/var/log/resolver.log
:filterdns[88435]: Adding Action: pf table: hosts_application_containers host: myapp.mydomain.com filterdns[88435]: Adding host myapp.mydomain.com
Even when I know that domain's IP is not being added to the table (because the table is empty).
-
Hi !
what is the output in the console after executing commandps -ax | grep filterdns
If you see only two entries in the log and nothing else , it is possible that the program crashes at the time of startup.
-
@Konstanti said in Alias Entries Are Not Being Added To The Tables (Even Hardcoded IPs):
ps-ax | grep filterdns
lol - that's a syntax error.
You mean :[24.11-RELEASE][root@pfSense.bhf.tld]/root: ps -ax | grep filterdns 70495 - Is 0:14.75 /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 300 -c /var/etc/filterdns.conf -d 1 98260 - Is 0:06.38 /usr/local/sbin/filterdns -p /var/run/filterdns-cpzone1-cpah.pid -i 300 -c /var/etc/filterdns-cpzone1-captiveportal.conf -d 1
I've changed the 'd 1' to 'd 9' myself, and, to be sure, I've rebooted.
[24.11-RELEASE][root@pfSense.bhf.tld]/root: ps -ax | grep filterdns 81666 - Is 0:00.01 /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 300 -c /var/etc/filterdns.conf -d 1 88517 - Is 0:00.00 /usr/local/sbin/filterdns -p /var/run/filterdns-cpzone1-cpah.pid -i 300 -c /var/etc/filterdns-cpzone1-captiveportal.conf -d 1
Still 'd 1' ?!
Let's fact check :
[24.11-RELEASE][root@pfSense.bhf.tld]/root: grep 'd 9' /etc/inc/system.inc mwexec("/usr/local/sbin/filterdns -p {$g['varrun_path']}/filterdns-route.pid -i {$interval} -c {$g['varetc_path']}/filterdns-route.hosts -d 9");
Hummmmm.
Again - search better, more precise :
[24.11-RELEASE][root@pfSense.bhf.tld]/root: grep "-d 9" /etc/inc/*
/etc/inc/system.inc: mwexec("/usr/local/sbin/filterdns -p {$g['varrun_path']}/filterdns-route.pid -i {$interval} -c {$g['varetc_path']}/filterdns-route.hosts -d 9");Now - are their any other places where filterdns is started ? :
[24.11-RELEASE][root@pfSense.brit-hotel-fumel.net]/root: grep "\-d 1" /etc/inc/* /etc/inc/captiveportal.inc: " -i 300 -c {$cp_filterdns_filename} -d 1"); /etc/inc/filter.inc: mwexec("/usr/local/sbin/filterdns -p {$g['varrun_path']}/filterdns.pid -i {$resolve_interval} -c {$g['varetc_path']}/filterdns.conf -d 1"); /etc/inc/ipsec.inc: mwexec_bg("/usr/local/sbin/filterdns -p {$g['varrun_path']}/filterdns-ipsec.pid -i {$interval} -c {$g['varetc_path']}/ipsec/filterdns-ipsec.hosts -d 1"); /etc/inc/ipsec.inc: mwexec_bg("/usr/local/sbin/filterdns -p {$g['varrun_path']}/filterdns-ipsec.pid -i {$interval} -c {$g['varetc_path']}/ipsec/filterdns-ipsec.hosts -d 1");
... yes !
4 more places where filterdns is started !
If you don't use the portal neither ipsec, editing /etc/inc/filter.inc will do the trick. -
I would have chosen a slightly different path (for testing only)
For example
From console- kill 81666
- /usr/local/sbin/filterdns -f -i 300 -c /var/etc/filterdns.conf -d 9
in this case, I use the option -f to debug the program.
at the same time, without making changes to the PF codeor
/etc/inc/filter.inc: mwexec("/usr/local/sbin/filterdns -p {$g['varrun_path']}/filterdns.pid -i {$resolve_interval} -c {$g['varetc_path']}/filterdns.conf -d 9"); -
@Konstanti
Way easier and better, I upvote. -
@dark-baritone Just to ask what is https://docs.netgate.com/pfsense/en/latest/config/advanced-firewall-nat.html#firewall-maximum-table-entries set to? If using pfBlocker or anything that generates a lot of entries the advice I'd heard long ago was to set it to 2 million and increase as necessary.
-
ps -ax | grep filterdns
@Konstanti returns nothing!
Now, I know that it runs at SOME point it runs because there are SOME logs, but very telling that it's not CURRENTLY running.
-
@SteveITS Yeah I have it set extremely high. Like 80,000,000 (I have plenty of free memory). Just because I wanted to make sure that wasn't causing a problem.
-
@Konstanti @Gertjan doing it this way works perfectly for logging.
Ok so for one, when I run it manually, I'm seeing:
filterdns: could not start host thread for test.domain-not-being-added.org
Sent to stderr in the shell.
So that seems to be happening either here or here. Either of those seem to error when it gets to here or here. But I might be wrong about that.
In the log I do see:
filterdns[65219]: [105139] ("filterdns.c":675 check_hostname_create()): Creating a new thread for host test.domain-not-being-added.org
For the domains that are actually adding to the tables, I see log lines like:
filterdns[65219]: [978477] ("filterdns.c":507 host_dns()): found address 123.123.123.123 for host test2.domain-being-added-correctly.org filterdns[65219]: [978477] ("filterdns.c":434 addr_add()): adding address 123.123.123.123 for test2.domain-being-added-correctly.org
For full disclosure just in case it matters, the table with the domain where the thread is failing is fairly large but doesn't seem prohibitively so and is currently sitting at around 7,500 entries. The alias with the domain where the thread is failing has about 260 domains listed in it. I haven't seen any documentation anywhere on limits to sizes other than "1k per table entry" and "all tables must fit within about half of the max table entries size". I have about 14GB of RAM free which should be plenty.
Thank again to everyone for the help. If there's something specific you all need out of the debug logs or anything else, I'm happy to provide.
-
@dark-baritone said in Alias Entries Are Not Being Added To The Tables (Even Hardcoded IPs):
@Konstanti @Gertjan doing it this way works perfectly for logging.
Ok so for one, when I run it manually, I'm seeing:
filterdns: could not start host thread for test.somedomain.org
Sent to stderr in the shell.
So either here or here both of which seem to error when it gets to here or here? I might be wrong about that.
In the log I do see:
filterdns[65219]: [105139] ("filterdns.c":675 check_hostname_create()): Creating a new thread for host test.domain-not-being-added.org
For the domains that are actually adding to the tables, I see log lines like:
filterdns[65219]: [978477] ("filterdns.c":507 host_dns()): found address 123.123.123.123 for host test2.domain-being-added-correctly.org filterdns[65219]: [978477] ("filterdns.c":434 addr_add()): adding address 123.123.123.123 for test2.domain-being-added-correctly.org
But the other hosts that aren't showing up only show up as the two lines that I pasted in my previous comment.
For full disclosure just in case it matters, the table where the thread is failing is fairly large but doesn't seem prohibitively so and is currently sitting at around 7,500 entries. The alias containing the domain that is erroring with the thread error has about 260 domains listed in it.
Thank again to everyone for the help. If there's something specific you all need out of the debug logs or anything else, I'm happy to provide.
Somewhere along the way when I briefly researched the history of
filterdns
problems before making my initial post in this thread, I encountered a link that said out-of-the-box FreeBSD has a built-in limit on the number of threads a process can spawn. Perhaps you are hitting that limit?Here is one old post about increasing the limit: https://serverfault.com/questions/134616/increasing-freebsd-threads.
Here is an old post from 2009 to the FreeBSD mailing list: https://lists.freebsd.org/pipermail/freebsd-threads/2009-April/004554.html.
Maybe you are hitting a limit with a large alias ?? However, you would expect a more meaningful error message like "... can't create additional threads.." or "... exceeded thread limit ..." or something similar.
I see a lot of Google hits on "Linux thread limits", but very few results that address FreeBSD. I suspect FreeBSD certainly has its own internal limits of both max threads for the whole system and then max threads launched per process.
-
@dark-baritone This is sort of a side note but each hostname will create a filterdns process for each domain so that would be 260 processes for 260 names. It's not terribly efficient but they are not that large. Possibly some sort of process limit?
I would lightly question though why you need to resolve 260 names every few minutes. Normally that's used for dynamic DNS or similar.
Any chance one is invalid, like a wildcard or something that doesn't resolve? Can you try adding them to another alias and see what happens? I realize that would take time but...
The other 7240 entries in the alias are just IPs?
There is this but at 80m you're pretty large anyway.
https://docs.netgate.com/pfsense/en/latest/firewall/aliases.html#alias-sizing-concerns -
@SteveITS Between IPv4 and IPv6 and the fact that some of these aliases are occasionally returning different IPs, those 260 domains are responsible for all of those IPs. I know having aliases that return different IPs is not the best use of aliases, but up until this point it's been working great for me. The resolution of the domain name hasn't been off so much that I've been being blocked when I shouldn't be and makes maintenance of my firewall rules manageable.
-
@bmeeks WTF man. Increasing the max threads did it!
I had to update to increase by about 2,000 threads to make it work, but eventually the entire thing ran with no problems!
I'd love to get everyone's input on whether 7,500 values in a table is just like absolutely insane or just "unadvised" and/or anything else I'm doing that could be done a "better" way. Presumably most people don't run into this issue
Although it seems like there should be a better way to understand that this is happening other than going through what I just went through hahaha
THANKS AGAIN TO EVERYONE WHO JUMPED INTO THIS THREAD!! I know your all's time is valuable and I appreciate it more than you could know.
-
@dark-baritone said in Alias Entries Are Not Being Added To The Tables (Even Hardcoded IPs):
Increasing the max threads did it!
Glad that worked. It was an educated guess
.
To answer your other question about the alias size, I think it is fair to say the designers of the
filterdns
logic probably did not anticipate more than 100 entries in a FQDN alias.Sounds like either the pfSense code needs to check the entries count in an alias assigned to
filterdns
and squawk if a limit is exceeded, or else restructure the code a bit so multiple FQDN hosts are resolved per thread (instead of launching a separate thread for each host). -
@dark-baritone pfBlocker (I think just -devel currently due to the underlying provider changes) can set up aliases using ASN (IP subnets registered to a company).
You could make a redmine.pfsense.org report that the "filterdns: could not start host thread" message is not logged (which is what I gather from your posts).
If I had to guess, I'd guess there is a performance crossover between "start 100 processes that trigger every 5 minutes on their own" and "have one thread spawn 100 processes every 5 minutes."
-
@SteveITS said in Alias Entries Are Not Being Added To The Tables (Even Hardcoded IPs):
I'd guess there is a performance crossover between "start 100 processes that trigger every 5 minutes on their own" and "have one thread spawn 100 processes every 5 minutes."
I was thinking something more like --
Start a thread that resolves 4 hostnames every 5 minutes instead of a single hostname every 5 minutes. The started threads do their work, then sleep for 5 minutes to wake again and resolve the host. Right now I believe it is one thread per host. It would cut down on total threads if each thread resolved a few more hosts. And that still should not overwork that thread too much.
-
This post is deleted! -
I searched and it looks like it's already being tracked: https://redmine.pfsense.org/issues/15708
-
@dark-baritone said in Alias Entries Are Not Being Added To The Tables (Even Hardcoded IPs):
I searched and it looks like it's already being tracked: https://redmine.pfsense.org/issues/15708
...which links to https://docs.netgate.com/pfsense/en/latest/troubleshooting/filterdns-thread-errors.html, though with a slightly different error message.