Domain Override results in both A record and SERVFAIL response
-
What zone type are you using btw, transparent (default) or static or something else? And what acls do you have set?
I tried duplicated your setup with 2 different pfsense, my normal pfsense using home.arpa, my 2nd pfsense uses test.mydomain.tld and it does the same thing with A records..
So I can ask it for a name I created on my 2nd pfsense nas.test.mydomain.tld I get an answer
$ dig @192.168.9.34 nas.test.mydomain.tld ; <<>> DiG 9.16.50 <<>> @192.168.9.34 nas.test.mydomain.tld ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 20195 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1432 ;; QUESTION SECTION: ;nas.test.mydomain.tld. IN A ;; ANSWER SECTION: nas.test.mydomain.tld. 3600 IN A 10.20.30.40 ;; Query time: 2 msec ;; SERVER: 192.168.9.34#53(192.168.9.34) ;; WHEN: Mon Sep 23 14:10:51 Central Daylight Time 2024 ;; MSG SIZE rcvd: 66
; <<>> DiG 9.16.50 <<>> @192.168.9.253 nas.test.mydomain.tld ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18896 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;nas.test.mydomain.tld. IN A ;; ANSWER SECTION: nas.test.mydomain.tld. 3130 IN A 10.20.30.40 ;; Query time: 0 msec ;; SERVER: 192.168.9.253#53(192.168.9.253) ;; WHEN: Mon Sep 23 14:10:18 Central Daylight Time 2024 ;; MSG SIZE rcvd: 66
here is asking for that same nas fqdn with AAAA, which there is no record for
$ dig @192.168.9.34 nas.test.mydomain.tld AAAA ; <<>> DiG 9.16.50 <<>> @192.168.9.34 nas.test.mydomain.tld AAAA ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3482 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1432 ;; QUESTION SECTION: ;nas.test.mydomain.tld. IN AAAA ;; Query time: 1 msec ;; SERVER: 192.168.9.34#53(192.168.9.34) ;; WHEN: Mon Sep 23 14:12:25 Central Daylight Time 2024 ;; MSG SIZE rcvd: 50
$ dig @192.168.9.253 nas.test.mydomain.tld AAAA ; <<>> DiG 9.16.50 <<>> @192.168.9.253 nas.test.mydomain.tld AAAA ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53301 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;nas.test.mydomain.tld. IN AAAA ;; Query time: 0 msec ;; SERVER: 192.168.9.253#53(192.168.9.253) ;; WHEN: Mon Sep 23 14:13:45 Central Daylight Time 2024 ;; MSG SIZE rcvd: 50
If I ask for some record that doesn't exist I get nx from both
$ dig @192.168.9.34 nas1.test.mydomain.tld ; <<>> DiG 9.16.50 <<>> @192.168.9.34 nas1.test.mydomain.tld ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 39172 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1432 ;; QUESTION SECTION: ;nas1.test.mydomain.tld. IN A ;; Query time: 1 msec ;; SERVER: 192.168.9.34#53(192.168.9.34) ;; WHEN: Mon Sep 23 14:14:31 Central Daylight Time 2024 ;; MSG SIZE rcvd: 51 $ dig @192.168.9.253 nas1.test.mydomain.tld ; <<>> DiG 9.16.50 <<>> @192.168.9.253 nas1.test.mydomain.tld ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 42385 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;nas1.test.mydomain.tld. IN A ;; Query time: 6 msec ;; SERVER: 192.168.9.253#53(192.168.9.253) ;; WHEN: Mon Sep 23 14:14:44 Central Daylight Time 2024 ;; MSG SIZE rcvd: 51
I am using static as my zone type, if I typo something in my own local domain I sure don't want unbound trying to resolve that. And I am using an allow acl on the 2nd pfsense, and on my normal one I am using full snoop acl.
I normally use this setting in my pfsense, because I have no desire to go look up AAAA and no devices currently have IPv6.. I only turn that on for testing, and when I do I turn off that setting.
#private-address: ::/0 # filters out all AAAA !
Its currently turned off because testing your AAAA query, let me turn it back on. Nope no change.. Let me create an AAAA record in the 2nd pfsense and see what happens when I query that from my pfsense that has no AAAA set.
-
@johnpoz I'm currently set as "transparent" on both pfsense systems. I've just tried changing both to "static" and see no change in behavior. I also tried "type transparent" and this seemed to resolve the issue. I'm not sure I understand why that is though....
-
Type Transparent:
Similar to Transparent but it also passes through queries where the name matches but the type does not. For example, if a client queries for an AAAA record but only an A record exists, the AAAA query is passed on rather than resulting in a negative response.
Static should work too.. I use static..
I would have to look a bit closer, your using just sub domain for your delegation, domain override right.. So like your 1st pfsense is example.com and your 2 ns is location1.example.com ?
-
@johnpoz local pfsense is in the domain "location2.example.com". Remote pfsense is in "location1.example.com". I have a domain override set up on local pfsense for "location1.example.com" to use the IP of the remote pfsense.
To confirm, "static" does not work, only "type transparent" works. With any of the settings, the remote pfsense returns a "no answer" for AAAA requests, but only when the remote pfsense is set to "type transparent" does the local pfsense return a "no answer" back to the client instead of a SERVFAIL. I suppose the underlying difference is with "type transparent", the remote pfsense is passing the request on to its upstream DNS servers, whereas with "static" or "transparent" it is answering directly; but I don't see how that should matter as the answer back to the local pfsense is the same in all cases.
-
@rtadams89 I could see a case for servfail from the 1st pfsense to the client.. Because it wasn't able to lookup what was asked for.
What version of pfsense are you using btw. Could be some change in the version of unbound on it.. In my test the 1st pfsense is 24.03, 2nd pfsense was just a 2.7.2 vm..
24.03 is running 1.19.3 of unbound
2.7.2 is on 1.19.1If the answer is truely passed on, you should get back an NX and SOA.
;; AUTHORITY SECTION: example.com. 1800 IN SOA ns.icann.org. noc.dns.icann.org. 2024081420 7200 3600 1209600 3600
What happens when you actually query public NS for whatever that fqdn your asking your other local NS?
Where do the 2 NS forward to or do they resolve for stuff that is not a local resource.. That could come into play.. In my test that 2nd ns just resolves if their is no local record, same as my 1st ns.. And I am set to static, so if no local resource in its domain it would just send back nx
-
@johnpoz Both pfsense instances are 2.7.2-RELEASE
Both pfsense instances are setup with 8.8.8.8 and 1.1.1.1 as their DNS servers
-
Well seems like you have it working how you want with type transparent. But to be honest what do you care gets returned with some AAAA query, if you have no AAAA records?
If you have not AAAA, do you even use IPv6 externally? If not I would just turn off answering any AAAA with the command I posted above.
I even turned off AAAA in my browser, because its stupid to ask for AAAA record if you don't even have an IPv6 address ;)
-
@johnpoz I hit an edge case. I have some uptime monitoring software which takes a hostname as input. It resolves that hostname to both A and AAAA address (this is not configurable) and then attempts to connect to the returned IPv4/IPv6 addresses. If it gets back "no answer" it figures there is no IPv4/IPv6 record and just does not try to connect to that address. However, if it gets back a SERVFAIL or other error, it figures there is a problem and reports that the hostname it is monitoring is down.
-
Hi,
I'm experiencing the same issue:- Two pfSense boxes connected via IPSEC, let's call them P1 an P2
- P2 has a domain ovverride set up so that it will query P1
- If a query B2 for an host on the overridden domain, I get the expected reply for the "A" record and an error for the "AAAA" record
- If a query B1 for the same, I get the expected reply for the "A" record and empty result for the "AAAA" record
I've attached:
- DNS status screenshot after the query of P2
- tcpdump output on P1 showing the dns requests
- dns reply on wireshark
It seems that empty answers are treated like timed out answers.
Is this the expected behaviour?
-
@rtadams89 Is it the check-mk agent? Have you managed to solve?