Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Unbound stops resolving when Domain Overrides DNS not answering

    Scheduled Pinned Locked Moved DHCP and DNS
    23 Posts 7 Posters 4.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      John41
      last edited by

      I will take a look at those options.

      As you propose I have been thinking of running a DNS server so I can be secondary for the zone I am currently forwarding to. This is not a critical application so in my case might not be worth the overhead.

      Thanks,

      John

      1 Reply Last reply Reply Quote 0
      • iorxI
        iorx
        last edited by

        Hi again!

        Now I'm experiencing this with 2.4.5p1. Newly installed.
        IPsec to main office.
        The fix with LAN gateway and route.
        Domain override in unbound.

        If connection is lost for a brief moment making unbound timeout it stops resolving for the overridden domain.
        I believe we came to the conclusion that unbound marks this as unreachable or something and just doesn't bother to ask again.

        Any new idea on how to make pfsense/unbound not give up so easily? Or if it is possible in a script detect the unbound has "tombstoned" the entries?

        Switching back to DNS Forwarder a solution maybe?

        1 Reply Last reply Reply Quote 0
        • iorxI
          iorx
          last edited by

          No response? This is an issue, how to go about getting some attention for it?

          bmeeksB 1 Reply Last reply Reply Quote 0
          • bmeeksB
            bmeeks @iorx
            last edited by

            @iorx said in Unbound stops resolving when Domain Overrides DNS not answering:

            No response? This is an issue, how to go about getting some attention for it?

            You can register and submit bug reports on the Redmine site here: https://redmine.pfsense.org/projects/pfsense.

            Be prepared to fully describe in the report the actual bug and the steps required to reliably recreate the bug.

            1 Reply Last reply Reply Quote 0
            • johnpozJ
              johnpoz LAYER 8 Global Moderator
              last edited by

              To your other question you can ask unbound who it would ask for something

              unbound-control -c /var/unbound/unbound.conf lookup www.example.com
              

              It should list your domain override NS, and then info about that NS..

              You could use the flush_negative command with that to flush all negative data

              An intelligent man is sometimes forced to be drunk to spend time with his fools
              If you get confused: Listen to the Music Play
              Please don't Chat/PM me for help, unless mod related
              SG-4860 24.11 | Lab VMs 2.8, 24.11

              1 Reply Last reply Reply Quote 0
              • iorxI
                iorx
                last edited by

                @johnpoz said in Unbound stops resolving when Domain Overrides DNS not answering:

                unbound-control -c /var/unbound/unbound.conf lookup

                Nice. I will see if I can find a way to trigger a flush when resolution stops for the overrides (go around the problem until a better solution)

                For the moment I'm testing to use DNS Forwarder instead, but have experience some weirdness there too. But the Forwarder is "dumb" isn't it? No caching? So maybe last time it stopped working was a related to the IPsec, need to check that further.

                But unbound I know have this issue. I'll try to create a bug report with reproducible steps to trigger the problem.

                1 Reply Last reply Reply Quote 0
                • johnpozJ
                  johnpoz LAYER 8 Global Moderator
                  last edited by johnpoz

                  @iorx said in Unbound stops resolving when Domain Overrides DNS not answering:

                  But the Forwarder is "dumb" isn't it? No caching?

                  Not sure where you would of gotten that idea, it caches. It would really be pretty pointless if it didn't

                  Here I enabled dnsmasq on port 5353 (so I didn't have to turn off unbound), then asked it how big its cache is

                  $ dig @192.168.9.253 -p 5353 +short chaos txt cachesize.bind
                  "10000"
                  

                  As simple way to see if something is cached or not, is look to see how fast it resolves.. If you get an answer in 0 or couple of ms vs how long it would take to forward to where your forwarding and back, it was cached and your answer was returned from cache.

                  You can also ask like the command above what is the hit rate on your cache.

                  $ dig @192.168.9.253 -p 5353 +short chaos txt hits.bind
                  "2"
                  

                  Do a query for something a few times, and then check it again - see the number go up..

                  $ dig @192.168.9.253 -p 5353 +short chaos txt hits.bind
                  "7"
                  

                  You can ask it how many misses its had

                  $ dig @192.168.9.253 -p 5353 +short chaos txt misses.bind
                  "1"
                  

                  Keep in mind I just enabled it 30 seconds ago and have only done query for www.google.com, not actually using it, etc.

                  You can get info for cachesize.bind, insertions.bind, evictions.bind, misses.bind, hits.bind, auth.bind and servers.bind

                  There is a way you can get it to dump its cache to syslog too.. you have to set it to log queries and then

                  -q, --log-queries
                       Log the results of DNS queries handled by dnsmasq. Enable a full 
                       cache dump on receipt of SIGUSR1.
                  

                  Unbound is much more robust dns option..

                  Check out the dnsmasq man page for other info
                  https://linux.die.net/man/8/dnsmasq

                  BTW, that is caches is right in its description ;)

                  Name
                  dnsmasq - A lightweight DHCP and caching DNS server. 
                  

                  An intelligent man is sometimes forced to be drunk to spend time with his fools
                  If you get confused: Listen to the Music Play
                  Please don't Chat/PM me for help, unless mod related
                  SG-4860 24.11 | Lab VMs 2.8, 24.11

                  1 Reply Last reply Reply Quote 0
                  • iorxI
                    iorx
                    last edited by

                    Got the forwarder (dnsmasq) capabilities and function backwards I understand. Didn't read up enough on that, my apologies.
                    Many thanks for the awesome explanation!

                    I'll go forth trying to make reproducible lookup scenario. Going to try out both dnsmasq and unbounds behavior on domain overrides.

                    1 Reply Last reply Reply Quote 0
                    • johnpozJ
                      johnpoz LAYER 8 Global Moderator
                      last edited by

                      A simple test I would do when you feel your not resolving something over your vpn connection be it ipsec or openvpn... Is just do a direct query yourself via your fav lookup too, dig, host, nslookup - do you get a response?

                      If not then there is no possible way unbound or dnsmasq could either. If you do, then you need to figure out why unbound or dnsmasq is not - did they loose their binding to interface that would allow them to query down the vpn connection? Where exactly sort of response do you get, do you get timeout, refused, servfail, nx?

                      Was what you were looking for not cached? If it was cached you should of gotten response be it you could talk to that other ns either way.

                      I am not clear enough on how routing and pfsense works with ipsec, and what interface your binding unbound too. But least likely to fail sort of setup is to set unbound to only use localhost as as its outbound interface.. Now it should use routing to get to where you setup a domain override, or normal resolving/forwarding. If it has route to where the IP is that you setup in your domain override that says go over the vpn, it should do that.

                      If had some binding issue with its outbound interface, that has failed for some reason - reconnection of vpn, without restart of unbound.. Then sure it could have problems.. Which use of localhost as outbound interface could remedy.

                      Another option when your doing odd stuff with vpn connections that could reconned, and effect some applications binding to an interface/ip is to move the NS off pfsense, and put it on your network, so anything it would be trying to talk to would be normally routed just like any other client on your network.

                      An intelligent man is sometimes forced to be drunk to spend time with his fools
                      If you get confused: Listen to the Music Play
                      Please don't Chat/PM me for help, unless mod related
                      SG-4860 24.11 | Lab VMs 2.8, 24.11

                      iorxI 1 Reply Last reply Reply Quote 1
                      • iorxI
                        iorx @johnpoz
                        last edited by iorx

                        @johnpoz

                        (necroposting, sorry for that. but I felt the need to follow up)

                        To begin with, I never thanked you for educating and helping me on the subject! Thanks!

                        This has been brewing for a while, I've gone back and forth, tested stuff and given up.

                        Short info/summary:
                        "remotesite.local" points to a DNS on the other side of a VPN connection. An override in Unbound.
                        "localsite.n23" is the local network where I am.
                        Unbound stops resolving "remotesite.local" hosts after a while. Works for a while again after restarting Unbound and the stops resolving at remotesite.local

                        Today using some extreme googe-fu after I realized something. The only overrides that stops resolving are those ending with .local.

                        What lead me to this conclusion was this:

                        As one can see (logs below) 17:18 it was able to resolve hosts at the remote site. At 17:19 it couldn't anymore. Checking the logs for Unbound i found that it's not even trying to resolve anything on the .local domain.
                        Googled around on the issue and found that someone had a similar problem with .local that just stopped responding.
                        domain-overrides-stop-resolving-periodically-they-only-resume-after-the-service-has-been-restarted
                        The solution there was to make an override ".local" to point out a DNS. Tested to do that, a "local" override that points to 127.0.0.1.

                        This was a couple of hours ago and it looks like it's working.
                        The reason .local was used at the remove domain is ancient, it's a windows domain created when Microsoft "best practice" was to create local FQDN with .local at the end.

                        Unbound log:

                        Mar 18 17:19:24 	unbound 	52338 	[52338:3] info: validation success host01.remotesite.local. AAAA IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:3] info: validator operate: query host01.remotesite.local. AAAA IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:3] info: finishing processing for host01.remotesite.local. AAAA IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:3] info: resolving host01.remotesite.local. AAAA IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:3] info: validator operate: query host01.remotesite.local. AAAA IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:2] info: validation success host01.remotesite.local.localsite.n23. AAAA IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:2] info: validator operate: query host01.remotesite.local.localsite.n23. AAAA IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:2] info: finishing processing for host01.remotesite.local.localsite.n23. AAAA IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:2] info: resolving host01.remotesite.local.localsite.n23. AAAA IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:2] info: validator operate: query host01.remotesite.local.localsite.n23. AAAA IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:0] info: validation success host01.remotesite.local.localsite.n23. A IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:0] info: validator operate: query host01.remotesite.local.localsite.n23. A IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:0] info: finishing processing for host01.remotesite.local.localsite.n23. A IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:0] info: resolving host01.remotesite.local.localsite.n23. A IN
                        Mar 18 17:19:24 	unbound 	52338 	[52338:0] info: validator operate: query host01.remotesite.local.localsite.n23. A IN
                        Mar 18 17:18:04 	unbound 	52338 	[52338:2] info: validation success host01.remotesite.local. A IN
                        Mar 18 17:18:04 	unbound 	52338 	[52338:2] info: validator operate: query host01.remotesite.local. A IN
                        Mar 18 17:18:04 	unbound 	52338 	[52338:2] info: finishing processing for host01.remotesite.local. A IN
                        Mar 18 17:18:04 	unbound 	52338 	[52338:2] info: resolving host01.remotesite.local. A IN
                        Mar 18 17:18:04 	unbound 	52338 	[52338:2] info: validator operate: query host01.remotesite.local. A IN 
                        
                        iorxI 1 Reply Last reply Reply Quote 1
                        • iorxI
                          iorx @iorx
                          last edited by

                          This post is deleted!
                          1 Reply Last reply Reply Quote 0
                          • M
                            masupilamie
                            last edited by masupilamie

                            Can confirm iorx's "workaround" works. It seems the tld needs to be added as a domain override pointing to itself when a subdomain of that tld is used for local resolution and another subdomain is used for remote resolution via domain override.

                            In my case my local network uses main.lan and the remote site uses remote.lan
                            Only adding remote.lan as domain override to the remote site's DNS server made it work for less than a minute after flushing unbound's cache. Adding "lan" as domain override pointing to 127.0.0.1 made DNS resolution to remote.lan stable.

                            configured Domain Overrides
                            Screenshot 2025-01-19 at 20.55.04.png

                            pfsense version: 2.7.2

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.