Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times

    Scheduled Pinned Locked Moved DHCP and DNS
    176 Posts 6 Posters 33.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • johnpozJ
      johnpoz LAYER 8 Global Moderator @RickyBaker
      last edited by

      @RickyBaker said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

      All the mac addresses were "no vendor" results on a mac address lookup

      If I were to guess - those would be mobile devices, apple or android - they love to use made up mac address - you know for your privacy ;) You can turn it off on the device.. So it uses its actual mac

      An intelligent man is sometimes forced to be drunk to spend time with his fools
      If you get confused: Listen to the Music Play
      Please don't Chat/PM me for help, unless mod related
      SG-4860 24.11 | Lab VMs 2.8, 24.11

      1 Reply Last reply Reply Quote 1
      • GertjanG
        Gertjan @RickyBaker
        last edited by Gertjan

        @RickyBaker

        As probably already said above (I didn't check) : you don't want unbound to get restarted every xx seconds (minutes).
        So : uncheck this one :

        8bfbc4d1-407c-4404-82ec-3602c8648aa0-image.png

        From now on, you should see very few :

        750c297d-584a-44de-8adc-632c913b37d1-image.png

        Maybe once a day ?

        And remember : under pfBlockerng control, unbound can also get restarted.

        To see unbound (DNS) activity, I use this :

        tail -f /var/unbound/var/log/pfblockerng/dns_reply.log
        

        as I have pfBlocker already running.
        You can set unbound logging back to "Level 1 basic operations".

        What you also can try is : use the unbound settings as pre initialized by Netgate.
        De activate forwarding.
        Ditch 8.8.8.8 8 etc.
        You'll be using the default resolving.

        This is what I'm using :

        8f9b646a-92f0-4d43-acfc-a9f987daf43a-image.png

        and is rock solid for close to a decade.
        Don't worry about 8.8.8.8 etc, they will get over it ;)

        No "help me" PM's please. Use the forum, the community will thank you.
        Edit : and where are the logs ??

        johnpozJ R 3 Replies Last reply Reply Quote 1
        • johnpozJ
          johnpoz LAYER 8 Global Moderator @Gertjan
          last edited by

          @Gertjan said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

          Don't worry about 8.8.8.8 etc, they will get over it ;)

          hahaha - made me laugh.. Oh man they are going to wonder why Ricky stopped asking for dns..

          An intelligent man is sometimes forced to be drunk to spend time with his fools
          If you get confused: Listen to the Music Play
          Please don't Chat/PM me for help, unless mod related
          SG-4860 24.11 | Lab VMs 2.8, 24.11

          1 Reply Last reply Reply Quote 0
          • R
            RickyBaker @Gertjan
            last edited by RickyBaker

            First, thanks for all the screenshots and suggestions, really appreciate the time, i'm going nuts.

            @Gertjan said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

            So : uncheck this one :

            Since @johnpoz first message i've had this unenabled in the DNS Resolver settings menu:
            b9527ca6-4537-4676-b70a-ecafbfbf265d-image.png
            are you still seeing this constant restarting in the current log? Is there somewhere else i can disable or does a similar setting live elsewhere in the menu?

            @Gertjan said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

            And remember : under pfBlockerng control, unbound can also get restarted.

            I still don't think i have pfblockerng installed. I believe its an installed package that would show up in the Firewall drop down:
            b8deee27-b058-48e6-9162-793b5b4b1385-image.png
            b0b21785-1fb4-4fd3-b1f5-dea60cc22e40-image.png

            @Gertjan said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

            To see unbound (DNS) activity, I use this :

            I don't seem to have a var folder under /var/unbound/ Is this an issue with being on 2.7.0 and not 2.7.2? i would love to try to update but I'm running into a roadblock there as well:
            befe104a-eb14-4b3a-b147-5e4e0e776c27-image.png
            4f1d8594-e17a-4384-a6c9-682e3caf4294-image.png
            After posting this I will go back to this thread and see if there's any other suggestions to try besides what I already have: https://forum.netgate.com/topic/184670/issue-with-going-from-2-7-0-to-2-7-2/9
            edit: there didn't appear to be anything more I hadn't tried besides installing 2.7.2 from scratch and restoring the backup, which i'm not currently prepared to do

            @Gertjan said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

            This is what I'm using :

            512aa83a-d38e-4bb4-bc9d-dd0eced084bc-image.png
            14a4df6b-f4ce-49cf-8004-f548f61b9f00-image.png
            f42b00d8-0e8b-4a67-81c3-316ec9f15f70-image.png
            Differences I see are:

            • I have All selected under Outgoing Network Interface, I've changed that to LocalHost to match yours but i don't really understand this setting

            • I have System Domain Local Zone Type set to Transparent and not static like you. I don't understand this option and can switch it if the above doesn't fix anything

            • I disabled DNSSEC per @johnpoz suggestion in the first post

            • I don't have Python Module enabled (should I?)

            • i have a different SSL cert but I assume that's just a personal one you created

            I also don't understand what the domain overrides at the end are, should I trash them?

            Again thanks for everything. anything to attempt is appreciated

            GertjanG 1 Reply Last reply Reply Quote 0
            • R
              RickyBaker @Gertjan
              last edited by

              @Gertjan said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

              Don't worry about 8.8.8.8 etc, they will get over it ;)

              sorry missed this part:
              7c7a08ff-0426-4e00-b5e9-7525eeff3cb3-image.png
              I removed those 4 other dns after that first post as well, are they also still showing up in the logs?

              1 Reply Last reply Reply Quote 0
              • GertjanG
                Gertjan @RickyBaker
                last edited by Gertjan

                @RickyBaker said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

                I have All selected under Outgoing Network Interface, I've changed that to LocalHost to match yours but i don't really understand this setting
                

                All is default (I guess). The local DNS guru explained me somewhere that Localhost was 'better'.

                I have System Domain Local Zone Type set to Transparent and not static like you. I don't understand this option and can switch it if the above doesn't fix anything
                

                I can read this (as an explanation). You probably have to translate. I've chosen 'static' for my use, but there is really no important differences with transparent.

                I disabled DNSSEC per @johnpoz suggestion in the first post
                

                DNSSEC can be sued without issues if you do not Forward. You don't, so you an use it.
                Disabling it won't hurt, though.

                I don't have Python Module enabled (should I?)
                

                You can. DNSSEC, if activated, uses Python. And pfBlockerng - but you don't use pfBlockerng .
                It's activated by default.

                i have a different SSL cert but I assume that's just a personal one you created
                

                Pick one you've listed. Certs are not used by default.

                I also don't understand what the domain overrides at the end are, should I trash them?

                These are added by the pfSense admin = you. Not Netgate.
                But I know why these (your) domains are listed : it will disable registration checking of some known adobe (photoshop, to name it) software ^^
                If you don't have the cracked Photoshop installed on one of your PCs, you can remove them all.

                No "help me" PM's please. Use the forum, the community will thank you.
                Edit : and where are the logs ??

                R 1 Reply Last reply Reply Quote 1
                • R
                  RickyBaker @Gertjan
                  last edited by

                  @Gertjan said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

                  You can. DNSSEC, if activated, uses Python. And pfBlockerng - but you don't use pfBlockerng .
                  It's activated by default.

                  any suggestions on what options to select for the Python Module after enabling? If I understood you correctly you were suggesting that it's necessary IF i enable DNSSEC support? But that DNSSEC isn't necessary?

                  @Gertjan said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

                  Pick one you've listed. Certs are not used by default.

                  I picked my name, which was a user I created for OpenVPN i think. I assumed you were implying that picking "Webconfigurator Default" was suboptimal/wrong?

                  GertjanG 1 Reply Last reply Reply Quote 0
                  • GertjanG
                    Gertjan @RickyBaker
                    last edited by

                    @RickyBaker said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

                    any suggestions

                    119744a1-3ece-4254-a444-2284b83e440e-image.png

                    The resolver can do DNSSEC for you. It's not mandatory.
                    But If you know what DNSSEC is, you will activate DNSSEC. It will not secure every DNS request, as most domain names are not signed yet. But when they are, why not securing the request ?

                    @RickyBaker said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

                    I picked my name

                    TLS can be used by unbound to secure the TLS channels, like the control chancel, commonly on 853. Any cert will do.

                    No "help me" PM's please. Use the forum, the community will thank you.
                    Edit : and where are the logs ??

                    johnpozJ 1 Reply Last reply Reply Quote 1
                    • johnpozJ
                      johnpoz LAYER 8 Global Moderator @Gertjan
                      last edited by johnpoz

                      The cert means nothing - unless you want unbound to be a dot server to your clients. Forget about the cert that is listed. It matters when you want to answer dot queries, not when you want to make them.

                      An intelligent man is sometimes forced to be drunk to spend time with his fools
                      If you get confused: Listen to the Music Play
                      Please don't Chat/PM me for help, unless mod related
                      SG-4860 24.11 | Lab VMs 2.8, 24.11

                      R 1 Reply Last reply Reply Quote 1
                      • R
                        RickyBaker @johnpoz
                        last edited by

                        @johnpoz @Gertjan thanks all

                        I've enabled DNSSEC again since it didn't really help my issue having it disabled. Is there another log I should increase the logging on? What level should I have the dns resolver on? I'm still experiencing the issue, yesterday for about 20 minutes (longest yet) I couldn't' open a webpage on my phone, but I was concurrently streaming the Knicks game on Youtube TV.

                        I've also added all those "no vendor" Mac address's I couldn't explain with randomly assigned DHCP leases to the whitelist block list and they've yet to come back and I've yet to discover anything broken. Just an update.

                        Any advice on getting it updated to 2.7.2 without doing a full clean install and restore?

                        johnpozJ 1 Reply Last reply Reply Quote 0
                        • johnpozJ
                          johnpoz LAYER 8 Global Moderator @RickyBaker
                          last edited by johnpoz

                          @RickyBaker said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

                          I've enabled DNSSEC again since it didn't really help my issue having it disabled

                          if you are forwarding dnssec should be disabled.. While it might not seem like no issues with a few queries, but can tell you its going to be problematic at some point.. even quad9 faq for when forwarding says to disable it. It is pointless if you forward, where you forward does dnssec or they don't you telling unbound to do it isn't going to do anything other than cause you issues at some point.

                          An intelligent man is sometimes forced to be drunk to spend time with his fools
                          If you get confused: Listen to the Music Play
                          Please don't Chat/PM me for help, unless mod related
                          SG-4860 24.11 | Lab VMs 2.8, 24.11

                          R 1 Reply Last reply Reply Quote 1
                          • R
                            RickyBaker @johnpoz
                            last edited by

                            @johnpoz I'm no longer forwarding per your first post in this thread.

                            johnpozJ 1 Reply Last reply Reply Quote 0
                            • johnpozJ
                              johnpoz LAYER 8 Global Moderator @RickyBaker
                              last edited by

                              @RickyBaker Long thread and haven't paid close attention, could tell if you had switched back to forwarding or not. Yeah if your resolving then dnssec is good to have enabled.

                              An intelligent man is sometimes forced to be drunk to spend time with his fools
                              If you get confused: Listen to the Music Play
                              Please don't Chat/PM me for help, unless mod related
                              SG-4860 24.11 | Lab VMs 2.8, 24.11

                              R 1 Reply Last reply Reply Quote 1
                              • R
                                RickyBaker @johnpoz
                                last edited by

                                @johnpoz great thanks for circling back

                                R 1 Reply Last reply Reply Quote 0
                                • R
                                  RickyBaker @RickyBaker
                                  last edited by

                                  https://pastebin.com/SFR8BXb0

                                  Woke up from a nap and experienced one of the longest internet outages of this whole saga. It was out at 3:14 when I tried to open venmo and was out for over 20 minutes before it came back. the above is the DNS resolver log but I think i have the log level dialed too high cause 2000 entries didn't even go back 2 minutes. I've changed it back to Log Level 1 but could someone check it out and see if there's any clues in there (or what log level I should have it at)? Or is there another log that I should also be monitoring? Is it possible the problem is purely something with the wifi and Ubiquiti?

                                  GertjanG johnpozJ 2 Replies Last reply Reply Quote 0
                                  • GertjanG
                                    Gertjan @RickyBaker
                                    last edited by

                                    @RickyBaker

                                    Same thoughts here : a high level of log details actually the details your looking for, as there is only for 2 minutes worth of info.
                                    If you have some disk space left, you can make the log files bigger.

                                    48bc948c-57dc-4564-b2e1-9af964888543-image.png

                                    If needed, you can make the log retention a bit smaller - I've "7", you can make it 5 or 4.

                                    You can also make this, one

                                    0bc21cb3-4150-4daf-922f-0b75f6638b5d-image.png

                                    a bit bigger.

                                    The actual goal is :
                                    As soon as you find a situation where a device has no access anymore, you have to check :
                                    Does the access without using DNS works ? For example, ping 8.8.8.8 from that device.
                                    Also double check : does the device has a valid IP, gateway and dns set at that moment ?
                                    Example :

                                    ipconfig /all
                                    

                                    and check the duration of the lease, the gateway, the DNS (both should point to the IP of pfSense.

                                    Check on the device if "DNS" works :

                                    C:\Users\Gauche>nslookup www.google.com
                                    Serveur :   pfSense.bhf.tld
                                    Address:  2a01:cb19:907:bedf:92ec:77ff:fe29:392c
                                    
                                    Réponse ne faisant pas autorité :
                                    Nom :    www.google.com
                                    Addresses:  2a00:1450:4007:81a::2004
                                              142.250.201.164
                                    

                                    Take note : for me, both IPv6 and IPv4 works.

                                    Then (also) check on pfSense if resolving works :

                                    dig @127.0.0.1 www.google.com +short
                                    

                                    and then

                                    dig @192.168.1.1 www.google.com +short
                                    

                                    where 192.168.1.1 is your LAN interface.

                                    Check if unbound is up and running :

                                    [24.03-RELEASE][root@pfSense.bhf.tld]/root: ps ax | grep 'unbound'
                                    74113  -  Ss       4:32.60 /usr/local/sbin/unbound -c /var/unbound/unbound.conf
                                    ....
                                    ....
                                    

                                    and

                                    [24.03-RELEASE][root@pfSense.bhf.tld]/root: sockstat | grep 'unbound'
                                    unbound  unbound    74113 3   udp6   *:53                  *:*
                                    unbound  unbound    74113 4   tcp6   *:53                  *:*
                                    unbound  unbound    74113 5   udp4   *:53                  *:*
                                    unbound  unbound    74113 6   tcp4   *:53                  *:*
                                    unbound  unbound    74113 9   tcp4   127.0.0.1:953         *:*
                                    ...
                                    ...
                                    ...
                                    ...
                                    

                                    With the unbound log details set to "1", it will still contains the number of restarts (a controlled stop and then a start :

                                    grep "stopped" /var/log/resolver.log
                                    .....
                                    
                                    <30>1 2024-05-06T00:15:24.852356+02:00 pfSense.bhf.tld unbound 12814 - - [12814:0] info: service stopped (unbound 1.19.3).
                                    

                                    Btw : the actual unbound version is 1.19.3 as I'm using 24.03.
                                    pfSense 2.8.0 will be coming out soon.
                                    Not that the version really matters (imho) as I was using 1.17.x also a long time, and don't recall having any issues.

                                    @RickyBaker said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

                                    Is it possible the problem is purely something with the wifi and Ubiquiti?

                                    For me, an AP should be what it should do :; being a radio to wire signal converter.
                                    True, an AP can do a lot more, and really braking the connection for you.
                                    When testing connectivity issues, add APs and other gadgets later on, when you know the wired connection works well.
                                    The same thing goes for L3 'smart, VLAN based' switches : only use the when the bare bone network works well.

                                    No "help me" PM's please. Use the forum, the community will thank you.
                                    Edit : and where are the logs ??

                                    R 1 Reply Last reply Reply Quote 0
                                    • johnpozJ
                                      johnpoz LAYER 8 Global Moderator @RickyBaker
                                      last edited by

                                      @RickyBaker did you actually do some dns queries while you were having the issue, to both unbound and say external dns?

                                      You should log queries and replies as well if your wanting to troubleshoot dns not working.

                                      int your options box in unbound

                                      server:
                                      log-queries: yes
                                      log-replies: yes
                                      

                                      what is the response, timeout talking to unbound, servfail, nx?

                                      fire up your fav dns tool, nslookup, dig, doggo, host, etc. and actual validate what is failing.. If you look some fqdn do you get a response. if so what is the response, did it work, did unbound return servfail, or nx domain ?

                                      Does unbound answer local resources, like pfsense fqdn? Does something that is cached work, only new queries fail. You can view what is in your cache

                                      [23.09.1-RELEASE][admin@sg4860.home.arpa]/root: unbound-control -c /var/unbound/unbound.conf dump_cache | grep forum.netgate.com
                                      forum.netgate.com.      2452    IN      A       208.123.73.71
                                      msg forum.netgate.com. IN A 32896 1 2452 3 1 1 3 6 
                                      forum.netgate.com. IN A 0
                                      [23.09.1-RELEASE][admin@sg4860.home.arpa]/root: 
                                      

                                      If that fails, then do a query directed to some external NS like quad9 or google - do those work?

                                      An intelligent man is sometimes forced to be drunk to spend time with his fools
                                      If you get confused: Listen to the Music Play
                                      Please don't Chat/PM me for help, unless mod related
                                      SG-4860 24.11 | Lab VMs 2.8, 24.11

                                      R 2 Replies Last reply Reply Quote 0
                                      • R
                                        RickyBaker @Gertjan
                                        last edited by

                                        @Gertjan said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

                                        As soon as you find a situation where a device has no access anymore, you have to check :
                                        Does the access without using DNS works ? For example, ping 8.8.8.8 from that device.
                                        Also double check : does the device has a valid IP, gateway and dns set at that moment ?

                                        this is really helpful, thank you, i will try to screenshot this post and enact it as well as possible the minute i notice an outage. BTW I tested a hard wired PC when i had an outage and also observed dns connectivity issues fwiw. but all of this is a very good framework for continuing the troubleshooting

                                        johnpozJ 1 Reply Last reply Reply Quote 0
                                        • johnpozJ
                                          johnpoz LAYER 8 Global Moderator @RickyBaker
                                          last edited by

                                          @RickyBaker how are you knowing the dns is failing? Are you doing an actual query with a tool? like dig or nslookup?

                                          Or your browser just doesn't load - for all you know your browser is using doh..

                                          When you have the issue, can your client ping its gateway (pfsense) can you ping the internet via IP, 8.8.8.8 for example.

                                          If you can not ping pfsense, then you have a local network issue most likely. If you can not ping the internet - maybe just your internet is out. If you can ping pfsense, can you do a query for pfsense name, this should always work even if the internet is down. Only reason it wouldn't is you can't actually talk to pfsense, or unbound is not running.

                                          Doing some basic connectivity tests and dns queries should point to where your actual problem is.

                                          An intelligent man is sometimes forced to be drunk to spend time with his fools
                                          If you get confused: Listen to the Music Play
                                          Please don't Chat/PM me for help, unless mod related
                                          SG-4860 24.11 | Lab VMs 2.8, 24.11

                                          1 Reply Last reply Reply Quote 1
                                          • R
                                            RickyBaker @johnpoz
                                            last edited by

                                            @johnpoz said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

                                            did you actually do some dns queries while you were having the issue, to both unbound and say external dns?

                                            at one point when i finally realized i could use dig on the pfsense itself I ran the command you posted to 8.8.8.8 and it worked successfully but I need to test this more thoroughly (i.e. other linux devices not the pfsense) and try 8.8.8.8 as well as google.com. thanks for reminding.

                                            @johnpoz said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

                                            int your options box in unbound

                                            I'm sorry where would i find/set these options set?

                                            @johnpoz said in DNS_PROBE_FINISHED_NXDOMAIN sporadically for anywhere from 30secs to 10min. works flawlessly at all other times:

                                            what is the response, timeout talking to unbound, servfail, nx?

                                            again, very sorry but how would I do this? I don't even KNOW what servfail, nx is? In fact, reading the rest of the suggestions i can tell this is an important framework for isolating the issue but it's just far beyond my grasp of the tools at play. I will google each individual term in hopes of understanding better but if there's a more specific you could include for me to enact and post that would be very helpful.

                                            S 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.