• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

pfSense resolver stops working

DHCP and DNS
7
66
15.4k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M
    maverickws @johnpoz
    last edited by Jul 27, 2022, 7:33 PM

    @johnpoz ah man I just restarted the service ;_;

    will check that out tomorrow. in the meanwhile, and seriously, what would be a reasonable expectation for the unbound version to be bumped on a patch release?

    1 Reply Last reply Reply Quote 0
    • M
      maverickws @johnpoz
      last edited by maverickws Jul 28, 2022, 1:49 PM Jul 28, 2022, 9:25 AM

      @johnpoz good morning guys,

      So this morning we saw the connection to Stripe API failing. Tests from the pfSense:

      [22.05-RELEASE][root@pf.net]/root: dig stripe.com A
      
      ; <<>> DiG 9.16.26 <<>> stripe.com A
      ;; global options: +cmd
      ;; Got answer:
      ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 48056
      ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
      
      ;; OPT PSEUDOSECTION:
      ; EDNS: version: 0, flags:; udp: 1332
      ;; QUESTION SECTION:
      ;stripe.com.			IN	A
      
      ;; Query time: 0 msec
      ;; SERVER: 127.0.0.1#53(127.0.0.1)
      ;; WHEN: Thu Jul 28 10:19:09 WEST 2022
      ;; MSG SIZE  rcvd: 39
      
      [22.05-RELEASE][root@pf.net]/root: nslookup stripe.com
      Server:		127.0.0.1
      Address:	127.0.0.1#53
      
      ** server can't find stripe.com: SERVFAIL
      

      Checked that stackoverflow.com was still on the DNS cache. So if I do that:

      [22.05-RELEASE][root@pf.net]/root: dig stackoverflow.com A
      
      ; <<>> DiG 9.16.26 <<>> stackoverflow.com A
      ;; global options: +cmd
      ;; Got answer:
      ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9120
      ;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1
      
      ;; OPT PSEUDOSECTION:
      ; EDNS: version: 0, flags:; udp: 1332
      ;; QUESTION SECTION:
      ;stackoverflow.com.		IN	A
      
      ;; ANSWER SECTION:
      stackoverflow.com.	191	IN	A	151.101.129.69
      stackoverflow.com.	191	IN	A	151.101.193.69
      stackoverflow.com.	191	IN	A	151.101.65.69
      stackoverflow.com.	191	IN	A	151.101.1.69
      
      ;; Query time: 0 msec
      ;; SERVER: 127.0.0.1#53(127.0.0.1)
      ;; WHEN: Thu Jul 28 10:23:13 WEST 2022
      ;; MSG SIZE  rcvd: 110
      
      [22.05-RELEASE][root@pf.net]/root: nslookup stackoverflow.com
      Server:		127.0.0.1
      Address:	127.0.0.1#53
      
      Non-authoritative answer:
      Name:	stackoverflow.com
      Address: 151.101.129.69
      Name:	stackoverflow.com
      Address: 151.101.193.69
      Name:	stackoverflow.com
      Address: 151.101.65.69
      Name:	stackoverflow.com
      Address: 151.101.1.69
      
      

      I get the answers and no error.

      EDIT:
      After these last tests I've added the do-ip6: no option to the resolver. So far, we haven't had any more hiccups. Hoping it mitigates the issue until a proper fix is out at least.

      L 1 Reply Last reply Jul 29, 2022, 2:24 AM Reply Quote 0
      • L
        lohphat @maverickws
        last edited by Jul 29, 2022, 2:24 AM

        @maverickws said in pfSense resolver stops working:

        EDIT:
        After these last tests I've added the do-ip6: no option to the resolver. So far, we haven't had any more hiccups. Hoping it mitigates the issue until a proper fix is out at least.

        I wonder what's the current tally of those with IPv6 enabled and unbound having issues? I conjectured that unbound is suffering memory/heap issues silently since the updates to it since 22.01 but disabling IPv6 the problem goes away since the memory overhead is reduced.

        Methinks we might have a smoking gun to warrant looking at unbound's memory footprint.

        SG-3100 24.11-RELEASE (arm) | Avahi (2.2_6) | ntopng (5.6.0_1) | openvpn-client-export (1.9.5) | pfBlockerNG-devel (3.2.1_20) | System_Patches (2.2.20_1)

        1 Reply Last reply Reply Quote 0
        • M
          maverickws
          last edited by Jul 29, 2022, 9:18 AM

          Well truth is I haven't had this issue since I've added the do-ip6: no option.
          Everything's running smoothly, no more failed queries. I'm even considering re-enabling dhcp leases just to see if it really has any impact on this or was all due to that option.

          In either case, I also have a pfSense with 22.05 at home and I don't have this issue.
          The difference in the setups would be, the datacenter segment where this was happening only has IPv6 enabled locally, while at my place I do have an IPv6 WAN connection.

          I don't think it's memory related (could be wrong ofc) but I've never seen the pfSense be nowhere near it's limits either of memory or CPU.

          L 1 Reply Last reply Jul 29, 2022, 5:04 PM Reply Quote 0
          • G
            Gertjan @maverickws
            last edited by Jul 29, 2022, 9:45 AM

            @maverickws said in pfSense resolver stops working:

            And we're all OK with that?

            Noop, so I removed the check before "register dhcp leases setting" and be done with it.

            See the existing redmine->pfSEnse bug reports about this subject.

            Some possible solutions have been mentioned already.
            We all wait for that person that is willing to write the code, some others to test it.
            The usual development sequence.

            Even on 'big' networks with a lot of PC type devices that are always connected, this is (nearly) not noticeable.
            But then came the connect disconnect connect disconnect connect disconnect type of device : our smart phone that go out of wifi range, come into wifi range etc. That triggers a new DHCP sequence with the now known side effects. Now you have issues.

            And things became worse : the market was flooded with cheap no-brain devices that renew their lease every 7200 seconds, no matter what.
            So, it's true : that cheap connected doorbell gadget can really destroy your DNS stability.

            @maverickws said in pfSense resolver stops working:

            I also have a pfSense at home which is one version behind (22.01), with pfBlockerNG and these issues do not happen.

            The behaviour unbound + the dhcpleases process that restarts unbound didn't change for the last 2, 3 years or so. It's a pain, we all agree. But a pain with a "go away" button ;)

            If your device @home is a PC, linked up by cable, and asks for a 48 hours lease, it will renew every 24 hours. That's ok.
            If your device has a stic IP, it will not initiate a DHCP request == unbound dosn't get restarted by "dhcpleases".
            All depends on these kind of details

            @maverickws said in pfSense resolver stops working:

            I can't imagine if it were hundreds or thousands.

            If you need to know the host name (often pure BS like HUAWEI_P30-91b3ex3ab3c5d), that is, you want the "HUAWEI_P30-91b3ex3ab3c5d" in your DNS cache, then yeah, you have an issue.

            That's why I added all (the ones I need to know by host name as they have a GUI or something like that) my known home and company devices as static MAC leases.
            I had to enter 50+ static leases over the last ....10 years ? - and this works fine for me now.

            No "help me" PM's please. Use the forum, the community will thank you.
            Edit : and where are the logs ??

            M 1 Reply Last reply Jul 29, 2022, 10:08 AM Reply Quote 0
            • M
              maverickws @Gertjan
              last edited by maverickws Jul 29, 2022, 10:09 AM Jul 29, 2022, 10:08 AM

              @gertjan thank you for your comment, but unfortunately it seems like it's a bit focused on the dhcp leases option when in truth that option had absolutely nothing to do with it.
              I have it disabled for two days (just scroll up I said when it was disabled) and the problem did not cease.

              The problem only ceased with the do-ip6: no option.

              So, despite understanding your explanation (and even agreeing that there isn't any requirement for enabling the dhcp leases, which are not enabled) it focused on something different.

              If your device @home is a PC, linked up by cable, and asks for a 48 hours lease, it will renew every 24 hours. That's ok.

              Just to give an idea, I have an office at home where me and the mrs both work, with two desks, 2 computers, a PBX and IP phones, a local server, users devices (phones, smart wearables etc), you still have to account for smart TV's, smart vacuum cleaners, smart scales, smart "different kinds of" alarms, and others I know we have around AT LEAST 20 devices connected at any given moment and I'm thinking I'm counting it under.

              The funny thing is that here at home we have absolutely no issue what so ever.

              So we must understand I can compare two different segments: Segment A let's say it's home-office, and segment B is the datacenter. So let's compare:

              Segment A:

              • release 22.05;
              • Has WAN IPv6;
              • Has pfBlockerNG;
              • Register DHCP Leases enabled;
              • Has huge lists;
              • No issues have been registered.

              Segment B:

              • release 22.05;
              • Doesn't have WAN IPv6, only local;
              • Does NOT have ANY extra packages except Service Watchdog;
              • Register DHCP Leases disabled;
              • Does not have any kind of huge lists;
              • Issues occur constantly until the do-ip6: no option is added to the resolver.

              I agree with the static lease approach, I do it too. Just that the issue is unrelated because it's not due to that register dhcp lease option.

              johnpozJ G 2 Replies Last reply Jul 29, 2022, 10:26 AM Reply Quote 0
              • johnpozJ
                johnpoz LAYER 8 Global Moderator @maverickws
                last edited by johnpoz Aug 11, 2022, 2:04 PM Jul 29, 2022, 10:26 AM

                @maverickws said in pfSense resolver stops working:

                not due to that register dhcp lease option.

                I don't think so either.. Its just its been a common pain point.. Other than the cache clear that will happen when unbound restarts normally this shouldn't be an issue that anyone would notice really. But depending on the setup, the normal quick restart of unbound can take longer - if for whatever reason unbound takes any amount of time to restart and its happening multiple times an hour say this could be noticed by users and become an issue.

                I think we all would like for register of dhcp leases not to cause a restart of unbound - when that might happen not sure..

                But suggesting to turn it off has become common answer to many users saying they are having issues with unbound.

                But i think these latest issues is not that unbound is not running or restarting just as your trying to resolve since clearly unbound is resolving local and cached items.. For whatever reason its having an issue resolving stuff that is no longer in its cache.

                What that might be I do not know.. But you have provided some good info to figuring that I out I think.. And its nice to hear that the do-ip6 setting has at least minimized your issues your seeing.

                The thread that was linked to seems for sure related to what some users have been seeing, and hope whatever that is cleared up when unbound on pfsense is updated from the current 1.15 version, be that with some point release like 22.05.1 or when 22.11 comes out - or maybe 2.7 will have a newer version of unbound?

                Part of the overall problem with finding stuff like this is - well I believe you are seeing an issue, but why am I not etc.. If it was some simple bug then you would think everyone would be seeing it, and not just some specific configuration.. So the hard part is figuring out ok what is the specific thing in user A config that is causing this, while user B and C are not seeing it, etc

                An intelligent man is sometimes forced to be drunk to spend time with his fools
                If you get confused: Listen to the Music Play
                Please don't Chat/PM me for help, unless mod related
                SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                1 Reply Last reply Reply Quote 0
                • G
                  Gertjan @maverickws
                  last edited by Jul 29, 2022, 10:56 AM

                  @maverickws said in pfSense resolver stops working:

                  it seems like it's a bit focused on the dhcp leases option

                  True.

                  When the "do-ip6: no" trick, over time, resolves the resolver stopping (to answer) ,issue, then that's not DHCP related at all.
                  Actually, the regular restarting of the resolver would make your issue go away : a stalled resolver gets restarted so it will answer again.

                  What I don't understand : I'm using IPv6 ans I'm using IPv4.
                  Not that I really need to to work, but I like to have these two up and running.

                  The core question is : Why should your IPv6 be different as mine ?
                  Why does you unbound choke on IPv6 it - and not mine ?
                  Or isn't this a IPv6 issue, and is the "do-ip6: no" just a way to cut the number of DNS requests in half, thus lowering internal buffer usage, or just lowering the chance the issue pops up ?

                  No "help me" PM's please. Use the forum, the community will thank you.
                  Edit : and where are the logs ??

                  M 1 Reply Last reply Jul 29, 2022, 11:51 AM Reply Quote 0
                  • M
                    maverickws @Gertjan
                    last edited by Jul 29, 2022, 11:51 AM

                    @gertjan I'm assuming for the same reason that it doesn't choke on my soho setup and it does on the datacenter:

                    At home I have a valid ipv6 wan connection, so I'm assuming it does some resolving via IPv6 link to the world. So at home, since there's a valid WAN IPv6 link, no problems.

                    At the DC, the IPv6 is only enabled locally, this setup does not have external IPv6 connectivity. And I'm assuming this is the exact point that makes the difference.

                    G 1 Reply Last reply Jul 29, 2022, 12:22 PM Reply Quote 0
                    • G
                      Gertjan @maverickws
                      last edited by Jul 29, 2022, 12:22 PM

                      @maverickws said in pfSense resolver stops working:

                      the IPv6 is only enabled locally, this setup does not have external IPv6 connectivity.

                      Imagine :
                      All local devices see IPv6 on their network interface, and all (modern) OS will prefer IPv6 over IPv4, so DNS requests will be 'AAAA' first, and will unbound collects all the AAAA info, gives the result back to the local devices, who will initiate a IPv6 to the (remote) host.
                      Nothing comes back, the connection will time out, and after a while, everything restarts, this time using classic A requests to get an A for the host.
                      Take note : unbound knows that there is no IPv6 available, and will ask for AAAA over a IPv4 UDP or TCP connection. That's not an issue.
                      IMHO : Informing your local LAN that the DNS/Gateway doesn't 'speak' IPv6 should accelerate overall network fluidity.
                      The local devices can very well talk 'IPv6' among them on their local LAN, that ok.

                      You could also add IPv6 to your DC, he.net IPv6 Tunnel Broker offers you a free static /48 and is rock solid, easy to implement with pfSense. I'm using their services for years already.

                      No "help me" PM's please. Use the forum, the community will thank you.
                      Edit : and where are the logs ??

                      johnpozJ 1 Reply Last reply Jul 29, 2022, 12:24 PM Reply Quote 0
                      • johnpozJ
                        johnpoz LAYER 8 Global Moderator @Gertjan
                        last edited by johnpoz Aug 11, 2022, 1:57 PM Jul 29, 2022, 12:24 PM

                        @gertjan the do-ip6 has nothing to do with AAAA or A, it has to do with unbound using IPv6 to make the query or answer the query.

                           do-ip6: <yes or no>
                                  Enable or disable whether ip6 queries are  answered  or  issued.
                                  Default  is yes.  If disabled, queries are not answered on IPv6,
                                  and queries are not sent on IPv6 to  the  internet  nameservers.
                                  With  this option you can disable the ipv6 transport for sending
                                  DNS traffic, it does not impact the contents of the DNS traffic,
                                  which may have ip4 and ip6 addresses in it.
                        

                        if your goal is not returning to the client AAAA when they asked for it for say google.com you can use the option

                        private-address: ::/0

                        An intelligent man is sometimes forced to be drunk to spend time with his fools
                        If you get confused: Listen to the Music Play
                        Please don't Chat/PM me for help, unless mod related
                        SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                        G 1 Reply Last reply Jul 29, 2022, 12:38 PM Reply Quote 0
                        • G
                          Gertjan @johnpoz
                          last edited by Jul 29, 2022, 12:38 PM

                          @johnpoz said in pfSense resolver stops working:

                          it has to do with unbound using IPv6 to make the query or answer the query.

                          @gertjan said in pfSense resolver stops working:

                          Take note : unbound knows that there is no IPv6 available

                          should be : no IPv6 over WAN available.
                          I was convinced that a :

                          login-to-view

                          still permitted local IPv6 :

                          [22.05-RELEASE][root@pfSense.my-local-mess.net]/root: sockstat -l | grep ":53"
                          unbound  unbound    60716 3  udp4   *:53                  *:*
                          unbound  unbound    60716 4  tcp4   *:53                  *:*
                          unbound  unbound    60716 7  udp6   *:53                  *:*
                          unbound  unbound    60716 8  tcp6   *:53                  *:*
                          

                          I redid the test.
                          The manual and you are right.
                          I see now :

                          [22.05-RELEASE][root@pfSense.getting-better.net]/root: sockstat -l | grep ":53"
                          unbound  unbound    47871 3  udp4   *:53                  *:*
                          unbound  unbound    47871 4  tcp4   *:53                  *:*
                          

                          No "help me" PM's please. Use the forum, the community will thank you.
                          Edit : and where are the logs ??

                          johnpozJ 1 Reply Last reply Jul 29, 2022, 12:49 PM Reply Quote 0
                          • johnpozJ
                            johnpoz LAYER 8 Global Moderator @Gertjan
                            last edited by Jul 29, 2022, 12:49 PM

                            @gertjan said in pfSense resolver stops working:

                            knows that there is no IPv6 available

                            You know what ticks me off dns clients... There is no IPv6 on my psk network, where my rokus sit.. Yet they still ask for AAAA, why you asking for an IPv6 address when you don't even have an IPv6 address?? Well it has a link-local address, but come on!!

                            login-to-view

                            An intelligent man is sometimes forced to be drunk to spend time with his fools
                            If you get confused: Listen to the Music Play
                            Please don't Chat/PM me for help, unless mod related
                            SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                            1 Reply Last reply Reply Quote 0
                            • M
                              maverickws
                              last edited by Jul 29, 2022, 1:31 PM

                              I figure doing the IPv6 lookup makes sense on the local network considering local IPv6 is enabled.
                              let's say web-server1 and db-server1 are using the ipv6 link locally. they still need to ask the resolver who that host is, and it will return the A and AAAA records. Since IPv6 takes precedence, it makes sense locally.

                              Now what really is the issue here is that unbound is unable to differ from local link connectivity and wide-network connectivity, so I'm assuming it tries to query the root servers with IPv6, where no IPv6 connection to that destination is available.

                              In the end I bet if looked closely those issues will all be related to this (as local ipv6 connectivity is enabled by default iirc) where users don't have IPv6 wan.

                              What would be interesting to understand as well is why has this behaviour changed from previous versions of unbound to the current state. Clearly some sort of logic was present before preventing this from happening, where now is gone.

                              johnpozJ 1 Reply Last reply Jul 29, 2022, 2:23 PM Reply Quote 0
                              • johnpozJ
                                johnpoz LAYER 8 Global Moderator @maverickws
                                last edited by Jul 29, 2022, 2:23 PM

                                @maverickws yeah I guess

                                But come on, these streaming boxes don't normally do anything locally. If you do not have a GUA Ipv6 address, why waste cycles asking for AAAA

                                An intelligent man is sometimes forced to be drunk to spend time with his fools
                                If you get confused: Listen to the Music Play
                                Please don't Chat/PM me for help, unless mod related
                                SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                M 1 Reply Last reply Jul 29, 2022, 2:35 PM Reply Quote 0
                                • M
                                  maverickws @johnpoz
                                  last edited by Jul 29, 2022, 2:35 PM

                                  @johnpoz but in my case they aren't streaming boxes. They're application servers, database servers and alike. the webserver/dbserver was an accurate example of local connections here. We never connect to the web server using IPv6, but the web server does connect to services internally using ipv6. or used to, I guess.

                                  johnpozJ 1 Reply Last reply Jul 29, 2022, 3:00 PM Reply Quote 0
                                  • johnpozJ
                                    johnpoz LAYER 8 Global Moderator @maverickws
                                    last edited by johnpoz Jul 29, 2022, 3:02 PM Jul 29, 2022, 3:00 PM

                                    @maverickws sorry I might of gotten a bit off topic, I was just bitching about IPv6 dns clients in general...

                                    To me if you don't have a GUA, or at least ULA - there is zero point to asking for AAAA, sure ok maybe you have link local, but link local addresses don't belong in DNS..

                                    https://www.ietf.org/rfc/rfc4472.txt
                                    Operational Considerations and Issues with IPv6 DNS

                                    Section 2.1

                                    Link-local addresses should never be published in DNS (whether in
                                    forward or reverse tree), because they have only local (to the
                                    connected link) significance [WIP-DC2005].

                                    An intelligent man is sometimes forced to be drunk to spend time with his fools
                                    If you get confused: Listen to the Music Play
                                    Please don't Chat/PM me for help, unless mod related
                                    SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                    1 Reply Last reply Reply Quote 1
                                    • L
                                      lohphat @maverickws
                                      last edited by lohphat Jul 30, 2022, 6:50 AM Jul 29, 2022, 5:04 PM

                                      @maverickws said in pfSense resolver stops working:

                                      I don't think it's memory related (could be wrong ofc) but I've never seen the pfSense be nowhere near it's limits either of memory or CPU.

                                      It's related to memory allocation unbound uses internally for its local data, not the entire memory on the appliance running out.

                                      See earlier post regarding unbound release 1.16.0 github notes

                                      SG-3100 24.11-RELEASE (arm) | Avahi (2.2_6) | ntopng (5.6.0_1) | openvpn-client-export (1.9.5) | pfBlockerNG-devel (3.2.1_20) | System_Patches (2.2.20_1)

                                      1 Reply Last reply Reply Quote 0
                                      • M
                                        maverickws
                                        last edited by Aug 9, 2022, 7:24 PM

                                        Hi guys I have an update on this, will update if it goes the other way:

                                        I was doing some changes on my home pfsense (where I have pfblockerng etc) and all of the sudden dns went a-wire.
                                        Ended up having to add the do-ip6: no option but that really wasn't making sense as I had updated in ages and haven't had issues so far. PLUS I have IPv6 here working well.

                                        So in the end I remembered I had enabled the Experimental Bit 0x20 Support option.
                                        Disabled it, haven't had issues since. A couple of hours.
                                        So I'm wondering how's your setups and what conflict could it be.

                                        johnpozJ 1 Reply Last reply Aug 9, 2022, 7:27 PM Reply Quote 0
                                        • johnpozJ
                                          johnpoz LAYER 8 Global Moderator @maverickws
                                          last edited by Aug 9, 2022, 7:27 PM

                                          @maverickws Have had that enabled for YEARS.. zero issues with it.

                                          An intelligent man is sometimes forced to be drunk to spend time with his fools
                                          If you get confused: Listen to the Music Play
                                          Please don't Chat/PM me for help, unless mod related
                                          SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                          M 1 Reply Last reply Aug 9, 2022, 8:12 PM Reply Quote 0
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.