• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Unbound Resolver starts returning SERVFAIL after resolving certain hostnames

Scheduled Pinned Locked Moved DHCP and DNS
9 Posts 5 Posters 7.4k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D
    Derelict LAYER 8 Netgate
    last edited by Feb 10, 2015, 2:03 AM

    Coming from here: https://forum.pfsense.org/index.php?topic=87491.msg488407#msg488407  Credit and apologies to those over there for isolating this way to reproduce.

    New thread since this looks to me like a different issue from whatever's going on with DNS server hijacking.

    I am running Unbound in Resolver mode with DNSSEC enabled.  I can routinely tickle this by asking unbound to resolve:

    
    ns3.csof.net
    and/or
    api-nyc01.exip.org
    
    Note that that exip.org hostname has csof name servers.
    
    ns3.csof.net.		600	IN	A	195.22.26.199
    api-nyc01.exip.org.	 10	IN	A	195.22.26.248
    
    

    Note that both of those are in a known hostile netblock.

    Anyway, my resolver has been running fine for days.  No problems until I asked it to resolve those two hostnames.  After doing so, apparently random domains start being returned as SERVFAIL.

    $ dig forum.pfsense.org

    ; <<>> DiG 9.8.3-P1 <<>> forum.pfsense.org
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 30471
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

    ;; QUESTION SECTION:
    ;forum.pfsense.org. IN A

    ;; Query time: 1781 msec
    ;; SERVER: 192.168.223.1#53(192.168.223.1)
    ;; WHEN: Mon Feb  9 17:46:41 2015
    ;; MSG SIZE  rcvd: 35

    There's one example.  This happens until unbound is restarted.  I did this a couple times.  Last one on unbound log level 5.  Haven't really looked at the logs yet.

    Chattanooga, Tennessee, USA
    A comprehensive network diagram is worth 10,000 words and 15 conference calls.
    DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
    Do Not Chat For Help! NO_WAN_EGRESS(TM)

    1 Reply Last reply Reply Quote 0
    • D
      Derelict LAYER 8 Netgate
      last edited by Feb 10, 2015, 2:56 AM

      Without DNSSEC enabled, All I had to do was query these two domain names and then I got this:

      gridbug:etc cjl$ dig www.pfsense.org

      ; <<>> DiG 9.8.3-P1 <<>> www.pfsense.org
      ;; global options: +cmd
      ;; Got answer:
      ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 51593
      ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0

      ;; QUESTION SECTION:
      ;www.pfsense.org. IN A

      ;; ANSWER SECTION:
      www.pfsense.org. 10 IN A 195.22.26.248

      ;; AUTHORITY SECTION:
      org. 172779 IN NS ns1.csof.net.
      org. 172779 IN NS ns2.csof.net.
      org. 172779 IN NS ns3.csof.net.
      org. 172779 IN NS ns4.csof.net.

      ;; Query time: 159 msec
      ;; SERVER: 192.168.223.1#53(192.168.223.1)
      ;; WHEN: Mon Feb  9 18:48:39 2015
      ;; MSG SIZE  rcvd: 129

      This looks bad.

      Chattanooga, Tennessee, USA
      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
      Do Not Chat For Help! NO_WAN_EGRESS(TM)

      1 Reply Last reply Reply Quote 0
      • C
        cmb
        last edited by Feb 10, 2015, 4:26 AM Feb 10, 2015, 4:21 AM

        Do you have "harden glue" enabled on the Advanced tab of Unbound? If not, is it still replicable with that enabled?

        1 Reply Last reply Reply Quote 1
        • A
          agreenfield1
          last edited by Feb 10, 2015, 4:44 AM

          @cmb:

          Do you have "harden glue" enabled on the Advanced tab of Unbound? If not, is it still replicable with that enabled?

          I had experienced the same issue as Derelict, and was able to replicate it in the same way.  I did not have 'harden glue' enabled.  After doing so, I have not been able to replicate the issue!

          Should the default setting for harden-glue be enabled?  The documentation for unbound suggests yes (https://www.unbound.net/documentation/unbound.conf.html, but it was definitely not enabled by default on my system.

          1 Reply Last reply Reply Quote 0
          • K
            kejianshi
            last edited by Feb 10, 2015, 4:57 AM Feb 10, 2015, 4:52 AM

            My settings include…

            In Services: DNS Resolver: Advanced

            Harden Glue

            Harden DNSSEC data

            Unwanted Reply Threshold (10 million)

            Prefetch Support

            Prefetch DNS Key Support

            All those on - I had asked about 10x if those might be recommended without an answer.  After trying them for couple weeks, I'd say "Yes" - Definitely

            DNSSEC is on also and its not in forwarder mode.  Anyway - I'd recommend trying with these settings.

            Be sure to reboot everything and clear DNS Cache on all clients after.

            1 Reply Last reply Reply Quote 0
            • D
              Derelict LAYER 8 Netgate
              last edited by Feb 10, 2015, 4:56 AM

              Harden Glue appears to correct this, but that's pretty anecdotal.

              Chattanooga, Tennessee, USA
              A comprehensive network diagram is worth 10,000 words and 15 conference calls.
              DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
              Do Not Chat For Help! NO_WAN_EGRESS(TM)

              1 Reply Last reply Reply Quote 0
              • C
                cmb
                last edited by Feb 10, 2015, 4:58 AM

                Judging by the DNS traffic I captured when replicating that, harden glue should fix. I changed the default in new configs to enable, and we'll add config upgrade code so anyone who doesn't already have it enabled will have that changed upon upgrade to 2.2.1.

                1 Reply Last reply Reply Quote 0
                • K
                  kejianshi
                  last edited by Feb 10, 2015, 5:01 AM

                  Not so much anecdotal.

                  People are poisoning your cache either with malicious DNS records or with man-on-the-side attacks or both.

                  Those settings are to prevent such things.  Although, IMHO DNS protocol is a broken piece of crap and needs to be replaced with something that both encrypts and authenticates.

                  I'm sure that would introduce some latency, but my god…  Its ridiculous.  current DNS is about as secure as ftp and equally in need of being phased.

                  1 Reply Last reply Reply Quote 0
                  • D
                    doktornotor Banned
                    last edited by Feb 10, 2015, 7:48 AM Feb 10, 2015, 7:43 AM

                    @Derelict:

                    Harden Glue appears to correct this, but that's pretty anecdotal.

                    Never could reproduce this lolcal issue… I have harden-glue: yes enabled everywhere. So, sounds like a pretty good guess I'd say.

                    @cmb: Can we get harden-referral-path exposed in the GUI as well? (Probably not default on, but visible.) Also, harden-below-nxdomain.

                    1 Reply Last reply Reply Quote 0
                    9 out of 9
                    • First post
                      9/9
                      Last post
                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                      This community forum collects and processes your personal information.
                      consent.not_received