Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    DNS crashing every ~ 36 hours or so and unbound has to be restarted.

    Scheduled Pinned Locked Moved General pfSense Questions
    38 Posts 5 Posters 1.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • G
      gawainxx
      last edited by

      I've been having an issue where DNS has been crapping out every other day or so and I'm having to restart the DNS (unbound?) Service in order to restore service.

      The only thing I've really done in the time period before this began was configuring log forwarding to a Splunk instance...
      I suppose it could be related and I might try turning it back off although that will hamper my ability to collect troubleshooting information.

      Can someone please provide any insight such as possible causes and processes or other things to focus on when looking through the syslogs?

      1 Reply Last reply Reply Quote 0
      • G
        gawainxx
        last edited by stephenw10

        Think I may have found something.

        I see the following entries right before a complete absense of any logging data from unbound until the service is restarted.

        12/6/19
        3:13:42.000 AM	
        Dec  6 03:13:39 unbound: [92636:0] fatal error: Could not read config file: /unbound.conf. Maybe try unbound -dd, it stays on the commandline to see more errors, or unbound-checkconf
        host = gatewaysource = udp:7001sourcetype = pfsense:unbound
        12/6/19
        3:13:42.000 AM	
        Dec  6 03:13:39 unbound: [92636:0] fatal error: Could not read config file: /unbound.conf. Maybe try unbound -dd, it stays on the commandline to see more errors, or unbound-checkconf
        host = gatewaysource = udp:7001sourcetype = pfsense:unbound
        12/6/19
        3:13:42.000 AM	
        Dec  6 03:13:39 unbound: [92636:0] notice: Restart of unbound 1.9.1.
        
        1 Reply Last reply Reply Quote 0
        • GertjanG
          Gertjan
          last edited by

          @gawainxx said in DNS crashing every ~ 36 hours or so and unbound has to be restarted.:

          host = gatewaysource = udp:7001sourcetype = pfsense:unboun

          Hi,
          Can you show the unbound.conf file ?
          It's here : /var/unbound/unbound.conf

          (and not here in the root = /unbound.conf)

          No "help me" PM's please. Use the forum, the community will thank you.
          Edit : and where are the logs ??

          G 2 Replies Last reply Reply Quote 0
          • G
            gawainxx @Gertjan
            last edited by

            This post is deleted!
            1 Reply Last reply Reply Quote 0
            • G
              gawainxx @Gertjan
              last edited by stephenw10

              @Gertjan
              Sorry, I had a derp moment and as accessing the wrong server.

              
              ##########################
              # Unbound Configuration
              ##########################
              
              ##
              # Server configuration
              ##
              server:
              
              chroot: /var/unbound
              username: "unbound"
              directory: "/var/unbound"
              pidfile: "/var/run/unbound.pid"
              use-syslog: yes
              port: 53
              verbosity: 1
              hide-identity: yes
              hide-version: yes
              harden-glue: yes
              do-ip4: yes
              do-ip6: yes
              do-udp: yes
              do-tcp: yes
              do-daemonize: yes
              module-config: "validator iterator"
              unwanted-reply-threshold: 0
              num-queries-per-thread: 4096
              jostle-timeout: 200
              infra-host-ttl: 900
              infra-cache-numhosts: 10000
              outgoing-num-tcp: 10
              incoming-num-tcp: 10
              edns-buffer-size: 4096
              cache-max-ttl: 86400
              cache-min-ttl: 0
              harden-dnssec-stripped: yes
              msg-cache-size: 4m
              rrset-cache-size: 8m
              
              num-threads: 2
              msg-cache-slabs: 2
              rrset-cache-slabs: 2
              infra-cache-slabs: 2
              key-cache-slabs: 2
              outgoing-range: 4096
              #so-rcvbuf: 4m
              auto-trust-anchor-file: /var/unbound/root.key
              prefetch: no
              prefetch-key: no
              use-caps-for-id: no
              serve-expired: no
              # Statistics
              # Unbound Statistics
              statistics-interval: 0
              extended-statistics: yes
              statistics-cumulative: yes
              
              # TLS Configuration
              tls-cert-bundle: "/etc/ssl/cert.pem"
              
              # Interface IP(s) to bind to
              interface-automatic: yes
              interface: 0.0.0.0
              interface: ::0
              
              # Outgoing interfaces to be used
              
              # DNS Rebinding
              # For DNS Rebinding prevention
              private-address: 10.0.0.0/8
              private-address: ::ffff:a00:0/104
              private-address: 172.16.0.0/12
              private-address: ::ffff:ac10:0/108
              private-address: 169.254.0.0/16
              private-address: ::ffff:a9fe:0/112
              private-address: 192.168.0.0/16
              private-address: ::ffff:c0a8:0/112
              private-address: fd00::/8
              private-address: fe80::/10
              # Set private domains in case authoritative name server returns a Private IP address
              private-domain: "_msdcs.britannia2.local"
              domain-insecure: "_msdcs.britannia2.local"
              private-domain: "britannia2.local"
              domain-insecure: "britannia2.local"
              
              
              # Access lists
              include: /var/unbound/access_lists.conf
              
              # Static host entries
              include: /var/unbound/host_entries.conf
              
              # dhcp lease entries
              include: /var/unbound/dhcpleases_entries.conf
              
              
              
              # Domain overrides
              include: /var/unbound/domainoverrides.conf
              
              
              # Unbound custom options
              server: 
              # Allow plex to work over LAN
              private-domain: "plex.direct"
              # Configuration for Britannia2.local with the PDC of mordred.britannia2.local
              local-data: "_ldap._tcp.your.britannia2.local 600 IN SRV 0 100 389 mordred.britannia2.local"
              local-data: "_ldap._tcp.Default-First-Site-Name._sites.britannia2.local 600 IN SRV 0 100 389 mordred.britannia2.local"
              local-data: "_ldap._tcp.pdc._msdcs.britannia2.local 600 IN SRV 0 100 389 mordred.britannia2.local"
              local-data: "_ldap._tcp.gc._msdcs.britannia2.local 600 IN SRV 0 100 3268 mordred.britannia2.local"
              local-data: "_ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.britannia2.local 600 IN SRV 0 100 3268 mordred.britannia2.local"
              local-data: "_ldap._tcp.30e36ab8-a6ac-4c64-85aa-0fbeb612a33b.domains._msdcs.britannia2.local 600 IN SRV 0 100 389 mordred.britannia2.local"
              local-data: "d4f866aa-a210-4c29-81a2-ebb256bdef7d._msdcs.britannia2.local 600 IN CNAME mordred.britannia2.local"
              local-data: "_kerberos._tcp.dc._msdcs.britannia2.local 600 IN SRV 0 100 88 mordred.britannia2.local"
              local-data: "_kerberos._tcp.Default-First-Site-Name._sites.dc._msdcs.britannia2.local 600 IN SRV 0 100 88 mordred.britannia2.local"
              local-data: "_ldap._tcp.dc._msdcs.britannia2.local 600 IN SRV 0 100 389 mordred.britannia2.local"
              local-data: "_ldap._tcp.Default-First-Site-Name._sites.dc._msdcs.britannia2.local 600 IN SRV 0 100 389 mordred.britannia2.local"
              local-data: "_kerberos._tcp.britannia2.local 600 IN SRV 0 100 88 mordred.britannia2.local"
              local-data: "_kerberos._tcp.Default-First-Site-Name._sites.britannia2.local 600 IN SRV 0 100 88 mordred.britannia2.local"
              local-data: "_gc._tcp.britannia2.local 600 IN SRV 0 100 3268 mordred.britannia2.local"
              local-data: "_gc._tcp.Default-First-Site-Name._sites.britannia2.local 600 IN SRV 0 100 3268 mordred.britannia2.local"
              local-data: "_kerberos._udp.britannia2.local 600 IN SRV 0 100 88 mordred.britannia2.local"
              local-data: "_kpasswd._tcp.britannia2.local 600 IN SRV 0 100 464 mordred.britannia2.local"
              local-data: "_kpasswd._udp.britannia2.local 600 IN SRV 0 100 464 mordred.britannia2.local"
              local-data: "_ldap._tcp.ForestDnsZones.britannia2.local 600 IN SRV 0 100 389 mordred.britannia2.local"
              local-data: "_ldap._tcp.Default-First-Site-Name._sites.ForestDnsZones.britannia2.local 600 IN SRV 0 100 389 mordred.britannia2.local"
              local-data: "_ldap._tcp.DomainDnsZones.britannia2.local 600 IN SRV 0 100 389 mordred.britannia2.local"
              local-data: "_ldap._tcp.Default-First-Site-Name._sites.DomainDnsZones.britannia2.local 600 IN SRV 0 100 389 mordred.britannia2.local"
              local-data: "britannia2.local 600 IN A 192.168.4.5"
              local-data: "britannia2.local  600 IN A 192.168.4.5"
              local-data: "gc._msdcs.britannia2.local 600 IN A 192.168.4.5"
              local-data: "gc._msdcs.britannia2.local 600 IN A 192.168.4.5"
              local-data: "ForestDnsZones.britannia2.local 600 IN A 192.168.4.5"
              local-data: "ForestDnsZones.britannia2.local 600 IN A 192.168.4.5"
              local-data: "DomainDnsZones.britannia2.local 600 IN A 192.168.4.5"
              local-data: "DomainDnsZones.britannia2.local 600 IN A 192.168.4.5"
              
              
              ###
              # Remote Control Config
              ###
              include: /var/unbound/remotecontrol.conf
              
              
              1 Reply Last reply Reply Quote 0
              • GertjanG
                Gertjan
                last edited by

                Looks pretty normal to me.

                No "help me" PM's please. Use the forum, the community will thank you.
                Edit : and where are the logs ??

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  Do you have pfBlocker installer with DNS-BL enabled? I don't see it in the conf file but that would update the file potentially causing a problem.

                  Steve

                  1 Reply Last reply Reply Quote 0
                  • GertjanG
                    Gertjan
                    last edited by

                    @gawainxx said in DNS crashing every ~ 36 hours or so and unbound has to be restarted.:

                    config file

                    https://github.com/NLnetLabs/unbound/blob/e828d678bafb7ef0df32623f6883bc4bdc07dc5b/daemon/unbound.c#L664

                    The config file is actually ok /unbound.conf - this file named is prefixed with with chrooted dir.
                    The chroot went wrong ? => File system errors ?

                    No "help me" PM's please. Use the forum, the community will thank you.
                    Edit : and where are the logs ??

                    GertjanG 1 Reply Last reply Reply Quote 0
                    • G
                      gawainxx
                      last edited by

                      Died again, help plox!

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Need more info to help further. What's logged in the system log when it fails? Or just before it fails?

                        G 1 Reply Last reply Reply Quote 0
                        • G
                          gawainxx @stephenw10
                          last edited by

                          @stephenw10 I'll grab those logs for you in a bit once I'm able to access my network again , am currently remote and am locked out due to the issue.

                          My logs are divided by process within Splunk.
                          Is there a specific process that would have the most relevant log data?

                          G 1 Reply Last reply Reply Quote 0
                          • G
                            gawainxx @gawainxx
                            last edited by gawainxx

                            Here are some logs, they are csvs renamed to .txt
                            Unbound logs from 10:40am - 3 PM
                            1576040675_650.txt
                            System" logs from 10:40am - 3 PM
                            1576040737_651.txt

                            I've also purged the DC related entries in my unbound config to see if that perhaps makes a change as I really only use that for labs/training stuff. Also adjusted my firewall rules so that I can access the router webUI from it, would have saved myself a lot of headache if I could have just restarted it via Ovpn.

                            1 Reply Last reply Reply Quote 0
                            • GertjanG
                              Gertjan
                              last edited by

                              unbound is stopped and restarted.

                              More logs are needed to see which process is doing this. It could also be a hardware event like a "LINK UP / LINK UP"

                              Btw : this "plunked" unbound log is close to totally unreadable : possible to see the original one ?
                              And while testing, can snort be send on a holiday ? What is snort protecting ?

                              No "help me" PM's please. Use the forum, the community will thank you.
                              Edit : and where are the logs ??

                              1 Reply Last reply Reply Quote 0
                              • GertjanG
                                Gertjan @Gertjan
                                last edited by

                                @Gertjan said in DNS crashing every ~ 36 hours or so and unbound has to be restarted.:

                                File system errors ?

                                ?

                                No "help me" PM's please. Use the forum, the community will thank you.
                                Edit : and where are the logs ??

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  Yeah very hard to read that. You should export the snort logs separately and not log the main system log, that makes it much easier to see actual system events.
                                  But anyway nothing seems to be logged there, not much to go on.

                                  Running a filesystem check is probably a good idea.

                                  Steve

                                  G 1 Reply Last reply Reply Quote 0
                                  • G
                                    gawainxx @stephenw10
                                    last edited by gawainxx

                                    @stephenw10
                                    Unfortunately the system logs have already looped and only go as far back as this morning.

                                    I've set up service watchdog to monitor the unbound process which will hopefully prevent the issue from causing extended outages while I work on getting it figured out. Snort is protecting my home network as well as a few miscellaneous things, mostly running for added security, I'm using one of the lighter pre-defined snort ruleset bundles.

                                    I've exported the data from splunk in a raw format, perhaps that will be closer to the original?

                                    Here are my logs from yesterday, the outage was around 10:56am, where there is an absolute absense of unbound log data until I had someone at home restart the server via console.

                                    UnboundIssues_SystemLogs.txt
                                    UnboundIssues_UnboundLogs.txt
                                    UnboundIssues_SnortLogs.txt

                                    1 Reply Last reply Reply Quote 0
                                    • stephenw10S
                                      stephenw10 Netgate Administrator
                                      last edited by

                                      Nothing logged but that file error seems like a permissions issue.

                                      I would definitely run the file system check. I would consider just reinstalling and restoring, it's usually pretty quick.

                                      Steve

                                      G 1 Reply Last reply Reply Quote 0
                                      • G
                                        gawainxx
                                        last edited by

                                        This post is deleted!
                                        1 Reply Last reply Reply Quote 0
                                        • G
                                          gawainxx @stephenw10
                                          last edited by

                                          @stephenw10

                                          Thanks,
                                          Here are the things i'm currently planning to do in order, moving to the next one if I see the service failure in the logs afterwords.

                                          • Gutting all non-critical code from my unbound.conf (Awaiting results on this currently).
                                          • SSHing to the router and running a filesystem check.
                                          • Toggling snort
                                          • Toggling Avahi
                                          • Toggling NUT
                                          • Reload and restore

                                          Seem fair?

                                          bmeeksB 1 Reply Last reply Reply Quote 0
                                          • bmeeksB
                                            bmeeks @gawainxx
                                            last edited by

                                            @gawainxx said in DNS crashing every ~ 36 hours or so and unbound has to be restarted.:

                                            @stephenw10

                                            Thanks,
                                            Here are the things i'm currently planning to do in order, moving to the next one if I see the service failure in the logs afterwords.

                                            • Gutting all non-critical code from my unbound.conf (Awaiting results on this currently).
                                            • SSHing to the router and running a filesystem check.
                                            • Toggling snort
                                            • Toggling Avahi
                                            • Toggling NUT
                                            • Reload and restore

                                            Seem fair?

                                            Just an FYI. Snort and Unbound have absolutely nothing to do with each other in terms of Unbound starting or stopping. However, the DNSBL function of pfBlockerNG does rewrite the unbound.conf file and that can lead to Unbound issues.

                                            While troubleshooting it is certainly prudent to stop Snort to remove that variable, but Snort running or not will have no impact on Unbound stopping and failing to restart.

                                            G 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.