Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Unbound Failing after Short Outage

    Scheduled Pinned Locked Moved DHCP and DNS
    2 Posts 2 Posters 329 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • G
      guardian Rebel Alliance
      last edited by guardian

      For some reason I am losing DNS Resolution after a short ISP outage. I'm hoping someone can give me some guidance as to how make sure that unbound restarts promptly when internet comes back after an outage. I have highlighted my questions.

      I'm running pfSense community edition 2.6.0 on a generic J1900 box with Intel NICs and 4GB of RAM. Resource utilization doesn't seem to be an issue.

      My wife complained of "no internet", and when I logged into pfSense and did a ping 8.8.8.8 I got ping replies so I restarted unbound. After doing so things were working again.

      Here's my analysis of what I believe what happened (I have posted more complete logs below for context).

      WAN em0 experienced a short outage:

      Nov  1 10:46:21 fwrouter rc.gateway_alarm[8390]: >>> Gateway alarm: WAN_DHCP (Addr:8.8.8.8 Alarm:1 RTT:17.900ms RTTsd:2.951ms Loss:21%)
      Nov  1 10:51:46 fwrouter kernel: em0: link state changed to DOWN
      Nov  1 10:51:53 fwrouter kernel: em0: link state changed to UP
      

      FYI this message is from a watchdog script that I built for my VPN:

      Nov  1 10:50:01 fwrouter root[25004]: ***** 22/11/01-10:50:00-VPN_UDP_TOS_VPNV4-Attempting to issue restart to openvpn service... openvpn has been restarted.
      

      I had trouble awhile back with the VPN losing connectivity, and a manual restart of the OpenVPN service was required to get it running agaion, so I just wrote a simple script that restarts the VPN every 5 minutes if it goes down as long as the WAN is up. Given the time stamp and the Gateway alarm, it appears as if connectivity was lost about 10:45

      Unbound Crashed:

      Nov  1 10:50:33 fwrouter php-fpm[47855]: /rc.linkup: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1667314233] unbound[36793:0] error: bind: address already in use [1667314233] unbound[36793:0] fatal error: could not open ports'
      

      What is this? I assume it is due to the Unbound issue.

      Nov  1 10:50:34 fwrouter php-fpm[47855]: /rc.linkup: The command '/usr/sbin/arp -s '10.10.50.1' '00:28:1a:XX:XX:X5'' returned exit code '1', the output was 'arp: writing to routing socket: Operation not permitted'
      

      What is this? Does this mean Unbound is running out of cache memory?
      debug: cache memory msg=516099 rrset=1547265 infra=578247 val=0

      Is this is because the link is down? Or is there likely something else going on?

      Nov  1 10:48:36 guardian unbound[74982]: [74982:0] debug: request nleditor.osi.office.net. has exceeded the maximum number of glue fetches 60 to a single delegation point
      
      Nov  1 10:48:36 guardian unbound[74982]: [74982:0] debug: request nleditor.osi.office.net. has exceeded the maximum number of glue fetches 60 to a single delegation point
      

      I have noticed a number of posts regarding unbound being "somewhat unreliable", but I was wondering if this is likely due to misconfiguration on my part and/or if there is a good workaround so I don't have to deal with a long term internet outage?

      Any suggestions/assistance is much appreciated, and I have included more log info below.

      Relevent syslog Entries:

      Nov  1 10:46:21 fwrouter rc.gateway_alarm[8390]: >>> Gateway alarm: WAN_DHCP (Addr:8.8.8.8 Alarm:1 RTT:17.900ms RTTsd:2.951ms Loss:21%)
      Nov  1 10:46:21 fwrouter check_reload_status[396]: updating dyndns WAN_DHCP
      Nov  1 10:46:21 fwrouter check_reload_status[396]: Restarting IPsec tunnels
      Nov  1 10:46:21 fwrouter check_reload_status[396]: Restarting OpenVPN tunnels/interfaces
      Nov  1 10:46:21 fwrouter check_reload_status[396]: Reloading filter
      Nov  1 10:46:21 fwrouter rc.gateway_alarm[9850]: >>> Gateway alarm: VPN_UDP_TOS_VPNV4 (Addr:8.8.4.4 Alarm:1 RTT:36.706ms RTTsd:2.263ms Loss:22%)
      Nov  1 10:46:21 fwrouter check_reload_status[396]: updating dyndns VPN_UDP_TOS_VPNV4
      Nov  1 10:46:21 fwrouter check_reload_status[396]: Restarting IPsec tunnels
      Nov  1 10:46:21 fwrouter check_reload_status[396]: Restarting OpenVPN tunnels/interfaces
      Nov  1 10:46:21 fwrouter check_reload_status[396]: Reloading filter
      Nov  1 10:46:22 fwrouter php-fpm[33266]: /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. ''
      Nov  1 10:46:22 fwrouter php-fpm[33266]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_DHCP.
      Nov  1 10:46:22 fwrouter php-fpm[47855]: /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. ''
      Nov  1 10:46:22 fwrouter php-fpm[47855]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use VPN_UDP_TOS_VPNV4.
      Nov  1 10:48:32 fwrouter kernel: em0: link state changed to DOWN
      Nov  1 10:48:32 fwrouter check_reload_status[396]: Linkup starting em0
      Nov  1 10:48:33 fwrouter php-fpm[80175]: /rc.linkup: DEVD Ethernet detached event for wan
      Nov  1 10:48:34 fwrouter php-fpm[80175]: /rc.linkup: Removing static route for monitor 8.8.4.4 and adding a new route through 10.15.110.7
      Nov  1 10:48:34 fwrouter check_reload_status[396]: Reloading filter
      Nov  1 10:48:36 fwrouter rc.gateway_alarm[96558]: >>> Gateway alarm: VPN_UDP_TOS_VPNV4 (Addr:8.8.4.4 Alarm:1 RTT:0.000ms RTTsd:0.000ms Loss:100%)
      Nov  1 10:48:36 fwrouter check_reload_status[396]: updating dyndns VPN_UDP_TOS_VPNV4
      Nov  1 10:48:36 fwrouter check_reload_status[396]: Restarting IPsec tunnels
      Nov  1 10:48:36 fwrouter check_reload_status[396]: Restarting OpenVPN tunnels/interfaces
      Nov  1 10:48:36 fwrouter check_reload_status[396]: Reloading filter
      Nov  1 10:48:37 fwrouter php-fpm[33266]: /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. ''
      Nov  1 10:48:37 fwrouter php-fpm[33266]: /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use VPN_UDP_TOS_VPNV4.
      Nov  1 10:48:39 fwrouter check_reload_status[396]: Linkup starting em0
      Nov  1 10:48:39 fwrouter kernel: em0: link state changed to UP
      Nov  1 10:48:40 fwrouter php-fpm[47855]: /rc.linkup: DEVD Ethernet attached event for wan
      Nov  1 10:48:40 fwrouter php-fpm[47855]: /rc.linkup: HOTPLUG: Configuring interface wan
      Nov  1 10:49:00 fwrouter sshguard[15203]: Exiting on signal.
      Nov  1 10:49:00 fwrouter sshguard[86330]: Now monitoring attacks.
      Nov  1 10:49:21 fwrouter kernel: ovpnc1: link state changed to DOWN
      Nov  1 10:49:21 fwrouter check_reload_status[396]: Reloading filter
      Nov  1 10:49:51 fwrouter check_reload_status[396]: rc.newwanip starting em0
      Nov  1 10:49:51 fwrouter php-fpm[47855]: /rc.linkup: Gateway, none 'available' for inet6, use the first one configured. ''
      Nov  1 10:49:51 fwrouter check_reload_status[396]: Restarting IPsec tunnels
      Nov  1 10:49:52 fwrouter php-fpm[17928]: /rc.newwanip: rc.newwanip: Info: starting on em0.
      Nov  1 10:49:52 fwrouter php-fpm[17928]: /rc.newwanip: rc.newwanip: on (IP address: 99.XXX.XXX.XXX) (interface: WAN[wan]) (real interface: em0).
      Nov  1 10:49:52 fwrouter check_reload_status[396]: Reloading filter
      Nov  1 10:49:58 fwrouter php-fpm[47855]: /rc.linkup: The command '/usr/sbin/arp -s '10.10.50.1' '00:28:1a:XX:XX:X5'' returned exit code '1', the output was 'arp: writing to routing socket: Operation not permitted' 
      Nov  1 10:49:59 fwrouter check_reload_status[396]: updating dyndns wan
      Nov  1 10:50:00 fwrouter sshguard[86330]: Exiting on signal.
      Nov  1 10:50:00 fwrouter sshguard[82308]: Now monitoring attacks.
      Nov  1 10:50:01 fwrouter check_reload_status[396]: Reloading filter
      Nov  1 10:50:01 fwrouter root[25004]: ***** 22/11/01-10:50:00-VPN_UDP_TOS_VPNV4-Attempting to issue restart to openvpn service... openvpn has been restarted.
      Nov  1 10:50:33 fwrouter php-fpm[47855]: /rc.linkup: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1667314233] unbound[36793:0] error: bind: address already in use [1667314233] unbound[36793:0] fatal error: could not open ports' 
      Nov  1 10:50:34 fwrouter php-fpm[47855]: /rc.linkup: The command '/usr/sbin/arp -s '10.10.50.1' '00:28:1a:XX:XX:X5'' returned exit code '1', the output was 'arp: writing to routing socket: Operation not permitted' 
      Nov  1 10:50:34 fwrouter check_reload_status[396]: Reloading filter
      Nov  1 10:51:46 fwrouter kernel: em0: link state changed to DOWN
      Nov  1 10:51:46 fwrouter check_reload_status[396]: Linkup starting em0
      Nov  1 10:51:47 fwrouter php-fpm[38020]: /rc.linkup: DEVD Ethernet detached event for wan
      Nov  1 10:51:48 fwrouter check_reload_status[396]: Reloading filter
      Nov  1 10:51:53 fwrouter check_reload_status[396]: Linkup starting em0
      Nov  1 10:51:53 fwrouter kernel: em0: link state changed to UP
      Nov  1 10:51:54 fwrouter php-fpm[91311]: /rc.linkup: DEVD Ethernet attached event for wan
      Nov  1 10:51:54 fwrouter php-fpm[91311]: /rc.linkup: HOTPLUG: Configuring interface wan
      Nov  1 10:53:06 fwrouter check_reload_status[396]: rc.newwanip starting em0
      Nov  1 10:53:06 fwrouter php-fpm[91311]: /rc.linkup: Gateway, none 'available' for inet6, use the first one configured. ''
      Nov  1 10:53:06 fwrouter check_reload_status[396]: Restarting IPsec tunnels
      Nov  1 10:53:07 fwrouter php-fpm[80175]: /rc.newwanip: rc.newwanip: Info: starting on em0.
      Nov  1 10:53:07 fwrouter php-fpm[80175]: /rc.newwanip: rc.newwanip: on (IP address: 99.XXX.XXX.XXX) (interface: WAN[wan]) (real interface: em0).
      Nov  1 10:53:07 fwrouter check_reload_status[396]: Reloading filter
      Nov  1 10:53:12 fwrouter php-fpm[91311]: /rc.linkup: The command '/usr/sbin/arp -s '10.10.50.1' '00:28:1a:XX:XX:X5'' returned exit code '1', the output was 'arp: writing to routing socket: Operation not permitted' 
      Nov  1 10:53:12 fwrouter check_reload_status[396]: updating dyndns wan
      Nov  1 10:53:46 fwrouter php-fpm[91311]: /rc.linkup: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1667314426] unbound[41349:0] error: bind: address already in use [1667314426] unbound[41349:0] fatal error: could not open ports' 
      Nov  1 10:53:47 fwrouter php-fpm[91311]: /rc.linkup: The command '/usr/sbin/arp -s '10.10.50.1' '00:28:1a:XX:XX:X5'' returned exit code '1', the output was 'arp: writing to routing socket: Operation not permitted' 
      Nov  1 10:53:47 fwrouter check_reload_status[396]: Reloading filter
      

      Sample resolver.log Entries:

      Nov  1 10:47:59 guardian unbound[74982]: [74982:2] debug: sending to target: <com.> 192.52.178.30#53
      Nov  1 10:47:59 guardian unbound[74982]: [74982:2] debug: cache memory msg=516099 rrset=1547265 infra=578247 val=0
      Nov  1 10:48:00 guardian unbound[74982]: [74982:3] debug: iterator[module 0] operate: extstate:module_wait_reply event:module_event_noreply
      Nov  1 10:48:00 guardian unbound[74982]: [74982:3] info: iterator operate: query canadaeast1-0.pushnp.svc.ms. A IN
      Nov  1 10:48:00 guardian unbound[74982]: [74982:3] info: processQueryTargets: canadaeast1-0.pushnp.svc.ms. A IN
      Nov  1 10:48:00 guardian unbound[74982]: [74982:3] info: sending query: canadaeast1-0.pushnp.svc.ms. A IN
      Nov  1 10:48:00 guardian unbound[74982]: [74982:3] debug: sending to target: <ms.> 185.28.194.194#53
      Nov  1 10:48:00 guardian unbound[74982]: [74982:3] debug: cache memory msg=516099 rrset=1547265 infra=578247 val=0
      Nov  1 10:48:00 guardian unbound[74982]: [74982:2] debug: cache memory msg=516099 rrset=1547265 infra=578247 val=0
      Nov  1 10:48:00 guardian unbound[74982]: [74982:1] debug: cache memory msg=516099 rrset=1547265 infra=578247 val=0
      Nov  1 10:48:00 guardian unbound[74982]: [74982:0] debug: iterator[module 0] operate: extstate:module_wait_reply event:module_event_noreply
      Nov  1 10:48:00 guardian unbound[74982]: [74982:0] info: iterator operate: query ssl.gstatic.com. A IN
      Nov  1 10:48:00 guardian unbound[74982]: [74982:0] info: processQueryTargets: ssl.gstatic.com. A IN
      Nov  1 10:48:00 guardian unbound[74982]: [74982:0] info: sending query: ssl.gstatic.com. A IN
      Nov  1 10:48:00 guardian unbound[74982]: [74982:0] debug: sending to target: <gstatic.com.> 216.239.36.10#53
      Nov  1 10:48:00 guardian unbound[74982]: [74982:0] debug: cache memory msg=516099 rrset=1547265 infra=578247 val=0
      Nov  1 10:48:01 guardian unbound[74982]: [74982:3] debug: iterator[module 0] operate: extstate:module_wait_reply event:module_event_noreply
      [SNIP]
      Nov  1 10:48:36 guardian unbound[74982]: [74982:0] info: sending query: nleditor.osi.office.net. A IN
      Nov  1 10:48:36 guardian unbound[74982]: [74982:0] debug: sending to target: <net.> 192.31.80.30#53
      Nov  1 10:48:36 guardian unbound[74982]: [74982:0] debug: error sending query to auth server 192.31.80.30 port 53
      Nov  1 10:48:36 guardian unbound[74982]: [74982:0] info: processQueryTargets: nleditor.osi.office.net. A IN
      Nov  1 10:48:36 guardian unbound[74982]: [74982:0] debug: request nleditor.osi.office.net. has exceeded the maximum number of glue fetches 60 to a single delegation point
      [SNIP]
      Nov  1 10:48:36 guardian unbound[74982]: [74982:0] info: sending query: . NS IN
      Nov  1 10:48:36 guardian unbound[74982]: [74982:0] debug: sending to target: <.> 193.0.14.129#53
      Nov  1 10:48:36 guardian unbound[74982]: [74982:0] debug: error sending query to auth server 193.0.14.129 port 53
      Nov  1 10:48:36 guardian unbound[74982]: [74982:0] info: processQueryTargets: . NS IN
      Nov  1 10:48:36 guardian unbound[74982]: [74982:0] debug: request . has exceeded the maximum number of glue fetches 34 to a single delegation point
      Nov  1 10:48:36 guardian unbound[74982]: [74982:0] info: sending query: . NS IN
      [SNIP]
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: processQueryTargets: . NS IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: new pside target D.ROOT-SERVERS.NET. A IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] debug: iterator[module 0] operate: extstate:module_state_initial event:module_event_pass
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: iterator operate: query D.ROOT-SERVERS.NET. A IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: resolving D.ROOT-SERVERS.NET. A IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: priming . IN NS
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] debug: return error response REFUSED
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] debug: iterator[module 0] operate: extstate:module_wait_subquery event:module_event_pass
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: iterator operate: query . NS IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: processQueryTargets: . NS IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: sending query: . NS IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] debug: sending to target: <.> 199.9.14.201#53
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] debug: error sending query to auth server 199.9.14.201 port 53
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: processQueryTargets: . NS IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: sending query: . NS IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] debug: sending to target: <.> 192.203.230.10#53
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] debug: error sending query to auth server 192.203.230.10 port 53
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: processQueryTargets: . NS IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: sending query: . NS IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] debug: sending to target: <.> 192.36.148.17#53
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] debug: error sending query to auth server 192.36.148.17 port 53
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: processQueryTargets: . NS IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: sending query: . NS IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] debug: sending to target: <.> 192.33.4.12#53
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] debug: error sending query to auth server 192.33.4.12 port 53
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: processQueryTargets: . NS IN
      [SNIP]
      Nov  1 10:49:00 guardian unbound[74982]: [74982:1] debug: iterator[module 0] operate: extstate:module_wait_subquery event:module_event_pass
      Nov  1 10:49:00 guardian unbound[74982]: [74982:1] info: iterator operate: query m.root-servers.net. A IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:1] info: processQueryTargets: m.root-servers.net. A IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:1] debug: request has exceeded the maximum  number of nxdomain nameserver lookups with 6
      Nov  1 10:49:00 guardian unbound[74982]: [74982:1] debug: return error response SERVFAIL
      [SNIP]
      Nov  1 10:49:00 guardian unbound[74982]: [74982:3] info: sending query: . NS IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: processQueryTargets: shadow.mxrouting.net. AAAA IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:1] debug: sending to target: <org.> 199.19.54.1#53
      Nov  1 10:49:00 guardian unbound[74982]: [74982:0] debug: sending to target: <.> 202.12.27.33#53
      Nov  1 10:49:00 guardian unbound[74982]: [74982:0] debug: error sending query to auth server 202.12.27.33 port 53
      Nov  1 10:49:00 guardian unbound[74982]: [74982:2] info: sending query: shadow.mxrouting.net. AAAA IN
      Nov  1 10:49:00 guardian unbound[74982]: [74982:1] debug: error sending query to auth server 199.19.54.1 port 53
      Nov  1 10:49:00 guardian unbound[74982]: [74982:3] debug: sending to target: <.> 198.97.190.53#53
      

      If you find my post useful, please give it a thumbs up!
      pfSense 2.7.2-RELEASE

      1 Reply Last reply Reply Quote 0
      • G
        Gawzirabaws
        last edited by

        i was having similar symptoms for months after an isp change. elusive, intermittent failures. certain sites faring well, others almost unusable. then a day later, seemingly random redistribution of problematic sites and working sites.

        figured out my isp was handing out ip6 and ip4 addresses to dhcp clients, while only allowing port 53 traffic over ip4 for some ungodly reason.

        SOLVED: putting "do-ip6: no" in unbound.conf cleared everything right up. AAAA records still come back but everything port 53 up and downstream of my resolver now happens over ip4.

        sadly i guess i cannot access any of the zero domains known to me with their authoritative nameservers only available via ip6 unless i use my isps cache (lazy bums actually only offer up google's in the dhcp response anyway; no thank you)

        1 Reply Last reply Reply Quote 0
        • First post
          Last post
        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.