Unbound stopped working



  • Tonight unbound service stopped but I don't know why. The log shows:

    Apr 21 01:42:56	filterdns		merge_config: configuration reload
    Apr 21 01:42:52	unbound	13074:0	info: 2.000000 4.000000 1
    Apr 21 01:42:52	unbound	13074:0	info: 1.000000 2.000000 7
    Apr 21 01:42:52	unbound	13074:0	info: 0.524288 1.000000 29
    Apr 21 01:42:52	unbound	13074:0	info: 0.262144 0.524288 85
    Apr 21 01:42:52	unbound	13074:0	info: 0.131072 0.262144 208
    Apr 21 01:42:52	unbound	13074:0	info: 0.065536 0.131072 384
    Apr 21 01:42:52	unbound	13074:0	info: 0.032768 0.065536 481
    Apr 21 01:42:52	unbound	13074:0	info: 0.016384 0.032768 455
    Apr 21 01:42:52	unbound	13074:0	info: 0.008192 0.016384 216
    Apr 21 01:42:52	unbound	13074:0	info: 0.004096 0.008192 16
    Apr 21 01:42:52	unbound	13074:0	info: 0.002048 0.004096 5
    Apr 21 01:42:52	unbound	13074:0	info: 0.001024 0.002048 1
    Apr 21 01:42:52	unbound	13074:0	info: 0.000512 0.001024 5
    Apr 21 01:42:52	unbound	13074:0	info: 0.000256 0.000512 1
    Apr 21 01:42:52	unbound	13074:0	info: 0.000000 0.000001 430
    Apr 21 01:42:52	unbound	13074:0	info: lower(secs) upper(secs) recursions
    Apr 21 01:42:52	unbound	13074:0	info: [25%]=0.0128569 median[50%]=0.0350161 [75%]=0.0882347
    Apr 21 01:42:52	unbound	13074:0	info: histogram of recursion processing times
    Apr 21 01:42:52	unbound	13074:0	info: average recursion processing time 0.073219 sec
    Apr 21 01:42:52	unbound	13074:0	info: server stats for thread 3: requestlist max 37 avg 2.67326 exceeded 0 jostled 0
    Apr 21 01:42:52	unbound	13074:0	info: server stats for thread 3: 2689 queries, 365 answers from cache, 2324 recursions, 51 prefetch, 0 rejected by ip ratelimiting
    Apr 21 01:42:52	unbound	13074:0	info: server stats for thread 2: requestlist max 0 avg 0 exceeded 0 jostled 0
    Apr 21 01:42:52	unbound	13074:0	info: server stats for thread 2: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Apr 21 01:42:52	unbound	13074:0	info: server stats for thread 1: requestlist max 0 avg 0 exceeded 0 jostled 0
    Apr 21 01:42:52	unbound	13074:0	info: server stats for thread 1: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Apr 21 01:42:52	unbound	13074:0	info: server stats for thread 0: requestlist max 0 avg 0 exceeded 0 jostled 0
    Apr 21 01:42:52	unbound	13074:0	info: server stats for thread 0: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Apr 21 01:42:52	unbound	13074:0	info: service stopped (unbound 1.8.1).
    
    

    It also doesn't start automatically again, so I have to fix it myself. It neverr happened before. Any idea to find out what was wrong?

    I use pfsense 2.4.4-p2.

    EDIT: after checking system logs: I saw this:

    Apr 21 01:42:55	php-fpm	31234	/rc.newwanipv6: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1555803775] unbound[8292:0] error: can't bind socket: Can't assign requested address for 2a02:8071:800:0:xxx port 53 [1555803775] unbound[8292:0] fatal error: could not open ports'
    Apr 21 01:42:55	php-fpm	11145	/rc.newwanip: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1555803775] unbound[97811:0] error: can't bind socket: Can't assign requested address for 2a02:8071:800:0:xxx port 53 [1555803775] unbound[97811:0] fatal error: could not open ports'
    

    What does that mean?



  • It seems that this happens when WAN IPv6 shortly goes down.





  • I saw this. But why is this happening? I was usimg this installation for a long time without problems.



  • Why ?
    I should have to debug your system to answer that ;)

    What I told in the other thread might be the problem : Unbound is started, and the "code" goes on. To get back right away to re-kill unbound - but no PID file created yet, we're just several micro seconds later .... so no killing - but a 1 sec wait.
    Now, after that one sec, unbound really started .... and get started again in the code => error : it's already running.

    My solution, as posted in the other thread : wait 1 sec also after starting unbound.



  • Strange that this appeared now. I will test this later.



  • Another question: why is Unbound restarting if there is a IP change or WAN goes down and up? It also cleans the cache and thats bad. Can't it be running all the time?


  • LAYER 8 Global Moderator

    Do you have it bound to all interfaces for listening?

    I just pulled my wan interface plug on pfsense, and sure it started a lot of things that listion on that and use that - my vpn connection, my gif connection for HE, etc. etc..

    But unbound did not restart.. But I don't have it listen on all, I just have it listen on my lan side ports.



  • I also have it only to my LAN and VLAN interfaces and localhost. Right now it happens again but also started again:

    Apr 22 13:08:05	unbound	54631:0	info: start of service (unbound 1.8.1).
    Apr 22 13:08:05	unbound	54631:0	notice: init module 1: iterator
    Apr 22 13:08:05	unbound	54631:0	notice: init module 0: validator
    Apr 22 13:08:02	unbound	8296:0	info: 2.000000 4.000000 6
    Apr 22 13:08:02	unbound	8296:0	info: 1.000000 2.000000 17
    Apr 22 13:08:02	unbound	8296:0	info: 0.524288 1.000000 80
    Apr 22 13:08:02	unbound	8296:0	info: 0.262144 0.524288 317
    Apr 22 13:08:02	unbound	8296:0	info: 0.131072 0.262144 692
    Apr 22 13:08:02	unbound	8296:0	info: 0.065536 0.131072 1009
    Apr 22 13:08:02	unbound	8296:0	info: 0.032768 0.065536 1938
    Apr 22 13:08:02	unbound	8296:0	info: 0.016384 0.032768 2427
    Apr 22 13:08:02	unbound	8296:0	info: 0.008192 0.016384 1193
    Apr 22 13:08:02	unbound	8296:0	info: 0.004096 0.008192 140
    Apr 22 13:08:02	unbound	8296:0	info: 0.002048 0.004096 38
    Apr 22 13:08:02	unbound	8296:0	info: 0.001024 0.002048 14
    Apr 22 13:08:02	unbound	8296:0	info: 0.000512 0.001024 7
    Apr 22 13:08:02	unbound	8296:0	info: 0.000256 0.000512 6
    Apr 22 13:08:02	unbound	8296:0	info: 0.000128 0.000256 2
    Apr 22 13:08:02	unbound	8296:0	info: 0.000064 0.000128 2
    Apr 22 13:08:02	unbound	8296:0	info: 0.000032 0.000064 1
    Apr 22 13:08:02	unbound	8296:0	info: 0.000000 0.000001 2114
    Apr 22 13:08:02	unbound	8296:0	info: lower(secs) upper(secs) recursions
    Apr 22 13:08:02	unbound	8296:0	info: [25%]=0.00940569 median[50%]=0.0264054 [75%]=0.0591151
    Apr 22 13:08:02	unbound	8296:0	info: histogram of recursion processing times
    Apr 22 13:08:02	unbound	8296:0	info: average recursion processing time 0.057755 sec
    Apr 22 13:08:02	unbound	8296:0	info: server stats for thread 3: requestlist max 54 avg 2.25742 exceeded 0 jostled 0
    Apr 22 13:08:02	unbound	8296:0	info: server stats for thread 3: 13459 queries, 3456 answers from cache, 10003 recursions, 447 prefetch, 0 rejected by ip ratelimiting
    Apr 22 13:08:02	unbound	8296:0	info: server stats for thread 2: requestlist max 0 avg 0 exceeded 0 jostled 0
    Apr 22 13:08:02	unbound	8296:0	info: server stats for thread 2: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Apr 22 13:08:02	unbound	8296:0	info: server stats for thread 1: requestlist max 0 avg 0 exceeded 0 jostled 0
    Apr 22 13:08:02	unbound	8296:0	info: server stats for thread 1: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Apr 22 13:08:02	unbound	8296:0	info: server stats for thread 0: requestlist max 0 avg 0 exceeded 0 jostled 0
    Apr 22 13:08:02	unbound	8296:0	info: server stats for thread 0: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
    Apr 22 13:08:02	unbound	8296:0	info: service stopped (unbound 1.8.1).
    

    System log at that time:

    Apr 22 13:10:30	check_reload_status		Reloading filter
    Apr 22 13:10:26	check_reload_status		Syncing firewall
    Apr 22 13:08:09	php-fpm	353	/services_unbound.php: Gateway, switch to: WAN_DHCP6
    Apr 22 13:08:07	php-fpm	353	/services_unbound.php: Gateway, switch to: WAN_DHCP6
    Apr 22 13:08:05	php-fpm	353	/services_unbound.php: Gateway, switch to: WAN_DHCP6
    Apr 22 13:08:05	php-fpm	353	/services_unbound.php: Gateway, switch to: WAN_DHCP6
    Apr 22 13:08:05	php-fpm	353	/services_unbound.php: Gateway, switch to: WAN_DHCP6
    Apr 22 13:08:05	php-fpm	353	/services_unbound.php: Gateway, switch to: WAN_DHCP6
    Apr 22 13:08:02	check_reload_status		Syncing firewall
    Apr 22 13:08:01	php-fpm	12113	/services_unbound.php: Gateway, switch to: WAN_DHCP6
    Apr 22 13:08:01	php-fpm	12113	/services_unbound.php: Gateway, switch to: WAN_DHCP6
    Apr 22 13:08:01	php-fpm	12113	/services_unbound.php: Gateway, switch to: WAN_DHCP6
    Apr 22 13:08:01	php-fpm	12113	/services_unbound.php: Gateway, switch to: WAN_DHCP6
    

  • LAYER 8 Global Moderator

    My BAD!!! on that - not sure how I didn't see the log.. But I have unbound logging all queries so log fills up quick on the gui..

    I just did another test.. Where I looked at uptime and pid of pfsense, then pulled the wan connection and put it back... And yes clearly it restarts.

    [2.4.4-RELEASE][admin@sg4860.local.lan]/root: /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf status
    version: 1.8.1
    verbosity: 3
    threads: 4
    modules: 2 [ validator iterator ]
    uptime: 729 seconds
    options: control(ssl)
    unbound (pid 6774) is running...
    [2.4.4-RELEASE][admin@sg4860.local.lan]/root: /usr/local/sbin/unbound-control -c /var/unbound/unbound.conf status
    version: 1.8.1
    verbosity: 3
    threads: 4
    modules: 2 [ validator iterator ]
    uptime: 7 seconds
    options: control(ssl)
    unbound (pid 97634) is running...
    [2.4.4-RELEASE][admin@sg4860.local.lan]/root: 
    

    You can see the uptime reset, and the pid changed.


Log in to reply