Unbound crashes periodically with signal 11
-
I first noticed yesterday that the windows 10 client had no internet access, even after disabling and reenabling the network adapter. The console was still alive, so I rebooted it. It came up okay, so I updated it and left it until today. I noticed again today that the client had no internet access. I rebooted and checked the log. The general log is completely empty from yesterday until I logged into the console to reboot.
Here are the log messages where the gap is located:
Jan 18 12:32:03 login 82321 login on ttyv0 as root Jan 17 17:55:55 kernel pid 42275 (unbound), jid 0, uid 59: exited on signal 11Is anyone else experiencing this?
-
@bimmerdriver said in pfSense 2.50 snapshots have been dying for the past couple of days:
I first noticed yesterday that the windows 10 client had no internet access, even after disabling and reenabling the network adapter. The console was still alive, so I rebooted it. It came up okay, so I updated it and left it until today. I noticed again today that the client had no internet access. I rebooted and checked the log. The general log is completely empty from yesterday until I logged into the console to reboot.
Here are the log messages where the gap is located:
Jan 18 12:32:03 login 82321 login on ttyv0 as root Jan 17 17:55:55 kernel pid 42275 (unbound), jid 0, uid 59: exited on signal 11Is anyone else experiencing this?
It might be useful to the developers to know if you have any packages running that interact with
unbound, such as pfBlockerNG-devel. If you have such a package, I would try disabling the package to see if the package interaction withunboundis the issue, or ifunboundalone is crashing. -
@bmeeks
I too have the same problem. Only unbound dies... -
@maverick_slo
Other packages ?
Latest snapshot ?
Developers can only actually do something if you add details. -
Snapshot:
2.5.0-DEVELOPMENT (amd64) built on Sun Jan 17 00:06:34 EST 2021Packages:
acme
Cron
haproxy-devel
openvpn-client-export
snortLog:
Jan 19 07:57:13 unbound 40894 [40894:0] info: service stopped (unbound 1.13.0).
Jan 19 08:00:52 kernel pid 40894 (unbound), jid 0, uid 59: exited on signal 11
Jan 19 08:05:33 dhcpleases 26712 Could not deliver signal HUP to process 40894: No such processWas working 3-4 days ago without any issues.
-
@maverick_slo said in pfSense 2.50 snapshots have been dying for the past couple of days:
Jan 19 07:57:13 unbound 40894 [40894:0] info: service stopped (unbound 1.13.0).
Jan 19 08:00:52 kernel pid 40894 (unbound), jid 0, uid 59: exited on signal 11
Jan 19 08:05:33 dhcpleases 26712 Could not deliver signal HUP to process 40894: No such processThe list of packages and relevant log entries helps the developers. From the log snippet I quoted above, it appears the
unboundservice was told to "stop", and then about 2 minutes later the kernel logs the Signal 11 from theunbounddaemon, which is a SEGFAULT or memory violation. -
Sorry for the delay to reply with further information. This system has no packages installed. It's running on a windows 2019 hyper-v server. The lan and single windows 10 client are also virtual. It's configured with basic mostly default settings with a native dual-stack connection to the ISP. The prefix is a /56. Since I posted, the system operated continuously with no problems, so perhaps something changed when I updated to 2.5.0.a.20210120.1500.
-
doing update to 2.5.0.a.20210120.1500 now.
lets see -
Unbound stops, requires manual restart several times a day.
Version: 2.5.0.a.20210118.2350
Unbound 1.13.0
Packages: apcupsd 0.3.91_92021-01-21 04:43:35.153316+00:00 unbound 66192 [66192:0] info: 0.002048 0.004096 1 2021-01-21 04:43:35.153307+00:00 unbound 66192 [66192:0] info: lower(secs) upper(secs) recursions 2021-01-21 04:43:35.153296+00:00 unbound 66192 [66192:0] info: [25%]=0 median[50%]=0 [75%]=0 2021-01-21 04:43:35.153287+00:00 unbound 66192 [66192:0] info: histogram of recursion processing times 2021-01-21 04:43:35.153277+00:00 unbound 66192 [66192:0] info: average recursion processing time 0.002168 sec 2021-01-21 04:43:35.153262+00:00 unbound 66192 [66192:0] info: server stats for thread 3: requestlist max 0 avg 0 exceeded 0 jostled 0 2021-01-21 04:43:35.153246+00:00 unbound 66192 [66192:0] info: server stats for thread 3: 3 queries, 2 answers from cache, 1 recursions, 0 prefetch, 0 rejected by ip ratelimiting 2021-01-21 04:43:35.152091+00:00 unbound 66192 [66192:0] info: 0.131072 0.262144 3 2021-01-21 04:43:35.152076+00:00 unbound 66192 [66192:0] info: 0.065536 0.131072 1 2021-01-21 04:43:35.152061+00:00 unbound 66192 [66192:0] info: lower(secs) upper(secs) recursions 2021-01-21 04:43:35.152052+00:00 unbound 66192 [66192:0] info: [25%]=0.131072 median[50%]=0.174763 [75%]=0.218453 2021-01-21 04:43:35.152039+00:00 unbound 66192 [66192:0] info: histogram of recursion processing times 2021-01-21 04:43:35.152028+00:00 unbound 66192 [66192:0] info: average recursion processing time 0.175769 sec 2021-01-21 04:43:35.152014+00:00 unbound 66192 [66192:0] info: server stats for thread 2: requestlist max 0 avg 0 exceeded 0 jostled 0 2021-01-21 04:43:35.151997+00:00 unbound 66192 [66192:0] info: server stats for thread 2: 4 queries, 0 answers from cache, 4 recursions, 0 prefetch, 0 rejected by ip ratelimiting 2021-01-21 04:43:35.150737+00:00 unbound 66192 [66192:0] info: 0.032768 0.065536 1 2021-01-21 04:43:35.150728+00:00 unbound 66192 [66192:0] info: lower(secs) upper(secs) recursions 2021-01-21 04:43:35.150717+00:00 unbound 66192 [66192:0] info: [25%]=0 median[50%]=0 [75%]=0 2021-01-21 04:43:35.150708+00:00 unbound 66192 [66192:0] info: histogram of recursion processing times 2021-01-21 04:43:35.150698+00:00 unbound 66192 [66192:0] info: average recursion processing time 0.049706 sec 2021-01-21 04:43:35.150689+00:00 unbound 66192 [66192:0] info: server stats for thread 1: requestlist max 0 avg 0 exceeded 0 jostled 0 2021-01-21 04:43:35.150673+00:00 unbound 66192 [66192:0] info: server stats for thread 1: 2 queries, 1 answers from cache, 1 recursions, 0 prefetch, 0 rejected by ip ratelimiting 2021-01-21 04:43:35.149631+00:00 unbound 66192 [66192:0] info: 0.131072 0.262144 2 2021-01-21 04:43:35.149620+00:00 unbound 66192 [66192:0] info: 0.002048 0.004096 1 2021-01-21 04:43:35.149611+00:00 unbound 66192 [66192:0] info: lower(secs) upper(secs) recursions 2021-01-21 04:43:35.149600+00:00 unbound 66192 [66192:0] info: [25%]=0 median[50%]=0 [75%]=0 2021-01-21 04:43:35.149591+00:00 unbound 66192 [66192:0] info: histogram of recursion processing times 2021-01-21 04:43:35.149580+00:00 unbound 66192 [66192:0] info: average recursion processing time 0.145391 sec 2021-01-21 04:43:35.149567+00:00 unbound 66192 [66192:0] info: server stats for thread 0: requestlist max 0 avg 0 exceeded 0 jostled 0 2021-01-21 04:43:35.149546+00:00 unbound 66192 [66192:0] info: server stats for thread 0: 3 queries, 0 answers from cache, 3 recursions, 0 prefetch, 0 rejected by ip ratelimiting 2021-01-21 04:43:35.149002+00:00 unbound 66192 [66192:0] info: service stopped (unbound 1.13.0). 2021-01-21 04:42:07.648917+00:00 unbound 66192 [66192:0] info: start of service (unbound 1.13.0). 2021-01-21 04:42:07.642700+00:00 unbound 66192 [66192:0] notice: init module 0: iterator 2021-01-21 04:42:07.641500+00:00 unbound 66192 [66192:0] notice: Restart of unbound 1.13.0. 2021-01-21 04:42:07.640307+00:00 unbound 66192 [66192:0] info: server stats for thread 3: requestlist max 0 avg 0 exceeded 0 jostled 0 2021-01-21 04:42:07.640290+00:00 unbound 66192 [66192:0] info: server stats for thread 3: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting 2021-01-21 04:42:07.639234+00:00 unbound 66192 [66192:0] info: server stats for thread 2: requestlist max 0 avg 0 exceeded 0 jostled 0 2021-01-21 04:42:07.639218+00:00 unbound 66192 [66192:0] info: server stats for thread 2: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting 2021-01-21 04:42:07.638136+00:00 unbound 66192 [66192:0] info: 0.131072 0.262144 1 2021-01-21 04:42:07.638121+00:00 unbound 66192 [66192:0] info: lower(secs) upper(secs) recursions 2021-01-21 04:42:07.638106+00:00 unbound 66192 [66192:0] info: [25%]=0 median[50%]=0 [75%]=0 2021-01-21 04:42:07.638097+00:00 unbound 66192 [66192:0] info: histogram of recursion processing times 2021-01-21 04:42:07.638086+00:00 unbound 66192 [66192:0] info: average recursion processing time 0.218760 sec 2021-01-21 04:42:07.638077+00:00 unbound 66192 [66192:0] info: server stats for thread 1: requestlist max 0 avg 0 exceeded 0 jostled 0 2021-01-21 04:42:07.638060+00:00 unbound 66192 [66192:0] info: server stats for thread 1: 1 queries, 0 answers from cache, 1 recursions, 0 prefetch, 0 rejected by ip ratelimiting 2021-01-21 04:42:07.636993+00:00 unbound 66192 [66192:0] info: server stats for thread 0: requestlist max 0 avg 0 exceeded 0 jostled 0 2021-01-21 04:42:07.636974+00:00 unbound 66192 [66192:0] info: server stats for thread 0: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting 2021-01-21 04:42:07.636447+00:00 unbound 66192 [66192:0] info: service stopped (unbound 1.13.0). 2021-01-21 04:41:56.104192+00:00 unbound 66192 [66192:0] info: start of service (unbound 1.13.0).David
-
@maverick_slo
did not help: Jan 21 06:45:09 kernel pid 5252 (unbound), jid 0, uid 59: exited on signal 11 -
@n8it said in pfSense 2.50 snapshots have been dying for the past couple of days:
2021-01-21 04:43:35.149002+00:00 unbound 66192 [66192:0] info: service stopped (unbound 1.13.0).
2021-01-21 04:42:07.648917+00:00 unbound 66192 [66192:0] info: start of service (unbound 1.13.0).What/who is restarting unbound ? Check other logs, like DHCP.
How often does it restart.Btw : the unbound log sequence you showed is the normal one for stop and start : on stop it dumps some stats for every internal thread.
-
@gertjan
For me it is just this log line: Jan 21 06:45:09 kernel pid 5252 (unbound), jid 0, uid 59: exited on signal 11
Nothing in unbound logs or anywhere else.
Started few days ago. -
-
DHCP logs do show DHCP is sending a HUP to DNS daemon. It happens more than 12 times a day. Thanks for telling me to check.
code_text 2021-01-21 06:10:05.551715+00:00 dhcpd 49502 DHCPACK on 192.168.1.89 to 04:79:b7:ad:9c:7a via igb1 2021-01-21 06:10:05.551693+00:00 dhcpd 49502 DHCPREQUEST for 192.168.1.89 from 04:79:b7:ad:9c:7a via igb1 2021-01-21 06:07:14.269409+00:00 dhcpleases 35662 Sending HUP signal to dns daemon(43713) 2021-01-21 06:07:14.253979+00:00 dhcpd 49502 DHCPACK on 192.168.1.101 to ac:5f:3e:fe:91:3e (DesGS7) via igb1 2021-01-21 06:07:14.252255+00:00 dhcpd 49502 DHCPREQUEST for 192.168.1.101 from ac:5f:3e:fe:91:3e (DesGS7) via igb1 2021-01-21 06:06:52.752895+00:00 dhcpleases 35662 Sending HUP signal to dns daemon(43713) 2021-01-21 06:06:52.738449+00:00 dhcpleases 35662 Sending HUP signal to dns daemon(43713) 2021-01-21 06:06:52.721929+00:00 dhcpd 49502 Server starting service. 2021-01-21 06:06:52.721763+00:00 dhcpd 49502 Sending on Socket/fallback/fallback-net 2021-01-21 06:06:52.721705+00:00 dhcpd 49502 Sending on BPF/igb1/40:62:31:0a:d9:7b/192.168.1.0/24 2021-01-21 06:06:52.721631+00:00 dhcpd 49502 Listening on BPF/igb1/40:62:31:0a:d9:7b/192.168.1.0/24 2021-01-21 06:06:52.721562+00:00 dhcpd 49502 Sending on BPF/igb2/40:62:31:0a:d9:7c/192.168.100.0/24 2021-01-21 06:06:52.721536+00:00 dhcpd 49502 Listening on BPF/igb2/40:62:31:0a:d9:7c/192.168.100.0/24 2021-01-21 06:06:52.719470+00:00 dhcpd 49502 Wrote 24 leases to leases file. 2021-01-21 06:06:52.719236+00:00 dhcpd 49502 Wrote 0 new dynamic host decls to leases file. 2021-01-21 06:06:52.719187+00:00 dhcpd 49502 Wrote 0 deleted host decls to leases file. 2021-01-21 06:06:52.719107+00:00 dhcpd 49502 Wrote 0 class decls to leases file. 2021-01-21 06:06:52.718612+00:00 dhcpd 49502 For info, please visit https://www.isc.org/software/dhcp/ 2021-01-21 06:06:52.718602+00:00 dhcpd 49502 All rights reserved. 2021-01-21 06:06:52.718593+00:00 dhcpd 49502 Copyright 2004-2020 Internet Systems Consortium. 2021-01-21 06:06:52.718582+00:00 dhcpd 49502 Internet Systems Consortium DHCP Server 4.4.2 2021-01-21 06:06:52.718525+00:00 dhcpd 49502 PID file: /var/run/dhcpd.pid 2021-01-21 06:06:52.718493+00:00 dhcpd 49502 Database file: /var/db/dhcpd.leases 2021-01-21 06:06:52.718428+00:00 dhcpd 49502 Config file: /etc/dhcpd.conf 2021-01-21 06:06:52.716270+00:00 dhcpd 49502 For info, please visit https://www.isc.org/software/dhcp/ 2021-01-21 06:06:52.716232+00:00 dhcpd 49502 All rights reserved. 2021-01-21 06:06:52.716207+00:00 dhcpd 49502 Copyright 2004-2020 Internet Systems Consortium. 2021-01-21 06:06:52.716056+00:00 dhcpd 49502 Internet Systems Consortium DHCP Server 4.4.2 2021-01-21 06:06:51.481598+00:00 dhcpleases 35662 Could not deliver signal HUP to process 30993: No such process. 2021-01-21 06:06:51.481494+00:00 dhcpleases 35662 Sending HUP signal to dns daemon(30993) 2021-01-21 06:00:48.433505+00:00 dhcpleases 23469 Sending HUP signal to dns daemon(30993) 2021-01-21 06:00:48.419710+00:00 dhcpd 36710 Server starting service. 2021-01-21 06:00:48.419547+00:00 dhcpd 36710 Sending on Socket/fallback/fallback-net 2021-01-21 06:00:48.419432+00:00 dhcpd 36710 Sending on BPF/igb1/40:62:31:0a:d9:7b/192.168.1.0/24 2021-01-21 06:00:48.419387+00:00 dhcpd 36710 Listening on BPF/igb1/40:62:31:0a:d9:7b/192.168.1.0/24 2021-01-21 06:00:48.419220+00:00 dhcpd 36710 Sending on BPF/igb2/40:62:31:0a:d9:7c/192.168.100.0/24 2021-01-21 06:00:48.419168+00:00 dhcpd 36710 Listening on BPF/igb2/40:62:31:0a:d9:7c/192.168.100.0/24 2021-01-21 06:00:48.416935+00:00 dhcpd 36710 Wrote 24 leases to leases file. 2021-01-21 06:00:48.416727+00:00 dhcpd 36710 Wrote 0 new dynamic host decls to leases file. 2021-01-21 06:00:48.416680+00:00 dhcpd 36710 Wrote 0 deleted host decls to leases file. 2021-01-21 06:00:48.416615+00:00 dhcpd 36710 Wrote 0 class decls to leases file. 2021-01-21 06:00:48.416054+00:00 dhcpd 36710 For info, please visit https://www.isc.org/software/dhcp/ 2021-01-21 06:00:48.416044+00:00 dhcpd 36710 All rights reserved. 2021-01-21 06:00:48.416035+00:00 dhcpd 36710 Copyright 2004-2020 Internet Systems Consortium. 2021-01-21 06:00:48.416024+00:00 dhcpd 36710 Internet Systems Consortium DHCP Server 4.4.2 2021-01-21 06:00:48.415966+00:00 dhcpd 36710 PID file: /var/run/dhcpd.pid 2021-01-21 06:00:48.415930+00:00 dhcpd 36710 Database file: /var/db/dhcpd.leases 2021-01-21 06:00:48.415865+00:00 dhcpd 36710 Config file: /etc/dhcpd.conf 2021-01-21 06:00:48.413782+00:00 dhcpd 36710 For info, please visit https://www.isc.org/software/dhcp/ 2021-01-21 06:00:48.413752+00:00 dhcpd 36710 All rights reserved. 2021-01-21 06:00:48.413709+00:00 dhcpd 36710 Copyright 2004-2020 Internet Systems Consortium. 2021-01-21 06:00:48.413590+00:00 dhcpd 36710 Internet Systems Consortium DHCP Server 4.4.2 -
You have a single LAN with single W10 client and you see this :
@n8it said in pfSense 2.50 snapshots have been dying for the past couple of days:
2021-01-21 06:06:52.752895+00:00 dhcpleases 35662 Sending HUP signal to dns daemon(43713)
2021-01-21 06:06:52.738449+00:00 dhcpleases 35662 Sending HUP signal to dns daemon(43Who are these 192.168.1.101 - 192.168.1.89
Why is the dhcpd server restarting so often ?
If the LAN goes down, it's normal that clients trigger a DHCP sequence. But your LAN is virtual ... -
@gertjan well that resolved it.
Thanks for the tip now to find root cause... -
it looks like dhcp tries to restart the unbound service because of a new dhcp lease but unbound does not start again.
if I enter this
kill -1 `cat /var/run/unbound.pid`in the comandline on my production system unbound restarts on the 2.5 snapshot unbound dies and doesn't start again.
can anyone confirm this?
-
@supermaster
Yup. -
@supermaster said in pfSense 2.50 snapshots have been dying for the past couple of days:
it looks like dhcp tries to restart the unbound
dhcp == the DHCP client, running on a WAN type interface to get a WAN IP from an upstream DHCP server, like your ISP router, or your ISP.
dhcpd is a server type app that receive DHCP requests from dhcp clients on your LAN.
I presume that your dhcp is the dhcpd process.Ins't
kill -1 `cat /var/run/unbound.pid`just killing it ? It's not a 'restart' command.
Also, check if the PID, the integer in the file is actually the PID of the running instance of unbound.
ps ax | grep 'unbound'to see how many unbound instances are running, and if the PID corresponds.
If the dhcpd is restarting unbound to often, set up unbound to ignore new DHCP leases - this works well with 2.4.5-p1.
-
yes i mean the dhcpd process. the problem is not that dhpcd restarts unbound too often.
if I understand the code correctly https://github.com/pfsense/FreeBSD-ports/blob/devel/sysutils/dhcpleases/files/dhcpleases.c
syslog(LOG_INFO, "Sending HUP signal to dns daemon(%u)", pidno); if (kill((pid_t)pidno, SIGHUP) < 0) syslog(LOG_ERR, "Could not deliver signal HUP to process %d: %m.", pidno)it will send kil -HUP (kill -1 is the same as HUP) to unobund but i dont find where unbound is startet after that again.
this works in version 2.4.5 without problems because unbound just starts again when you do a kill -HUP but not in 2.5 since the update to unbound 1.13
