Unbound ipv6 problems anyone?
-
Something else is afoot here. I have rebooted multiple times, using dhcp6 before RA, and I cannot make it fall over.
Bimmer, is it possible that sometimes there is a slow response from your ISP for the v6 lease, hence the 'sometimes' its OK?
It maybe that if that is the case that there is a bit of a catch 22 going on, in that rc.newwanipv6 cannot restart unbound as it's still in the booting up process. Also, and I've not delved into that bit of the code before, unbound is being killed and restarted, if already running, rather than a SIGHUP; I need to study that a bit more to see if there is a reason for that.
- note *
Having studied and read Jimp's note that unbound should be killed then that answers that question. Whilst there is already work going on in this area I will not mess with it, too many cooks….
-
Sorry for the slow response. I've been away all week… I'll try running wireshark to see what's happening. I suspect it will the same as it always was. I have a suspicion this problem has been present all along, but I haven't noticed it because unbound was getting restarted so often. Possibly now that it's not getting restarted, the problem is visible. That would explain why I didn't notice it before recently, along with the other changes.
-
I captured icmpv6 and dhcpv6 packets with wireshark during a reboot. Everything appears the way it always has. There are no significant delays in any of the replies.
-
I created bug #7750 "unbound refuses ipv6 queries after reboot" for this issue.
-
I just did another reboot to capture log info. It's consistent that unbound starts before the PD process has completed.
Aug 2 13:12:15 dhcp6c 11947 add an address 2001:*:*:*:215:5dff:fe5c:e21e/64 on hn0 Aug 2 13:12:15 dhcp6c 11947 dhcp6c Received REQUEST Aug 2 13:12:15 dhcp6c 11947 Sending Request Aug 2 13:12:14 dhcp6c 11947 Sending Solicit Aug 2 13:12:13 dhcp6c 11867 skip opening control port Aug 2 13:12:13 dhcp6c 11867 failed initialize control message authentication Aug 2 13:12:13 dhcp6c 11867 failed to open /usr/local/etc/dhcp6cctlkey: No such file or directory
Aug 2 13:12:27 unbound 24093:1 info: response for . NS IN Aug 2 13:12:27 unbound 24093:1 info: priming . IN NS Aug 2 13:12:27 unbound 24093:1 info: resolving ipv6.msftconnecttest.com. A IN Aug 2 13:12:27 unbound 24093:1 info: priming . IN NS Aug 2 13:12:27 unbound 24093:1 info: resolving ipv6.msftconnecttest.com. AAAA IN Aug 2 13:12:27 unbound 24093:1 info: priming . IN NS Aug 2 13:12:27 unbound 24093:1 info: resolving 0.pfsense.pool.ntp.org. A IN Aug 2 13:12:14 unbound 24093:0 info: start of service (unbound 1.6.3). Aug 2 13:12:14 unbound 24093:0 notice: init module 1: iterator Aug 2 13:12:14 unbound 24093:0 notice: init module 0: validator
-
I've been trying to recreate this, as I'm also on Telus using dhcp6c before RA, but so far no luck, if I reboot from an upgrade or otherwise, everything still works fine. For curiosity sake, I even looked to see when my unbound is starting and it happens after I get my prefix. I'm running on the latest 2.4 snapshot. I'm in AB on FTTH with the T3200M bridged to port1 with a bare metal install on an SG-2440.
Aug 2 17:10:24 dhcp6c 23456 dhcp6c Received REQUEST Aug 2 17:10:24 dhcp6c 23456 nameserver[0] 2001:568:ff09:10c::53 Aug 2 17:10:24 dhcp6c 23456 nameserver[1] 2001:568:ff09:10b::122 Aug 2 17:10:24 dhcp6c 23456 make an IA: PD-0 Aug 2 17:10:24 dhcp6c 23456 create a prefix 2001:56a:73d0:3c00::/56 pltime=140733193402432, vltime=14700 Aug 2 17:10:24 dhcp6c 23456 add an address 2001:56a:73d0:3c01:208:a3ff:fe0a:5ad6/64 on igb1_vlan101 Aug 2 17:10:24 dhcp6c 23456 executes /var/etc/dhcp6c_wan_dhcp6withoutra_script.sh Aug 2 17:10:29 unbound 48227:0 notice: init module 0: validator Aug 2 17:10:29 unbound 48227:0 notice: init module 1: iterator Aug 2 17:10:29 unbound 48227:0 info: start of service (unbound 1.6.3). Aug 2 17:10:30 unbound 48227:0 info: resolving s.youtube.com. A IN
-
That's really strange. Your dhcp6c is running well ahead of mine, relative to unbound.
I was on a T2200H but just switched to a T3200M. I didn't expect it would make any difference and it didn't.
All unbound settings are default.
I've tried a clean install in a fresh VM (running on hyper-v), but that didn't make any difference either.
![Capture 1.PNG_thumb](/public/imported_attachments/1/Capture 1.PNG_thumb)
![Capture 1.PNG](/public/imported_attachments/1/Capture 1.PNG)
![Capture 2.PNG](/public/imported_attachments/1/Capture 2.PNG)
![Capture 2.PNG_thumb](/public/imported_attachments/1/Capture 2.PNG_thumb) -
I enabled dhcp debug mode to see if there would be any earlier messages.
Here are the first messages (newest at the top):
Aug 2 18:09:14 dhcp6c 11792 failed initialize control message authentication Aug 2 18:09:14 dhcp6c 11792 failed to open /usr/local/etc/dhcp6cctlkey: No such file or directory Aug 2 18:09:14 dhcp6c 11792 extracted an existing DUID from /var/db/dhcp6c_duid: 00:01:00:01:20:d5:a4:23:00:15:5d:5c:e2:1e
Here are the last messages:
Aug 2 18:09:19 dhcp6c 11840 got an expected reply, sleeping. Aug 2 18:09:19 dhcp6c 11840 removing server (ID: 00:03:00:01:0c:a4:02:29:50:01) Aug 2 18:09:19 dhcp6c 11840 removing an event on hn1, state=REQUEST Aug 2 18:09:19 dhcp6c 11840 script "/var/etc/dhcp6c_wan_dhcp6withoutra_script.sh" terminated Aug 2 18:09:19 dhcp6c dhcp6c REQUEST on hn1 - running rc.newwanipv6 Aug 2 18:09:17 dhcp6c 11840 executes /var/etc/dhcp6c_wan_dhcp6withoutra_script.sh Aug 2 18:09:17 dhcp6c 11840 add an address 2001:*:*:*:215:5dff:fe5c:e21e/64 on hn0 Aug 2 18:09:17 dhcp6c 11840 create a prefix 2001:*:*:*::/56 pltime=140733193402432, vltime=14700 Aug 2 18:09:17 dhcp6c 11840 make an IA: PD-0 Aug 2 18:09:17 dhcp6c 11840 nameserver[1] 2001:568:ff09:10b::122 Aug 2 18:09:17 dhcp6c 11840 nameserver[0] 2001:568:ff09:10a::55 Aug 2 18:09:17 dhcp6c 11840 dhcp6c Received REQUEST
Here are the first unbound messages:
Aug 2 18:09:16 unbound 24030:0 info: start of service (unbound 1.6.3). Aug 2 18:09:16 unbound 24030:0 notice: init module 1: iterator Aug 2 18:09:16 unbound 24030:0 notice: init module 0: validator
-
It would appear that something is odd with your system Bimmer, otherwise Chris and Ned over here would be suffering also, and they are not having any issues.
Time for you and Scott to do some comparisons :)
So, let's start with the basics, Scott, what are you running pfSense on?
-
@marjohn56:
It would appear that something is odd with your system Bimmer, otherwise Chris and Ned over here would be suffering also, and they are not having any issues.
Time for you and Scott to do some comparisons :)
So, let's start with the basics, Scott, what are you running pfSense on?
Something is strange, that's for sure.
Do you notice these messages in the dhcp log? They are the very first dhcp6c messages
Aug 3 13:38:05 dhcp6c 10883 failed initialize control message authentication Aug 3 13:38:05 dhcp6c 10883 failed to open /usr/local/etc/dhcp6cctlkey: No such file or directory Aug 3 13:38:05 dhcp6c 10883 extracted an existing DUID from /var/db/dhcp6c_duid: 00:01:00:01:20:d5:a4:23:00:15:5d:5c:e2:1e Aug 3 13:38:02 dhclient Creating resolv.conf
I'm wondering if perhaps they are slowing down the execution of dhcp6c.
-
@marjohn56:
So, let's start with the basics, Scott, what are you running pfSense on?
I'm running it on an official pfSense (Negate) SG-2440 appliance.
Do you notice these messages in the dhcp log? They are the very first dhcp6c messages
I believe I have these messages in my log too, I will confirm and post when I get home from work.
-
Ignore those messages, they are in everyone's log and have been since day one. dhcp6c is looking for the a key that does not exist and will not exist as pfSense does not use that method of control, it's just noise in the log. I could stop it but it really is nothing.
-
@marjohn56:
Ignore those messages, they are in everyone's log and have been since day one. dhcp6c is looking for the a key that does not exist and will not exist as pfSense does not use that method of control, it's just noise in the log. I could stop it but it really is nothing.
If they are normal, I don't care that they are there. I was just wondering if dhcp6c was blocking momentarily because of some issue.
-
@marjohn56:
So, let's start with the basics, Scott, what are you running pfSense on?
I'm running it on an official pfSense (Negate) SG-2440 appliance.
I'm running on a hyper-v server. The guest has 2000 MB of RAM and 2 processors. It's basically idling. The server is lightly loaded.
Do you notice these messages in the dhcp log? They are the very first dhcp6c messages
I believe I have these messages in my log too, I will confirm and post when I get home from work.
It sounds like they are a non-issue.
-
@marjohn56:
It would appear that something is odd with your system Bimmer, otherwise Chris and Ned over here would be suffering also, and they are not having any issues.
Here is some further info that makes me wonder why dhcp6c and unbound messages start to appear in the log around the same time during the reboot. Not clear what is happening during the "Configuring WAN interface…done." phase, but it's a few steps before unbound is started, yet dhcp6c is only barely into the startup phase by the time unbound is starting.
![pfsense console.PNG](/public/imported_attachments/1/pfsense console.PNG)
![pfsense console.PNG_thumb](/public/imported_attachments/1/pfsense console.PNG_thumb) -
And here are a few more log messages from general:
Aug 4 09:19:22 php-fpm 291 /rc.newwanipv6: Removing static route for monitor fe80::ea4:2ff:fe29:5001 and adding a new route through fe80::ea4:2ff:fe29:5001%hn1 Aug 4 09:19:22 php-fpm 291 /rc.newwanipv6: ROUTING: setting IPv6 default route to fe80::ea4:2ff:fe29:5001%hn1 Aug 4 09:19:22 php-fpm 291 /rc.newwanipv6: ROUTING: setting default route to 154.20.116.1 Aug 4 09:19:22 php-fpm 291 /rc.newwanipv6: rc.newwanipv6: on (IP address: fe80::215:5dff:fe5c:e21d%hn1) (interface: wan) (real interface: hn1). Aug 4 09:19:22 php-fpm 291 /rc.newwanipv6: rc.newwanipv6: Info: starting on hn1. Aug 4 09:19:21 rtsold Received RA specifying route fe80::ea4:2ff:fe29:5001 for interface wan(hn1) Aug 4 09:19:17 kernel done. Aug 4 09:19:16 php-cgi rc.bootup: ROUTING: setting default route to 154.20.116.1 Aug 4 09:19:16 kernel .done. Aug 4 09:19:15 kernel pflog0: promiscuous mode enabled Aug 4 09:19:15 php-cgi rc.bootup: Resyncing OpenVPN instances. Aug 4 09:19:15 kernel hn0: link state changed to UP Aug 4 09:19:15 check_reload_status Linkup starting hn0 Aug 4 09:19:15 php-cgi rc.bootup: Starting dhcp6 client for interface wan hn1 in DHCP6 without RA mode
The first unbound log message appeared at 09:19:17.
-
scrapped my post as you already said its defaults, I would follow the advice given and compare ipv6 settings.
-
I'm using the same ipv6 settings that I've been using for months which are correct for my ISP. The only variable is whether or not to send a prefix hint and that makes no difference. Without the other settings are they are, the PD process will not complete.
I had thought this was a new problem since the other recent changes to unbound, but I restarted my 2.3.4 system and noticed it does the same thing. As I said before, I think this slipped through the cracks before because unbound used to restart so often on its own. However, when I rebooted my 2.3.4 system, I checked nslookup google.com immediately after booting and it failed. After restarting unbound, it worked.
Unless someone has another explanation, this problem appears to be caused by unbound starting before the PD process is complete.
-
I did a couple more packet captures with wireshark, two reboots from the 2.3.4 system and two from the 2.4 beta system.
The 2.3.4 system took 2.34 seconds and 1.36 seconds. The 2.4 beta system took 4.84 seconds both times.
Both systems have identical WAN ipv6 settings. It's interesting to notice that pfsense 2.3.4 appears to send the RS after sending the second dhcp request before the reply is received, rather than apparently waiting for the reply as pfsense 2.4 beta does. I have never noticed this behaviour before. It could be a coincidence.
You can see that the link-local address of the edge router is the same in both cases, so any difference is from pfsense, not the edge router.
![2.4 B.PNG](/public/imported_attachments/1/2.4 B.PNG)
![2.4 B.PNG_thumb](/public/imported_attachments/1/2.4 B.PNG_thumb) -
When I get some time to play I'll put something together for you. Don't know when that will be as real work is exceptionally busy at the moment.