IPv6 issues: multiple dhcp6c processes on an interface, random interface ID



  • I'm working on two IPv6 issues that equally affect 2.2.5/2.2.6 and the 2.3 beta series:

    Issue 1: Multiple dhcp6c processes on one interface issue, leading to broken IPv6 connectivity
    Issue 2: Random interface identifiers when interfaces connect at boot time

    Full information and a candidate patch are in the thread in the IPv6 forum. The reference to 2.2.5 in the thread title relates to the history - the thread equally relates to 2.3.

    The correct logic to fix Issue 1 is not entirely clear. I believe the patch improves pfSense's behaviour, but the complexity of the issue means the patch might fail to fix some cases that are currently broken. The patch might also be a regression: it could stop dhcp6c from starting at all on previously functional connections, breaking currently working connections.

    I'd be grateful if as many people as possible using SLAAC, DHCPv6 and/or DHCP-PD (prefix delegation) could try the patch and provide feedback. It would be helpful if those giving feedback gave outline details of their connection - especially the settings for "IPv4 Configuration Type", "IPv6 Configuration Type" and "Use IPv4 connectivity as parent interface" (in the "DHCP6 Client Configuration" section).

    If you hit problems, revert the patch and reboot - you will be back to where you were before. Don't forget to re-apply the patch if you update to a later beta.

    Issue 2 affects PPP type connections, leading to a random interface identifier (least significant 64 bits of the IPv6 address) for the link local address and any address allocated by SLAAC on the first connection following boot. The root cause is documented in the linked post and the work-round seems to be solid.

    All feedback is welcome. My hope is to get this to a point where I can submit a pull request in confidence that the patch makes things better.

    I'm aware that Issue 2 needs a redmine bug. Issue 1 probably needs a redmine bug as well - I'll sort those out in due course.



  • David, as far as I'm concerned your patch works and does not break my existing working config.
    Go ahead :)



  • Thanks much for your efforts on this, David.

    Having followed the threads here, and reviewed and tested the change, I'm confident enough in it now that I'd merge the pull request to master as it stands. Go ahead and get a bug ticket and pull request opened and I'll merge.



  • Well guys I would just like to add that this truly is an amazing community :)



  • CF with full pfSense-2.3-BETA-2g-i386-nanobsd-20160109-0142 and IPv6-PPP-patch : All OK :)
    Same config and results as pfSense-2.2.6.

    
    		 <wan><enable><if>pppoe1</if>
    
    			 <alias-address><alias-subnet>32</alias-subnet>
    			 <spoofmac><blockpriv><ipaddr>pppoe</ipaddr>
    			<ipaddrv6>dhcp6</ipaddrv6>
    			 <dhcp6-duid><dhcp6-ia-pd-len>16</dhcp6-ia-pd-len>
    			 <dhcp6-ia-pd-send-hint><dhcp6prefixonly><dhcp6usev4iface><adv_dhcp6_interface_statement_send_options>ia-pd 0</adv_dhcp6_interface_statement_send_options>
    			<adv_dhcp6_id_assoc_statement_address_id>0</adv_dhcp6_id_assoc_statement_address_id>
    			<adv_dhcp6_id_assoc_statement_prefix_enable>Selected</adv_dhcp6_id_assoc_statement_prefix_enable>
    			<adv_dhcp6_id_assoc_statement_prefix_id>0</adv_dhcp6_id_assoc_statement_prefix_id>
    			<adv_dhcp6_config_advanced>Selected</adv_dhcp6_config_advanced></dhcp6usev4iface></dhcp6prefixonly></dhcp6-ia-pd-send-hint></dhcp6-duid></blockpriv></spoofmac></alias-address></enable></wan> 
    
    


  • @cmb:

    Thanks much for your efforts on this, David.

    Having followed the threads here, and reviewed and tested the change, I'm confident enough in it now that I'd merge the pull request to master as it stands. Go ahead and get a bug ticket and pull request opened and I'll merge.

    I hope to get the redmine bugs filed and pull requests made later today.

    I'm proposing to open one bug per issue, but submit pull requests that contain both fixes - one for master and one for RELENG_2_2. I'll put an @cbuechler reference somewhere in the pull requests so that you know when this is done.



  • I don't know if this is related or not, but since the approximate time of this update, my native IPv6 DHCP/PD connection seems broken.

    Initially, the WAN interface acquires an IPv6 address and the LAN 'tracks' with a prefix.  Upon renew, the following is logged, and the WAN IPv6 public IP is removed:

    /rc.newwanipv6: rc.newwanipv6: Failed to update WAN[wan] IPv6, restarting…

    dhcp6c is running and renewing both the prefix and the WAN address (confirmed on DHCP server), but public WAN address never gets re-assigned to the interface....

    Any recommendations for further debugging on this?



  • @quantumx:

    I don't know if this is related or not, but since the approximate time of this update, my native IPv6 DHCP/PD connection seems broken.

    It's not related at all. These patches have yet to be merged to pfSense, so you would only be running with them if you had installed them using the instructions earlier in the thread.

    Nothing in the patches touches /etc/rc.newwanipv6. The only way these patches would upset your connection is if they somehow stopped dhcp6c running at all on your WAN interface. If this has happened, it's a regression that needs sorting out. What does ps -auwwx | grep dhcp6c show?

    @quantumx:

    /rc.newwanipv6: rc.newwanipv6: Failed to update WAN[wan] IPv6, restarting…

    For whatever reason, /etc/rc.newwanipv6 thinks your WAN interface doesn't have a valid IPv6 address or doesn't have an IPv6 address at all.

    What does ifconfig pppoe0 (or whatever your WAN interface is) show when the error is generated? Are there any dhcp6c errors in the DHCP log?



  • @David_W:

    @quantumx:

    I don't know if this is related or not, but since the approximate time of this update, my native IPv6 DHCP/PD connection seems broken.

    It's not related at all. These patches have yet to be merged to pfSense, so you would only be running with them if you had installed them using the instructions earlier in the thread.

    That's what I was figuring, though wasn't clear if he'd actually applied the patch or not. It's possible that's a situation that would be fixed by this change I guess if it's one where dhcp6c ended up running twice, though not sure it exits with that in that situation.

    Regardless, what David asked for is the best steps to troubleshoot. If you haven't applied his patch, please start a new thread as it has no relation to this. Or might want to try applying his patch then reporting back here, as that's likely to be merged soon.



  • Apologies.  No offence intended  ;)  This just seemed like a likely spot to start.

    dhcp6c is running (ps -auwwx | grep dhcp6c):

    /usr/local/sbin/dhcp6c -d -c /var/etc/dhcp6c_wan.conf -p /var/run/dhcp6c_em1_vlan30.pid em1_vlan30

    Digging…..



  • David, after 2 weeks with the patch applied on a 2.2.6 system everything is ok here, no problems whatsoever.



  • Hi David,

    I had both IPv6 issues and after running the patch on 2.3 for over a week everything is still working as it should. Before the patch I would have had to go in and manually kill the multiple processes almost every 12 hours and would take me 15-20 min to get a usable IPv6 address on PPPoE.

    I have not had any issues after multiple reboots, it immediately gives me a usable IPv6 connection and there are no extra processes running.

    Thanks,

    Robbert



  • Please merge it :)



  • Since David hasn't been around for a bit, I went ahead and merged his patch.
    https://redmine.pfsense.org/issues/5621

    Those who were using his patch, please remove that patch and upgrade tomorrow (or later) and report back.



  • Hi Chris!
    Woulg gitsync also do it?

    Br,Greg



  • @maverick_slo:

    Woulg gitsync also do it?

    Yes, but if you haven't done an upgrade in the past ~24 hours, you'll break the system if you only gitsync as some changes yesterday require a PHP upgrade.



  • OK so I can do upgrade+gitsync and I can test now :) ?



  • @maverick_slo:

    OK so I can do upgrade+gitsync and I can test now :) ?

    Yes.



  • I'm not sure if I have the latest fix or not but I am still having lots of issues with many dhcpd -6 and dhcpleases6 processes.

    My pfSense version is:

    2.3-BETA (amd64)
    built on Fri Jan 29 10:31:24 CST 2016
    FreeBSD 10.3-PRERELEASE

    right now I see:

    
    [2.3-BETA][admin@pfs.dv.loc]/root: ps Ax | grep dhcp
    24158  -  Is     0:00.09 /usr/local/sbin/dhcp6c -d -c /var/etc/dhcp6c_wan.conf -p /var/run/dhcp6c_em1.pid em1
    61348  -  Ss     0:00.01 /usr/local/sbin/dhcpd -6 -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpdv6.conf -pf /var/run/dhcpdv6.pid em0 
    61720  -  Is     0:00.00 /usr/local/sbin/dhcpleases6 -c /usr/local/bin/php-cgi -f /usr/local/sbin/prefixes.php|/bin/sh -l /var/dhcpd/var/db/dhcpd6.leases
    70366  -  Is     0:00.00 /usr/local/sbin/dhcpleases -l /var/dhcpd/var/db/dhcpd.leases -d dv.loc -p /var/run/unbound.pid -u /var/unbound/dhcpleases_entries.conf -h /var/etc/hosts
    89153  -  Ss     0:06.13 /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -P /var/run/syslog.pid -f /var/etc/syslog.conf -b 172.22.22.254
    90907  -  Ss     0:00.02 /usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcpd.pid em0 ue0
    91854  -  Ss     0:00.02 /usr/local/sbin/dhcpd -6 -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpdv6.conf -pf /var/run/dhcpdv6.pid em0 ue0
    92199  -  Is     0:00.00 /usr/local/sbin/dhcpleases6 -c /usr/local/bin/php-cgi -f /usr/local/sbin/prefixes.php|/bin/sh -l /var/dhcpd/var/db/dhcpd6.leases
    97815  -  Ss     0:00.01 /usr/local/sbin/dhcpd -6 -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpdv6.conf -pf /var/run/dhcpdv6.pid em0 ue0
    98191  -  Is     0:00.00 /usr/local/sbin/dhcpleases6 -c /usr/local/bin/php-cgi -f /usr/local/sbin/prefixes.php|/bin/sh -l /var/dhcpd/var/db/dhcpd6.leases
    59438  0  S+     0:00.00 grep dhcp
    
    

    My problem may be different because it seems to have something to do with my OPT1 (ue0) interface going away and coming back many times an hour:

    
    [2.3-BETA][admin@pfs.dv.loc]/root: clog /var/log/system.log | grep linkup | tail -n 10
    Jan 29 18:38:34 pfs php-fpm[30595]: /rc.linkup: HOTPLUG: Configuring interface opt1
    Jan 29 18:38:36 pfs php-fpm[30595]: /rc.linkup: The command '/usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcpd.pid em0 ue0' returned exit code '1', the output was 'Internet Systems Consortium DHCP Server 4.2.8 Copyright 2004-2015 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Wrote 0 deleted host decls to leases file. Wrote 0 new dynamic host decls to leases file. Wrote 0 leases to leases file. Listening on BPF/ue0/a0:ce:c8:01:c6:ca/172.22.23.0/24 Sending on   BPF/ue0/a0:ce:c8:01:c6:ca/172.22.23.0/24 Listening on BPF/em0/00:90:fb:38:84:96/172.22.22.0/24 Sending on   BPF/em0/00:90:fb:38:84:96/172.22.22.0/24 Can't bind to dhcp address: Address already in use Please make sure there is no other dhcp server running and that there's no entry for dhcp or bootp in /etc/inetd.conf.   Also make sure you are not running HP JetAdmin software, which includes a bootp server.  If you did not get this software
    Jan 29 18:39:51 pfs php-fpm[47071]: /rc.linkup: DEVD Ethernet detached event for opt1
    Jan 29 18:39:51 pfs php-fpm[47071]: /rc.linkup: DEVD Ethernet attached event for opt1
    Jan 29 18:39:51 pfs php-fpm[47071]: /rc.linkup: HOTPLUG: Configuring interface opt1
    Jan 29 18:39:54 pfs php-fpm[47071]: /rc.linkup: The command '/usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcpd.pid em0 ue0' returned exit code '1', the output was 'Internet Systems Consortium DHCP Server 4.2.8 Copyright 2004-2015 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Wrote 0 deleted host decls to leases file. Wrote 0 new dynamic host decls to leases file. Wrote 0 leases to leases file. Listening on BPF/ue0/a0:ce:c8:01:c6:ca/172.22.23.0/24 Sending on   BPF/ue0/a0:ce:c8:01:c6:ca/172.22.23.0/24 Listening on BPF/em0/00:90:fb:38:84:96/172.22.22.0/24 Sending on   BPF/em0/00:90:fb:38:84:96/172.22.22.0/24 Can't bind to dhcp address: Address already in use Please make sure there is no other dhcp server running and that there's no entry for dhcp or bootp in /etc/inetd.conf.   Also make sure you are not running HP JetAdmin software, which includes a bootp server.  If you did not get this software
    Jan 29 18:39:55 pfs php-fpm[65925]: /rc.linkup: DEVD Ethernet detached event for opt1
    Jan 29 18:39:55 pfs php-fpm[67730]: /rc.linkup: DEVD Ethernet attached event for opt1
    Jan 29 18:39:55 pfs php-fpm[67730]: /rc.linkup: HOTPLUG: Configuring interface opt1
    Jan 29 18:39:58 pfs php-fpm[67730]: /rc.linkup: The command '/usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcpd.pid em0 ue0' returned exit code '1', the output was 'Internet Systems Consortium DHCP Server 4.2.8 Copyright 2004-2015 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Wrote 0 deleted host decls to leases file. Wrote 0 new dynamic host decls to leases file. Wrote 0 leases to leases file. Listening on BPF/ue0/a0:ce:c8:01:c6:ca/172.22.23.0/24 Sending on   BPF/ue0/a0:ce:c8:01:c6:ca/172.22.23.0/24 Listening on BPF/em0/00:90:fb:38:84:96/172.22.22.0/24 Sending on   BPF/em0/00:90:fb:38:84:96/172.22.22.0/24 Can't bind to dhcp address: Address already in use Please make sure there is no other dhcp server running and that there's no entry for dhcp or bootp in /etc/inetd.conf.   Also make sure you are not running HP JetAdmin software, which includes a bootp server.  If you did not get this software
    
    

    Does this seem familiar or do I have another problem?  Maybe it's because of the USB network interface?



  • @iamzam:


    Does this seem familiar or do I have another problem?  Maybe it's because of the USB network interface?

    Different. Plug issues within your own site. This topic is about dhcp6c WAN connections to global, an ISP…



  • @iamzam:

    My problem may be different because it seems to have something to do with my OPT1 (ue0) interface going away and coming back many times an hour:

    As hda says, you have a different issue - but one that is somewhat related to that discussed in this thread.

    Your issue happens because your USB network interface keeps going down and coming back up. It really would be in your interests to get away from the problem by fixing or changing the interface if possible as an interface really should not be behaving this way.

    The root cause is that nothing is terminating the dhcpd / dhcpleases6 / dhcpleases processes when the interface goes down. It may well be that interface_bring_down() (/etc/inc/interfaces.inc from around line 1219) should do this, though it would be important to check for possible side effects before making this change because this function is used in /etc/rc.linkup when a hotpluggable interface goes down (as in your scenario) but also in many other places.