if_pppoe problems with php-fpm causing loops. (resolved)
-
So I have spent the past hour or two diagnosing major issues with if_pppoe.
They are the following.
1 - When a connection is initiated, either by boot process, reconfiguring the interface, or cycling the interface, it will automatically cycle PPPoE WAN roughly every 10 seconds. This of course has a snowball effect on VPN's, services etc.
2 - The fixed /128 IPv6 I get from my ISP designed to be directly on firewall, only appears on my WAN interface using the legacy PPPoE, the prefixes for my LAN still work fine, but there is no automatic DHCP6 allocation on WAN anymore.#1 is the biggest problem, and I after some time noticed, if I select option 16 in the console/ssh menu to terminate all php-fpm services, the PPPoE settles down and stops cycling, after doing this I may need to start anything thats down mid cycle such as dpinger or unbound. This only works until the next time PPPoE is connected, at which point it starts another PHP-FPM doom loop.
#2 Might not be if_pppoe specific but rather 2.8.0 specific, I can see from the console when the system booted I had the WAN IPv6, I was then disabling things, to try and stabilise the PPPoE, and when I enabled DHCP6 again it hasnt worked since. I validated the config with a backup and have verified its configured the same with the exception that pfSense now configures '<adv_dhcp6_prefix_selected_interface>wan</adv_dhcp6_prefix_selected_interface>' instead of '<adv_dhcp6_prefix_selected_interface>lan</adv_dhcp6_prefix_selected_interface>' for the pppoe interface, however even after manually changing that, it didnt bring back the WAN IPv6.
I do plan to ask my ISP to check their logs, to see if they see anything unusual.
-
Ok it is logging just in the main log section, so having a look before it gets pushed off the screen.
Jul 4 09:07:10 php 8818 /usr/local/sbin/ppp-ipv6: Accept router advertisements on interface pppoe2
Jul 4 09:07:09 php 5494 /usr/local/sbin/ppp-ipv6: Starting rtsold process on wan(pppoe2)
Jul 4 09:07:09 php 5494 /usr/local/sbin/ppp-ipv6: Starting DHCP6 client for interfaces pppoe2
Jul 4 09:07:09 php 5494 /usr/local/sbin/ppp-ipv6: Accept router advertisements on interface pppoe2
Jul 4 09:07:09 kernel 8108 if_pppoe: pppoe2: failed to set default route 17
Jul 4 09:07:09 kernel 8108 pppoe2: link state changed to UP
Jul 4 09:07:09 kernel 8108 if_pppoe: pppoe2: failed to clear IP address: 49
Jul 4 09:07:09 kernel 8108 pppoe2: link state changed to DOWN
Jul 4 09:07:09 kernel 8108 pppoe2: link state changed to UPJul 4 09:07:01 kernel 8101 pppoe2: received unexpected PADO
Jul 4 09:07:01 kernel 8101 pppoe2: received unexpected PADO
Jul 4 09:07:01 kernel 8101 pppoe2: link state changed to DOWNNow I think the possible trigger of the cycle.
Jul 4 09:07:01 php-fpm 68331 /interfaces.php: The command '/sbin/ifconfig 'pppoe2' inet 'censored'/'32' alias ' returned exit code '1', the output was 'ifconfig: in_exec_nl(): Empty IFA_LOCAL/IFA_ADDRESS ifconfig: ioctl (SIOCAIFADDR): Invalid argument'
Looks like it is reporting a failure to add an IP alias I configured as a virtual IP. However this IP does get added, its on the interface when I look.
Jul 4 09:07:01 php-fpm 68331 /interfaces.php: calling interface_dhcpv6_configure.
Jul 4 09:07:01 php 53849 /usr/local/sbin/ppp-ipv6: The command '/sbin/ifconfig 'pppoe2' inet6 -accept_rtadv' returned exit code '1', the output was 'ifconfig: interface pppoe2 does not exist'
Jul 4 09:07:01 php 46706 /usr/local/sbin/ppp-ipv6: The command '/sbin/ifconfig 'pppoe2' inet6 -accept_rtadv' returned exit code '1', the output was 'ifconfig: interface pppoe2 does not exist'
Jul 4 09:07:01 php-fpm 68331 /interfaces.php: The command '/sbin/ifconfig 'pppoe2' inet6 -accept_rtadv' returned exit code '1', the output was 'ifconfig: interface pppoe2 does not exist'
Jul 4 09:07:00 kernel 8100 pppoe2: link state changed to DOWNI will update this post in a moment with what happens when I remove the alias.
-
Ok, so the alias was causing the loop, I have tested with manual commands, it looks I cannot manually add any IP's to the pppoe interface, just get a similar error to what is in the log, so if_pppoe seems to have issues with post connection reconfiguration.
I also suggest removing the abort thats present in the php script that terminates pppoe when an alias or dhcp6 failure occurs.
-
I am back on mpd pppoe now, and can confirm the IPv6 is back on as well as the alias with no warnings, I can also manually add alias's to the pppoe interface as well.
If I google the error it seems to lead to DCO VPN discussions, so not quite sure whats going on, but I have pinpointed what is going wrong.
I will be willing to try it again if anyone comes on here with any suggestions to try.
-
The issue is already known and reported, but linking this post on redmine in case anything is missed.
https://redmine.pfsense.org/issues/16235
-
@chrcoluk said in if_pppoe serious problems with php-fpm causing loops.:
https://redmine.pfsense.org/issues/16235
Try the patch linked there. That should resolve it, it did in all our test cases. But pppoe has quite a lot of variation so yours might be a new edge case that would be good to test/fix before 2.8.1.
-
@stephenw10 I will do tonight thanks, I need to determine if the IPv6 issue is a separate bug or will be ok with the patch.
-
There has been a bunch of work gone into the IPv6 setup since 2.8.0. Mostly around the variation in ISPs sending RAs.
Try the patch here: https://redmine.pfsense.org/issues/16265
-
@stephenw10 Ok I have just done it now, patched the file, and rebooted with if_pppoe, the good news is, its came online with no loops and all services running, the bad news is the ipv6 is an issue, I already raised the ipv6 on redmine, will add a note to say I have tested it now with the patch.
-
OK so it just fails to pull a lease at all? Also no PD?
Are you requesting a PD only?
Do you have 'Do not wait for RAs set'?
The behaviour of that setting may be different. Some users have reported needing it set or unset with if_pppoe. ISP dependent.Who is your ISP, if you can say?
-
@stephenw10 The ISP is AAISP. (UK)
The configuration is to request prefix/information via IPv4 connectivity.
Prefix delegation set to none.
no prefix hint, do not wait for RA unticked, request only prefix unticked.The IPv6 subnets working as normal on my LAN interfaces, not requested via DHCP6, but normally AAISP assign a single /128 to the pppoe interface on the firewall for its own use. This is what is failing. Without touching anything other than the enable if_pppoe box it will work on mpd.
I am leaving this booted now on if_pppoe since it is at least now stable, so I will be able to try and fix it by tinkering, if its possible.
I added extra info here.
https://redmine.pfsense.org/issues/16300
See here for some documentation on what AAISP do.
https://support.aa.net.uk/IPv6
It is that 2001: that is failing to attach.
-
@chrcoluk said in if_pppoe serious problems with php-fpm causing loops.:
The ISP is AAISP. (UK)
Nice! That explains the control you have then.
Do you see it trying to pull a dhcpv6 lease? Do you see it logging a RA after it connects at v4?
-
@stephenw10 Yeah, bear in mind I am no expert on dhcp6c, but what I am seeing suggests, its managing to pull something, it then runs ' /var/etc/dhcp6c_wan_script.sh ' which seems to fail, I posted these logs on the redmine, I will snippets here as well for you.
Sorry also forgot to mention I have now also tried the do not wait for RA option as well.
Here is end of it
"Jul 4 13:46:23 dhcp6c 21264 exiting
Jul 4 13:46:23 dhcp6c 21264 script "/var/etc/dhcp6c_wan_script.sh" terminated
Jul 4 13:46:23 dhcp6c 22610 script "/var/etc/dhcp6c_wan_script.sh" cannot be executed safely
Jul 4 13:46:23 dhcp6c 22610 lstat failed: No such file or directory
Jul 4 13:46:23 dhcp6c 21264 executes /var/etc/dhcp6c_wan_script.sh
Jul 4 13:46:23 dhcp6c 21264 removing an event on pppoe2, state=INIT
Jul 4 13:46:23 dhcp6c 21264 exit without release "Here is all of it, I just noticed it there is an error at the start as well, whether that also happens when it succeeds on mpd I dont know as I wasnt looking.
Jul 4 13:46:23 dhcp6c 21264 exiting
Jul 4 13:46:23 dhcp6c 21264 script "/var/etc/dhcp6c_wan_script.sh" terminated
Jul 4 13:46:23 dhcp6c 22610 script "/var/etc/dhcp6c_wan_script.sh" cannot be executed safely
Jul 4 13:46:23 dhcp6c 22610 lstat failed: No such file or directory
Jul 4 13:46:23 dhcp6c 21264 executes /var/etc/dhcp6c_wan_script.sh
Jul 4 13:46:23 dhcp6c 21264 removing an event on pppoe2, state=INIT
Jul 4 13:46:23 dhcp6c 21264 exit without release
Jul 4 13:46:22 dhcp6c 21264 reset a timer on pppoe2, state=INIT, timeo=0, retrans=891
Jul 4 13:46:22 dhcp6c 21004 called
Jul 4 13:46:22 dhcp6c 21004 called
Jul 4 13:46:22 dhcp6c 21004 <3>end of sentence [;] (1)
Jul 4 13:46:22 dhcp6c 21004 <3>end of closure [}] (1)
Jul 4 13:46:22 dhcp6c 21004 <13>begin of closure [{] (1)
Jul 4 13:46:22 dhcp6c 21004 <13>[0] (1)
Jul 4 13:46:22 dhcp6c 21004 <13>[na] (2)
Jul 4 13:46:22 dhcp6c 21004 <3>[id-assoc] (8)
Jul 4 13:46:22 dhcp6c 21004 <3>end of sentence [;] (1)
Jul 4 13:46:22 dhcp6c 21004 <3>end of closure [}] (1)
Jul 4 13:46:22 dhcp6c 21004 <3>comment [# we'd like some nameservers please] (35)
Jul 4 13:46:22 dhcp6c 21004 <3>end of sentence [;] (1)
Jul 4 13:46:22 dhcp6c 21004 <3>["/var/etc/dhcp6c_wan_script.sh"] (31)
Jul 4 13:46:22 dhcp6c 21004 <3>[script] (6)
Jul 4 13:46:22 dhcp6c 21004 <3>end of sentence [;] (1)
Jul 4 13:46:22 dhcp6c 21004 <3>[domain-name] (11)
Jul 4 13:46:22 dhcp6c 21004 <3>[request] (7)
Jul 4 13:46:22 dhcp6c 21004 <3>end of sentence [;] (1)
Jul 4 13:46:22 dhcp6c 21004 <3>[domain-name-servers] (19)
Jul 4 13:46:22 dhcp6c 21004 <3>[request] (7)
Jul 4 13:46:22 dhcp6c 21004 <3>comment [# request stateful address] (26)
Jul 4 13:46:22 dhcp6c 21004 <3>end of sentence [;] (1)
Jul 4 13:46:22 dhcp6c 21004 <3>[0] (1)
Jul 4 13:46:22 dhcp6c 21004 <3>[ia-na] (5)
Jul 4 13:46:22 dhcp6c 21004 <3>[send] (4)
Jul 4 13:46:22 dhcp6c 21004 <3>begin of closure [{] (1)
Jul 4 13:46:22 dhcp6c 21004 <5>[pppoe2] (6)
Jul 4 13:46:22 dhcp6c 21004 <3>[interface] (9)
Jul 4 13:46:22 dhcp6c 21004 skip opening control port
Jul 4 13:46:22 dhcp6c 21004 failed initialize control message authentication
Jul 4 13:46:22 dhcp6c 21004 failed to open /usr/local/etc/dhcp6cctlkey: No such file or directory
Jul 4 13:46:22 dhcp6c 21004 extracted an existing DUID from /var/db/dhcp6c_duid: <censored> -
Both logs are reverse order I assume?
-
@stephenw10 yes, newest at the top.
-
Ok I have a done a little more testing, so the dhcp6c client was not staying in a running state, it is supposed to stay running.
I first tried to just manually start it after pppoe2 is online, but it just says sending solicit every 2 seconds with not much else happening, also the '/var/etc/dhcp6c.conf' is unpopulated.
I then rebooted into legacy mode, and '/var/etc/dhcp6c.conf' is populated with dhcpc6c running as a daemon.
The contents as below.
interface pppoe2 { send ia-na 0; # request stateful address request domain-name-servers; request domain-name; script "/var/etc/dhcp6c_wan_script.sh"; # we'd like some nameservers please }; id-assoc na 0 { };
I then rebooted back into if_pppoe mode, let the ipv4 come online, manually populated the above file to match, and then started dhcpc6c from command line with same syntax as is done automatically and its now working.
I think its a timing issue, dhcp6c is called too early on if_pppoe, I will update the redmine post.
-
@chrcoluk said in if_pppoe serious problems with php-fpm causing loops.:
I think its a timing issue, dhcp6c is called too early on if_pppoe
I just want to clarify — you're not using a multi-WAN configuration, right? And no CARP?
-
@w0w There is only one active pppoe, I think its pppoe2, because of an old WAN config that was disabled ages ago.
Not using CARP.
-
@chrcoluk said in if_pppoe serious problems with php-fpm causing loops.:
/var/etc/dhcp6c_wan_script.sh
What's in that script in each pppoe mode?
-
@stephenw10 I already know the contents in if_pppoe mode, I dont know when I can next reboot as I have been annoying people with all the stuff I am doing, but will check it on mpd as soon as possible. I think this one is without debug activated.
#!/bin/sh This shell script launches /etc/rc.newwanipv6 with a interface argument. dmips=${new_domain_name_servers} dmnames=${new_domain_name} case $REASON in REBIND) /usr/bin/logger -t dhcp6c "dhcp6c rebind on pppoe2" ;; REQUEST|RELEASE) /usr/bin/logger -t dhcp6c "dhcp6c RELEASE, REQUEST or EXIT on pppoe2 running rc.newwanipv6" /usr/local/sbin/fcgicli -f /etc/rc.newwanipv6 -d "interface=pppoe2&dmnames=${dmnames}&dmips=${dmips}&reason=${REASON}" ;; RENEW|INFO) /usr/bin/logger -t dhcp6c "dhcp6c RENEW on pppoe2 running rc.newwanipv6" /usr/local/sbin/fcgicli -f /etc/rc.newwanipv6 -d "interface=pppoe2&dmnames=${dmnames}&dmips=${dmips}&reason=${REASON}" esac