PPPoE WAN does not restart correctly after reconfiguring interfaces.
-
hi
, the main problem I see is that when a new mpd process is started, the previously started processes are checked (whether the file /var/run/pppoe_wan.pid exists) if the file exists, a command is sent to terminate the process.Apr 17 05:55:01 gate1 ppp[90881]: process 90881 started, version 5.9
Apr 17 05:55:01 gate1 ppp[90881]: waiting for process 47767 to die...Then, within 30 seconds (30 attempts), the closing of the previous mpd process is checked again. If this does not happen, then the program terminates (your case)
Apr 17 05:55:31 gate1 ppp[90881]: waiting for process 47767 to die...
Apr 17 05:55:32 gate1 ppp[90881]: can't lock /var/run/pppoe_wan.pid after 30 attempts. In your case, the previous process, for some unknown reason, was completed not within 30 seconds, but after 2 minutes and 30 seconds .
Apr 17 05:57:32 gate1 ppp[47767]: process 47767 terminated
-
BTW on the latest 24.03 RC builds I can't replicate this by issuing /usr/local/sbin/pfSctl -c 'interface reload wan' as before.
EDIT:
Got something, after switching to secondary firewall
2024-04-20 13:19:36.596539+03:00 ppp 98029 process 98029 terminated 2024-04-20 13:19:36.595640+03:00 ppp 98029 [wan_link0] Link: Shutdown 2024-04-20 13:19:36.498909+03:00 ppp 98029 [wan] Bundle: Shutdown 2024-04-20 13:19:35.807384+03:00 ppp 92838 waiting for process 98029 to die... 2024-04-20 13:19:34.804613+03:00 ppp 92838 waiting for process 98029 to die... 2024-04-20 13:19:34.496426+03:00 ppp 98029 [wan_link0] LCP: state change Closed --> Initial 2024-04-20 13:19:34.496379+03:00 ppp 98029 [wan_link0] LCP: Down event 2024-04-20 13:19:34.496330+03:00 ppp 98029 [wan_link0] LCP: Close event 2024-04-20 13:19:34.496281+03:00 ppp 98029 [wan_link0] Link: giving up after 0 reconnection attempts 2024-04-20 13:19:34.496227+03:00 ppp 98029 [wan_link0] Link: DOWN event 2024-04-20 13:19:34.495982+03:00 ppp 98029 [wan_link0] LCP: LayerFinish 2024-04-20 13:19:34.495923+03:00 ppp 98029 [wan_link0] LCP: state change Closing --> Closed 2024-04-20 13:19:34.495866+03:00 ppp 98029 [wan_link0] LCP: rec'd Terminate Ack #3 (Closing) 2024-04-20 13:19:34.495045+03:00 ppp 98029 [wan_link0] LCP: LayerDown 2024-04-20 13:19:34.494921+03:00 ppp 98029 [wan_link0] LCP: SendTerminateReq #3 2024-04-20 13:19:34.494857+03:00 ppp 98029 [wan] IPCP: state change Closed --> Initial 2024-04-20 13:19:34.494806+03:00 ppp 98029 [wan] IPCP: Down event 2024-04-20 13:19:34.494749+03:00 ppp 98029 [wan] IPCP: Close event 2024-04-20 13:19:34.494668+03:00 ppp 98029 [wan] Bundle: Status update: up 0 links, total bandwidth 9600 bps 2024-04-20 13:19:34.494075+03:00 ppp 98029 [wan_link0] Link: Leave bundle "wan" 2024-04-20 13:19:34.494024+03:00 ppp 98029 [wan_link0] LCP: state change Opened --> Closing 2024-04-20 13:19:34.493975+03:00 ppp 98029 [wan_link0] LCP: Close event 2024-04-20 13:19:34.493916+03:00 ppp 98029 [wan_link0] Link: CLOSE event 2024-04-20 13:19:34.493714+03:00 ppp 98029 [wan] IPCP: rec'd Terminate Ack #5 (Closed) 2024-04-20 13:19:34.493498+03:00 ppp 98029 [wan] Bundle: closing link "wan_link0"... 2024-04-20 13:19:34.493448+03:00 ppp 98029 [wan] Bundle: No NCPs left. Closing links... 2024-04-20 13:19:34.493399+03:00 ppp 98029 [wan] IPCP: LayerFinish 2024-04-20 13:19:34.493340+03:00 ppp 98029 [wan] IPCP: state change Closing --> Closed 2024-04-20 13:19:34.493281+03:00 ppp 98029 [wan] IPCP: rec'd Terminate Ack #4 (Closing) 2024-04-20 13:19:34.492968+03:00 ppp 98029 [wan] IPCP: SendTerminateReq #5 2024-04-20 13:19:34.492804+03:00 ppp 98029 [wan] IFACE: Set description "WAN" 2024-04-20 13:19:34.492667+03:00 ppp 98029 [wan] IFACE: Rename interface pppoe0 to pppoe0 2024-04-20 13:19:34.492452+03:00 ppp 98029 [wan] IFACE: Down event 2024-04-20 13:19:34.492224+03:00 ppp 98029 [wan] IFACE: Removing IPv4 address from pppoe0 failed(IGNORING for now. This should be only for PPPoE friendly!): Can't assign requested address 2024-04-20 13:19:33.789243+03:00 ppp 92838 waiting for process 98029 to die... 2024-04-20 13:19:32.747960+03:00 ppp 92838 waiting for process 98029 to die... 2024-04-20 13:19:31.746052+03:00 ppp 92838 waiting for process 98029 to die... 2024-04-20 13:19:30.744279+03:00 ppp 92838 waiting for process 98029 to die... 2024-04-20 13:19:29.742824+03:00 ppp 92838 waiting for process 98029 to die... 2024-04-20 13:19:29.740665+03:00 ppp 92838 process 92838 started, version 5.9 2024-04-20 13:19:29.742544+03:00 ppp 98029 [wan] IPCP: LayerDown 2024-04-20 13:19:29.742278+03:00 ppp 98029 [wan] IPCP: SendTerminateReq #4 2024-04-20 13:19:29.742176+03:00 ppp 98029 [wan] IPCP: state change Opened --> Closing 2024-04-20 13:19:29.742077+03:00 ppp 98029 [wan] IPCP: Close event 2024-04-20 13:19:29.741967+03:00 ppp 98029 [wan] IFACE: Close event 2024-04-20 13:19:29.741737+03:00 ppp 98029 caught fatal signal TERM 2024-04-20 13:19:29.738833+03:00 ppp 92838 Multi-link PPP daemon for FreeBSD
It does appear different. It seems to be connected to the 'Multilink extensions over single link' option. I can't replicate the issue after unchecking this option and issuing
/usr/local/sbin/pfSctl -c 'interface reload wan'
However, when it's checked, it consistently fails to reconnect. It doesn't seem to be a direct bug, but there's definitely something wrong when the multilink configuration is applied. The primary firewall has this option unchecked.
But I remember I have had exactly the same behaviour as posted here https://forum.netgate.com/post/1163644 can't find now when it's changed
-
Huh, that's interesting. That's not set by default, was there some reason you enabled it?
-
@stephenw10
At some point, I was testing the capabilities of the provider and forgot to disable this option. However, it was definitely not related to the initially discussed issue, which is why I brought up this topic. -
@stephenw10 said in PPPoE WAN does not restart correctly after reconfiguring interfaces.:
With the other WAN disabled for v6 does the mpd process finish without that min wait?
Seems so. See attached log file:
PPPoE not failing.txt -
Hmm, yeah 2.5s vs 2.5mins! Interestng.
Can I assume you don't have Multilink extensions enabled?
-
"Multilink extensions over single link" is not enabled.
-
It seems that after upgrading to version 24.03 the PPPoE fails only exactly after every other periodic reset or manual interface reload while there is a DHCP6 configuration on the other WAN interface. This leads to the question whether the issue is related to one of the files changed for that upgrade. I am puzzled.
-
Hmm, interesting. Unexpected!
How does the logging differ between the two resets?
-
Hi Steve, I didn't particularly want to add additional noise to this thread but I also have the occasional hiccup when taking the PPPoE WAN down and back up again.
Since the changes in 24.03 these are of a lower order of magnitude than previously they do still happen. Whilst I could not grab a log and point at it is my sense that this only happens when bringing the PPPoE WAN up now (since the changes) rather than both down and up.
I've not been doing any true controlled testing (due to the cloud hanging over those of us willing to test betas but likely to be billed for it in the near future) but the changes have made a considerable difference to both my Netgate and non-Netgate hardware.
However, they do still happen and when they do they can be oddly successive when they do happen, which causes considerable frustration. Other times I can go through multiple WAN down/up commands and have no issue at all.
Regards
️ -
Hmm, well if you can grab any logs to compare that may help here.
-
Here's a log including both the successful interface reset as well as the one with failing PPPoE (pfsense 24.03).
PPPoE not failing and failing v24.03.txt -
Hmm, are you sure you didn't label those the wrong way around? The first log looks like a failure and the second a success.
-
Yes, of course the labels should be swapped.
-
@stephenw10
Have you been able to dig into this?
It seems one of the changes made to version 24.03 was Revision 30d46b63. Since the behaviour changed with 24.03, the fault might be related to one of the files touched in that change. -
@stephenw10
Have you passed this on to the developers or do I need to somehow open a bug on Redmine? -
Mmm, checking what happened here...
-
After Update to 24.03 i have the same issue. Every night my ISP forces a reset and every night my PPPOE (Netgate 4100) will not reconnect. This is really frustrating:
°PPPOE Log°(Sep 15 03:20:01 ppp 48609 Multi-link PPP daemon for FreeBSD
Sep 15 03:20:01 ppp 48609 process 48609 started, version 5.9
Sep 15 03:20:01 ppp 4389 caught fatal signal TERM
Sep 15 03:20:01 ppp 4389 [wan] IFACE: Close event
Sep 15 03:20:01 ppp 4389 [wan] IPCP: Close event
Sep 15 03:20:01 ppp 4389 [wan] IPCP: state change Opened --> Closing
Sep 15 03:20:01 ppp 4389 [wan] IPCP: SendTerminateReq #4
Sep 15 03:20:01 ppp 4389 [wan] IPCP: LayerDown
Sep 15 03:20:01 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:01 ppp 4389 [wan] IFACE: Removing IPv4 address from pppoe0 failed(IGNORING for now. This should be only for PPPoE friendly!): Can't assign requested address
Sep 15 03:20:01 ppp 4389 [wan] IPV6CP: Close event
Sep 15 03:20:01 ppp 4389 [wan] IPV6CP: state change Opened --> Closing
Sep 15 03:20:01 ppp 4389 [wan] IPV6CP: SendTerminateReq #2
Sep 15 03:20:01 ppp 4389 [wan] IPV6CP: LayerDown
Sep 15 03:20:02 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:03 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:04 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:05 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:06 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:07 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:08 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:09 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:10 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:11 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:12 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:13 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:14 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:15 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:16 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:17 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:18 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:19 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:20 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:21 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:22 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:23 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:24 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:25 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:26 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:27 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:28 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:29 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:30 ppp 48609 waiting for process 4389 to die...
Sep 15 03:20:31 ppp 48609 can't lock /var/run/pppoe_wan.pid after 30 attempts
Sep 15 03:20:52 ppp 4389 [wan] IFACE: Down event
Sep 15 03:20:52 ppp 4389 [wan] IFACE: Rename interface pppoe0 to pppoe0
Sep 15 03:20:52 ppp 4389 [wan] IFACE: Set description "WAN"
Sep 15 03:20:52 ppp 4389 [wan] IPV6CP: SendTerminateReq #3
Sep 15 03:20:52 ppp 4389 [wan] IPCP: SendTerminateReq #5
Sep 15 03:20:52 ppp 4389 [wan] IPCP: rec'd Terminate Ack #4 (Closing)
Sep 15 03:20:52 ppp 4389 [wan] IPCP: state change Closing --> Closed
Sep 15 03:20:52 ppp 4389 [wan] IPCP: LayerFinish
Sep 15 03:20:52 ppp 4389 [wan] IPV6CP: rec'd Terminate Ack #2 (Closing)
Sep 15 03:20:52 ppp 4389 [wan] IPV6CP: state change Closing --> Closed
Sep 15 03:20:52 ppp 4389 [wan] IPV6CP: LayerFinish
Sep 15 03:20:52 ppp 4389 [wan] Bundle: No NCPs left. Closing links...
Sep 15 03:20:52 ppp 4389 [wan] Bundle: closing link "wan_link0"...
Sep 15 03:20:52 ppp 4389 [wan_link0] LCP: rec'd Terminate Request #3 (Opened)
Sep 15 03:20:52 ppp 4389 [wan_link0] LCP: state change Opened --> Stopping
Sep 15 03:20:52 ppp 4389 [wan_link0] Link: Leave bundle "wan"
Sep 15 03:20:52 ppp 4389 [wan] Bundle: Status update: up 0 links, total bandwidth 9600 bps
Sep 15 03:20:52 ppp 4389 [wan] IPCP: Close event
Sep 15 03:20:52 ppp 4389 [wan] IPV6CP: Close event
Sep 15 03:20:52 ppp 4389 [wan] IPCP: Down event
Sep 15 03:20:52 ppp 4389 [wan] IPCP: state change Closed --> Initial
Sep 15 03:20:52 ppp 4389 [wan] IPV6CP: Down event
Sep 15 03:20:52 ppp 4389 [wan] IPV6CP: state change Closed --> Initial
Sep 15 03:20:52 ppp 4389 [wan_link0] LCP: SendTerminateAck #4
Sep 15 03:20:52 ppp 4389 [wan_link0] LCP: LayerDown
Sep 15 03:20:52 ppp 4389 [wan_link0] PPPoE: connection closed
Sep 15 03:20:52 ppp 4389 [wan_link0] Link: DOWN event
Sep 15 03:20:52 ppp 4389 [wan_link0] Link: giving up after 0 reconnection attempts
Sep 15 03:20:52 ppp 4389 [wan_link0] LCP: Close event
Sep 15 03:20:52 ppp 4389 [wan_link0] LCP: state change Stopping --> Closing
Sep 15 03:20:52 ppp 4389 [wan_link0] LCP: Down event
Sep 15 03:20:52 ppp 4389 [wan_link0] LCP: LayerFinish
Sep 15 03:20:52 ppp 4389 [wan_link0] LCP: state change Closing --> Initial
Sep 15 03:20:52 ppp 4389 [wan_link0] Link: CLOSE event
Sep 15 03:20:52 ppp 4389 [wan_link0] LCP: Close event
Sep 15 03:20:54 ppp 4389 [wan] Bundle: Shutdown
Sep 15 03:20:54 ppp 4389 [wan_link0] Link: Shutdown
Sep 15 03:20:54 ppp 4389 process 4389 terminated)I have to downgrade to an older Version of Pfsense as i only have this problem with 24.03.
-
@Malec
Same for me.
I still encounter a race condition that causes mpd to start twice and get stuck, but I wrote my own script as a workaround. Not sure what the real status of the problem is. Still waiting for development snapshots available to re-test. -
@w0w Were you able to test with the older 24.08 public snapshot?