Missing WAN uptime and missing default gateway on dashboard

rcoleman-netgate

@phil_d
Verify that your WAN is configured for DHCP but also the Routing table using the WAN for it's route (System->Routing .... Default IPv4 route option at the bottom of the page).

Many thanks for the tips, however I've tried all manner of default settings. I can confirm that routing shows as attached, and as it hasn't come up correctly (despite all traffic working fine) it just shows dynamic for the gateway address. I also tried restoring pfSense to defaults and just configuring it for PPPoE and it did the same thing. It is random, so 4 out of 10 connection attempts it is fine.

WAN is set to PPPoE for IPv4 and 6 is set to DHCPv6.

.

rcoleman-netgate

@phil_d I would run a pcap (Diagnostics->Packet Capture) on the PPPoE interface, disconnect it, then reconnect and see if the service is assigning you a DHCP Route.

[If you have never used packet capturing] Download the pcap file and check it in a program like Wireshark to see what the DHCP return states.

@rcoleman-netgate Yes used packet capture quite a bit, so will try that.

Just to clarify, if I watch the dashboard whilst PPPoE comes up, the uptime count is started and I will see the default Gateway on the dashboard, then a few seconds later, the uptime and the default gateway disappears on the dashboard. All connectivity remains working however. Just that the dashboard stops displaying uptime and the gateway.

If I manually add in the tmp files pppoe1_ip, pppoe1_router, pppoe1up which I noticed were missing when this happened, the dashboard sorts itself out, and it seems to be these files that get deleted after a few seconds or so and are needed by the dashboard. I suppose the question is: What script adds and deletes these files? I can then have a look or add some debugging.

I wonder if it is a timing or race condition hence the randomness? My PPPoE comes up instantly, and this issue only started after moving to a new ISP where it is so much quicker to log in and establish a connection. Could it be the process of clearing the old tmp files is sometimes happening after the new files have already been written, and so they are getting deleted?

@rcoleman-netgate I fixed the issue, the problem seems to be in /usr/local/sbin/ppp-linkdown. This script is called for PPP down on IPv6 and IPv4 and when called the last thing it does is remove the IPv4 signpost files from /tmp. There are some checks for other actions in the script for IPv6 or 4 networks, however the IPv4 signpost files are always deleted regardless if the call related to IPv6.

So I've added an if statement so that the signpost files are only deleted if the call is for IPv4, and that has resolved it. This may be quirk of my ISP or is some sort of race condition, but after IPv4 comes up and the IPv6 gateway is configuring, ppp-linkdown is being called for the IPv6 interface, perhaps to clear up things before ppp-linkup for IPv6 is called or my ISP brings IPv6 up, then drops it, before bringing it up again, and this linkdown call for IPv6 seems to incorrectly remove the IPv4 signpost files.

This is how I have modified the script, see if statement around the /bin/rm commands at the end of the script.

#!/bin/sh
#
# ppp-linkdown
#
# part of pfSense (https://www.pfsense.org)
# Copyright (c) 2004-2013 BSD Perimeter
# Copyright (c) 2013-2016 Electric Sheep Fencing
# Copyright (c) 2014-2022 Rubicon Communications, LLC (Netgate)
# All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

IF="${1}"
PROTOCOL="${2}"
LOCAL_IP="${3}"

if [ -f /tmp/${IF}up ] && [ -f /conf/${IF}.log ]; then
	seconds=$((`date -j +%s` - `/usr/bin/stat -f %m /tmp/${IF}up`))
	/usr/local/sbin/ppp-log-uptime.sh $seconds ${IF} &
fi

if echo "${IF}" | /usr/bin/egrep -qv "ppp[0-9]+"; then
	/etc/rc.kill_states ${IF} ${LOCAL_IP}
fi

if [ "${PROTOCOL}" == "inet" && -s "/tmp/${IF}_defaultgw" ]; then
	GW=`head -n 1 /tmp/${IF}_defaultgw`
	DGW=`/sbin/route -n get -inet default | /usr/bin/awk '/gateway:/ {print $2}'`
	# Only remove the default gateway if it matches the gateway for this interface. See redmine #1837
	if [ "${GW}" = "${DGW}" ]; then
		/sbin/route -q delete default ${GW}
	fi
fi

if [ "${PROTOCOL}" == "inet6" ]; then
	/usr/local/sbin/ppp-ipv6 ${IF} down
fi
# delete the node just in case mpd cannot do that
/usr/sbin/ngctl shutdown ${IF}:
if [ -f "/var/etc/nameserver_${IF}" ]; then
	# Remove old entries
	for nameserver in `cat /var/etc/nameserver_${IF}`; do
		/sbin/route -q delete ${nameserver} >/dev/null 2>&1
	done
	/bin/rm -f /var/etc/nameserver_${IF}
fi
# Do not remove gateway used during filter reload.

# Add check to only remove signpost files when this is a IPv4 network link down
if [ "${PROTOCOL}" == "inet" ]; then
    /bin/rm -f /tmp/${IF}_router
    /bin/rm -f /tmp/${IF}up
    /bin/rm -f /tmp/${IF}_ip
fi

/usr/local/sbin/pfSctl -c 'service reload dns'

stephenw10

Did you have 'Use IPv4 connectivity as parent interface' set for the WAN DHCPv6?

@stephenw10 Yes that is/was checked. All I've changed is that script. With the change all works perfectly with the gateway and uptime always displaying correctly no matter how many times I've tested dropping PPP and bring it back up. Reinstate the script file as it was and 6 in 10 connection attempts shows the gateway and uptime showing for just a few seconds before disappearing.

stephenw10

Hmm. Did you try without that checked?

@stephenw10 This isn't a connectivity issue as I explained in my first post. Just that the gateway and uptime on the dashboard vanishes, but connectivity remains. The fix is in that script change, and all works perfectly.

I have to use the 'Use IPv4 connectivity as parent interface' set for the WAN DHCPv6?' option as that is required by my ISP (all my ISPs have been the same) as without that option there is no IPv6.

stephenw10

Hmm, interesting. I use that exact setting here and have done for years. I've never seen that issue though.
The real bug here seems to be that's running ppp-linkdown at all for a dhcp interface. I could imagine that parent interface setting coming into play there though.

@stephenw10 I've added it to https://redmine.pfsense.org/issues/13552#change-63114 someone has suggested it is a duplicate but I don't see that it is. I've also added a note regarding the leaving in place of the IPv6 tmp files which do NOT appear to get deleted at all, which may cause any scripts using the existence of those files to indicate an up interface to fail in some way.

I also found someone else potentially falling across the same issue: https://www.reddit.com/r/PFSENSE/comments/e00han/no_wan_uptime_stat_on_dashboard/

Hopefully it will get reviewed and a fix will go into the next version. It definitely appears an oversight to be deleting the IPv4 tmp files for an IPv6 call to that script. I've tested dropping and reconnecting a dozen times now and uptime and the gateway is always appearing correctly on the dashboard now.

stephenw10

Mmm, I could see those being related but not a direct duplicate. They could certainly have the same root cause. I agree I could see the fact the v6 files don't get removed as causing that other bug at least partly.

Steve

josh256

Same issue on clean 2.7.2 install (bare metal)

Solution:
Install system_patches in package manager, apply all patches, reboot ;)