WAN does not renew after reboot
-
@stephenw10 said in WAN does not renew after reboot:
That's what happens when you replug the WAN.
From a user perspective, the problem is the interface gets it self in a state where the user needs to unplug then replug in the WAN Ethernet cable.
Perhaps the edge case I'm encountering is the Ethernet line on my modem can be up when the WAN DHCP is not accessible. A situation which if persists results in pfsense establishing a lease on the WAN line without an IPv4 gateway address (supplied by the ISP DHCP). Isolating pfsense from the internet.
In Australia there is effectively two ISP further exacerbating the issue. The link to the exchange is owned and run by NBN, the subsequent link and DHCP is owned by the ISP the user chooses. As a result he cable modem being up is not synonymous with the ISP WAN DHCP being accessible.
-
Mmm, the interesting thing here is that in the vast majority of cases dhcp pulls a lease as expected. If it fails to find a dhcp server it just keeps trying. If you boot without the WAN connected it will pull as lease as soon as it's connected.
So it looks like what's happening here must be some timing issue. For whatever reason the WAN is not linked when rc.bootup tries to initialise it so it fails. But it then becomes linked without triggering a new dhclient process on WAN. I would guess because bootup is still running rc.linkup is not called.@RyanM When you generated that log was it only pfSense that was rebooted? The upstream router was not? Yet it still hadn't linked by that point?
Steve
-
dhcp.log :
At 11:39:41
Feb 4 11:39:41 router php[392]: rc.bootup: The command '/sbin/dhclient -c /var/etc/dhclient_wan.conf igb0 > /tmp/igb0_output 2> /tmp/igb0_error_output' returned exit code '1', the output was ''
but it fails (probably : WAN (and LAN for that matter) are not ready yet ; no interface 'WAN' bind to. dhclient fails.
edit :
Or something like :
/etc/inc/interfaces.inc : line 40 :pfSense_interface_flags($interface, IFF_UP);
this line fires for "WAN" but it takes several (5 ... ) seconds to take effect, to late for dhclient.
edit end.Or, 5 seconds later, igb0 (WAN) and igb1 (LAN) are coming up :
At 11:39:44 (3 seconds later) :Feb 4 11:39:44 router check_reload_status[374]: Linkup starting igb0 Feb 4 11:39:44 router kernel: Feb 4 11:39:44 router kernel: igb0: link state changed to UP .... Feb 4 11:39:45 router check_reload_status[374]: Linkup starting igb1 Feb 4 11:39:45 router kernel: igb1: link state changed to UP
My theory : dhclient has been fired up just to early. And for the rest of the duration of "rc.bootup" interface LINK events are not acted upon ( ? ).
At Feb 4 11:39:51 (7 seconds later) :
Feb 4 11:39:51 router root[10534]: Bootup complete
Let's give 'dhclient' a second chance, at the end of the main bootup script, as it was just a couple of seconds to early with launching dhclient.
Install this lightweight pfSense package :The command to copy paste :
/sbin/dhclient -c /var/etc/dhclient_wan.conf igb0 > /tmp/igb0_output 2> /tmp/igb0_error_output
-
Yes, I would expect that to work.
It's interesting that you are hitting this at all though. There is something unusual happening in the link timing,
Steve
-
@stephenw10 said in WAN does not renew after reboot:
The dhcp and system logs.
Happened again for me so I collected some data just in case it helps identify the problem
Summary
- pfsense 2.5.2-RELEASE (amd64)
- My ISP reports My network terminal device had been online for about 1.5 hours ie since about 18:55
- No internet connection confirmed from about 20:15
- Internet connection restored by pfsense -> Status -> interfaces -> WAN interface -> Release WAN
Prior to fix
After WAN interface -> Release WAN
pfsense -> Status -> System logs -> System -> General (redacted - MyWanIP, MyWanGateway, MyWanBroadcast.255)
Log - System - after Release WAN - Redacted.txtpfsense -> Status -> System logs -> System -> DHCP (redacted - MyWanIP, MyWanGateway, MyWanBroadcast.255)
Log - DHCP - After Release WAN - Redacted.txt -
@patch said in WAN does not renew after reboot:
pfsense -> Status -> System logs -> System -> General (redacted - MyWanIP, MyWanGateway, MyWanBroadcast.255)
Log - System - after Release WAN - Redacted.txt
pfsense -> Status -> System logs -> System -> DHCP (redacted - MyWanIP, MyWanGateway, MyWanBroadcast.255)
Log - DHCP - After Release WAN - Redacted.txtThe system log :
In less then 10 minutes your have more then 10 WAN cable disconnect / connect event :
Change that cable (bad cable or bad plugs ?), or change (swap with LAN) the WAN NIC - or use another or new NIC for WAN.
This might also explain why it takes 'ages' to "software activate" the interface, just before the dhcp client is launched. It takes to much time to active.
Feb 8 20:28:36 dhclient 17482 bound to MyWanIP -- renewal in 900 seconds.
The ISP gives you a 900 seconds lease. That means the renewal will take place every 7 and a half minute.
This isn't wrong, it works, but creates a lot of IP re negotiation. IMHO : that's not ok - but you can't do anything about that.
Your are using 4G/5G or Wifi or comparable as a ISP access ?
"normal" DHCP lease duration is more like a day or several days. -
@gertjan
I believe there was an upstream fault which was restored about 1.5 hours (I don't know the exact time) before I notices the loss of internet connection and fixed it.I don't have control of the up stream fault behaviour. However I would like pfsense to restore the connection when the up stream fault is resolved.
So my concern is why did pfsense not try to establish a WAN DHCP connection after 19:11 but instead required me to manually trigger a DHCP "Release WAN" at 20:28
As to why the ISP fault behaviour maybe atypical, I effectively have two ISP (but that's how it is done in Australia)
-
User chooses an ISP. ISP determines users WAN IP address and gateway address (so I assume controls the WAN DHCP)
-
The physical connection from the ISP to user is provided by NBN (National Broadband Network) including providing the network terminal device on the users premises. In my case a HFC (hybrid fibre cable) connection so the NBN modem converts coax to Ethernet.
-
I connect from the Ethernet port on the NBN modem to my pfsense router.
As a result with up stream faults, the network terminal device may recover at a different time the the Wan DHCP server.
So while I can't really exclude a cable or NIC fault with equipment I own, I'm not convinced that is actually the problem here.
-
My ISP stats show my NTD has been up since 19:22 (after the time pfsense tried to access the DHCP server). My connection has had 34 flaps (HFC is not a good technical design as faults on the shared cable result in short service outages to all customers sharing that cable, we have had some rain today so probably explaining the high flaps).
-
pfsense logs show not attempt to re-establish Wan DHCP for over an hour and stopped near the time the NTD came online again.
-
the fault was cleared by manually forcing a "Release Wan" in pfsense (no cables were touched)
But perhaps I'm wrong again.
-
-
@patch said in WAN does not renew after reboot:
WAN DHCP connection after 19:11
Look at 19:10 : you WAN physical connection went down.
No more need for DHCPclient to stay in memory : the interface (WAN) isn't there any more.The gateway alarm also start to signal the lost of the WAN :
Feb 8 19:11:00 rc.gateway_alarm 94104 >>> Gateway alarm: WAN_DHCP (Addr:MyWanGateway Alarm:1 RTT:13.200ms RTTsd:1.154ms Loss:33%)
Id did no saw it coming back up untill the end : Feb 8 20:29:37
Normally, when it comes up again, a 'LINK' event is launched.
This starts the dhcpclient (because it's the WAN interface).
On of he actions during the 'LINK' event is : activating the interface. But for some reasons, this takes (to much) time / more then normal. The DHCP clients fails ....
Again : didn't saw this happing in your latest system log. For me, WAN (igb0) never came back after 9:11.Again : this could be cable/plug issue. Or a NIC issue (both side).
Or the upstream device that loves to take the connection down for whatever reason. -
sorry I edited my post to show more information
@gertjan said in WAN does not renew after reboot:
Or the upstream device that loves to take the connection down for whatever reason.
Yes that.
My connection flapped 32 times because it is HFC and it has rained here today.ISP reports NTD online since 19:22. pfsense made attempt to reconnect after that till I manually forced it at 20:29
-
Try this : between you ISP device and pfSense,, in the WAN cable : place a switch.
@patch said in WAN does not renew after reboot:
19:22.
Then no more WAN interface flapping since then.
So all ok.
If the WAN interface stays up : the dhcp client process keeps 'attached' to it, and does its work = renewing every 450 seconds. -
@gertjan said in WAN does not renew after reboot:
If the WAN interface stays up : the dhcp client process keeps 'attached' to it, and does its work = renewing every 450 seconds.
450 seconds = 7.5 minutes.
-
I'm not sure when the WAN Ethernet interface to pfsense came up but:
-
Internet connection up at 19:22 at which time I suspect the pfsense Ethernet interface was already up and it had already given up looking for a DHCP response (the 32 flaps in a day may have been the cause)
-
19:22 + 7.5 minutes << 20:29 when I had to manually force "Release Wan"
So I would be more than happy with pfsense looking itself every 7.5 minutes but that's not happening when the pfsense Wan Ethernet connection is up but no Wan DHCP has been found. It sat in this state for over an hour
-
-
@patch said in WAN does not renew after reboot:
19:22 + 7.5 minutes
Look at your DHCP logs :
I see the DHCP client stopping, because igb0/WAN interface goes away.Feb 8 19:10:57 dhclient 36309 connection closed Feb 8 19:10:57 dhclient 36309 exiting.
It was never (re) started.
@patch said in WAN does not renew after reboot:
So I would be more than happy with pfsense looking itself every 7.5 minutes but that's not happening when the pfsense Wan Ethernet connection is up but no Wan DHCP has been found.
The dhcp-client can work : if the WAN is up and stays up.
And if the WAN goes down, ok, when it goes up, the dhcp client should be restarted on WAN.
What you see is : WAN is chain gunned. No need to go technical here : this is not ok.Again : to keep WAN up : put a simple switch on the WAN side, bewteen pfSense and your IS device. This will permit WAN to stay up, removing one issue.
-
Hmm, there's something else going on there. It's not a timing a reboot issue for starters.
More importantly I don't see a single log showing:
kernel igb0: link state changed to UP
Or DOWN. Is the link actually flapping or is it just race conditions with a number of scripts?
Steve
-
@stephenw10 said in WAN does not renew after reboot:
Is the link actually flapping
It is flapping up stream ( probably due to moisture in joints on the cable shared between a group of NBN customers).
I don’t actually know what the NBN modem does to the Ethernet line to pfsense when this occurs.
-
Do you only have one WAN? If so try disabling the gateway montoring action for it. It looks like there are a bunch of scripts being fired each time it flaps that doing nothing useful.
Are you running Snort or Suricata in in-line mode? That could explain why it's showing the link go down/up but the NIC driver isn't reporting that.Steve
-
@stephenw10 said in WAN does not renew after reboot:
Do you only have one WAN? If so try disabling the gateway montoring action for it.
Yes one lan. I will disable monitoring to see if that help.
Testing will be a challange as the failure is not frequent (about 1 / month atm).@stephenw10 said in WAN does not renew after reboot:
Are you running Snort or Suricata in in-line mode?
Neither.
Have loaded pfBlockerNG-devel but notice no change since and not actually configured it.
no other packages have been loaded -
Hmm, well unclear why it's logging a hotplug event then. But disabling monitoring action is a good test anyway.
-
@stephenw10 said in WAN does not renew after reboot:
How exactly is it connected? Just to the main router or via some wifi bridge/extension?
Right now I just have the WAN plugged into a switch on my network at this location. When I ship it home, it will be connected directly to the ethernet cable coming from the fiber optic "modem".
@bingo600 good idea on disconnect/reconnect WAN after boot. I may see if that is a temporary work-around.
@stephenw10 said in WAN does not renew after reboot:
@RyanM When you generated that log was it only pfSense that was rebooted? The upstream router was not? Yet it still hadn't linked by that point?
Steve
Correct, I did not reboot the modem/router. I just booted pfSense.
@Gertjan fascinating write-up on the timing of when dhclient is initialized and when my interfaces are "up".
I am going to try adding that shellcmd package and the suggested command here after boot to see if it helps it pull an IP on this network. And I understand the behavior may be different at the other location, but I would at least like to see it work here first...
-
I used a different cable and am now seeing "1000baseT <full-duplex>" for WAN. I also moved the router to be on a switch connected directly to the Google WiFi that is acting as the router. Now my pfSense router is pulling an IP for the WAN when it boots. I think I am good. Will hopefully be able to ship this box home and get friend to plug it in and get internet at home working again.
-
Nice result!