WAN does not renew after reboot
-
@stephenw10 said in WAN does not renew after reboot:
The dhcp and system logs.
Happened again for me so I collected some data just in case it helps identify the problem
Summary
- pfsense 2.5.2-RELEASE (amd64)
- My ISP reports My network terminal device had been online for about 1.5 hours ie since about 18:55
- No internet connection confirmed from about 20:15
- Internet connection restored by pfsense -> Status -> interfaces -> WAN interface -> Release WAN
Prior to fix
After WAN interface -> Release WAN
pfsense -> Status -> System logs -> System -> General (redacted - MyWanIP, MyWanGateway, MyWanBroadcast.255)
Log - System - after Release WAN - Redacted.txtpfsense -> Status -> System logs -> System -> DHCP (redacted - MyWanIP, MyWanGateway, MyWanBroadcast.255)
Log - DHCP - After Release WAN - Redacted.txt -
@patch said in WAN does not renew after reboot:
pfsense -> Status -> System logs -> System -> General (redacted - MyWanIP, MyWanGateway, MyWanBroadcast.255)
Log - System - after Release WAN - Redacted.txt
pfsense -> Status -> System logs -> System -> DHCP (redacted - MyWanIP, MyWanGateway, MyWanBroadcast.255)
Log - DHCP - After Release WAN - Redacted.txtThe system log :
In less then 10 minutes your have more then 10 WAN cable disconnect / connect event :
Change that cable (bad cable or bad plugs ?), or change (swap with LAN) the WAN NIC - or use another or new NIC for WAN.
This might also explain why it takes 'ages' to "software activate" the interface, just before the dhcp client is launched. It takes to much time to active.
Feb 8 20:28:36 dhclient 17482 bound to MyWanIP -- renewal in 900 seconds.
The ISP gives you a 900 seconds lease. That means the renewal will take place every 7 and a half minute.
This isn't wrong, it works, but creates a lot of IP re negotiation. IMHO : that's not ok - but you can't do anything about that.
Your are using 4G/5G or Wifi or comparable as a ISP access ?
"normal" DHCP lease duration is more like a day or several days. -
@gertjan
I believe there was an upstream fault which was restored about 1.5 hours (I don't know the exact time) before I notices the loss of internet connection and fixed it.I don't have control of the up stream fault behaviour. However I would like pfsense to restore the connection when the up stream fault is resolved.
So my concern is why did pfsense not try to establish a WAN DHCP connection after 19:11 but instead required me to manually trigger a DHCP "Release WAN" at 20:28
As to why the ISP fault behaviour maybe atypical, I effectively have two ISP (but that's how it is done in Australia)
-
User chooses an ISP. ISP determines users WAN IP address and gateway address (so I assume controls the WAN DHCP)
-
The physical connection from the ISP to user is provided by NBN (National Broadband Network) including providing the network terminal device on the users premises. In my case a HFC (hybrid fibre cable) connection so the NBN modem converts coax to Ethernet.
-
I connect from the Ethernet port on the NBN modem to my pfsense router.
As a result with up stream faults, the network terminal device may recover at a different time the the Wan DHCP server.
So while I can't really exclude a cable or NIC fault with equipment I own, I'm not convinced that is actually the problem here.
-
My ISP stats show my NTD has been up since 19:22 (after the time pfsense tried to access the DHCP server). My connection has had 34 flaps (HFC is not a good technical design as faults on the shared cable result in short service outages to all customers sharing that cable, we have had some rain today so probably explaining the high flaps).
-
pfsense logs show not attempt to re-establish Wan DHCP for over an hour and stopped near the time the NTD came online again.
-
the fault was cleared by manually forcing a "Release Wan" in pfsense (no cables were touched)
But perhaps I'm wrong again.
-
-
@patch said in WAN does not renew after reboot:
WAN DHCP connection after 19:11
Look at 19:10 : you WAN physical connection went down.
No more need for DHCPclient to stay in memory : the interface (WAN) isn't there any more.The gateway alarm also start to signal the lost of the WAN :
Feb 8 19:11:00 rc.gateway_alarm 94104 >>> Gateway alarm: WAN_DHCP (Addr:MyWanGateway Alarm:1 RTT:13.200ms RTTsd:1.154ms Loss:33%)
Id did no saw it coming back up untill the end : Feb 8 20:29:37
Normally, when it comes up again, a 'LINK' event is launched.
This starts the dhcpclient (because it's the WAN interface).
On of he actions during the 'LINK' event is : activating the interface. But for some reasons, this takes (to much) time / more then normal. The DHCP clients fails ....
Again : didn't saw this happing in your latest system log. For me, WAN (igb0) never came back after 9:11.Again : this could be cable/plug issue. Or a NIC issue (both side).
Or the upstream device that loves to take the connection down for whatever reason. -
sorry I edited my post to show more information
@gertjan said in WAN does not renew after reboot:
Or the upstream device that loves to take the connection down for whatever reason.
Yes that.
My connection flapped 32 times because it is HFC and it has rained here today.ISP reports NTD online since 19:22. pfsense made attempt to reconnect after that till I manually forced it at 20:29
-
Try this : between you ISP device and pfSense,, in the WAN cable : place a switch.
@patch said in WAN does not renew after reboot:
19:22.
Then no more WAN interface flapping since then.
So all ok.
If the WAN interface stays up : the dhcp client process keeps 'attached' to it, and does its work = renewing every 450 seconds. -
@gertjan said in WAN does not renew after reboot:
If the WAN interface stays up : the dhcp client process keeps 'attached' to it, and does its work = renewing every 450 seconds.
450 seconds = 7.5 minutes.
-
I'm not sure when the WAN Ethernet interface to pfsense came up but:
-
Internet connection up at 19:22 at which time I suspect the pfsense Ethernet interface was already up and it had already given up looking for a DHCP response (the 32 flaps in a day may have been the cause)
-
19:22 + 7.5 minutes << 20:29 when I had to manually force "Release Wan"
So I would be more than happy with pfsense looking itself every 7.5 minutes but that's not happening when the pfsense Wan Ethernet connection is up but no Wan DHCP has been found. It sat in this state for over an hour
-
-
@patch said in WAN does not renew after reboot:
19:22 + 7.5 minutes
Look at your DHCP logs :
I see the DHCP client stopping, because igb0/WAN interface goes away.Feb 8 19:10:57 dhclient 36309 connection closed Feb 8 19:10:57 dhclient 36309 exiting.
It was never (re) started.
@patch said in WAN does not renew after reboot:
So I would be more than happy with pfsense looking itself every 7.5 minutes but that's not happening when the pfsense Wan Ethernet connection is up but no Wan DHCP has been found.
The dhcp-client can work : if the WAN is up and stays up.
And if the WAN goes down, ok, when it goes up, the dhcp client should be restarted on WAN.
What you see is : WAN is chain gunned. No need to go technical here : this is not ok.Again : to keep WAN up : put a simple switch on the WAN side, bewteen pfSense and your IS device. This will permit WAN to stay up, removing one issue.
-
Hmm, there's something else going on there. It's not a timing a reboot issue for starters.
More importantly I don't see a single log showing:
kernel igb0: link state changed to UP
Or DOWN. Is the link actually flapping or is it just race conditions with a number of scripts?
Steve
-
@stephenw10 said in WAN does not renew after reboot:
Is the link actually flapping
It is flapping up stream ( probably due to moisture in joints on the cable shared between a group of NBN customers).
I don’t actually know what the NBN modem does to the Ethernet line to pfsense when this occurs.
-
Do you only have one WAN? If so try disabling the gateway montoring action for it. It looks like there are a bunch of scripts being fired each time it flaps that doing nothing useful.
Are you running Snort or Suricata in in-line mode? That could explain why it's showing the link go down/up but the NIC driver isn't reporting that.Steve
-
@stephenw10 said in WAN does not renew after reboot:
Do you only have one WAN? If so try disabling the gateway montoring action for it.
Yes one lan. I will disable monitoring to see if that help.
Testing will be a challange as the failure is not frequent (about 1 / month atm).@stephenw10 said in WAN does not renew after reboot:
Are you running Snort or Suricata in in-line mode?
Neither.
Have loaded pfBlockerNG-devel but notice no change since and not actually configured it.
no other packages have been loaded -
Hmm, well unclear why it's logging a hotplug event then. But disabling monitoring action is a good test anyway.
-
@stephenw10 said in WAN does not renew after reboot:
How exactly is it connected? Just to the main router or via some wifi bridge/extension?
Right now I just have the WAN plugged into a switch on my network at this location. When I ship it home, it will be connected directly to the ethernet cable coming from the fiber optic "modem".
@bingo600 good idea on disconnect/reconnect WAN after boot. I may see if that is a temporary work-around.
@stephenw10 said in WAN does not renew after reboot:
@RyanM When you generated that log was it only pfSense that was rebooted? The upstream router was not? Yet it still hadn't linked by that point?
Steve
Correct, I did not reboot the modem/router. I just booted pfSense.
@Gertjan fascinating write-up on the timing of when dhclient is initialized and when my interfaces are "up".
I am going to try adding that shellcmd package and the suggested command here after boot to see if it helps it pull an IP on this network. And I understand the behavior may be different at the other location, but I would at least like to see it work here first...
-
I used a different cable and am now seeing "1000baseT <full-duplex>" for WAN. I also moved the router to be on a switch connected directly to the Google WiFi that is acting as the router. Now my pfSense router is pulling an IP for the WAN when it boots. I think I am good. Will hopefully be able to ship this box home and get friend to plug it in and get internet at home working again.
-
Nice result!
-
Did a test where I plugged pfSense router directly into cable modem and was able to connect on mobile phone using VPN. I think I am in business. Thanks for the help.
-
This is still an issue. On both my pfSense boxes, they do not take a WAN address by DHCP if the pfSense box is up before the Fiber modem connection, which is always the case after a power outage. That means you have to drive to the location to press the renew DHCP button. This cannot be right and is a real problem
-
@hoegge
That was discussed here recently. Also there were solutions or workarounds given: https://forum.netgate.com/topic/177429/3100-tries-to-configure-wan-before-fiber-modem-has-uplink/3 -
@viragomann Thanks a lot - I'll try that. Still don't understand, why netgate has not made the client continue retrying, if it has no lease. Must be a common problem, and e.g. in my case, I have to drive 200 km to fix it :-(