Constant rerooting
-
My internet connection here is flakey. There is only one infrastructure provider. The whole neighbourhood is experiencing short, random outages.
This would not be a real problem -- mine is not a mission-critical network -- except that every time the ISP's connection goes down, even for less than a minute, the only way I can get the internet back is to reroot from the console. Sometimes when I check connectivity in the morning, overnight backups and updates have failed.
I have disabled Gateway Monitoring and Gateway Action. I would be happy with a momentary delay in communications until the connection comes back up.
Why do I have to restart services each time there is a short duration loss of connection? What setting am I missing that would allow pfSense to recover gracefully and automatically without any human intervention?
-
@RickyO Hmm, thats kind of strange since nothing really should “happen” on pfSense if you are not using the gateway monitor/gateway action.
I think you need to identify which service is actually the culprit on your pfSense. It seems likely that it’s the DHCP client that “dies” if there has been no replies to nenew frames… It should just continue to ask for renew, or go into discover mode if the lease times out completely.
Next time - dont do a reroot. Login to the UI and go to STATUS -> INTERFACES and click “Release WAN” and then “Rewew WAN”. If. That has no effect, do it again - this time with “Relequish lease” selected. See if this solves the problem.I’m a little unsure if that action actually also restarts Unbound (DNS server) in the background. The DNS server could be the other culprit, so you should also test if simply restarting that wakes up your internet access. Go to STATUS - SERVICES and restart Unbound. See if that recovers the internet without retrying the DHCP lease on WAN.
-
@RickyO said in Constant rerooting:
Why do I have to restart services each time there is a short duration loss of connection? What setting am I missing that would allow pfSense to recover gracefully and automatically without any human intervention?
And your right.
For a device, whatever device, the network connection (typically : using RJ45, as wifi is special as it constantly breaks and that's 'normal' ) is considered always "UP".
And if it breaks, like you ripped out the cable of your PC, and put it back in, the connection comes back you doing nothing special.
I'm pretty sure you already have tried this.
So, between your pfSense LAN interface and your other devices, everything works fine.
WAN is just another interface, with its network partially in your house, and going outside, to your ISP equipment.
You are correct : when I remove the power of ISP router, which is connected to the WAN interface of my pfSense, and then power it back on, my WAN == Internet connection comes back again me doing nothing.
Same thing happens when I remove the fiber cable from my ISP router, wait a bit, and put it back in.The thing is : your WAN connection isn't behaving the "known" way.
Its breaks all the time.
And when it comes back, it comes back in a way 'strange' way. And normally, you don't care, but if pfSense can't rebuild the connection ... well, that's bad.Plenty reasons to stop using that ISP ...
You didn't say what type of WAN connection you use : DHCP ? Some other type like PPPOE ?
DHCP is extremely resilient, and tested for more then 5 decades now. It's rock solid. So it should work. If it doesn't, well, that say a lot about your ISP ....
There are times you can't fix what is broken, as you're on the wrong side of the wire. -
@keyser Thank you for responding.
I tried what you suggested. When the connection went down and stayed down I did the "Release WAN/Renew WAN" and the connection was restored.
The next time it went down I tried restarting unbound. That had no effect, so I doubt that it's a DNS issue. I am using Cloudflare instead of the ISP's DNS and have not allowed DNS Server Override.
This would seem to affirm what @Gertjan suggested: that's it's a problem with my ISP, not my network.
-
FWIW, I have a similar problem on Comcast cable. If I reboot the modem (Netgear CM600) from its web interface, pfSense never comes back online. I haven't bothered juggling services or anything, I just reboot (not reroot) the FW. Actually, I had never heard of rerooting before this thread. Like the OP, it's just me so not a big deal, more curious than complaining. Gateway Status cycles from Online/Packetloss/Unknown. Rinse/repeat.
-
@Gertjan said in Constant rerooting:
Plenty reasons to stop using that ISP ...
I agree, but really have no options. One provider has a monopoly in the neighbourhood. They own all the coax cable hung from the utility poles and into the houses -- a legacy of the old cable TV days.
My ISP leases a line and provides my connection along with a digital "landline" and our mobile phone service. When it is up I am getting the advertised 500 Mbps. When it goes down I sometimes also lose the connection for my digital phone box and have to restart it.
This has been going on for a long time. We used to experience "buffering" while watching streaming TV and I just thought it was a connection speed issue. When our granddaughter started using our network I thought it wise to build a better router and have a separate network for WiFi, home automation and entertainment.
I installed Endian Firewall on an older pc and started experiencing the frequent lost connections. I thought it might be a hardware issue and installed Endian on a higher spec pc (3.40GHz processor, 8 GB RAM, 240GB SSD). The problems continued. I installed pfSense on the newer pc thinking it was software problem. The frequent dropped connections continue. At least I can just "reroot" or “Release WAN” and then “Rewew WAN” as @keyser suggested. Getting the connection back with Endian necessitated a complete reboot.
My ISP is apologetic but really unable to do much about the leased line. I am preparing a formal complaint about the coax line owner.
-
Seems like a DHCP issue.
Anything in the dhcp logs?
Does it reconnect if you manually release/renew from Status > Interfaces?
-
@provels If you are running your firewall device "headless", there is only a reboot option in the web GUI Diagnostics/Reboot. That does indeed restart everything.
I have a keyboard and monitor attached to the pc that runs pfSense. The monitor displays a simple FreeBSD-style page with numbered options. One of the options is "Reboot system" followed a further choice of a complete reboot or a "reroot" which just restarts services and not the entire OS. The reroot is much faster -- a factor when you're forced to do it a few times a day...
-
You can reroot from the gui, you just have to select it:
-
@RickyO @stephenw10 Thanks for the tips!
-
@RickyO said in Constant rerooting:
@keyser Thank you for responding.
I tried what you suggested. When the connection went down and stayed down I did the "Release WAN/Renew WAN" and the connection was restored.
The next time it went down I tried restarting unbound. That had no effect, so I doubt that it's a DNS issue. I am using Cloudflare instead of the ISP's DNS and have not allowed DNS Server Override.
This would seem to affirm what @Gertjan suggested: that's it's a problem with my ISP, not my network.
Excellent tests - now you know.
I’m 99.9% sure your ISP is running “DHCP Authentication” which means your ISP requires a full DHCP Discover/acknowledge process to actually “open” the line for a client. The client can only be the MAC address that completed the DHCP discover procedure.
It’s not at all uncommon to use that “lockdown” mechanism on the ISP side.What is uncommon is the fact that your authentication is not sticky with the ISP after a line outage. So the next thing I’m pretty sure of is that the line outtage is the ISP equipment or line that has a problem, because your authentication should stick unless the ISPs concentrator sensed a linkdown when you experience the line is down. (There is a difference between linkdown, and just no service). With a linkdown all authentications are invalidated a has to be done from scratch again. I have the exact same issue on a couple of ISP’s.
The problem is this: All standard equipment just follows DHCP procedure, and as long as there has been no sensed linkdown (possible change of network), the client will never do a new full DHCP discover/acknowledge procedure while the current DHCP lease is valid - it will only try a DHCP renew (Which is not enough for DHCP authentication).
THe ISPs router however is specifically configured to either never do renews (always full discover), or have a gateway monitor that triggers a full discover cycle. But most likely it also senses the “linkdown” and just does a full DHCP discover. But a bridgemode router might not relay the linkdown to the device behind it, and then you have the problem
@stephenw10 Can you think of some custom DHCP settings in pfSense that will have pfSense do a full dhcp discover cycle on a gateway action trigger?
But really: You should contact the ISP and have the look into the outages - as they are likely also seen/present on ports in their devices.
-
@stephenw10 Yes. When I do "Release WAN/Renew WAN" from Status/Interfaces the connection is restored.
I have tried to find things in the logs without success (or knowing where to look...)
For example, when I go to Status/System Logs/DHCP and filter for "WARN" in the message I get mostly stuff about multi-threading and queue size.
-
Hmm, so since the link never actually goes down the dhclient is never re-fired.
You could just set a much shorter DHCP lease. If it fails to renew it should eventually try discover. And you can set the number of times it tries that I believe.
-
@stephenw10 Yeah, that could be a “workaround”.
But the timers under DHCP advanced configuration is requested leasetimes. Does that mean pfSense will use those settings regardless of the DHCP servers returned leasetime (if it does not grant/honour the clients request)?
Setting the leasetime to fx. 10 minuttes should make sure that pfSense is never more than < 10min from coming back up automatically once the line comes back.
-
The requested time is what is sent to the server in the request. But it can ignore that and send you a much longer lease. Any many will do that.
-
@stephenw10 I will double check the next time the connection goes down, but I'm fairly certain that when I went to Status/Interfaces WAN Interface to do "Release WAN", Status and DHCP both said "up".
If I go to Interfaces/WAN DHCP Client Configuration these are my current settings:
-
You can put
dhcp-lease-time XXX
in the Send Options field to request that from the server.You can put
supersede dhcp-lease-time XXX
in the Option Modifiers field to ignore whatever lease time the server sends you and use that instead. Obviously that should always be less than the lease time the server sends. -
@Gertjan Thank you for your advice. My connection seems stable for now after some work by the leased line owner.
-
@stephenw10 Thank you for your help with this.
-
@keyser Thank you for your help with this.