Auto-renew DHCP after outage
-
Your gateway monitoring isn't working correctly.
From: System | Routing | Gateways configure a Monitor-IP so the system detects it's not online. -
Your gateway monitoring isn't working correctly.
From: System | Routing | Gateways configure a Monitor-IP so the system detects it's not online.On my system, gateway monitoring is disabled, but I don't have a problem with things failing. My cable modem is configured in bridge mode and pfSense has no problem restoring connection after a power failure, as happened last Friday. As discussed earlier, a device "owns" a DHCP address for the duration of the lease time. That means that even if the modem fails, for any reason, and then comes back up, the DHCP address should still be valid and work. Something else is going on with the OP, which is why I asked about the subnet after the failure. We need more info from the OP, including packet captures to really know what's happening.
-
The modem is the DHCP server (I'd assume). That modem is provided by the Internet provider (UPC Cablecom Switzerland) and I cannot change it or use another model. It has tons of functionality, like IP telephony, WLAN, guest WLAN and many other things, but I've disabled all that and switched it to a modem/router-only mode, where nothing can be configured and WLAN is turned off.
I've tried with a standalone notebook connected only to that modem and the same problem occurs. But as I cannot change it and the problem is fixed after a release/renew, I'd like to configure that somehow.
It's worth mentioning that after a release/renew, when it works again, I often get the same IP address. But before the release/renew I couldn't ping 8.8.8.8 and afterwards I do get an answer, so there's definitely something broken that gets fixes with a release/renew.
And pfSense does recognize this problem somehow as the dashboard shows "n/a" in the IP on the WAN connection. So I'd like to automate that somehow; whenever there's a "n/a" on the dashboard for WAN, a release/renew should get issued.
Any idea how to achive this?
I might also complain to the Internet provider, but I suspect that won't get me anywhere.
Regarding the Monitoring, there's a default entry in System/Routing/Gateways, with Name=WAN_DHCP (default) with Monitor IP=external IP. When I click on Edit, I get Gateway=dynamic (Gateway IP address), the Default Gateway checkbox is checked. The monitor IP does respond to ping. -
I don't see anything that can be done to restore a connection when the failure is detected. One thing you could do is write a shell script that pings the gateway address and if it fails do something like restart the dhcp client.
-
I've got the same issue as OP. My ISP is Ziggo (which basically is UPC..) and cable modem in bridge mode.
I guess I have to start looking for a shell script which renews the IP when a ping to an outside address fails..@OP, what kind of cable modem do you have? Perhaps it's related to one specific type..
-
Another case of an ISP device that should know it has to do something like down/up the downstream link on an upstream address change.
-
Another case of an ISP device that should know it has to do something like down/up the downstream link on an upstream address change.
Well, in the case of the antenna cable disconnect, after a DHCP release/renew, I get the same IP as before. So it's not only the address change.
-
I've got the same issue as OP. My ISP is Ziggo (which basically is UPC..) and cable modem in bridge mode.
I guess I have to start looking for a shell script which renews the IP when a ping to an outside address fails..@OP, what kind of cable modem do you have? Perhaps it's related to one specific type..
UPC Cablecom Switzerland uses the so called "Connect Box". Googling for it shows that it seems to be used in other countries as well, including Ziggo.
This is the modem in question:
https://www.broadbandtvnews.com/2015/11/12/upc-cablecom-rolls-out-libertys-new-wi-fi-gateway/
I got the modem exchanged in the meantime, but of course it didn't fix the problem.
Please note that the box has many features like WLAN, routing, etc., but it's configured in modem-only mode (no settings at all in this mode).
If anyone wrote a script, please provide it here (including installation instructions if possible). -
I have a different modem, the Technicolor TC7200. Then it must be our ISP which has it's own view on implementing technology instead of using worldwide standards.
In a different forum post I found someone who has created a script which pings certain external IP addresses and if all of them fails it resets the wan interface, then it pings again and when that fails it initiates a reboot
https://forum.pfsense.org/index.php?topic=51786.0I will test with it asap..
-
@e4ch I had to set pfSense to reject the DHCP info offered by my cable modem when it is not connected to the the Internet, that causes pfSense to wait to do a DHCP request until I'm on-line and getting DHCP information from my ISP instead of the internal modem server.
http://pfsense.home/interfaces.php?if=wan
Reject leases from
192.168.100.1
To have the DHCP client reject offers from specific DHCP servers, enter their IP addresses here (separate multiple entries with a comma). This is useful for rejecting leases from cable modems that offer private IP addresses when they lose upstream sync. -
This it! Thanks now it's working fine! Thanks.
@stan-qaz said in Auto-renew DHCP after outage:
@e4ch I had to set pfSense to reject the DHCP info offered by my cable modem when it is not connected to the the Internet, that causes pfSense to wait to do a DHCP request until I'm on-line and getting DHCP information from my ISP instead of the internal modem server.
http://pfsense.home/interfaces.php?if=wan
Reject leases from
192.168.100.1
To have the DHCP client reject offers from specific DHCP servers, enter their IP addresses here (separate multiple entries with a comma). This is useful for rejecting leases from cable modems that offer private IP addresses when they lose upstream sync. -
I still haven't done anything yet and the problem persists. The reject setting did not solve the issue and I'm a bit reluctant to implement a script that is dependent on external sites - from the sample script, already two of the four sites no longer exist. Also, why should we check external sites, if the problem is somehow clearly detectable? When the dashboard shows "n/a" as WAN IP address, then we have a problem.
Let me show you the situation again:
(not sure how to attach images here, so let me give a description)
On the Dashboard, in Interfaces, the WAN shows as "up", but as IP shows "n/a" when the problem exists (with some network traffic on the chart). If there's no problem, it shows a correct external IP address instead.
In Interface Status, Status and DHCP both show as "up" (both in working / not working case). The IPv4 Address only shows in the working case. When it's not working it's not listed. The DNS servers there are shown as 127.0.0.1 and the four servers from my ISP in the working case. When it's not working because the modem rebooted, it only shows the 127.0.0.1. When it's not working because modem and pfSense rebooted both together, it shows all 5 DNS like in the working case.
I created a log file (I'm on latest version 2.4.4-RELEASE-p2(amd64)).
Here first the system log:Mar 5 01:04:38 kernel igb1: link state changed to DOWN Mar 5 01:04:38 check_reload_status Linkup starting igb1 Mar 5 01:04:39 php-fpm 37326 /rc.linkup: DEVD Ethernet detached event for wan Mar 5 01:04:41 php-fpm 37326 /rc.linkup: Shutting down Router Advertisment daemon cleanly Mar 5 01:04:41 check_reload_status Reloading filter Mar 5 01:04:53 rc.gateway_alarm 8999 >>> Gateway alarm: WAN_DHCP (Addr:8X.XXX.XX.1 Alarm:1 RTT:10.948ms RTTsd:8.602ms Loss:21%) Mar 5 01:04:53 check_reload_status updating dyndns WAN_DHCP Mar 5 01:04:53 check_reload_status Restarting ipsec tunnels Mar 5 01:04:53 check_reload_status Restarting OpenVPN tunnels/interfaces Mar 5 01:04:53 check_reload_status Reloading filter Mar 5 01:04:54 php-fpm 37326 /rc.openvpn: Gateway, none 'available' for inet, use the first one configured. 'WAN_DHCP' Mar 5 01:04:54 php-fpm 37326 /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. '' Mar 5 01:04:54 php-fpm 37326 /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_DHCP. Mar 5 01:04:54 php-fpm 345 /rc.dyndns.update: Dynamic DNS: updatedns() starting Mar 5 01:04:54 php-fpm 345 /rc.dyndns.update: Dynamic DNS (): running get_failover_interface for wan. found igb1 Mar 5 01:04:54 php-fpm 345 /rc.dyndns.update: Dynamic DNS () There was an error trying to determine the public IP for interface - wan (igb1 ). <<<Time b>>> Mar 5 01:06:08 kernel igb1: link state changed to UP Mar 5 01:06:08 check_reload_status Linkup starting igb1 Mar 5 01:06:09 php-fpm 20955 /rc.linkup: DEVD Ethernet attached event for wan Mar 5 01:06:09 php-fpm 20955 /rc.linkup: HOTPLUG: Configuring interface wan <<<Time c>>> Mar 5 01:07:32 php-fpm 20955 /rc.linkup: calling interface_dhcpv6_configure. Mar 5 01:07:32 php-fpm 20955 /rc.linkup: Accept router advertisements on interface igb1 Mar 5 01:07:32 php-fpm 20955 /rc.linkup: Starting rtsold process Mar 5 01:07:34 php-fpm 20955 /rc.linkup: Gateway, none 'available' for inet, use the first one configured. 'WAN_DHCP' Mar 5 01:07:34 php-fpm 20955 /rc.linkup: Default gateway setting Interface WAN_DHCP Gateway as default. Mar 5 01:07:34 php-fpm 20955 /rc.linkup: Gateway, none 'available' for inet6, use the first one configured. '' Mar 5 01:07:34 check_reload_status Restarting ipsec tunnels Mar 5 01:07:35 rtsold 9729 <sendpacket> sendmsg on igb1: Permission denied Mar 5 01:07:37 check_reload_status updating dyndns wan Mar 5 01:07:37 check_reload_status Reloading filter Mar 5 01:07:38 php-fpm 344 /rc.dyndns.update: Dynamic DNS: updatedns() starting Mar 5 01:07:39 php-fpm 344 /rc.dyndns.update: Dynamic DNS (): running get_failover_interface for wan. found igb1 Mar 5 01:07:39 php-fpm 344 /rc.dyndns.update: Dynamic DNS () There was an error trying to determine the public IP for interface - wan (igb1 ). Mar 5 01:07:39 rtsold 9729 <sendpacket> sendmsg on igb1: Permission denied Mar 5 01:07:43 rtsold 9729 <sendpacket> sendmsg on igb1: Permission denied
What I've done:
Starting with fully working pfSense
Turned off the ISP modem (logs until <<<Time b>>>)
Turned on the ISP modem again (logs until <<<Time c>>>); Dashboard showing IP 0.0.0.0 for WAN
After rest of the log: Dashboard now showing "n/a" for WANHere the DHCP log for the same period:
Mar 5 01:04:38 dhclient 36543 igb1 link state up -> down Mar 5 01:04:39 dhclient 32254 connection closed Mar 5 01:04:39 dhclient 32254 exiting. <<<Time b>>> Mar 5 01:06:09 dhclient PREINIT Mar 5 01:06:09 dhclient 73556 DHCPREQUEST on igb1 to 255.255.255.255 port 67 <<<Time c>>> Mar 5 01:06:11 dhclient 73556 DHCPREQUEST on igb1 to 255.255.255.255 port 67 Mar 5 01:06:16 dhclient 73556 DHCPREQUEST on igb1 to 255.255.255.255 port 67 Mar 5 01:06:29 dhclient 73556 DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 2 Mar 5 01:06:31 dhclient 73556 DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 3 Mar 5 01:06:34 dhclient 73556 DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 8 Mar 5 01:06:42 dhclient 73556 DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 13 Mar 5 01:06:55 dhclient 73556 DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 20 Mar 5 01:07:15 dhclient 73556 DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 13 Mar 5 01:07:28 dhclient 73556 DHCPDISCOVER on igb1 to 255.255.255.255 port 67 interval 2 Mar 5 01:07:30 dhclient 73556 No DHCPOFFERS received. Mar 5 01:07:30 dhclient 73556 Trying recorded lease 8X.XXX.XX.X11 Mar 5 01:07:30 dhclient TIMEOUT Mar 5 01:07:30 dhclient Starting add_new_address() Mar 5 01:07:30 dhclient ifconfig igb1 inet 8X.XXX.XX.X11 netmask 255.255.240.0 broadcast 255.255.255.255 Mar 5 01:07:30 dhclient New IP Address (igb1): 8X.XXX.XX.X11 Mar 5 01:07:30 dhclient New Subnet Mask (igb1): 255.255.240.0 Mar 5 01:07:30 dhclient New Broadcast Address (igb1): 255.255.255.255 Mar 5 01:07:30 dhclient New Routers (igb1): 8X.XXX.XX.1 Mar 5 01:07:31 dhclient New Routers (igb1): 8X.XXX.XX.1 Mar 5 01:07:32 dhclient Deleting old routes Mar 5 01:07:32 dhclient 73556 bound: renewal in 27196 seconds.
Any other ideas than implementing a cron script? And if I have to use a cron script, then how would I need to change it in order to detect this situation without relying on external sites? I'm lacking bash/Linux knowledge here.
-
Ok, I've now also created a script. Instead of using external sites, I'm now just checking if the WAN adapter (igb1) has an IPv4 address. If not, it waits 2 minutes and tries again. If it still has no IPv4 address, it issues the same commands as the other scripts: ifconfig down, ifconfig up, dhclient (not sure if the last one is necessary). It does not include rebooting the firewall, because I think this is overkill. Actually I just wanted to force-renew the DHCP client lease, but I couldn't get that to work, although I looked at what the PHP source is doing.
I'm using this script now:
#!/bin/sh wan="igb1" LOGFILE=/var/log/pingtest.log currip=$(ifconfig $wan | grep "inet " | cut -d " " -f 2) if test -z "$currip"; then echo `date +%Y%m%d.%H%M%S` "Detected empty IP on $wan! Will try again in 120 seconds." >> $LOGFILE sleep 120 currip=$(ifconfig $wan | grep "inet " | cut -d " " -f 2) if test -z "$currip"; then echo `date +%Y%m%d.%H%M%S` "2nd try: Still empty IP on $wan! Will fix now." >> $LOGFILE ifconfig $wan down sleep 10 ifconfig $wan up sleep 20 dhclient $wan echo `date +%Y%m%d.%H%M%S` "Fixing done!" >> $LOGFILE else echo `date +%Y%m%d.%H%M%S` "2nd try: $wan has IP $currip; ok" >> $LOGFILE fi else echo `date +%Y%m%d.%H%M%S` "$wan has IP $currip; ok" >> $LOGFILE fi
If you want to use it, you would have to do the following:
- Diagnostics / Edit File: Enter file name /usr/local/bin/pingtest.sh and paste the file from above in there and click Save. Update igb1 with the name of your WAN interface. That name is visible in Status / Interfaces in the title. For me it says there: "WAN Interface (wan, igb1)"
- Diagnostics / Command: "chmod +x /usr/local/bin/pingtest.sh" (without quotes) and click Execute. This makes the file runnable.
- System / Package Manager / Available Packages: Install Cron. To my understanding, this is just the user interface for Cron.
- Services / Cron / Settings: Leave the existing packages there and add a new one.
- Minute: "/10" (without quotes) or just "" for every minute, but every 10 minutes should be fine and avoids filling up the log.
- Other values: "*", User: "root", Command: "/usr/local/bin/pingtest.sh"
From time to time you can take a look at the log file "/var/log/pingtest.log" and maybe delete it to avoid that it's getting too big.
For me this works now if I restart the modem. If even works if I schedule it to every minute; then two scripts will be running at the same time, but due to the 2 minute delay and retest, it works fine too. Without the second test it also worked, but then it already issues a fix while starting up, so it tries to fix it twice. I wanted to avoid that. We'll see how this works in the coming months/years.
For me this is a clear bug in pfSense, as this happens with many different modems and searching through the forum shows that many people have this problem. Even if the modem is the culprit, I still think pfSense should be able to recover, because with other clients it works fine.
-
Thanks a lot for posting a workable solution! Same problem here from same provider (UPC, Switzerland).
And yes, I agree that this should be fixed in pfSense (it should be able to automatically overcome such problems), or at least make such a check an option. -
I have exactly the same problem (UPC, Poland). Also while searching for solution I saw that there was many people with this problem here, on reddit, etc. Thank you for that solution. But still I will try to find different and easier fix too. Maybe it is possible to somehow do this without cron and scripts.
-
One question for people being longer in pfSense community. Because it won't fix by itself. Looking at descriptions from this topic is it enough to create a task in https://redmine.pfsense.org/ ? If not then what should be added? If yes, then should it be a bug (pfSense doesn't react in such a situation) or a feature (detection of such situation should be added).
At the same time, I would be grateful for the suggestion about pfSense code. Could somebody point me the code (if it exists) responsible for the detection of network problems? Even without help, I will check it for myself because this problem is killing me ;) and having some suggestions will make it easier. And if by chance I find even a partial/not perfect solution then I will be more then happy. (I know that here we have a working workaround but it should be addressed in pfSense code)
-
@e4ch @tomashk
I have seen exactly this behavior where specifically a DHCP-assigined IP is lost for an amount of time equal to the last cached lease time if a DHCP timeout occurs.I was able to narrow down the root cause to a (confirmed: https://lists.freebsd.org/pipermail/freebsd-net/2019-February/052894.html) bug in the FreeBSD DHCP client, dhclient, as well as an apparent bug in the associated dhclient-script provided with pfSense that both involve handling DHCP protocol timeouts improperly.
I have opened a pfSense bug ticket containing the technical details of my findings as well as a working patch set which addresses this at: https://redmine.pfsense.org/issues/9267
-
Thanks @tomashk! Looks like you found the underlying root cause! I looked at your changes and they sound reasonable, at least the wrong byte that was returned (I don't fully understand the script). Thanks a ton for providing such fixes and posting them at the right place; that helps the maintainers a lot to integrate such fixes more easily into the main branch. Unfortunately, even though you reported this already in January 2019, it doesn't seem to be included in version 2.5 and isn't even in the open issues list, but it is still in the list "new issues". So it might take a while until we can see this in a standard update. There are several threads about this, so this will help many people and I can then finally get rid of my repair script.
-
@e4ch It is fairly easy to hack this fix in to an existing pfSense install; with the toughest/most involved part being getting dhclient rebuilt. Patching the dhclient-script such that it (correctly) returns nonzero when the default gateway in the cached lease is not pingable is trivially doable with the "system patches" package and the patch from the bug ticket.
The easiest way I have found to build the patched dhclient is to just setup a FreeBSD 11.2 VM, build the patched dhclient and copy over the binary to the pfSense host. This will persist until an update is performed. I am more than willing to share my dhclient binary if desired as well.
You might also be able to use the FreeBSD 12 stable branch dhclient verbatim as well, since such contains the exit status patch, however I haven't tested this personally.
-
I have the same DHCP issue here. ISP is Net1, Bulgaria. Thanks for the script. It helps with mitigating the problem.