WAN disconnected shortly after DHCP renewed -- check_reload_status
-
I am trying to replace my ISP router with a Netgate 6100. Initially I thought it was a DHCP issue, but after several weeks of comparing packet captures I am starting to doubt it. Though it may be related. I would appreciated some help in resolving this. Let me give the details.
I have attached two recent packet captures: "wan-dhcp-kddi.pcapng" is the DHCP process via my ISP router; this works without any issues. "wan-dhcp-pfsense-20211213.pcap" is a packet capture of the DHCP process via pfSense. I am using DHCP override and have tried my best to match the DHCP details of the ISP router. (I am also overriding the MAC address.) The only thing that does not match is the pfSense DHCP adds Option 61 (Client identifier) and Option 12 (Host name), though neither seems to negatively affect the process. Here is the DHCP override:
interface "ix3" { supersede dhcp-server-identifier 255.255.255.255; supersede interface-mtu 0; timeout 60; retry 15; select-timeout 0; initial-interval 1; send dhcp-lease-time 3600; send dhcp-class-identifier "kddi-hgw1.1"; send dhcp-max-message-size 576; # send dhcp-client-identifier = ""; script "/usr/local/sbin/pfSense-dhclient-script"; request subnet-mask,routers,broadcast-address,vendor-encapsulated-options,domain-name,domain-name-servers,host-name; }
Is the script statement to "pfSense-dhclient-script" necessary? Is the placement important or should it be moved before or after the other statements?
The DHCP lease is a little under 24 hours. The first DHCP renew request is success around 12 hours later. There is another renew request roughly 12 hours later, and it seems to be successful. However, another 30 minutes later the WAN connection dies. Here is the syslog messages:
Date Time Level Host Name Category Program Messages 12/13/2021 16:43:02 Notice firewall.dono.local user check_reload_status Syncing firewall 12/13/2021 23:48:54 Notice firewall.dono.local user dhclient RENEW 12/13/2021 23:48:54 Notice firewall.dono.local user dhclient Creating resolv.conf 12/14/2021 03:55:11 Notice firewall.dono.local user check_reload_status Syncing firewall 12/14/2021 03:58:07 Notice firewall.dono.local user check_reload_status Syncing firewall 12/14/2021 11:24:26 Notice firewall.dono.local user check_reload_status Syncing firewall 12/14/2021 11:35:46 Notice firewall.dono.local user dhclient RENEW 12/14/2021 11:35:46 Notice firewall.dono.local user dhclient Creating resolv.conf 12/14/2021 12:00:27 Notice firewall.dono.local user check_reload_status Syncing firewall 12/14/2021 12:07:37 Notice firewall.dono.local user check_reload_status Linkup starting ix3 12/14/2021 12:07:37 Information firewall.dono.local kern kernel ix3: link state changed to DOWN 12/14/2021 12:07:38 Error firewall.dono.local daemon dhclient connection closed 12/14/2021 12:07:38 crit firewall.dono.local daemon dhclient exiting. 12/14/2021 12:07:41 Notice firewall.dono.local user check_reload_status Reloading filter
It looks like the DHCP renewal was success, but 30 minutes later check_reload_status performs a "Linkup starting ix3" and the link goes down. I do not know how to further debug or resolve this and would appreciate some help.
Thank you.
-
@dono said in WAN disconnected shortly after DHCP renewed -- check_reload_status:
12:07:37
At this time, check_reload_status that ix3 (your WAN interface, the dhclient config tells me so) was coming up.
This means it went down just a moment before, the next message "kernel ix3: link state changed to DOWN" confirms this.
The dhclient detects this also : no more WAN interface = no more need for it to run, it exits.Most probably : the ix3: link going down because the upstream devices takes the connection down.
What happens when you put a switch between the WAN and upstream device ?
-
@gertjan said in WAN disconnected shortly after DHCP renewed -- check_reload_status:
What happens when you put a switch between the WAN and upstream device ?
Thank you for the response.
I will try that now. It will take 24 hours to test, so I will post the results then.Thank you.
-
@gertjan said in WAN disconnected shortly after DHCP renewed -- check_reload_status:
Most probably : the ix3: link going down because the upstream devices takes the connection down.
What happens when you put a switch between the WAN and upstream device ?
The test just completed. This time I no longer see logs that the connection went down. I just simply could no longer access the Internet anymore. Let me give the details.
I have attached a packet capture for the DHCP process. Again, the lease is for a little under 24 hours. I can see the first half-time renewal was successful. In previous testing, I have always seen it fail shortly after the second half-time renewal, but this time it failed about 2 hours before that around 16:20 my local time. The logs do not show anything around that time period.
Here are the DHCP logs:
Dec 16 18:09:10 dhclient 26212 Cannot open or create pidfile: No such file or directory Dec 16 18:09:10 dhclient 26521 PREINIT Dec 16 18:09:10 dhclient 27494 EXPIRE Dec 16 18:09:10 dhclient 28109 Deleting old routes Dec 16 18:09:10 dhclient 29672 PREINIT Dec 16 18:09:10 dhclient 26212 DHCPDISCOVER on ix3 to 255.255.255.255 port 67 interval 2 Dec 16 18:09:10 dhclient 26212 DHCPOFFER from 10.199.175.209 Dec 16 18:09:10 dhclient 30274 ARPSEND Dec 16 18:09:12 dhclient 30362 ARPCHECK Dec 16 18:09:12 dhclient 26212 DHCPREQUEST on ix3 to 255.255.255.255 port 67 Dec 16 18:09:12 dhclient 26212 DHCPACK from 10.199.175.209 Dec 16 18:09:12 dhclient 30607 BOUND Dec 16 18:09:12 dhclient 30764 Starting add_new_address() Dec 16 18:09:12 dhclient 30974 ifconfig ix3 inet 118.154.101.178 netmask 255.255.255.128 broadcast 118.154.101.255 Dec 16 18:09:12 dhclient 31180 New IP Address (ix3): 118.154.101.178 Dec 16 18:09:12 dhclient 31442 New Subnet Mask (ix3): 255.255.255.128 Dec 16 18:09:12 dhclient 31563 New Broadcast Address (ix3): 118.154.101.255 Dec 16 18:09:12 dhclient 31781 New Routers (ix3): 118.154.101.129 Dec 16 18:09:12 dhclient 31850 Adding new routes to interface: ix3 Dec 16 18:09:12 dhclient 32751 /sbin/route add -host 118.154.101.129 -iface ix3 Dec 16 18:09:12 dhclient 33318 /sbin/route add default 118.154.101.129 Dec 16 18:09:12 dhclient 33479 Creating resolv.conf Dec 16 18:09:12 dhclient 26212 bound to 118.154.101.178 -- renewal in 42121 seconds. Dec 17 05:51:13 dhclient 34044 DHCPREQUEST on ix3 to 111.87.236.67 port 67 Dec 17 05:51:13 dhclient 34044 DHCPACK from 111.87.236.67 Dec 17 05:51:13 dhclient 52498 RENEW Dec 17 05:51:13 dhclient 52790 Creating resolv.conf Dec 17 05:51:13 dhclient 34044 bound to 118.154.101.178 -- renewal in 42118 seconds.
And here are the syslogs:
Date Time Level Category Program Messages 12/16/2021 18:04:57 Notice user check_reload_status Restarting ipsec tunnels 12/16/2021 18:05:01 Notice user check_reload_status updating dyndns lan 12/16/2021 18:05:01 Notice user check_reload_status Reloading filter 12/16/2021 18:06:04 Notice user check_reload_status Syncing firewall 12/16/2021 18:06:04 Error syslog syslogd exiting on signal 15 12/16/2021 18:06:04 Information kern syslogd kernel boot file is /boot/kernel/kernel 12/16/2021 18:08:57 Information kern kernel ix3: promiscuous mode enabled 12/16/2021 18:09:09 Notice user check_reload_status Linkup starting ix3 12/16/2021 18:09:09 Information kern kernel ix3: link state changed to UP 12/16/2021 18:09:10 Error daemon php-fpm /rc.linkup: DEVD Ethernet attached event for wan 12/16/2021 18:09:10 Error daemon php-fpm /rc.linkup: HOTPLUG: Configuring interface wan 12/16/2021 18:09:10 Error daemon dhclient Cannot open or create pidfile: No such file or directory 12/16/2021 18:09:10 Notice user dhclient PREINIT 12/16/2021 18:09:10 Notice user dhclient EXPIRE 12/16/2021 18:09:10 Notice user dhclient Deleting old routes 12/16/2021 18:09:10 Notice user dhclient PREINIT 12/16/2021 18:09:10 Information daemon dhclient DHCPDISCOVER on ix3 to 255.255.255.255 port 67 interval 2 12/16/2021 18:09:10 Information daemon dhclient DHCPOFFER from 10.199.175.209 12/16/2021 18:09:10 Notice user dhclient ARPSEND 12/16/2021 18:09:12 Notice user dhclient ARPCHECK 12/16/2021 18:09:12 Information daemon dhclient DHCPREQUEST on ix3 to 255.255.255.255 port 67 12/16/2021 18:09:12 Information daemon dhclient DHCPACK from 10.199.175.209 12/16/2021 18:09:12 Notice user dhclient BOUND 12/16/2021 18:09:12 Notice user dhclient Starting add_new_address() 12/16/2021 18:09:12 Notice user dhclient ifconfig ix3 inet 118.154.101.178 netmask 255.255.255.128 broadcast 118.154.101.255 12/16/2021 18:09:12 Notice user dhclient New IP Address (ix3): 118.154.101.178 12/16/2021 18:09:12 Notice user dhclient New Subnet Mask (ix3): 255.255.255.128 12/16/2021 18:09:12 Notice user dhclient New Broadcast Address (ix3): 118.154.101.255 12/16/2021 18:09:12 Notice user dhclient New Routers (ix3): 118.154.101.129 12/16/2021 18:09:12 Notice user dhclient Adding new routes to interface: ix3 12/16/2021 18:09:12 Notice user dhclient /sbin/route add -host 118.154.101.129 -iface ix3 12/16/2021 18:09:12 Notice user dhclient /sbin/route add default 118.154.101.129 12/16/2021 18:09:12 Notice user dhclient Creating resolv.conf 12/16/2021 18:09:12 Notice user check_reload_status rc.newwanip starting ix3 12/16/2021 18:09:12 Information daemon dhclient bound to 118.154.101.178 -- renewal in 42121 seconds. 12/16/2021 18:09:13 Error daemon php-fpm /rc.linkup: Gateway, none 'available' for inet6, use the first one configured. '' 12/16/2021 18:09:13 Notice user check_reload_status Restarting ipsec tunnels 12/16/2021 18:09:14 Error daemon php-fpm /rc.newwanip: rc.newwanip: Info: starting on ix3. 12/16/2021 18:09:14 Error daemon php-fpm /rc.newwanip: rc.newwanip: on (IP address: 118.154.101.178) (interface: WAN[wan]) (real interface: ix3). 12/16/2021 18:09:14 Notice user check_reload_status Reloading filter 12/16/2021 18:09:16 Notice user check_reload_status updating dyndns wan 12/16/2021 18:09:16 Notice user check_reload_status Reloading filter 12/17/2021 5:51:13 Information daemon dhclient DHCPREQUEST on ix3 to 111.87.236.67 port 67 12/17/2021 5:51:13 Information daemon dhclient DHCPACK from 111.87.236.67 12/17/2021 5:51:13 Notice user dhclient RENEW 12/17/2021 5:51:13 Notice user dhclient Creating resolv.conf 12/17/2021 5:51:13 Information daemon dhclient bound to 118.154.101.178 -- renewal in 42118 seconds.
If as suggested this is due to the ISP ending the connection, I presume it is due to the DHCP request. As described in the initial message, I have tried my best to match the DHCP request being done by the ISP router. The only thing that does not match is the pfSense DHCP adds Option 61 (Client identifier) and Option 12 (Host name); I have not yet found a way to remove those. Also, I have noticed that some of the padding is different. Any suggestions on further matching the ISP router DHCP? (Packet capture supplied in the first message.)
I'm open to other suggestions if anyone thinks it may be something else.
Much appreciated.wan-dhcp-pfsense-20211216.pcap
-
@dono said in WAN disconnected shortly after DHCP renewed -- check_reload_status:
I have tried my best to match the DHCP request being done by the ISP router
First things first : is your ISP device a modem or a router ?
Modem : a transparent 'dumb' device that translates 'optic' signals to a Ethernet electric 'pfSense WAN' signal.
Nothing more.
Or is it a router, that can still do "conversion between optics and Ethernet but probably also includes a DHCP server, it's own firewall, it can do NAT etc etc and can even do Phone, and contains an AP, etc.Dec 16 18:09:10 dhclient 27494 EXPIRE
The current lease is expired. Everything has to get redone.
Dec 16 18:09:10 dhclient 26212 DHCPOFFER from 10.199.175.209
Take note ; the DHCP server answering is "10.199.175.209" == RFC1918 !! ( is this a DHCP server in your router / modem ? )
..... Dec 16 18:09:12 dhclient 26212 bound to 118.154.101.178 -- renewal in 42121 seconds
Ok, your good for using 118.a.b.178 for 42121 seconds or 11 hours and env. 45 minutes.
Next day, at 5h51 : 5h50+5h50 = 11h40 later :
Dec 17 05:51:13 dhclient 34044 DHCPREQUEST on ix3 to 111.87.236.67 port 67
and 118.87.236.67 acknowledged the renewal :
Dec 17 05:51:13 dhclient 34044 DHCPACK from 111.87.236.67
118.87.236.67 is not 10.199.175.209 ......
My question is : why is the replying DHCP first server 10.199.175.209 and later on it is 118.87.236.67
118.87.236.67 is a non RFC1918 so for sure a remote ISP based DHCP server.
10.199.175.209 could be remote, but also local (in your ISP router / modem).I'm not saying something is wrong here, but I can imagine that a lease given by 10.199.175.209 isn't renewable by another (is it ?) DHCP server 118.87.236.67
A test :
Stop or disconnect pfSense.
Disconnect the uplink forum you ISP modem router - Reboot it.
Wait for a couple of minutes.
Start the modem router.
Wait for a couple of minutes.
Connect pfSense (WAN)Now, the question is : is a DHCP server answering the pfSense DHCP client ?
If is, is it : 10.199.175.209 ?
It couldn't 'shouldn't) be 118.87.236.67.
Or : the pfSense DHCP client just fails as there is no uplink available.What happens when you instruct the pfSense DHCP client to refuse DHCP afters from 10.199.175.209 ?
Like this : -
Apologies for the delayed response. Work got extra busy with the end of the year.
Let me describe the setup.
ISP --- ONU --- ISP router --- switch
There is a dedicated AP connected to the switch. I am trying to replace the ISP router with the Netate 6100.
I think I found the cause of the link termination. I was too focused on DHCP, but unknown to myself the ONU sends out an 802.1X / MD5 authentication challenge hourly. After swapping the ISP router with the Netgate 6100, those 802.1X authentications are no longer being answered. I guess at the 24 hour mark without any updates the link is then terminated.
I have a Cisco switch that can do perform the supplicant role. Though I need to identify the password. It is hashed, though, so it may take some time.
First things first : is your ISP device a modem or a router ?
It's definitely a router. Very basic though which is why I want to replace it. It has a GUI that I can login to and set some LAN and WAN options, DHCP, port forwards etc. It's doing NAT.
Dec 16 18:09:10 dhclient 26212 DHCPOFFER from 10.199.175.209
Take note ; the DHCP server answering is "10.199.175.209" == RFC1918 !! ( is this a DHCP server in your router / modem ? )
Yeah, I've noticed that as well and thought it odd. The only other device in the office is the ONU, but I do not think it has an IP address. I tried pinging that IP from the pfSense WAN interface but it failed.
Note that the TTL is 30, which--assuming it started at TTL 32--suggests it originated two hops away. I can only assume it is part of the ISP network.
For the record, the 802.1X authentication are L2 frames using MAC addresses. It does not have a L3 IP address.
A test :
Stop or disconnect pfSense.
Disconnect the uplink forum you ISP modem router - Reboot it.
Wait for a couple of minutes.
Start the modem router.
Wait for a couple of minutes.
Connect pfSense (WAN)Now, the question is : is a DHCP server answering the pfSense DHCP client ?
If is, is it : 10.199.175.209 ?
It couldn't 'shouldn't) be 118.87.236.67.
Or : the pfSense DHCP client just fails as there is no uplink available.After pulling the power on ISP ONU, plugging in pfSense, and powering them both on, I was not able to get a DHCP response. This eventually lead me to discover the 802.1X / MD5 authentication mentioned above.
What happens when you instruct the pfSense DHCP client to refuse DHCP afters from 10.199.175.209 ?
Like this :I was not able to test this because the "Reject from" option is missing when using "Configuration Override".
Anyway, the 802.1X / MD5 authentication will take me some time to sort out. If I get it working I'll update the thread. Until then, I'll have to run the 6100 from behind the ISP router.
Thanks for the help. It was much appreciated.
-
@dono said in WAN disconnected shortly after DHCP renewed -- check_reload_status:
I was not able to test this because the "Reject from" option is missing when using "Configuration Override".
When you use
:
you inform pfSense had you have created your own dhcp-client config file. The path to this file was indicated using "Configuration File Override".0
You can add a line ( see dhcp-client documentation, but you already have the doc as you are using the "Configuration Override" option) how to block "10.199.175.209".Example :
When I add "1.2.3.4" like this :
I get a /var/etc/dhclient_wan.conf that shows :
interface "em0" { supersede interface-mtu 0; # DHCP Protocol Timing Values # DHCP Protocol Options reject 1.2.3.4; script "/usr/local/sbin/pfSense-dhclient-script"; }
So, adding
reject 10.199.175.209;
will do.