pfSense WAN dhcp client exiting (error)

bingo600

For the 2'nd time i have experienced that my pfSense (CE 2.5.2-RELEASE (amd64) ) won't get a WAN DHCP ip addres after a Powerfailure.

I noticed this in the log.

Jan 21 15:37:24 	php 	400 	rc.bootup: The command '/sbin/dhclient -c /var/etc/dhclient_wan.conf igb0 > /tmp/igb0_output 2> /tmp/igb0_error_output' returned exit code '1', the output was '' 

Jan 21 15:37:32 	kernel 		igb0: link state changed to UP
Jan 21 15:37:32 	check_reload_status 	378 	Linkup starting igb0

This could be a race condition , where the ISP device is slower to boot than the pfSense , meaning Wan-eth (IGB0) link could be down, when dhcp-client is started. But i'd expect the dhcp client to keep trying ... Not get "angry" and exit.

Cure restart the ISP device once again.

Jan 23 19:51:54 	php-fpm 	16308 	/rc.newwanip: rc.newwanip: on (IP address: 192.168.1.x) (interface: WAN[wan]) (real interface: igb0).
Jan 23 19:51:54 	php-fpm 	16308 	/rc.newwanip: rc.newwanip: Info: starting on igb0.
Jan 23 19:51:54 	check_reload_status 	378 	Restarting ipsec tunnels
Jan 23 19:51:54 	php-fpm 	349 	/rc.linkup: Gateway, none 'available' for inet6, use the first one configured. ''
Jan 23 19:51:53 	check_reload_status 	378 	rc.newwanip starting igb0
Jan 23 19:51:44 	php-fpm 	349 	/rc.linkup: HOTPLUG: Configuring interface wan
Jan 23 19:51:44 	php-fpm 	349 	/rc.linkup: DEVD Ethernet attached event for wan
Jan 23 19:51:43 	kernel 		igb0: link state changed to UP
Jan 23 19:51:43 	check_reload_status 	378 	Linkup starting igb0
Jan 23 19:50:25 	check_reload_status 	378 	Reloading filter
Jan 23 19:50:25 	php-fpm 	348 	/rc.linkup: DEVD Ethernet detached event for wan
Jan 23 19:50:24 	kernel 		igb0: link state changed to DOWN
Jan 23 19:50:24 	check_reload_status 	378 	Linkup starting igb0

But not really feasible , when it's the "Summerhouse pfSense" .....

I didn't see this when using 2.4.5-p1

Any hints , fixes, or reasons ???
Do i have to switch to a static ip for my pfSense WAN , and totally avoid DHCP on my WAN ?
It's RFC1918 on the ISP device ... My "WAN"

/Bingo

stephenw10

Do those error files in /tmp show anything?

bingo600

@stephenw10

Not really ....
The igb0 files were overwritten , when the ISP box was rebooted.

cat igb0_output
Cannot open or create pidfile: No such file or directory
dhclient 82382 - - PREINIT
DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 2
DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 2
DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 3
DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 3
DHCPOFFER from 192.168.1.1
DHCPREQUEST on igb0 to 255.255.255.255 port 67
DHCPACK from 192.168.1.1
bound to 192.168.1.x -- renewal in 3600 seconds.

Same for the dhcp files

cat dhcpd.sh
/bin/mkdir -p /var/dhcpd
/bin/mkdir -p /var/dhcpd/dev
/bin/mkdir -p /var/dhcpd/etc
/bin/mkdir -p /var/dhcpd/usr/local/sbin
/bin/mkdir -p /var/dhcpd/var/db
/bin/mkdir -p /var/dhcpd/var/run
/bin/mkdir -p /var/dhcpd/usr
/bin/mkdir -p /var/dhcpd/lib
/bin/mkdir -p /var/dhcpd/run
/usr/sbin/chown -R dhcpd:_dhcp /var/dhcpd/*
/bin/cp -n /lib/libc.so.* /var/dhcpd/lib/
/bin/cp -n /usr/local/sbin/dhcpd /var/dhcpd/usr/local/sbin/
/bin/chmod a+rx /var/dhcpd/usr/local/sbin/dhcpd

Well this one seems to be from POR (power on reset)

-rw-r--r--  1 root  wheel      11 Jan 21 15:37 igb0_defaultgw

cat igb0_defaultgw
192.168.1.1

A bit strange ... as it seems like it got a def gw.

But then again i use OpenVPN to access it , so i can't be 100% sure what happens here. I just assumed there was a DHCP issue , due to the error message. And that a reboot of the isp device (Wan link-down & up) , makes things "happy".

/Bingo

Gertjan

@bingo600 said in pfSense WAN dhcp client exiting (error):

race condition

It probably is.
When "15:37:24" (fist image) happens there should be a WAN interface, and it should be UP.
Or, it was down (it should be UP at that moment).
8 seconds later, it comes UP - again ?

/erc/rc.boot line 227

interfaces_configure();

at that moment, all known assigned interfaces (WAN, LAN etc) should be up.
interfaces_configure() - in /etc/inc/interfaces.inc wind up executing 'dhclient' on the WAN interface.
Or, the WAN was gone for a moment. hence the error.

What you might try : put a switch between the WAN pfSense and the upstream router. This will mitigate the UP-DOWN-UP flapping.
I guess this issue wasn't noticeable in the past because your pfSense back then booted somewhat slower.
Using a static settings on WAN is also a solution.

bingo600

@gertjan

I do agree about the race condition.

But i still think that the DHCP Client task should "loop" in trying to get an address. Even if the IF isn't up.

I'll probably change to a static ip, next time i'm there.

stephenw10

Hmm, looks similar to some issues that existed previously, like this: https://redmine.pfsense.org/issues/9484

Try changing the DHCP timeout option to something large enough to allow the modem to boot. Like 900 as suggested there.

Though I agree it should just continue to try. I'm pretty there was a bug for that....

Steve

stephenw10

Ah, yes, this: https://redmine.pfsense.org/issues/9267

Though that should be fixed. Check the dhcp logs for the client using the old lease incorrectly.

Steve

bingo600

@stephenw10
I'll prob switch to fixed ip, as this is the summerhouse.
Any "timing issue" not caught, while there and testing. Will make my temperature monitoring unavailable. And during winter these measurements are essential , as we have "water in the pipes".

Thanx anyway.

/Bingo

Gertjan

@bingo600 said in pfSense WAN dhcp client exiting (error):

But i still think that the DHCP Client task should "loop" in trying to get an address. Even if the IF isn't up.

Think again ;)
When a process is launched on a given interface, like dhclient, it needs that one (1) - WAN interface, unlike unbound or the nginx web service for the GUI, which are bound to all interfaces. If that interface disappears, because it goes down, because the upstream router/modem pull the connection down, it will get removed from the 'ifconfig' kernel interfaces list - it will not stay active or in a running state.
dhclient exits.

I thought I found out why it is comes back - when an interface comes on line again :
The LINKUP event :
See /etc/rc.booutup : line 96-99 : and see devd.conf lines 56-60 :
I 'read' here :

#
# Try to start dhclient on Ethernet-like interfaces when the link comes
# up.  Only devices that are configured to support DHCP will actually
# run it.

for me : when an interface comes up- and it's an interface that should support dhclient, it is executed.
Note that the $subsystem variable isn't defined / used ?!?
So, is this used :

notify 0 {
	match "system"		"IFNET";
	match "type"		"LINK_UP";
	media-type		"ethernet";
	action "service dhclient quietstart $subsystem";
};

I really think that these tehse 6 lines should auto start the dhclient on that ($subsystem = set to what ? ) interface.
Real Plug and Play in action.

Just my 2 cents btw.

Go for the static IP. That's proven technology.

bingo600

@gertjan
I do agree that it's impossible (not feasible) to start a dhcpclient process on an IF that's down. But one could loop in a "sleep 5 min" retrying to see if the IF becomes available.

Apparently something happens if/when my special condition is met.
That doesn't retrigger the dhcpclient startup , and leaves the now active wan IF , in a IP address limbo ....

But as you mention , i'll switch to static/fixed ip next time i'm there.

/Bingo

Nosense 0

Exactly the same problem.

Pfense Version 2.7.2 release (AMD64)

In the Log hot plug event Loop up to infinity.

WAN Interface loop up and down forever.

Everything changed, Cable, Nic etc.

The only solution to change, reboot of the Pfense or to fixed IP, which I have now done :-(

Incomprehensible...

stephenw10

A link change loop is not what the symptoms of this issue are.

What is causing it to lose link? What do your logs actually show?

Does this also happen after a power outage? So every device boots at the same time?

Nosense 0

@stephenw10 said in pfSense WAN dhcp client exiting (error):

causing it to lose link

Very good question if I only knew that ...

-the connection to the ISP is interrupted
-definitv power outage
-the network cable is deducted
-?

The logs show about the same as above

No, the devices all boot at different times.

Unfortunately, I don't know when it all started because the Pfense runs stable with me.

I can therefore only assume that it started with a modem exchange from the ISP a few months ago.

But there is a router between the ISP modem and the Pfense and does not cause any problems, it is for despair because I have unfortunately not found sufficient information in the logs.

Only the booting of the Pfense worked.

stephenw10

Do you have any example logs showing a couple of loops?

Nosense 0

I'm sorry for the logs in the Pfense are only limited to 500 lines, this will not help you because it is a log from the router:

Mar/28/2024 02:03:28 interface,info ether2 Firewall link down
Mar/28/2024 02:03:30 interface,info ether2 Firewall link up (speed 1G, full duplex)
Mar/28/2024 02:05:27 interface,info ether2 Firewall link down
Mar/28/2024 02:05:29 interface,info ether2 Firewall link up (speed 1G, full duplex)
Mar/28/2024 02:07:10 interface,info ether2 Firewall link down
Mar/28/2024 02:07:11 interface,info ether2 Firewall link up (speed 1G, full duplex)
Mar/28/2024 02:09:00 interface,info ether2 Firewall link down
Mar/28/2024 02:09:02 interface,info ether2 Firewall link up (speed 1G, full duplex)
Mar/28/2024 02:10:58 interface,info ether2 Firewall link down
Mar/28/2024 02:11:02 interface,info ether2 Firewall link up (speed 1G, full duplex)
Mar/28/2024 02:12:48 interface,info ether2 Firewall link down
Mar/28/2024 02:12:52 interface,info ether2 Firewall link up (speed 1G, full duplex)
Mar/28/2024 02:14:48 interface,info ether2 Firewall link down
Mar/28/2024 02:14:49 interface,info ether2 Firewall link up (speed 1G, full duplex)
Mar/28/2024 02:16:47 interface,info ether2 Firewall link down
Mar/28/2024 02:16:49 interface,info ether2 Firewall link up (speed 1G, full duplex)
Mar/28/2024 02:18:45 interface,info ether2 Firewall link down
Mar/28/2024 02:18:48 interface,info ether2 Firewall link up (speed 1G, full duplex)
Mar/28/2024 02:20:35 interface,info ether2 Firewall link down

Or where can you still see the log from the Pfsene?

Nosense 0

Found on the hard drive

Mar 28 17:18:11 php-fpm[393]: /rc.newwanip: Resyncing OpenVPN instances for interface WAN.
Mar 28 17:18:11 php-fpm[393]: /rc.newwanip: Creating rrd update script
Mar 28 17:18:13 check_reload_status[433]: Reloading filter
Mar 28 17:18:13 php-fpm[394]: /rc.linkup: Hotplug event detected for WAN(wan) dynamic IP address (4: dhcp)
Mar 28 17:18:13 php-fpm[394]: /rc.linkup: DEVD Ethernet attached event for wan
Mar 28 17:18:13 php-fpm[394]: /rc.linkup: HOTPLUG: Configuring interface wan
Mar 28 17:18:13 check_reload_status[433]: Linkup starting em1
Mar 28 17:18:13 kernel: em1: link state changed to DOWN
Mar 28 17:18:13 php-fpm[393]: /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 0.0.0.0 -> 192.168.xxx.xxx - Restarting packages.
Mar 28 17:18:13 check_reload_status[433]: Starting packages
Mar 28 17:18:13 check_reload_status[433]: Reloading filter
Mar 28 17:18:14 php-fpm[54515]: /rc.start_packages: Restarting/Starting all packages.
Mar 28 17:18:16 kernel: em1: link state changed to UP
Mar 28 17:18:16 check_reload_status[433]: Linkup starting em1
Mar 28 17:18:17 check_reload_status[433]: rc.newwanip starting em1

stephenw10

Hmm. Are you running Snort or Suricata in in-line mode? I'd expect to see some netgraph logs there but that's the only package I'm aware of that can affect the link state.

Nosense 0

Snort -no

Suricata -yes

stephenw10

And specifically in in-line mode?

Do you have a more extensive log? Need to see at least two complete loop cycles to see whats happening there.

Nosense 0

[101342 - Suricata-Main] 2024-03-28 17:47:05 Notice: suricata: This is Suricata version 7.0.4 RELEASE running in SYSTEM mode

Okay, I try to activate the DHCP client again at the weekend to get the desired protocol.