Update to 2.5.0 broke DHCP relay
-
@victor_g
Correction: Having moved the backup file from 2.4.5 across the divide, it is as below..
Errors introduced by rekeying and not reading what's on the screen.-<dhcrelay> <enable/> <interface>lan</interface> <agentoption/> <server>192.168.123.1</server> </dhcrelay>
-
Our dhcp relay service failure on 2.5.0 update seems to be hardware specific.
Netgate XG-1537 = success 2 out 2 (version 21.02)
VMware 6.5 = success 4 out of 4
Supermicro 1U server (not sure of flavor, rear facing ports) 1 out of 1
Supermicro CSE-505-203B = fail 2 out of 2
Supermicro SYS-5018D-FN8T = fail 4 out of 4Have not tried bare metal reload on failures yet.
-
I think i found the root cause.
DHCP-Server is Upstream (behind) WAN.
DHCP-Relay for example only on LAN.At least i found a hint in Syslog:
Feb 22 16:01:17 check_reload_status 363 Syncing firewall
Feb 22 16:01:18 php-fpm 326 /services_dhcp_relay.php: No suitable upstream interfaces found for running dhcrelay!I guess i know where the problem resides.
Our default configuration sets the dhcp-relay only for the interfaces, not for wan. Our DHCP-Servers resides are mostly upstream on the WAN side. We have some firewalls where that is different.
/etc/inc/services.inc
$srvifaces = array(); foreach ($srvips as $srcidx => $srvip) { $destif = guess_interface_from_ip($srvip); if (!empty($destif) && !is_pseudo_interface($destif)) { $srvifaces[] = $destif; } } /* Check for relays in the same subnet as clients so they can bind for * either direction (up or down) */ $srvrelayifs = array_intersect($dhcrelayifs, $srvifaces); /* The server interface(s) should not be in this list */ $dhcrelayifs = array_diff($dhcrelayifs, $srvifaces); /* Remove the dual-role interfaces from up and down lists */ $srvifaces = array_diff($srvifaces, $srvrelayifs); $dhcrelayifs = array_diff($dhcrelayifs, $srvrelayifs); /* fire up dhcrelay */ if (empty($dhcrelayifs) && empty($srvrelayifs)) { log_error(gettext("No suitable downstream interfaces found for running dhcrelay!")); return; /* XXX */ } if (empty($srvifaces) && empty($srvrelayifs)) { # Error is here log_error(gettext("No suitable upstream interfaces found for running dhcrelay!")); return; /* XXX */ }
My dhcp-Server resides outside of any net within the firewall, therefore $servifaces
is empty, resulting in the error in syslog.My fix is to explicit add the upstream if there is none. I am not quite sure if this is the best variant. I would think that fixing guess_interface_from_ip() might be a better way.
if (empty($srvifaces)){ $srvifaces[] = "vmx0"; } if (empty($srvifaces) && empty($srvrelayifs)) { log_error(gettext("No suitable upstream interfaces found for running dhcrelay!")); return; /* XXX */ }
If there is anything else you need to know please let me know.
-
@victor_g
Further investigation on the upgraded 2.5.0 production environment shows (in/var/log/dhcpd.log
)Feb 22 15:41:48 wight dhcrelay[82265]: Internet Systems Consortium DHCP Relay Agent 4.4.2 Feb 22 15:41:48 wight dhcrelay[82265]: Copyright 2004-2020 Internet Systems Consortium. Feb 22 15:41:48 wight dhcrelay[82265]: All rights reserved. Feb 22 15:41:48 wight dhcrelay[82265]: For info, please visit https://www.isc.org/software/dhcp/ **Feb 22 15:41:48 wight dhcrelay[82265]: Unsupported device type 24 for "lo0"** Feb 22 15:41:48 wight dhcrelay[82265]: Feb 22 15:41:48 wight dhcrelay[82265]: If you think you have received this message due to a bug rather Feb 22 15:41:48 wight dhcrelay[82265]: than a configuration issue please read the section on submitting Feb 22 15:41:48 wight dhcrelay[82265]: bugs on either our web page at www.isc.org or in the README file Feb 22 15:41:48 wight dhcrelay[82265]: before submitting a bug. These pages explain the proper Feb 22 15:41:48 wight dhcrelay[82265]: process and the information we find helpful for debugging. Feb 22 15:41:48 wight dhcrelay[82265]: Feb 22 15:41:48 wight dhcrelay[82265]: exiting.
So perhaps the default upgrade is adding lo0 to the dhcrelay startup process?
The equivalent from the 2.4.5_1 environment is as follows:
Feb 22 15:47:00 wight dhcrelay: Internet Systems Consortium DHCP Relay Agent 4.4.1 Feb 22 15:47:00 wight dhcrelay: Copyright 2004-2018 Internet Systems Consortium. Feb 22 15:47:00 wight dhcrelay: All rights reserved. Feb 22 15:47:00 wight dhcrelay: For info, please visit https://www.isc.org/software/dhcp/ Feb 22 15:47:00 wight dhcrelay: Listening on BPF/vmx4/00:0c:29:10:bf:e3 Feb 22 15:47:00 wight dhcrelay: Sending on BPF/vmx4/00:0c:29:10:bf:e3 Feb 22 15:47:00 wight dhcrelay: Listening on BPF/vmx7.888/00:0c:29:10:bf:15 Feb 22 15:47:00 wight dhcrelay: Sending on BPF/vmx7.888/00:0c:29:10:bf:15 Feb 22 15:47:00 wight dhcrelay: Listening on BPF/vmx6/00:0c:29:10:bf:ed Feb 22 15:47:00 wight dhcrelay: Sending on BPF/vmx6/00:0c:29:10:bf:ed Feb 22 15:47:00 wight dhcrelay: Listening on BPF/vmx5/00:0c:29:10:bf:0b Feb 22 15:47:00 wight dhcrelay: Sending on BPF/vmx5/00:0c:29:10:bf:0b Feb 22 15:47:00 wight dhcrelay: Listening on BPF/vmx3/00:0c:29:10:bf:01 Feb 22 15:47:00 wight dhcrelay: Sending on BPF/vmx3/00:0c:29:10:bf:01 Feb 22 15:47:00 wight dhcrelay: Listening on BPF/vmx2/00:0c:29:10:bf:d9 Feb 22 15:47:00 wight dhcrelay: Sending on BPF/vmx2/00:0c:29:10:bf:d9 Feb 22 15:47:00 wight dhcrelay: Listening on BPF/vmx1/00:0c:29:10:bf:f7 Feb 22 15:47:00 wight dhcrelay: Sending on BPF/vmx1/00:0c:29:10:bf:f7 Feb 22 15:47:00 wight dhcrelay: Sending on Socket/fallback
-
Ok i have to add, that this is a solution (WAN is always vmx0) for my case. The physical interface of another system has sure another name,
<interfaces>
<wan>
<enable></enable>
<if>vmx0</if>
Therefore more general it is s.th. like
$config[interfaces][wan][if]
.. -
@fwcheck There's definitely something odd going on.
In the test scenario, I have a similar environment, with DHCP server on the dirty (WAN) side, using vmx0. This seems to be working correctly, as I get vmx0 in the$srvifaces
list.
But nolo0
in any list.Now adding more elements to the test environment to find the thing that triggers
lo0
to get added to the list. -
@johnsdixon said in Update to 2.5.0 broke DHCP relay:
But no lo0 in any list.
Isn't that a good thing ?
lo0 is the local host or 127.0.0.1
dhcrelay can't operate on "lo0" :@johnsdixon said in Update to 2.5.0 broke DHCP relay:
Feb 22 15:41:48 wight dhcrelay[82265]: Unsupported device type 24 for "lo0"
which is rather logic.
-
@gertjan But my production environment generates a startup command for the DHCP relay with
lo0
included.
This is not there in 2.4.5_1, but following an upgrade to 2.5.0 this appears, and there is no functioning DHCP relay process started by default in that situation. What I'm trying to do is work out what is triggering the inclusion of thelo0
within the startup process.
There is nolo0
anywhere in my config, nor has disabling services (eg. squid, OpenVPN) on the production configuration gained me working DHCP forwarding. -
Redmine issue created:
https://redmine.pfsense.org/issues/11523 -
I tried the beta of 2.5, and discovered the same thing.
I posted my 2.5 findings in here:
[https://forum.netgate.com/topic/157022/not-sure-if-it-is-a-bug-or-not-dhcprelay-in-2-5?_=1614502774329](link url)Hope this helps.
I have upgraded to 2.5, and it seems to be working on my setup.Cheers
Elfranko
-
@viktor_g
I can confirm, that this issue relates to routing as already mentioned on redmine, and it doesn't exist in earlier Versions of pfSense.Having this configuration, where LAN is for Management only, and WAN is for Connection to Internet Router, DHCP-Server is on opt2, and Test-Computer trying to get IP via DHCP is on opt3:
<interfaces> <wan> <enable></enable> <if>hn2</if> <blockbogons></blockbogons> <descr><![CDATA[WAN]]></descr> <spoofmac></spoofmac> <ipaddr>172.30.0.99</ipaddr> <subnet>16</subnet> <gateway>WANGW</gateway> </wan> <lan> <enable></enable> <if>hn0</if> <ipaddr>10.100.0.99</ipaddr> <subnet>16</subnet> <gateway></gateway> <gatewayv6></gatewayv6> <descr><![CDATA[LAN]]></descr> </lan> <opt2> <descr><![CDATA[TestDC]]></descr> <if>hn3</if> <enable></enable> <spoofmac></spoofmac> <ipaddr>10.199.0.1</ipaddr> <subnet>24</subnet> </opt2> <opt3> <descr><![CDATA[Test1]]></descr> <if>hn1</if> <enable></enable> <spoofmac></spoofmac> <ipaddr>10.99.1.1</ipaddr> <subnet>24</subnet> </opt3> </interfaces> <staticroutes> <route> <network>10.0.0.0/8</network> <gateway>Null4</gateway> <descr><![CDATA[Default bei RFC 1918, Private Class A]]></descr> </route> <route> <network>172.16.0.0/12</network> <gateway>Null4</gateway> <descr><![CDATA[Default bei RFC 1918, Private Class B]]></descr> </route> <route> <network>192.168.0.0/16</network> <gateway>Null4</gateway> <descr><![CDATA[Default bei RFC 1918, Private Class C]]></descr> </route> </staticroutes> <dhcrelay> <enable></enable> <interface>opt3</interface> <agentoption></agentoption> <server>10.199.0.11</server> </dhcrelay>
The NULL-Routes are to avoid Packets with local Addresses going to Internet (implicit Routes of direct attached Subnets have higher Priority).
With this configuration you cannot start DHCP-Relay Service (dhcrelay).
If you modify the NULL-Route "10.0.0.0/8" to something where the Subnet to the DHCP-Server is not part of (in my configuration e.g. "10.0.0.0/9"), then everything works fine.
Remark: After modifying Routes you have to reboot pfSense, because already existing routes were not replaced, but the modified route is added (no automatic flush of routing cache).
-
Try to apply Patch ID 7990de53bfc8267d1dd96636a175929a35cbe664 to fix DHCP Relay issue
see https://redmine.pfsense.org/issues/11475
@roland_v said in Update to 2.5.0 broke DHCP relay:
Remark: After modifying Routes you have to reboot pfSense, because already existing routes were not replaced, but the modified route is added (no automatic flush of routing cache).
Could you create a new redmine issue for this?
-
@viktor_g
Because I'm not able to build pfSense from source, I tried last development snapshot built on Tue Mar 02 01:18:30 EST 2021.With this version the DHCP-Relay works as expected.
For "Error on updating routing table after modifying static routes" I will open a new redmin issue as requested.
-
@viktor_g said in Update to 2.5.0 broke DHCP relay:
Try to apply Patch ID 7990de53bfc8267d1dd96636a175929a35cbe664 to fix DHCP Relay issue
Thanks, this patch fixed my identical problem.
For future reference, because some people might not know about it: See https://docs.netgate.com/pfsense/en/latest/development/system-patches.html for how to apply such patches to an existing installation.
--
Christian -
Hi,
just stepped on the same issue with PfSense 2.6.0
The Patch didnt pass the check, so i can´t apply it.My setup contains several LAN adapters leading to several (/24) Subnets.
The DHCP server is reachable on a remote system via Open VPN.Can You help me out?
/usr/bin/patch --directory='/' -t --strip '2' -i '/var/patches/6241d97843a1b.patch' --check --forward --ignore-whitespace Hmm... Looks like a unified diff to me... The text leading up to this was: -------------------------- |From 7990de53bfc8267d1dd96636a175929a35cbe664 Mon Sep 17 00:00:00 2001 |From: Viktor G <viktor@netgate.com> |Date: Thu, 25 Feb 2021 16:42:35 +0300 |Subject: [PATCH] route_get() optimization. Fixes #11475 | |--- | src/etc/inc/interfaces.inc | 2 +- | src/etc/inc/util.inc | 50 +++++++++++++++++++++++++++++--------- | 2 files changed, 39 insertions(+), 13 deletions(-) | |diff --git a/src/etc/inc/interfaces.inc b/src/etc/inc/interfaces.inc |index 35206915d92..307e76edcef 100644 |--- a/src/etc/inc/interfaces.inc |+++ b/src/etc/inc/interfaces.inc -------------------------- Patching file etc/inc/interfaces.inc using Plan A... Ignoring previously applied (or reversed) patch. Hunk #1 ignored at 6041. 1 out of 1 hunks ignored while patching etc/inc/interfaces.inc Hmm... The next patch looks like a unified diff to me... The text leading up to this was: -------------------------- |diff --git a/src/etc/inc/util.inc b/src/etc/inc/util.inc |index 6f94b0da41e..bc5178dee61 100644 |--- a/src/etc/inc/util.inc |+++ b/src/etc/inc/util.inc -------------------------- Patching file etc/inc/util.inc using Plan A... Ignoring previously applied (or reversed) patch. Hunk #1 ignored at 2692. Hunk #2 ignored at 2707. Hunk #3 ignored at 2755. 3 out of 3 hunks ignored while patching etc/inc/util.inc done /usr/bin/patch --directory='/' -f --strip '2' -i '/var/patches/6241d97843a1b.patch' --check --reverse --ignore-whitespace Hmm... Looks like a unified diff to me... The text leading up to this was: -------------------------- |From 7990de53bfc8267d1dd96636a175929a35cbe664 Mon Sep 17 00:00:00 2001 |From: Viktor G <viktor@netgate.com> |Date: Thu, 25 Feb 2021 16:42:35 +0300 |Subject: [PATCH] route_get() optimization. Fixes #11475 | |--- | src/etc/inc/interfaces.inc | 2 +- | src/etc/inc/util.inc | 50 +++++++++++++++++++++++++++++--------- | 2 files changed, 39 insertions(+), 13 deletions(-) | |diff --git a/src/etc/inc/interfaces.inc b/src/etc/inc/interfaces.inc |index 35206915d92..307e76edcef 100644 |--- a/src/etc/inc/interfaces.inc |+++ b/src/etc/inc/interfaces.inc -------------------------- Patching file etc/inc/interfaces.inc using Plan A... Hunk #1 succeeded at 6041 (offset -118 lines). Hmm... The next patch looks like a unified diff to me... The text leading up to this was: -------------------------- |diff --git a/src/etc/inc/util.inc b/src/etc/inc/util.inc |index 6f94b0da41e..bc5178dee61 100644 |--- a/src/etc/inc/util.inc |+++ b/src/etc/inc/util.inc -------------------------- Patching file etc/inc/util.inc using Plan A... Hunk #1 succeeded at 2692 (offset 43 lines). Hunk #2 failed at 2705. Hunk #3 succeeded at 2690 (offset 4 lines). 1 out of 3 hunks failed while patching etc/inc/util.inc done
-
@wurst You can simply upgrade to the latest pfSense version
-
-
@wurst. I am not quite sure i do understand your problem right:
Your DHCP-Server is outside of the nets of pfsense, upstream on the openVPN-Interface, right ?Does the dhcplog reveal anything ?
cat /var/log/dhcpd.log
If you login into the box and do a
#ps aux | grep "dhcrelay"
show that dhcrelay is running ?I am not quite sure if you can workaround using
/usr/local/sbin/dhcrelay [-id <for all interfaces which require DHCP>] -iu <your openvpn-interface> -a -m replace IP_dhcp-server1 IP_dhcpsever2
That might be a fast fix for the problem.
-
@fwcheck
Hi, sorry for my late reply.
The system went productive, so I used the Switch for DHCP relaying.Now I am back with another System, next try.
Heres my test with the line You recommended:
/usr/local/sbin/dhcrelay -id em2 -iu ovpns2 -a -m replace 10.1.1.12 Requesting: em2 as upstream: N downstream: Y Requesting: ovpns2 as upstream: Y downstream: N Internet Systems Consortium DHCP Relay Agent 4.4.2-P1 Copyright 2004-2021 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Unsupported device type 23 for "ovpns2"
--> Unsupported Device Type 23
OK. The openvpn adapter is not supported.
If I start it without specifying "-iu", the machine starts:/usr/local/sbin/dhcrelay -d -id em2 -a -m replace 10.1.1.12 Requesting: em2 as upstream: N downstream: Y Internet Systems Consortium DHCP Relay Agent 4.4.2-P1 Copyright 2004-2021 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Listening on BPF/em2/00:0c:29:a0:6f:7c Sending on BPF/em2/00:0c:29:a0:6f:7c Sending on Socket/fallback Adding 8-byte relay agent option Forwarded BOOTREQUEST for 00:0c:29:38:09:f2 to 10.1.1.12 Adding 8-byte relay agent option Forwarded BOOTREQUEST for 00:0c:29:38:09:f2 to 10.1.1.12 Adding 8-byte relay agent option Forwarded BOOTREQUEST for 00:0c:29:38:09:f2 to 10.1.1.12
The DHCP request ist recieved by DHCP Server 10.1.1.12, it replying with DHCP offer:
The Packet recieves at the Pfsense but nothing happens...
-
I'm seeing the same issue as @wurst but on a new (currently testing) installation which has not previously seen DHCP relay operational and has not been put into production yet.
At our main site I have PFSense router #1 (2.6.0) running an OpenVPN server, on the LAN network of this router are Microsoft DHCP servers. (PFSense #1 LAN address, 10.0.1.254/16, DHCP server 10.0.0.200)
At the remote site PFSense (2.6.0) box #2 running an OpenVPN client in peer to peer mode with layer 3 tunnelling. (LAN address 10.20.1.254/16)
Routing is all set up and working well, and I have been doing all my initial testing simply using the local DHCP server on PFSense route #2. For various reasons (including DNS registration, and potential PXE boot to a WDS server) I would prefer to use DHCP relay back to the Microsoft DHCP servers rather than a local DHCP server.
So I disabled DHCP, enabled DHCP relay and it..... doesn't work. Here is a screenshot of the configuration:
Two VLAN's are configured (STUDENT and GUEST) but at the moment I am
only testing with the untagged VLAN, LAN.Tcpdump on PFSense #2 confirms DHCP discovers are being received from a test client on LAN, but nothing is being forwarded over the OpenVPN link. It wasn't until I was looking directly at /var/log/syslog that I noticed this error which lead me to this (and other) threads:
Nov 12 15:46:48 pfSense2 php-fpm[70836]: /services_dhcp_relay.php: No suitable upstream interfaces found for running dhcrelay!
It appears that the dhcp relay daemon is not even being started at all. The "upstream interface" to the DHCP server would be the OpenVPN tunnel - which is already up and running at the time that I'm trying to start the DHCP relay.
Also, I do have interfaces "assigned" to the OpenVPN tunnel, as I know you can run an OpenVPN tunnel without doing this but this can have some limitations.
DHCP relay not working is unfortunate and is a bit of a showstopper for me to roll out this OpenVPN tunnel after doing 90% of the work setting it up and testing it. I also don't have a layer 3 switch at the remote site to do DHCP relay, as the site is very small with only 3 computers and 2 wireless base stations so doesn't warrant an expensive switch..
At the moment we have other equipment providing this VPN site-site link which has significant limitations and needs replacing - PFSense would do a much better job of it - if the DHCP relay would work for me...
I'm happy to provide logs and do testing on this issue, and while I'm somewhat of a FreeBSD noob I do have a lot of Linux experience so I can find my way around the command line fairly well.
At the moment the link is not in production, in fact PFSense router #2 which will be at the remote office is actually in my house at the moment, so I can test anything without disrupting anyone, so I have an ideal test environment.