DHCP Relay broken implementation
-
I had some troubles with DHCP Relay (using the built in dhcrelay) and investigated a bit.
The strange behaviour was that for every DHCP Request of a client, 2-3 were getting to the DHCP server. 3 in case of client has already an IP and renews it.
In that latter case, the client will unicast the request to the server. The Server sees the request from the client and ACKs it. Then dhcrelay on pfsense sends two additional. The same request, with the LAN IP as GIADDR (unnecessary when client unicasts) and one additional with the WAN IP as GIADDR (completely wrong). From the LAN IP it gets ACK of course, from WAN IP the server sends a NAK as he should. This is not how DHCP relay is supposed to work.
I did some digging and a tcpdump showed this exact behavior.
On the LAN you can see the client sending one unicast
12:44:42.523961 00:23:4d:8e:5a:23 > 00:50:56:86:10:e8, ethertype IPv4 (0x0800), length 351: (tos 0x0, ttl 128, id 23121, offset 0, flags [none], proto UDP (17), length 337)
CLIENTIP.68 > DHCPSERVERIP.67: [udp sum ok] BOOTP/DHCP, Request from 00:23:4d:8e:5a:23, length 309, xid 0xe8caabd1, Flags [none] (0x0000)
Client-IP CLIENTIP
Client-Ethernet-Address 00:23:4d:8e:5a:23On the WAN you can see three requests going out to the server
12:46:03.466191 00:50:56:86:b4:c6 > 00:00:5e:00:01:01, ethertype IPv4 (0x0800), length 351: (tos 0x0, ttl 127, id 23146, offset 0, flags [none], proto UDP (17), length 337)
CLIENTIP.68 > DHCPSERVERIP.67: [udp sum ok] BOOTP/DHCP, Request from 00:23:4d:8e:5a:23, length 309, xid 0x948e2722, Flags [none] (0x0000)
Client-IP CLIENTIP
Client-Ethernet-Address 00:23:4d:8e:5a:2312:46:03.466280 00:50:56:86:b4:c6 > 00:00:5e:00:01:01, ethertype IPv4 (0x0800), length 351: (tos 0x0, ttl 64, id 38994, offset 0, flags [none], proto UDP (17), length 337)
WANIP.67 > DHCPSERVERIP.67: [bad udp cksum 0x2195 -> 0x3f5d!] BOOTP/DHCP, Request from 00:23:4d:8e:5a:23, length 309, hops 1, xid 0x948e2722, Flags [none] (0x0000)
Client-IP CLIENTIP
Gateway-IP WANIP
Client-Ethernet-Address 00:23:4d:8e:5a:2312:46:03.466321 00:50:56:86:b4:c6 > 00:00:5e:00:01:01, ethertype IPv4 (0x0800), length 351: (tos 0x0, ttl 64, id 23708, offset 0, flags [none], proto UDP (17), length 337)
WANIP.67 > DHCPSERVERIP.67: [bad udp cksum 0x2195 -> 0x6ff5!] BOOTP/DHCP, Request from 00:23:4d:8e:5a:23, length 309, hops 1, xid 0x948e2722, Flags [none] (0x0000)
Client-IP CLIENTIP
Gateway-IP LANIP
Client-Ethernet-Address 00:23:4d:8e:5a:23The first request is the unicasted from the client (so client is sender), the second and third get relayed from dhcprelay (so WAN IP is sender), and one of those two has the wrong GIADDR (bolded) set.
So the DHCP Server answers correctly:
Apr 22 12:46:03 r2d2 dhcpd: DHCPREQUEST for CLIENTIP from 00:23:4d:8e:5a:23 via em0
Apr 22 12:46:03 r2d2 dhcpd: DHCPACK on CLIENTIP to 00:23:4d:8e:5a:23 via em0
Apr 22 12:46:03 r2d2 dhcpd: DHCPREQUEST for CLIENTIP from 00:23:4d:8e:5a:23 via WANIP: wrong network.
Apr 22 12:46:03 r2d2 dhcpd: DHCPNAK on CLIENTIP to 00:23:4d:8e:5a:23 via WANIP
Apr 22 12:46:03 r2d2 dhcpd: DHCPREQUEST for CLIENTIP from 00:23:4d:8e:5a:23 via LANIP
Apr 22 12:46:03 r2d2 dhcpd: DHCPACK on CLIENTIP to 00:23:4d:8e:5a:23 via LANIPSearching further over pfsense reddit, mailinglist and forum, I finally found one with the same problem:
https://forum.pfsense.org/index.php?topic=19959.msg102605#msg102605He is descriping the exact problem. (The thread is closed though, otherwise I would have resurrected it from the dead because the forum rules state you should use threads already there). But the bug itself is not within pfsense, but the dhcrelay program. To really work, you have to include the interface in the -i's that has the server - in that case the WAN - at startup, which pfsense does right. Failing to do that, and the DHCP relay will stop working alltogether, because dhcrelay wont listen to the responses. But the design is just flawed.
So, again, I started digging further, finding posts in mailing lists and forums, 10 years ago, and newer ones, all describing this problem, no real solution. Until I found something in the debian bugtracker - same problem (it starts out with having to -i the WAN interface, and goes on about wrongly listening on it, duplicating and triplicating the requests).
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=648401
But, if you read through the thread, there is a patch to dhcrelay (in relay/dhcrelay.c and common/discover.c) that fixes the -i listen problem. It will listen for replies on all interfaces, but only forward requests from -i interfaces. That he will also forward the unicasts is not fixed, but it is not as big a deal like the wrong GIADDR issue. The patch has been submitted to debian dhcrelay, but it seems it is not in upstream to ISC DHCPD. So this issue is fixed in debian, but nowhere else.
As the submitter of the patch writes:
Here's a patch that fixes the problem for us. It makes dhcrelay listen on all
interfaces and relay BOOTREPLY packets from them, but still only rely
BOOTREQUEST packets from requested interfaces (those with -i).I don't think that pfsense will patch packages that way (or will they?), but as they have ties to the freebsd community, maybe they can make this patch happen in freebsd? Or got developer ties to ISC?
On the side of pfsense itself, once that dhcrelay is patched, it should start dhcrelay only with -i on the selected interfaces, and it should work as expected. -
It looks like ISC's saying that's been fixed in 4.3.3 in the last post here.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=648401which is what we already include in 2.3, which version are you using?
I don't see anything about that in their change log though, at a quick review.
https://kb.isc.org/article/AA-01364 -
I'm on pfSense 2.3.
The link says it has been fixed in that ISC DHCPD version on debian - and debian only. They carry their own set of patches, and I'm not sure how or if they send such fixes upstream, and how ISC will handle them. As you said, there is nothing in the Official ISC changelogs for 4.3.3 and 4.3.4, so it is not included. 4.3.3-7 is a debian-internal version.
-
Ah, yeah, at a glance earlier I thought that last message was from ISC, not Debian.
I would hope they've reported that to ISC for inclusion in 4.3.4, though at a quick review of the change log I don't see it mentioned. It's really something ISC needs to fix. We can patch it if necessary, but that shouldn't be required.
If you can open up a bug ticket at redmine.pfsense.org, I'll take a look at that as soon as time permits.
-
Redmine ticket created for this here: https://redmine.pfsense.org/issues/6355
Steve
-
I'm on pfSense 2.3.
The link says it has been fixed in that ISC DHCPD version on debian - and debian only. They carry their own set of patches, and I'm not sure how or if they send such fixes upstream, and how ISC will handle them. As you said, there is nothing in the Official ISC changelogs for 4.3.3 and 4.3.4, so it is not included. 4.3.3-7 is a debian-internal version.
pfSense 2.3.2 snapshots have isc-dhcp43-relay version 4.3.4_1, which has the debian patch applied. Could you please try it out to make sure it fixed the issue?
-
I still have this problem on 2.3.2.