DHCP Relay and VPN
I've been spending quite a bit of time lately trying to get the following scenario to work, and wanted to post here because I'm curious if anyone else has run into this.
I have two sites, both running pfSense 2.4.5-p1 which are connected together with a site-to-site OpenVPN tunnel (technically it's in TAP mode). I have a DHCP server in each site. Both servers are set up for failover/load balancing with each other. The idea being that if the DHCP server in Site A goes offline, clients in Site A will still be able to get IP addresses from the DHCP server in Site B (or vice versa). However, I had been largely unsuccessful in getting DHCP relay to work with this configuration, in part because OpenVPN (and GRE, GIF, and other tunnel interfaces) were removed from the list of interfaces on the DHCP Relay configuration page in 2.4.5.
Based on lots of testing and packet captures, I can see that the DHCP discover goes across the OpenVPN tunnel and the DHCP server in the other site responds with an offer, but the offer is never passed to the client because dhcrelay is not listening on the OpenVPN interface. I've tried playing numerous games with NAT to try and intercept the offer and push it to an interface where dhcrelay is listening, but was ultimately unsuccessful. I found a comment in a bug report (https://redmine.pfsense.org/issues/9466) that mentioned that dhcrelay is able to differentiate between upstream and downstream interfaces (though this functionality is not exposed in the Web GUI) and thought it would be worth a shot to try adding the OpenVPN interface as an upstream interface. On Site B's router, dhcrelay was running with these options:
/usr/local/sbin/dhcrelay -id lagg0.3 -id lagg0.4 -id lagg0.8 -id lagg0.9 -id lagg0.11 -iu lagg0.10 <Site_B_DHCP_Server_IP>
I killed the process and restarted it via command line like this:
/usr/local/sbin/dhcrelay -id lagg0.3 -id lagg0.4 -id lagg0.8 -id lagg0.9 -id lagg0.11 -iu lagg0.10 -iu ovpnc1 <Site_B_DHCP_Server_IP> <Site_A_DHCP_Server_IP>
So far so good. I tested by shutting down the DHCP server in Site B and clients were still able to get IP addresses when connecting to the network. I'm also not seeing any interface type errors in the DHCP log (see https://redmine.pfsense.org/issues/10341 for details). I haven't had enough time to test the long term stability of dhcrelay running with this configuration but I'm cautiously optimistic. I even created a script to apply this configuration at startup (located in /usr/local/etc/rc.d):
killall -3 dhcrelay
/usr/local/sbin/dhcrelay -id lagg0.3 -id lagg0.4 -id lagg0.8 -id lagg0.9 -id lagg0.11 -iu lagg0.10 -iu ovpnc1 <Site B DHCP Server IP> <Site A DHCP Server IP>
Assuming this is stable long term I'm tempted to put in a feature request to allow users to select whether an interface is upstream, downstream, or both in the DHCP Relay GUI, and to allow tunnel interface types to be selected as upstream interfaces only.
feature request created: https://redmine.pfsense.org/issues/10711