Dual WAN DHCP Issues
-
In the DHCP server configuration, there is this option:

Just set 192.168.100.1 there.
-
@mcury Hello thank you for the response. I tried adding that setting in, but it never assigns an IPv4 address.
One item that I didn't mention originally is that IPv6 is sometimes handed out, but it's sporadic:

It never assigns an IPv4 address, before or after adding in the "reject leases from" setting.
I tried turning off IPv6 on the WAN2 interface, and also deleting the IPv6 WAN2 gateway with no luck:



-
Since it showed the 192.168.100.1 gateway address it must have pulled a lease containing that. It may not have been from that IP though. Check the dhcp logs to see what's happening then.
-
@stephenw10 The dhclient logs only show activity for WAN1, not WAN2.
I ran a packet capture on WAN2 and this is what I got. I ran the capture after doing a modem reboot.
16:52:56.503025 IP 0.0.0.0.68 > 255.255.255.255.67: UDP, length 300
16:53:04.561132 IP 0.0.0.0.68 > 255.255.255.255.67: UDP, length 300
16:53:05.503615 IP 0.0.0.0.68 > 255.255.255.255.67: UDP, length 300
16:53:07.504406 IP 0.0.0.0.68 > 255.255.255.255.67: UDP, length 300
16:53:12.507199 IP 0.0.0.0.68 > 255.255.255.255.67: UDP, length 300
16:53:22.507780 IP 0.0.0.0.68 > 255.255.255.255.67: UDP, length 300
16:53:33.589897 IP 0.0.0.0.68 > 255.255.255.255.67: UDP, length 300
16:53:35.121168 IP 192.168.0.1 > 224.0.0.22: igmp
16:53:35.845608 IP 192.168.0.1 > 224.0.0.22: igmp
16:53:38.310160 IP 192.168.0.1 > 224.0.0.22: igmp
16:53:38.525158 IP 192.168.0.1 > 224.0.0.22: igmp
16:53:45.505022 IP 0.0.0.0.68 > 255.255.255.255.67: UDP, length 300
16:53:45.506322 IP 192.168.100.1.67 > 192.168.100.10.68: UDP, length 300
16:53:53.505188 IP 0.0.0.0.68 > 255.255.255.255.67: UDP, length 300
16:53:53.506328 IP 192.168.100.1.67 > 192.168.100.10.68: UDP, length 300
16:54:05.731285 ARP, Request who-has 192.168.100.10 tell 192.168.100.10, length 28
16:54:07.178225 ARP, Request who-has 192.168.100.1 tell 192.168.100.10, length 28
16:54:07.179570 ARP, Reply 192.168.100.1 is-at <MAC removed>, length 46
16:54:07.179622 IP 192.168.100.10 > 192.168.100.1: ICMP echo request, id 11720, seq 0, length 64
16:54:07.180623 IP 192.168.100.1 > 192.168.100.10: ICMP echo reply, id 11720, seq 0, length 64
16:54:20.253659 IP 192.168.100.10 > 192.168.100.1: ICMP echo request, id 12980, seq 0, length 9
16:54:20.255227 IP 192.168.100.1 > 192.168.100.10: ICMP echo reply, id 12980, seq 0, length 9
16:54:20.761851 IP 192.168.100.10 > 192.168.100.1: ICMP echo request, id 12980, seq 1, length 9
16:54:20.763220 IP 192.168.100.1 > 192.168.100.10: ICMP echo reply, id 12980, seq 1, length 9
16:54:21.777608 ARP, Request who-has 192.168.100.1 tell 192.168.100.10, length 28
16:54:21.778419 ARP, Reply 192.168.100.1 is-at <MAC removed>, length 46
16:54:21.778477 IP 192.168.100.10 > 192.168.100.1: ICMP echo request, id 12980, seq 2, length 9
16:54:21.779491 IP 192.168.100.1 > 192.168.100.10: ICMP echo reply, id 12980, seq 2, length 9
...
16:54:39.519584 ARP, Request who-has 72.X.X.X tell 72.X.X.X, length 46
...


The interface appears to be on a public subnet, but it doesn't get an address. After doing a reboot of the pfSense, it got IP 192.168.100.10. I rebooted the modem after that, and it got IP 192.168.100.1. I still have the "reject leases from" set to "192.168.100.1,192.168.100.10".
-
Right so it definitely pulled a lease. There should be something shown in the dhcp log when that happened.
But the pcap shows the dhcp response was from 192.168.100.1 so it should have rejected that offer. Again I'd expect to see something logged.
-
Are both connections from the same ISP?
-
@chpalmer No they are separate ISPs
-
On a 2100 the WAN has a different MAC so shouldn't be a problem. That can be an issue on the 7100.
But, yes, maybe requires a different client identifier?
-
@stephenw10 First modem reboot:


Before the second modem reboot, I added in "WAN2" to the hostname field:

Second reboot:


I also did a pfSense reboot when on the line with the ISP, and they said that they don't see the modem and router bridging.
-
@pfsense_user1 Have you tried rebooting everything but keeping the first modem offline to see if there is some kind of issue with both modems having the same management address?
Shouldn't but trying to rule certain things out.
-
The 'dhclient', the DHCPv4 client on your (a) WAN does it can do t get a lease.
It's start with a PRENIT.
Then, as it 'recalls' the last WAN IPv4 it had, it emits multiple (every second) DHCPREQUEST to validate this 'last recently used IPv4.
No answer came back.
Ok, the dhclient will go for an all new lease : it start to send (broadcast) DHCPDISCOVER's, so any available (existing) DHCP server(s) can now answer on this request.
No asnwer, so a incremental delay is used, and the DHCPDISCOVER is repeated.
Up until delay number '12'.
Still no answer. Silence.
Suddenly, some one replied : 192.168.100.1 (and yeah, it took 27m:42s-26m:47s = close to seconds for this device to answer. This 192.168.100.1 was also rebooting, which will explain the absence all this time ?
Anyway, still no dies, as you've instructed that you don't want to use "192.168.100.1" as a DHCP server.At this moment, dhcp6c also kicks in (same physical WAN interface ? Another one ?) and this DHCP client receives answer right away, like within milliseconds.
At 27m:28your dhclient (IPv4) (the same client as the process ID 4488 is the same) continues to send DHCPDISCOVERS, and 192.168.1.100 replies straight away, and dhclient doesn't what his answer (lease proposition). It tells you that "No DHCPOFFERS received" - it will try to re use (probably previous) 192.168.100.10 lease .... and from that momenon on, no more message from "4488"
But wait, another dhclient process start talking (logging) now : 13778. It signals a TIMEOUT ... where did this dhclient process 13778 came from ? I didn't saw the startup logged.
And another one : 14629, which did has success, as it says "Starting add_new_address()" which makes me think it got a lease (next line says 192.168.100.10 !)
Now things becomes even more complex : 14629 - 15890 - 16830 -17302 - 18101, for me, these are all different process IDs. Incredible. A nice dentition of a real mess.
Questions :
Who is 192.168.100.1 ? a local modem ? Router ?
You don't want to get a lease from this device, why ?
I can't see what other device (DHCP server) (from where ?) is actually answering.Before rebooting : the old way :
Power down (for pfSense : cleanly, with the GUI !!) everything.
Now power up WAN circuit "1"
and WAN circuit "2"
and wait a minute or so for everything to boot.
Then, power up pfSense.Another thing to try : get two simple small witches.
Between each ISP WAN device and the related pfSense, place a switch. One for each WAN. I know, this seems stupid, il less optimal, but can help. because, if the WAN ISP senses that it's LAN gets triggered (as pfSense wakes up and activates its/a WAN) it will reboot, creating some sort of loop of 'restarts'.The final, best solution would be : as this is a typical situation where the ISP device isn't really fool proof, and needs 'hand holding' and random manual intervention at random times, express your ultimate consumers right : pick a better one. This time you know that 'speed' and 'price' are not the most important selection criteria. It's overall stability and easy of access.
And yes, I know, sometimes, you have not always a choice. -
Ah OK, you can see it's trying to use old lease data after failing to get a new lease. Look for /var/db/dhclient.leases.mvneta1.666. That file should show all the recent leases and it;s tryign the most recent one.
Either remove the 192.168.100.1 leases from there or just remove that file entirely.