DHCP not working / self-assigned IP address
-
If you have a switch on LAN and both units are plugged into LAN, it should start to work. Otherwise, disable DHCP failover.
yes, that's what i did to begin with.
the two units are connected to each other via the OPT1 interfaces, the two LAN interfaces are plugged into the same switch, and my laptop and the other computers are plugged into the switch as well. our single uplink is plugged into the WAN interface on the first (master) pfsense box. i can access the webgui of the master pfsense by using 172.17.67.2 and i can access the webgui of the backup by using 172.17.67.3, and when i use 172.17.67.1 it goes to the master. i'm pretty sure the failover IP in the DHCP settings was set to 172.17.67.3, maybe it should be 172.17.67.1?
-
No, the failover IP in DHCP should point to the real LAN IP of the other box. So on the main unit, it points to the LAN IP of the backup, and vice versa.
If you go to Status > DHCP Leases, what does the failover status at the top show?
-
i made sure that the DHCP server settings matched on each unit, save for the "failover IP", which was set to 172.17.67.3 on the master (172.17.67.2) and 172.17.67.2 on the backup (172.17.67.3). in the DHCP leases, this is what i saw:
Diagnostics: DHCP leases Failover Group My State Since Peer State Since "dhcp0" recover 2000/09/19 16:19:48 unknown-state 2000/09/19 16:19:48
DHCP still wasn't working. when i removed the failover IP settings from both units, DHCP started working, but both units were separately handing out leases…..
-
Hmm, that definitely is the problem then.
But with the failover IPs set appropriately and both units seeing each other on LAN IPs, it should have been working. Can you ping the LAN IP of each box from the other?
Did the CARP IP on LAN show up as MASTER/BACKUP properly?
-
Did the CARP IP on LAN show up as MASTER/BACKUP properly?
What do you mean by this, where do i check it?
I will try using the "ping host" option from the console for you when i get back to the datacenter…
-
Status > CARP.
-
i was able to ping the backup from the master, and vice-versa.
CARP status on 172.17.67.2:
Carp Interface Virtual IP Status carp0 172.17.67.1 MASTER pfSync nodes: 1a0b57df 2203c3b2 2e13f44e
CARP status on 172.17.67.3:
Carp Interface Virtual IP Status carp0 172.17.67.1 BACKUP pfSync nodes: 3196ee34 ce7021a3
-
What do your LAN rules look like on both boxes?
Anything in the DHCP tab of the system logs on either side?
-
What do your LAN rules look like on both boxes?
according to my config files:
<rule><type>pass</type> <descr>Default LAN -> any</descr> <interface>lan</interface> <source> <network>lan</network> <destination><any></any></destination></rule>
after looking through the config files, i noticed this:
<interfaces>[...] <opt1><descr>Sync</descr> <if>vr2</if> <bridge><enable><ipaddr>192.168.2.2</ipaddr> <subnet>24</subnet> <gateway><spoofmac></spoofmac></gateway></enable></bridge></opt1></interfaces>
likewise, the backup has 192.168.2.3 in it. should those be addresses in my 172.17.67.0/24 subnet? should they be unique, like 172.17.67.4 and .5, or should they match the LAN IP address of each unit?
-
LAN rule should be fine then…
As for the other part, that's the sync interface, which needs its own subnet.
The DHCP traffic doesn't use the sync interface, it stays completely on LAN.
-
I restored the configurations you sent me onto a pair of VMs and my DHCP worked OK. I pulled an IP, and the status was normal. So it's not making a whole lot of sense why it isn't working for you as-is. I still have a couple more things to test though.
What kind of a switch do you have on LAN? Was anything else plugged into it besides the netgate boxes and your laptop?
-
What kind of a switch do you have on LAN? Was anything else plugged into it besides the netgate boxes and your laptop?
The switch is a Cisco SGE2000. Two other servers and their LOM interfaces are plugged into the switch…. so, 7 ports are being used on the switch if you count my laptop.
-
I just made a VM CARP pair with NanoBSD images and restored your configuration there and it also worked. (My previous test was with full install VMs)
Is there something on the switch that might be impairing DHCP? Can you try a cheap "dumb" switch (non-managed) temporarily to see if it behaves differently? The servers aren't trying to also be DHCP servers, are they?
In every case I tried, I was able to pull an IP, and when I checked the DHCP leases, the failover status on both was "normal".
-
Is there something on the switch that might be impairing DHCP?
i have no idea ;) because of this DHCP issue, i haven't been able to log into the switch to see. i've never used a cisco managed switch before, anyway… is there anything in particular i should look for in its configuration that might be causing problems?
Can you try a cheap "dumb" switch (non-managed) temporarily to see if it behaves differently?
yes, i'll give it a shot.
The servers aren't trying to also be DHCP servers, are they?
no
-
Not sure exactly what it might be in the switch config, but testing against a small unmanaged switch would let us know if that's even the right direction to be looking.
-
Also, the contents of the DHCP log from master and slave will help. Status > System Logs, DHCP tab.
-
i tried with an ASUS GX-D1081… no such luck. and, there is absolutely nothing in the DHCP logs.
-
Very odd that it is completely blank in the logs. Usually there is at least a startup message.
Not sure how much trouble it would be, but is there any way you can reimage those CF cards with a stock pfSense 1.2.3 image and restore your config? I wonder if it might be something specific to the Netgate images that were preloaded.
-
:\ That is a lot of trouble for me, unless there is a step by step guide available.
-
Not really that difficult, just time consuming really, and requires a box with a CF reader.
I've got copies of the Netgate images here somewhere, I'll see if I can image them to a VM and try that instead.