DHCP not working / self-assigned IP address
-
What do your LAN rules look like on both boxes?
Anything in the DHCP tab of the system logs on either side?
-
What do your LAN rules look like on both boxes?
according to my config files:
<rule><type>pass</type> <descr>Default LAN -> any</descr> <interface>lan</interface> <source> <network>lan</network> <destination><any></any></destination></rule>
after looking through the config files, i noticed this:
<interfaces>[...] <opt1><descr>Sync</descr> <if>vr2</if> <bridge><enable><ipaddr>192.168.2.2</ipaddr> <subnet>24</subnet> <gateway><spoofmac></spoofmac></gateway></enable></bridge></opt1></interfaces>
likewise, the backup has 192.168.2.3 in it. should those be addresses in my 172.17.67.0/24 subnet? should they be unique, like 172.17.67.4 and .5, or should they match the LAN IP address of each unit?
-
LAN rule should be fine then…
As for the other part, that's the sync interface, which needs its own subnet.
The DHCP traffic doesn't use the sync interface, it stays completely on LAN.
-
I restored the configurations you sent me onto a pair of VMs and my DHCP worked OK. I pulled an IP, and the status was normal. So it's not making a whole lot of sense why it isn't working for you as-is. I still have a couple more things to test though.
What kind of a switch do you have on LAN? Was anything else plugged into it besides the netgate boxes and your laptop?
-
What kind of a switch do you have on LAN? Was anything else plugged into it besides the netgate boxes and your laptop?
The switch is a Cisco SGE2000. Two other servers and their LOM interfaces are plugged into the switch…. so, 7 ports are being used on the switch if you count my laptop.
-
I just made a VM CARP pair with NanoBSD images and restored your configuration there and it also worked. (My previous test was with full install VMs)
Is there something on the switch that might be impairing DHCP? Can you try a cheap "dumb" switch (non-managed) temporarily to see if it behaves differently? The servers aren't trying to also be DHCP servers, are they?
In every case I tried, I was able to pull an IP, and when I checked the DHCP leases, the failover status on both was "normal".
-
Is there something on the switch that might be impairing DHCP?
i have no idea ;) because of this DHCP issue, i haven't been able to log into the switch to see. i've never used a cisco managed switch before, anyway… is there anything in particular i should look for in its configuration that might be causing problems?
Can you try a cheap "dumb" switch (non-managed) temporarily to see if it behaves differently?
yes, i'll give it a shot.
The servers aren't trying to also be DHCP servers, are they?
no
-
Not sure exactly what it might be in the switch config, but testing against a small unmanaged switch would let us know if that's even the right direction to be looking.
-
Also, the contents of the DHCP log from master and slave will help. Status > System Logs, DHCP tab.
-
i tried with an ASUS GX-D1081… no such luck. and, there is absolutely nothing in the DHCP logs.
-
Very odd that it is completely blank in the logs. Usually there is at least a startup message.
Not sure how much trouble it would be, but is there any way you can reimage those CF cards with a stock pfSense 1.2.3 image and restore your config? I wonder if it might be something specific to the Netgate images that were preloaded.
-
:\ That is a lot of trouble for me, unless there is a step by step guide available.
-
Not really that difficult, just time consuming really, and requires a box with a CF reader.
I've got copies of the Netgate images here somewhere, I'll see if I can image them to a VM and try that instead.
-
Good news, and bad news.
Good news is I finally reproduced it and have a fix. The bad news it that it is specific to your configuration :-)
When you are using routers in a CARP pair, you can't use Proxy ARP VIPs. These were syncing into the backup config as empty <vip>tags, which was tricking the DHCP server there into thinking it should be primary.
So DHCP didn't work because they both thought they were primary.
The fix could either be to switch to 'other' type VIPs, and define them on both master and backup, or switch them to CARP if you can. Proxy ARP can't work on both routers in a failover configuration.
The other fix – if you really need proxy arp on the main unit -- is to patch services.inc to handle the empty tags better. I have a patch if you need it.</vip>
-
Thank you for discovering this. Seems like a bug to me. Is this only for the Netgate installation, or does it happen with a standard installation also?
i don't know if can use the type "Other" VIP. i have these public IP addresses which will be tied to various hostnames. the public IP address will need to be mapped to various internal IP addresses, depending on what service is accessed. will "Other" VIP's function this way?
-
It happened with the normal images as well once I looked deeper. I don't know why it was OK when I first restored your configuration, but it's possible that dhcpd hadn't reloaded after the empty tags sync'd over.
If your other IPs are in the same subnet as WAN, use CARP. If you have another subnet of IPs and those IPs are routed to your WAN IP, an 'other' type will work.
Using proxy arp IPs with CARP isn't considered a valid configuration, which is why nobody else seems to have hit this.
-
ok… i will try changing them all to CARP VIPs tonight, if possible, and let you know how it fares.
-
i changed them all to CARP VIPs but, still no luck. i put each address into it's own VHID group number but i used the same password for all of them… the password doesn't seem to matter anyway. on 'status -> dhcp leases' the state is still "recover". i notice that if i go to 'status -> CARP', the status for all of the public VIPs is "MASTER" on both the master and backup firewalls. i'm sorry i obviously don't know what i'm doing ;) but i'll learn in time. i'll send ya my current configs. is there any way i make the firewall config accessible from outside the datacenter? then i don't keep having to make trips back and forth each time....
-
I restored the new configuration files you sent, and all of the CARP IPs show up properly; master on master, and backup on backup. The DHCP daemon came up properly this time too, in a normal state on both sides.
However, I did notice that the WAN IP is set to the same IP address on both of them (x.x.x.16) when it should be different on each box (if you want to use .16 as the shared IP, it must also be a CARP IP), and also the subnet mask of all those CARP IPs should be /28, not /32.
-
i don't get why the config works for you but not for me… i even tried rebooting the two boxes and still everything shows up as MASTER. maybe i'll ask the datacenter support staff to look at it.