DHCP not working / self-assigned IP address



  • hello all,

    we purchased the 1u alix.2d3 dual server from netgate, it has two pfsense boxes and the idea was to configure them for failover.  however, because of the way our ISP's network is constructed, we couldn't accomplish it without purchasing some more hardware.  so, i was planning on just using a single pfsense box and at least having the config mirrored to the second pfsense box.

    i plugged our uplink into the WAN port, linked together the two OPT1 ports, plugged my laptop into the LAN port, and turned on the two pfsense units.  unfortunately, my laptop's eth0 didn't receive an address from DHCP, instead it got a 'self-assigned address'… 169.whatever.  after i manually assigned it an address on the correct subnet ('ifconfig eth0 192.168.1.23'), i was able to connect to the webserver of the first pfsense box (http://192.168.1.2) and go through the setup wizard, following the instructions that i got from netgate.

    i changed the internal IP of the pfsense box to some 172. address, etc., and completed the setup wizard, saved the changes, and restarted the pfsense box.  i reconnected the ethernet cable between my laptop and the pfsense box and my laptop still got a self-assigned address.  so again i manually assigned it a 172 address on the same subnet and was able to get the the web interface again.

    i also couldn't look up any hostnames / access the internet yet…

    i tried connecting the LAN port on the pfsense box to our switch and hooked up some of the other computers.  i observed on one of them that it was also getting a self-assigned address.

    what the heck is going on?  according the netgate instructions, my laptop should have been assigned an address from the DHCP server from the start... but that never happened.  the DHCP server is turned on and all the settings look fine.

    thanks.



  • Have you checked the DHCP log on pfSense?

    When you changed the laptop's IP address to 172.x.x.x did you also change default gateway and DNS server?

    What software does your laptop run? There is a known problem with DHCP on Windows Vista. (There are two laptops in my home running Vista that wouldn't act on a DHCPOFFER until I made a registry change.)



  • i did check the DHCP log in pfsense, but nothing in it caught my attention…. but, that doesn't really mean anything since i'm new to this.  i'll have to take a trip back to the datacenter to get a copy of it, though.

    my laptop is running debian and i wasn't sure exactly how to manually define the gateway and DNS server.  i am running gnome on it, so the network-manager might be interfering.

    the other computer that i checked with which was also getting a self-assigned address is running os x server.  i ran nmap from my laptop and didn't get any responses from the other computers which are setup for DHCP, indicating to me that they are also getting self-assigned addresses.



  • @scar:

    i did check the DHCP log in pfsense, but nothing in it caught my attention…. but, that doesn't really mean anything since i'm new to this.

    In the web GUI: Status -> System logs, click on the DHCP tab, should show entries like: DHCPREQUEST … from <mac address="" of="" your="" laptop's="" nic="">  via <pfsense lan="" interface="" name="">followed by a DHCPOFFER.

    On your laptop, if your give the shell command tcpdump -i eth0 you should see the DHCP requests and responses.

    These checks verify that pfSense is seeing the DHCP requests and the laptop is seeing the responses.

    @scar:

    my laptop is running debian and i wasn't sure exactly how to manually define the gateway and DNS server.

    If you don't have gateway and DNS server correctly defined it will be difficult to get "full" access to the internet.</pfsense></mac>



  • @wallabybob:

    In the web GUI: Status -> System logs, click on the DHCP tab, should show entries like: DHCPREQUEST … from <mac address="" of="" your="" laptop's="" nic="">  via <pfsense lan="" interface="" name="">followed by a DHCPOFFER.</pfsense></mac>

    no, nothing showed up in the log.  there were entries in it but, since the time has not been synced, i wasn't sure when they occurred.  so, i cleared the log and tried replugging in my laptop to the LAN port on the first pfsense box.  after my laptop obtained the 'self-assigned address', i manually assigned it a 172 IP, and refreshed the DHCP log in pfsense.  there were no events logged.

    @wallabybob:

    On your laptop, if your give the shell command tcpdump -i eth0 you should see the DHCP requests and responses.

    here is the output of that command.  i started the command, then plugged in the cable, then killed the command shortly after obtaining the self-assigned address.



  • Do you have DHCP server enabled on the pfSense LAN interface? Does the pfSense LAN interface have a static IP address?

    The tcpdump output you posted shows a number of DHCP requests with no response (e.g. 20:15:27.264460 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:46:e4:a9:3e (oui Unknown), length 300)

    It also shows a number of ARP requests for 172.17.67.3 with no response. This suggests that 172.17.67.3 is not connected to your laptop. Is 172.17.67.3 the IP address of the pfSense LAN interface? If not, what has IP address 172.17.67.3 and why is your laptop trying to talk with it?

    You seem to have tried a number of different configurations and its now not clear to me what is what. I suggest you work with:
    pfSense LAN IP: 192.168.1.x/24 with DHCP server enabled and a DHCP address range from 192.168.1.x/24
    pfSense LAN NIC connected to switch connected to laptop. (Its not certain that straight through cables will work without the switch so use a switch to reduce the unknowns.)
    Restart pfSense. On pfSense console start a trace: tcpdump -i re0 -port 67 (where you will have to replace re0 by the system name of your LAN interface, it will be something like re0 or vr0 or vr1 or …)
    THEN restart laptop. If laptop doesn't get IP address from pfSense take a look at the pfSense console. You should see a DHCP request like the one above and a response. If you don't see that then check the pfSense interface is UP and RUNNING and check you have the switch plugged into the correct port on the pfSense box.



  • 172.17.67.3 is the address for the second pfsense box, the backup. it is connected to the first pfsense box, 172.17.67.2, via OPT1 interface.

    i guess i'll verify the DHCP server is enabled and try your other suggestions when i go back to the datacenter :\  i had the LAN interface on both pfsense boxes plugged into the switch, which is where all of the computers are plugged into but, when i ran that tcpdump command, it was picking up traffic from the other computers so, i just plugged my laptop directly into the LAN interface on the first pfsense box.



  • @scar:

    172.17.67.3 is the address for the second pfsense box, the backup. it is connected to the first pfsense box, 172.17.67.2, via OPT1 interface.

    Then I suspect that you plugged the laptop into the OPT1 interface (which would explain why there was no response to the ARP who-has 172.17.67.3 from 172.167.67.2 and no DHCP response). Note that the default firewall rules block access from OPT1 - hence your laptop wouldn't be able to access the internet if it was plugged into OPT1 (unless you deliberately created firewall rules to allow that).

    Since you don't seem to be able to use the second pfSense box at present, why not remove it so you have a basic configuration which should be easier to get working. Then when you get a bit more experience and confidence and the additional equipment you can go on to bigger things.


  • Rebel Alliance Developer Netgate

    If you have DHCP failover configured (check the failover peer IP address) and DHCP can't reach its failover peer, it does not hand out addresses. It may be that you are hitting this.



  • @wallabybob:

    @scar:

    172.17.67.3 is the address for the second pfsense box, the backup. it is connected to the first pfsense box, 172.17.67.2, via OPT1 interface.

    Then I suspect that you plugged the laptop into the OPT1 interface

    i definately plugged into the LAN interface.

    @wallabybob:

    Since you don't seem to be able to use the second pfSense box at present, why not remove it so you have a basic configuration which should be easier to get working. Then when you get a bit more experience and confidence and the additional equipment you can go on to bigger things.

    That's true, but the pfsense boxes came configured like this from Netgate.  i guess i can look into resetting them back to defaults or something, but i'm sure that'll just open up another can of worms for me ;)

    I will check on the other things tonight, if possible, at the datacenter.


  • Rebel Alliance Developer Netgate

    @scar:

    That's true, but the pfsense boxes came configured like this from Netgate.  i guess i can look into resetting them back to defaults or something, but i'm sure that'll just open up another can of worms for me ;)

    I will check on the other things tonight, if possible, at the datacenter.

    …and I'm the one who made their embedded images they put on those devices ;)

    If you have a switch on LAN and both units are plugged into LAN, it should start to work. Otherwise, disable DHCP failover.



  • @jimp:

    If you have a switch on LAN and both units are plugged into LAN, it should start to work. Otherwise, disable DHCP failover.

    yes, that's what i did to begin with.

    the two units are connected to each other via the OPT1 interfaces, the two LAN interfaces are plugged into the same switch, and my laptop and the other computers are plugged into the switch as well.  our single uplink is plugged into the WAN interface on the first (master) pfsense box.  i can access the webgui of the master pfsense by using 172.17.67.2 and i can access the webgui of the backup by using 172.17.67.3, and when i use 172.17.67.1 it goes to the master.  i'm pretty sure the failover IP in the DHCP settings was set to 172.17.67.3, maybe it should be 172.17.67.1?


  • Rebel Alliance Developer Netgate

    No, the failover IP in DHCP should point to the real LAN IP of the other box. So on the main unit, it points to the LAN IP of the backup, and vice versa.

    If you go to Status > DHCP Leases, what does the failover status at the top show?



  • i made sure that the DHCP server settings matched on each unit, save for the "failover IP", which was set to 172.17.67.3 on the master (172.17.67.2) and 172.17.67.2 on the backup (172.17.67.3).  in the DHCP leases, this is what i saw:

    Diagnostics: DHCP leases
    Failover Group    	My State    	Since    	Peer State    	Since   
    "dhcp0"  	recover  	2000/09/19 16:19:48  	unknown-state  	2000/09/19 16:19:48 
    

    DHCP still wasn't working.  when i removed the failover IP settings from both units, DHCP started working, but both units were separately handing out leases…..


  • Rebel Alliance Developer Netgate

    Hmm, that definitely is the problem then.

    But with the failover IPs set appropriately and both units seeing each other on LAN IPs, it should have been working. Can you ping the LAN IP of each box from the other?

    Did the CARP IP on LAN show up as MASTER/BACKUP properly?



  • @jimp:

    Did the CARP IP on LAN show up as MASTER/BACKUP properly?

    What do you mean by this, where do i check it?

    I will try using the "ping host" option from the console for you when i get back to the datacenter…


  • Rebel Alliance Developer Netgate

    Status > CARP.



  • i was able to ping the backup from the master, and vice-versa.

    CARP status on 172.17.67.2:

    
    Carp Interface  Virtual IP  Status
    carp0           172.17.67.1 MASTER 
    
    pfSync nodes:
    
    1a0b57df
    2203c3b2
    2e13f44e
    
    

    CARP status on 172.17.67.3:

    
    Carp Interface  Virtual IP  Status
    carp0           172.17.67.1 BACKUP 
    
    pfSync nodes:
    
    3196ee34
    ce7021a3
    
    

  • Rebel Alliance Developer Netgate

    What do your LAN rules look like on both boxes?

    Anything in the DHCP tab of the system logs on either side?



  • @jimp:

    What do your LAN rules look like on both boxes?

    according to my config files:

    		 <rule><type>pass</type>
    			<descr>Default LAN -> any</descr>
    			<interface>lan</interface>
    			<source>
    				<network>lan</network>
    
    			 <destination><any></any></destination></rule> 
    
    

    after looking through the config files, i noticed this:

    	 <interfaces>[...]
    		 <opt1><descr>Sync</descr>
    			<if>vr2</if>
    			 <bridge><enable><ipaddr>192.168.2.2</ipaddr>
    			<subnet>24</subnet>
    			 <gateway><spoofmac></spoofmac></gateway></enable></bridge></opt1></interfaces> 
    
    

    likewise, the backup has 192.168.2.3 in it.  should those be addresses in my 172.17.67.0/24 subnet?  should they be unique, like 172.17.67.4 and .5, or should they match the LAN IP address of each unit?


  • Rebel Alliance Developer Netgate

    LAN rule should be fine then…

    As for the other part, that's the sync interface, which needs its own subnet.

    The DHCP traffic doesn't use the sync interface, it stays completely on LAN.


  • Rebel Alliance Developer Netgate

    I restored the configurations you sent me onto a pair of VMs and my DHCP worked OK. I pulled an IP, and the status was normal. So it's not making a whole lot of sense why it isn't working for you as-is. I still have a couple more things to test though.

    What kind of a switch do you have on LAN? Was anything else plugged into it besides the netgate boxes and your laptop?



  • @jimp:

    What kind of a switch do you have on LAN? Was anything else plugged into it besides the netgate boxes and your laptop?

    The switch is a Cisco SGE2000.  Two other servers and their LOM interfaces are plugged into the switch…. so, 7 ports are being used on the switch if you count my laptop.


  • Rebel Alliance Developer Netgate

    I just made a VM CARP pair with NanoBSD images and restored your configuration there and it also worked. (My previous test was with full install VMs)

    Is there something on the switch that might be impairing DHCP? Can you try a cheap "dumb" switch (non-managed) temporarily to see if it behaves differently? The servers aren't trying to also be DHCP servers, are they?

    In every case I tried, I was able to pull an IP, and when I checked the DHCP leases, the failover status on both was "normal".



  • @jimp:

    Is there something on the switch that might be impairing DHCP?

    i have no idea ;)  because of this DHCP issue, i haven't been able to log into the switch to see.  i've never used a cisco managed switch before, anyway… is there anything in particular i should look for in its configuration that might be causing problems?

    Can you try a cheap "dumb" switch (non-managed) temporarily to see if it behaves differently?

    yes, i'll give it a shot.

    The servers aren't trying to also be DHCP servers, are they?

    no


  • Rebel Alliance Developer Netgate

    Not sure exactly what it might be in the switch config, but testing against a small unmanaged switch would let us know if that's even the right direction to be looking.


  • Rebel Alliance Developer Netgate

    Also, the contents of the DHCP log from master and slave will help. Status > System Logs, DHCP tab.



  • i tried with an ASUS GX-D1081… no such luck.  and, there is absolutely nothing in the DHCP logs.


  • Rebel Alliance Developer Netgate

    Very odd that it is completely blank in the logs. Usually there is at least a startup message.

    Not sure how much trouble it would be, but is there any way you can reimage those CF cards with a stock pfSense 1.2.3 image and restore your config? I wonder if it might be something specific to the Netgate images that were preloaded.



  • :\  That is a lot of trouble for me, unless there is a step by step guide available.


  • Rebel Alliance Developer Netgate

    Not really that difficult, just time consuming really, and requires a box with a CF reader.

    I've got copies of the Netgate images here somewhere, I'll see if I can image them to a VM and try that instead.


  • Rebel Alliance Developer Netgate

    Good news, and bad news.

    Good news is I finally reproduced it and have a fix. The bad news it that it is specific to your configuration :-)

    When you are using routers in a CARP pair, you can't use Proxy ARP VIPs. These were syncing into the backup config as empty <vip>tags, which was tricking the DHCP server there into thinking it should be primary.

    So DHCP didn't work because they both thought they were primary.

    The fix could either be to switch to 'other' type VIPs, and define them on both master and backup, or switch them to CARP if you can. Proxy ARP can't work on both routers in a failover configuration.

    The other fix – if you really need proxy arp on the main unit -- is to patch services.inc to handle the empty tags better. I have a patch if you need it.</vip>



  • Thank you for discovering this.  Seems like a bug to me.  Is this only for the Netgate installation, or does it happen with a standard installation also?

    i don't know if can use the type "Other" VIP.  i have these public IP addresses which will be tied to various hostnames.  the public IP address will need to be mapped to various internal IP addresses, depending on what service is accessed.  will "Other" VIP's function this way?


  • Rebel Alliance Developer Netgate

    It happened with the normal images as well once I looked deeper. I don't know why it was OK when I first restored your configuration, but it's possible that dhcpd hadn't reloaded after the empty tags sync'd over.

    If your other IPs are in the same subnet as WAN, use CARP. If you have another subnet of IPs and those IPs are routed to your WAN IP, an 'other' type will work.

    Using proxy arp IPs with CARP isn't considered a valid configuration, which is why nobody else seems to have hit this.



  • ok… i will try changing them all to CARP VIPs tonight, if possible, and let you know how it fares.



  • i changed them all to CARP VIPs but, still no luck.  i put each address into it's own VHID group number but i used the same password for all of them… the password doesn't seem to matter anyway.  on 'status -> dhcp leases' the state is still "recover".  i notice that if i go to 'status -> CARP', the status for all of the public VIPs is "MASTER" on both the master and backup firewalls.  i'm sorry i obviously don't know what i'm doing ;)  but i'll learn in time.  i'll send ya my current configs.  is there any way i make the firewall config accessible from outside the datacenter?  then i don't keep having to make trips back and forth each time....


  • Rebel Alliance Developer Netgate

    I restored the new configuration files you sent, and all of the CARP IPs show up properly; master on master, and backup on backup. The DHCP daemon came up properly this time too, in a normal state on both sides.

    However, I did notice that the WAN IP is set to the same IP address on both of them (x.x.x.16) when it should be different on each box (if you want to use .16 as the shared IP, it must also be a CARP IP), and also the subnet mask of all those CARP IPs should be /28, not /32.



  • i don't get why the config works for you but not for me…  i even tried rebooting the two boxes and still everything shows up as MASTER.  maybe i'll ask the datacenter support staff to look at it.


  • Rebel Alliance Developer Netgate

    When you tried that unmanaged switch, did you try it on the WAN, the LAN, or both? CARP relies on broadcasts (as does DHCP) so if their switches are blocking anything like that it can be an issue.

    If your WANs are on your ISP's switch/network and not in its own VLAN, it's also possible your VHIDs might be conflicting or they may be blocking some broadcast traffic there.

    As for accessing the router remotely, it should be as easy as (a) switching the webgui to use HTTPS under system > general (for security reasons), and (b) adding a firewall rule on WAN that allows TCP traffic in to the destination of "WAN Address" on port 443 (pick https from the list or type in 443). Then you should be able to access the WebGUI from anywhere.

    Moving it to another port (and using that port in the rule) would be even better, something like 4433, 44433, etc. A VPN would be ideal but is much harder to setup.



  • Thanks for the info.  I'll talk to the ISP about it.  I am planning on setting up a VPN in the future.


Log in to reply