/etc/hosts destroyed periodically



  • I noticed quite a while ago that 'register DHCP lease in DNS forwarder' wasn't working properly, but this wasn't a huge deal to me. Installed a new snapshot a couple days ago, and I've noticed that at first it works now, both leases and statically-set hostnames are present in /etc/hosts and work fine. However after some period of time (not really sure exactly how long, in my test case it was ~4.5h), /etc/hosts gets wiped and only new leases appear, discarding all old leases and static entries. I believe this might be related to the patch here: http://redmine.pfsense.org/issues/show/374 as I'm pretty sure my previous snapshot was older than this and at least static hostnames worked in that one.

    
    Mar 24 01:04:44 router dnsmasq[8459]: read /etc/hosts - 12 addresses
    Mar 24 01:04:51 router dnsmasq[8459]: read /etc/hosts - 13 addresses
    Mar 24 05:47:33 router dnsmasq[8459]: read /etc/hosts - 13 addresses
    Mar 24 05:47:35 router dnsmasq[8459]: read /etc/hosts - 14 addresses
    Mar 24 05:47:35 router dnsmasq[8459]: read /etc/hosts - 14 addresses
    Mar 24 05:47:35 router dnsmasq[8459]: read /etc/hosts - 13 addresses
    Mar 24 05:47:36 router dnsmasq[8459]: read /etc/hosts - 13 addresses
    Mar 24 05:47:37 router dnsmasq[8459]: read /etc/hosts - 0 addresses
    Mar 24 05:47:37 router dnsmasq[8459]: read /etc/hosts - 1 addresses
    Mar 24 05:47:40 router dnsmasq[8459]: read /etc/hosts - 0 addresses
    Mar 24 06:08:45 router dnsmasq[8459]: read /etc/hosts - 0 addresses
    Mar 24 06:08:48 router dnsmasq[8459]: read /etc/hosts - 1 addresses
    Mar 24 16:59:12 router dnsmasq[8459]: read /etc/hosts - 1 addresses
    Mar 24 16:59:14 router dnsmasq[8459]: read /etc/hosts - 2 addresses
    Mar 24 16:59:15 router dnsmasq[8459]: read /etc/hosts - 0 addresses
    Mar 24 16:59:15 router dnsmasq[8459]: read /etc/hosts - 1 addresses
    Mar 24 16:59:15 router dnsmasq[8459]: read /etc/hosts - 2 addresses
    Mar 24 16:59:22 router dnsmasq[8459]: read /etc/hosts - 0 addresses
    Mar 24 17:59:19 router dnsmasq[8459]: read /etc/hosts - 0 addresses
    Mar 24 17:59:20 router dnsmasq[8459]: read /etc/hosts - 1 addresses
    Mar 24 17:59:20 router dnsmasq[8459]: read /etc/hosts - 2 addresses
    Mar 24 17:59:27 router dnsmasq[8459]: read /etc/hosts - 3 addresses
    Mar 24 17:59:29 router dnsmasq[8459]: read /etc/hosts - 2 addresses
    Mar 24 18:34:25 router dnsmasq[8459]: read /etc/hosts - 2 addresses
    
    

    Re-saving the configuration for the DNS forwarder seems to reset the /etc/hosts and things work again for a while, but the same failure will happen again after a couple hours.

    I'm not sure if this is something the team is aware of, the bug seems to be open with a 'there are still problems' flag, but it doesn't look like anyone's working on it. Just a heads up, this is pretty critical for me.

    Hmm, on a further look this looks like a race condition; I have 11 /etc/rc.parse-isc-dhcpd instances running concurrently, I wonder where they all came from, but I bet that's the problem. Looks like saving the DNS forwarder settings spawns another one without killing off the existing processes. I'm not sure if there's another way this could happen. Seems like it makes sense to have a PID file and check it at the start of the script as well as wherever it gets spawned from.

    Edit:

    Also this line:

    cat /var/etc/hosts | grep -v "$1" > /tmp/hosts.tmp
    

    I believe could wreck things too if the hostname happens to appear in the domain. Should probably make sure there's a word boundary, or at least grep for the full hostname.domain.


  • Rebel Alliance Developer Netgate

    There was a commit to code in this area in the last day or so, not sure if it's related. You may want to update one more time and see if the problem still happens.

    If there is not an existing ticket open on http://redmine.pfsense.org you may want to open one with these details and/or a link to the thread.



  • I noticed this to with an snapshot of a couple of weeks old. It usually resets the /etc/hosts when a new entry should be added by the DHCP server.
    Im currently running a snapshot built on Wed Mar 24 10:33:05 EDT 2010 and don't have any problems anymore.


Log in to reply