DHCP failover both in recover



  • Hi all,

    I'm trying to activate DHCP failover on a CARP LAN interface but both nodes are in recover / unknown peer state

    node1 - node2 - VIP

    10.6.0.1/16 - 10.6.0.2/16 - 10.6.0.3/16

    node1 (primary) has 10.6.0.2 as failover peer, and pfsync put 10.6.01 as failover peer on node2 automatically.

    Firewall rules are dynamically added:
    pass in  quick on $GUESTS1006 proto { tcp udp } from 10.6.0.2 to 10.6.0.1 port = 519 tracker 1000013144 label "allow access to DHCP failover"
    pass in  quick on $GUESTS1006 proto { tcp udp } from 10.6.0.2 to 10.6.0.1 port = 520 tracker 1000013145 label "allow access to DHCP failover"

    and

    pass in  quick on $GUESTS1006 proto { tcp udp } from 10.6.0.1 to 10.6.0.2 port = 519 tracker 1000013144 label "allow access to DHCP failover"
    pass in  quick on $GUESTS1006 proto { tcp udp } from 10.6.0.1 to 10.6.0.2 port = 520 tracker 1000013145 label "allow access to DHCP failover"

    Both nodes are NTP synced (same NTP servers)

    Both dhcpd.conf seems to be OK too:

    default-lease-time 7200;
    max-lease-time 86400;
    log-facility local7;
    one-lease-per-client true;
    deny duplicates;
    ping-check true;
    update-conflict-detection false;
    authoritative;
    failover peer "dhcp_opt10" {
      primary;
      address 10.6.0.1;
      port 519;
      peer address 10.6.0.2;
      peer port 520;
      max-response-delay 10;
      max-unacked-updates 10;
      split 128;
      mclt 600;

    load balance max seconds 3;
    }

    subnet 10.6.0.0 netmask 255.255.0.0 {
    pool {
    option domain-name-servers 10.6.0.3;
    deny dynamic bootp clients;
    failover peer "dhcp_opt10";

    range 10.6.1.1 10.6.9.255;
    }

    option routers 10.6.0.3;
    option domain-name "office-people-doc.com";
    option domain-name-servers 10.6.0.3;
    max-lease-time 7200;

    }

    default-lease-time 7200;
    max-lease-time 86400;
    log-facility local7;
    one-lease-per-client true;
    deny duplicates;
    ping-check true;
    update-conflict-detection false;
    authoritative;
    failover peer "dhcp_opt10" {
      secondary;
      address 10.6.0.2;
      port 520;
      peer address 10.6.0.1;
      peer port 519;
      max-response-delay 10;
      max-unacked-updates 10;
     
      load balance max seconds 3;
    }

    subnet 10.6.0.0 netmask 255.255.0.0 {
    pool {
    option domain-name-servers 10.6.0.3;
    deny dynamic bootp clients;
    failover peer "dhcp_opt10";

    range 10.6.1.1 10.6.9.255;
    }

    option routers 10.6.0.3;
    option domain-name "office-people-doc.com";
    option domain-name-servers 10.6.0.3;
    max-lease-time 7200;

    }

    I removed all DHCP lease on both nodes to have them clear, but no way. Both are staying in recover mode and does not serve IPs to clients. Where am I wrong? :-)

    I can see that on both nodes, nothing is received/sent on port 519 and 520 on the LAN interface. I think that's the problem but why?



  • I found something. Both nodes are unable to communicate between them.

    SNAT on loopback is translated to "interface address" so it should be good.

    I did a firewall alias with both real IPs 10.6.0.1 and 10.6.0.2 and I added a rule on interface "GUESTS1006" like:
    any protocol source "alias" to interface address

    no way! nodes can't ping each other.

    By the way, I have an other interface with CARP, on other subnet and nodes can ping each other. I can't see difference… Both interface are VLAN, CARP configuration is exactly the same, SNAT too. Diff is on firewall rules but I tried a any2any rule on GUESTS1006 and does not work. No packets matches the rule, I can't explain.


Log in to reply