[SOLVED] CARP Slave DNS Query Refused



  • I've just got one last thing (that I know about) until my CARP config is fully functional. For some reason my pfSense Master DNS is working perfectly fine, but I get the following error when I perform a DNS Query on my pfSense Slave.

    >nslookup google.com 10.10.0.1 (LAN VIP)
    Server:  pfsense.lan
    Address:  10.10.0.1
    
    Non-authoritative answer:
    Name:    google.com
    Addresses:  2607:f8b0:4006:81a::200e
              172.217.12.206
    
    
    >nslookup google.com 10.10.0.13 (master)
    Server:  brain.lan
    Address:  10.10.0.13
    
    Non-authoritative answer:
    Name:    google.com
    Addresses:  2607:f8b0:4006:81a::200e
              172.217.12.206
    
    
    >nslookup google.com 10.10.0.14 (slave)
    Server:  UnKnown
    Address:  10.10.0.14
    
    *** UnKnown can't find google.com: Query refused
    

    I was thinking maybe this was normal and killed the master so the slave would take over. Exact same results. I've compared everything between the two and it is syncing properly. Additionally, I made sure that my firewall rule allows port 53 to 10.10.0.1, 10.10.0.13, and 10.10.0.14.

    Attached are pictures of my settings. I have tried to search and can't find anything on this issue. I've tried every combination of settings I can think of (including removing the custom options) and I still get the exact same results. I've tried turning up the log level and comparing them without much luck. Does anyone have any idea on what I could be doing wrong? Thanks in advance!

    1_1533415855198_Capture2.PNG 0_1533415855198_Capture.PNG

    EDIT: After lots of troubleshooting, I ended up setting the slave pfsense to factory defaults and reconfigured everything, After doing that, everything seems to work as it should. Not an ideal answer, but it's not as bad as it could be.



  • Maybe there is something wrong with your outbound NAT rules.
    Post the rule set, please.



  • That thought had crossed my mind, but I don't think that could be it. When it's on master, I'm able to get out to the internet just fine. If I ssh into something on the internet by IP address and disconnect master, it stays connected. If I ssh into something on the internet by IP address and reconnect master, it stays connected.

    I have Manual Outbount NAT rule generation selected.I have 2 rules for each interface. They're the same except the source.

    Interface  Source      Port  Destination  Port  NAT Address  Port Static
    WAN        127.0.0.0/8 *     *            500   WAN address  *    YES
    WAN        127.0.0.0/8 *     *            *     WAN address  *    NO 
    ...
    WAN        10.10.0.0/28  *     *          500   WAN_VIP      *    YES
    WAN        10.10.0.0/28  *     *          *     WAN_VIP      *    NO 
    ...
    

    EDIT: Like a dummo, I was NATing my loopback to the VIP. Fixed the chart here to what it currently is. While it was a problem, it wasn't the problem causing my headache here.



  • If you have manual outbound NAT rule generation activated you have also set a rule for WAN and pfSense itself:

    WAN 127.0.0.0/8 * * * WAN address * NO
    WAN 127.0.0.0/8 * * 500 WAN address * YES
    Set it on the master, so it will get synct to slave.



  • You're absolutely right. Thanks for correcting those lines. That didn't seem to solve this problem though. :/



  • Try to reboot the boxes.



  • I've tried rebooting them a few times. No such luck.


  • Netgate

    Can the Secondary node resolve names in Diagnostics > DNS Lookup when it is not holding the WAN CARP VIP (CARP BACKUP)?

    Does this behavior change when it has CARP MASTER?

    Does the it behave the same way on the primary? (Can resolve when MASTER but not BACKUP) ??

    Note that this is not active/active failover. If you are going to tell inside hosts to use the firewall as a DNS server, you should point them at a CARP VIP and they should be using the active node only.

    Those rules are incorrect.

    Do not NAT traffic from the WAN interface addresses to the CARP VIP. Only NAT inside addresses to the CARP VIP. Do not NAT connections from localhost (127.0.0.1) to the CARP VIP. Only NAT inside addresses to the CARP VIP. Those localhost rules above should both NAT to WAN address. The rules for the /28 are unnecessary unless you need to NO NAT them because they match something further down.

    This assumes that the /28 in those rules is the WAN interface subnet.



  • I wanted to test what would happen if I put the factory defaults on the slave and disabled HA Sync for a moment. It was able to resolve DNS no problem. I'm restoring the config snapshot I took right before factory resetting it now, but I promised the wife today was family day, so I can't continue troubleshooting this right this second. I'll let you know as soon as I am able to. Sorry.

    In the meantime, I believe viragomann set me straight about the loopback NAT. I now have 127.0.0.0/8 NATing with the WAN interface address, no the VIP.

    EDIT: If Master is connected, both can successfully use Diag > DNS Resolve. If only Slave is connected, Slave can use DNS Resolve, Master cannot. The /28 in those rules is my "LAN" subnet. I don't have any rules for my WAN subnet (and I don't think I should, right?)


  • Netgate

    I don't have any rules for my WAN subnet (and I don't think I should, right?)

    No, you shouldn't.

    But why are you natting public addresses on an inside subnet?



  • I'm not quite sure I understand what you're asking. I have a Ubiquiti ER-4 that has Eth0 grabbing a DHCP address, and I have a subnet that both the WAN ports of my pfSense VMs are in. For example, they're in 10.0.0.0/29 and the two interfaces are bridged together. The bridge IP is 10.0.0.1 and that's the default gateway of that subnet. The master pfSense box has 10.0.0.5 as its WAN IP, the slave has 10.0.0.6, and the VIP is 10.0.0.4. Internal networks NAT to 10.0.0.4, and now 127.0.0.1 NATs to its WAN address (10.0.0.5 or 10.0.0.6 depending on which VM it is)

    Is this wrong?

    I've got more time today I believe. I'm going to go ahead and do a factory reset on the slave pfSense and try again. I'm thinking something weird just went screwy since a factory reset pfSense was able to at least resolve DNS for me.


  • Netgate

    Since you obfuscated them I assumed they were public. No idea why you would hide RFC1918 addresses.



  • Got it. I wasn't really thinking about it. Thinking about it, you're right. It makes no sense for me to have obfuscated them.

    EDIT: Deobfuscated them through all posts.

    EDIT 2: So I'm not convinced I've got my problem solved just yet, but it's possible. I reset my pfSense slave to factory defaults and have been reconfiguring it from the ground up. So far DNS is still working, but I still have a handful of interfaces to configure. At this stage, I would expect it to not be working on any interfaces if it was going to have any issues, so I'm hopeful. If this does fix it, I have absolutely no idea what was broke.


 

© Copyright 2002 - 2018 Rubicon Communications, LLC | Privacy Policy