Cannot obtaining update status on CARP Backup box
-
Generally means it cannot resolve names or make connections out unless it possesses the CARP VIP.
You trying to fake CARP/HA with private addresses on the interfaces and one public address for the CARP VIP?
-
No, I have regular setup with all public IPs on both WAN interfaces on each box.
-
Then you did something strange with the outbound NAT, most-likely. Maybe NATing source any to the CARP VIP.
-
NAT config is synced so both boxes have same NAT rule set. Master can obtain update status but backup can't. It's the same after failover: new backup box cannot obtain update status.
Basically NAT rules look like this:
127.0.0.1/8 to WAN1, WAN2 and LAN interface address (automatically added when switched from auto to manual rules)
192.168.1.0/30 to WAN1, WAN2 and LAN interface address ( SYNC interface network, automatically added when switched from auto to manual rules)
LAN net to WAN 1 virtual IP
LAN net to WAN 2 virtual IP -
Not really interested in seeing your ascii representation of what you think you have done. What you are experiencing is not normal and is the result of a misconfiguration. Most likely in Outbound NAT. Perhaps in DNS Resolver.
-
Turned off "NAT configuration" HA sync. Set "Automatic outbound NAT rule generation" on backup box. Still "Unable to check for updates".
DNS Resolver is disabled.
Three DNS servers set in System / General Setup on both boxes: 209.244.0.3, 8.8.8.8, 8.8.4.4. Gateway, for all three, set to "none". -
So when you go to Diagnostics > DNS Lookup and try to resolve pkg.pfsense.org what happens?
Open a console/ssh, use option 8 and do this:
drill -I WAN_IP_ADDRESS @209.244.0.3 pkg.pfsense.org
drill -I WAN_IP_ADDRESS @8.8.8.8 pkg.pfsense.org
drill -I WAN_IP_ADDRESS @8.8.4.4 pkg.pfsense.orgWhat happens?
What gateway is set as the default gateway? Can you ping it?
ping -S WAN_IP_ADDRESS GW_IP_ADDRESS
-
Interesting results… DNS lookup for 2nd and 3rd DNS server (209.244.0.3 and 8.8.8.8 ) sometimes fails and sometimes it's successful but often with slow responses. It's always fast for first and last DNS server (127.0.0.1 and 8.8.8.4.):
Results
Result Record type
162.208.119.39 A
2610:1c1:3::116 AAAA
Timings
Name server Query time
127.0.0.1 9 msec
209.244.0.3 5027 msec
8.8.8.8 No response
8.8.4.4 8 msecWAN 2 is primary gateway, same as on Master box.
drill -I WAN1_IP_ADDRESS will always pass, for any DNS server
drill -I WAN2_IP_ADDRESS will usually fails with the following error: "Error: error sending query: Could not send or receive, because of network error", but sometime passes, same for each DNS server.Ping is always fine for any WAN and any Gateway IP address.
I have tried to WAN1 as primary gateway and suddenly firewall can obtain update status and manage packages.
Now, I'm confused. It's obviously network error but it's not consistent.
-
What about the same tests from the current CARP master specifically using the interface addresses instead of the CARP VIP? Similar?
-
All test are successful on CARP master. Physical interface addresses used.
-
Going to have to packet capture to see what's what. It is certainly not systemic to CARP/HA which works fine. Maybe something with the ISP/upstream or the outside switch. Pay close attention to MAC addresses.
-
Do you have exactly the same settings for DNS servers in System > General Setup? I don't recall off-hand if those are XMLRPC synced. I don't think so.
-
They are not XMLRPC synced. I've checked again, DNS configs are same on both.
I don't get it, how come ping is successful and DNS resolving isn't? -
You'll have to packet capture to see.
Question the decision not to XMLRPC sync. Why wouldn't you do that? No wonder the units behave differently.
-
You didn't understand me, I've just confirmed that DNS servers in System > General Setup are not XMLRPC synced by design. Of course we sync XMLRPC everything:)
The issue is fixed. You were right when mentioning ISP. I've contacted them and it turned out it was their fault. They have had a route which routed all traffic to WAN VIP address, even though we are directly connected. So, all the replies for traffic sent from backup box WAN address, were sent to WAN VIP owned by primary box.
The only weird thing is that ICMP traffic was passing (ping, traceroute -I), UDP was passing partially (DNS lookup, drill), and (I think) TCP wasn't passing at all (couldn't telnet to any site port 80 from backup box).
Thank you, very much. You were of great help!