Cannot obtaining update status on CARP Backup box
2 pfSense 2.3.2 boxes in CARP with multi-WAN. Cannot obtain update status nor download update on CARP Backup box. Cannot download packages also nor ping 220.127.116.11 from localhost or LAN. This was the case with v2.1.5 but updated it offline (not possible any more).
Firewall rules with gateway groups are used because of multi-WAN. Manual outbound NAT rules set, using virtual WAN IP addresses.
When the other box becomes secondary, same happen with it.
Which source address does pfSense use for update / package download?
Any thoughts on how to fix this?
Generally means it cannot resolve names or make connections out unless it possesses the CARP VIP.
You trying to fake CARP/HA with private addresses on the interfaces and one public address for the CARP VIP?
No, I have regular setup with all public IPs on both WAN interfaces on each box.
Then you did something strange with the outbound NAT, most-likely. Maybe NATing source any to the CARP VIP.
NAT config is synced so both boxes have same NAT rule set. Master can obtain update status but backup can't. It's the same after failover: new backup box cannot obtain update status.
Basically NAT rules look like this:
127.0.0.1/8 to WAN1, WAN2 and LAN interface address (automatically added when switched from auto to manual rules)
192.168.1.0/30 to WAN1, WAN2 and LAN interface address ( SYNC interface network, automatically added when switched from auto to manual rules)
LAN net to WAN 1 virtual IP
LAN net to WAN 2 virtual IP
Not really interested in seeing your ascii representation of what you think you have done. What you are experiencing is not normal and is the result of a misconfiguration. Most likely in Outbound NAT. Perhaps in DNS Resolver.
Turned off "NAT configuration" HA sync. Set "Automatic outbound NAT rule generation" on backup box. Still "Unable to check for updates".
DNS Resolver is disabled.
Three DNS servers set in System / General Setup on both boxes: 18.104.22.168, 22.214.171.124, 126.96.36.199. Gateway, for all three, set to "none".
So when you go to Diagnostics > DNS Lookup and try to resolve pkg.pfsense.org what happens?
Open a console/ssh, use option 8 and do this:
What gateway is set as the default gateway? Can you ping it?
ping -S WAN_IP_ADDRESS GW_IP_ADDRESS
Interesting results… DNS lookup for 2nd and 3rd DNS server (188.8.131.52 and 184.108.40.206 ) sometimes fails and sometimes it's successful but often with slow responses. It's always fast for first and last DNS server (127.0.0.1 and 220.127.116.11.):
Result Record type
Name server Query time
127.0.0.1 9 msec
18.104.22.168 5027 msec
22.214.171.124 No response
126.96.36.199 8 msec
WAN 2 is primary gateway, same as on Master box.
drill -I WAN1_IP_ADDRESS will always pass, for any DNS server
drill -I WAN2_IP_ADDRESS will usually fails with the following error: "Error: error sending query: Could not send or receive, because of network error", but sometime passes, same for each DNS server.
Ping is always fine for any WAN and any Gateway IP address.
I have tried to WAN1 as primary gateway and suddenly firewall can obtain update status and manage packages.
Now, I'm confused. It's obviously network error but it's not consistent.
What about the same tests from the current CARP master specifically using the interface addresses instead of the CARP VIP? Similar?
All test are successful on CARP master. Physical interface addresses used.
Going to have to packet capture to see what's what. It is certainly not systemic to CARP/HA which works fine. Maybe something with the ISP/upstream or the outside switch. Pay close attention to MAC addresses.
Do you have exactly the same settings for DNS servers in System > General Setup? I don't recall off-hand if those are XMLRPC synced. I don't think so.
They are not XMLRPC synced. I've checked again, DNS configs are same on both.
I don't get it, how come ping is successful and DNS resolving isn't?
You'll have to packet capture to see.
Question the decision not to XMLRPC sync. Why wouldn't you do that? No wonder the units behave differently.
You didn't understand me, I've just confirmed that DNS servers in System > General Setup are not XMLRPC synced by design. Of course we sync XMLRPC everything:)
The issue is fixed. You were right when mentioning ISP. I've contacted them and it turned out it was their fault. They have had a route which routed all traffic to WAN VIP address, even though we are directly connected. So, all the replies for traffic sent from backup box WAN address, were sent to WAN VIP owned by primary box.
The only weird thing is that ICMP traffic was passing (ping, traceroute -I), UDP was passing partially (DNS lookup, drill), and (I think) TCP wasn't passing at all (couldn't telnet to any site port 80 from backup box).
Thank you, very much. You were of great help!