HA strange behaviour, problems on passive box
-
Hi all,
Need your wisdom here, kind of lost.
I have 2 Pfsense in HA, both hosted on VMWARE 6.5 hosts. I think it used to work, but now for some reason I am having issues, it can be due to version 2.5.0 or something I changed on the recent past.
Environment:
lan:
vip: 192.168.15.1
pf1: 192.168.15.2
pf2: 192.168.15.3dmz:
vip: 10.10.30.1
pf1: 10.10.30.2
pf2: 10.10.30.3When PF1 is the master:
-
From LAN I can ping all 3 LAN IPs (VIP, pf1 and pf2)
-
From LAN I can ping DMZ VIP and PF1, but ping DMZ PF2 IP fails!
-
From DMZ I can ping DMZ VIP and PF1, but ping DMZ PF2 IP fails!
When I disable CARP in pf1, pf2 becomes the master for all VIPs with no issues, pf1 becomes backup.
When PF2 is the master:
-
lan still has network connectivity
-
DMZ loses connectivity right away;
-
From DMZ I cannot ping PF1, PF2 or VIP, it is like DMZ is gone
-
From LAN I can ping all 3 LAN IPs (VIP, pf1 and pf2)
-
From LAN I can ping DMZ VIP and PF2, but now ping DMZ PF1 IP fails! (seems that I can only ping the PF IP is it is the master);
I have double checked VMWARE vswitch and port groups, they look fine to me.
Any ideas are very welcome!
-
-
So more info:
While PF1 is the master, a server inside the DMZ can see the mac address of VIP and PF1, not PF2:
While PF2 is the master, same server in the DMZ can see the mac addresses of PF1 and PF2, but not VIP:
Another thing: when getting back to PF1 I am getting this message, not sure if it is actually an issue or not:
-
@viniciusbr
Seems not to be a normal behavior.
What tells the system log? -
Logs after the failover:
PF1:
NOTE: check the top line, this one is ONLY for the DMZ, not for the others, not sure why
PF2:
PF2 goes to MASTER as expected:
PF1 has disabled status:
Now let's get all back, cleared the logs first
PF1 logs after becoming master again:
And then that RESET DEMOTION STATUS button appeared again, when clicking it I get new lines in the log:
And PF2 logs after becoming backup:
-
@viniciusbr
Don't disable CARP on PF1, just activate the "persistent maintenance mode". -
The way you suggested:
PF1 went to backup status as expected and PF2 as master as expected.
NOTES:
- PF1 as master I sill cannot ping PF1 IP from DMZ;
- PF2 as master DMZ is gone, can only ping PF1 IP, VIP and PF2 IP does not work;
PF1 logs:
PF2 logs:
-
@viniciusbr
Just to be sure, is the promiscuous mode enabled on ESXi? -
-
@viniciusbr
Check out the ARP entry in the PF2 log where you've hidden the IP.
It shows the IP moving from a CARP MAC to an physical MAC, when PF2 takes over. That seems strange to me.
Possibly the virtual IP has the wrong type or something else is miss-configured on that interface. -
@viragomann said in HA strange behaviour, problems on passive box:
y the virtual IP has the wrong type or something else is miss-configured on that interfa
I am investigating this, but that one is a public IP, which as far as I understand I am not having issues with.
DMZ is still the problem.
-
Here are the virtual IPs:
PF1:
PF2:
-
Now the basic questions:
- why from LAN I can ping both PF1 and VIP but NOT PF2? (DMZ IPs)
Looking at the firewall, it is allowed:
- same issue from DMZ:
- why from LAN I can ping both PF1 and VIP but NOT PF2? (DMZ IPs)
-
I was able to find the root cause of the issue:
Nothing to do with pfsense: vmware vs switch port access/trunk modes misconfiguration.
Thanks for the help!
-
@viniciusbr
Thanks for coming back with clarification. -
@viragomann Thanks for your help!