Lost outbound internet on private network after hypervisor moved VM's from Nodes during firmware update
-
Hi,
We have a Nutanix Cluster than consists of 7 Hardware nodes which use pfSense for it's Gateway/DHCP server. Recently we updated the firmware of these nodes via Nutanix LCM and what LCM does is it moves all running VM's from one node to the other available nodes whilst still live/running and updates that node (repeats process until all nodes are updated).Most of these VM's have two VNICS, one with public IP and one with private IP. After migrating from node to node most (if not all) lost internet connectivity and I couldn't access them via SSH on their public address. As a workaround removing the VNIC's and readding them gave me Internet connection (inbound/outbound) via public IP and also allowed me to SSH from one VM to another VM using private IP.
Adding the new VNIC's to the VM's for the private range generated new MAC addresses and I successfully updated these in pfSense and running netplan apply on the VM's enabled them to obtain the correct private IP via DHCP.
However there are a handful of VM's that don't have 2 VNIC's, they only have 1 VNIC for private IP (LAN). Readding VNIC for these VM's did enable me to SSH to them via a bastion but they can no longer have outbound internet. They did manage to contact the pfSense gateway to obtain private IP via DHCP however when I attempt the tracepath/ping to pfSense gateway they don't seem to be able to reach, although they can tracepath other VM's on the private network.
This is the version of pfSense we're using:
2.4.5-RELEASE (amd64)If anyone else has come across a similar issue or may have ideas that could help me get to the bottom of this that would be great.?
-
Since you changed nothing on pfSense (at least directly), I would go looking for the root cause in the Nutanix Cluster update process. My first guess would be during the move from node to node the Nutanix process changed something about the VNICs (could have been a MAC address, could have been something related to VLAN IDs if used, etc.). Changes to the VNIC could leave pfSense "confused" about which interface is LAN and which is WAN, for example.