States not synced between VMs
-
I have been looking at this for a few days now and I have been stumped. Similar questions have been answered but I have found no solutions to my problem.
I have two pfsense VMs, one running on Proxmox, the other on ESXi. Configuration sync is working, the active node shows 1000+ states, the standby shows around 20. Both are running 2.3.3, I have 3 WAN CARP addresses configured in the same subnet (I have a /29 allocated to me, 2 IPs for pfsense interface IPs, one for the upstream gateway, and 3 for CARP) one carp address for the LAN and one for the DMZ. A dedicated failover interface is also configured, state sync is enabled on both, and I have tried both multicast and unicast pfsync configuration. If I tcpdump on the standby on the failover interface, I can see the pfsync packets coming in, both unicast and multicast, but no change in the standby's state table. At the hypervisor level, these 4 interfaces WAN,LAN,DMZ, and failover, are set up as vlans and are tagged to the single Cisco 3560E switch between the two servers. None of my troubleshooting indicates that there is any issue with communication between the VMs, all CARP IPs on the WAN, LAN, and DMZ show as Master on the active VM and Backup on the standby VM, and as I said, I am able to see the pfsync packets on the failover interface in tcpdump, configuration sync is working over the failover interface. Also the firewall rules for the failover interface are wide open, and I can see an active state for the pfsync traffic on both the active and standby.
Is there anything else I can check or something I overlooked?
-
An update:
I have now tried deploying a fresh pfsense 2.3.3 OVA to both ESXi and Proxmox and using configuration backup to restore the configuration, but the issue persisted, suggesting that this is a configuration issue. I then factory reset both of these and built a configuration from scratch with simple rules and no extra packages besides openvm-tools and the Automatic backups, and the issue still persists. I am, again, stumped. Any input would be appreciated, even just a confirmation that there are similar setups working on 2.3.3 would be encouraging.
For testing, I just use a simple persistant connection, eg. telnet to route-server.he.net, look for the telnet session in the state table on both VMs. I have yet to find the state on the secondary.
-
I have resolved the issue, it appears I was hitting a change in pfsync as of pfsense 2.2 as shown here
https://forum.pfsense.org/index.php?topic=93052.0
https://forum.pfsense.org/index.php?topic=93132.msg519077#msg519077Since I was using VMXNET3 interfaces in ESXi and VirtIO interfaces in Proxmox they show up as different hardware since they have different drivers and pfsync cannot function properly. The work around in the previous threads was to create a LAGG but the simpler solution in this case was to change Proxmox to use VMXNET3 interfaces and my states are synced perfectly now. Changing both VMs to use E1000 interfaces likely would have worked too.