Secondary router not passing traffic
-
Hi all,
I have two servers A and B connected via switching and the OS is XCP-ng/XenServer 8.1.
Both servers A and B have the exact same hardware config and overall configuration. Each server has its own /32 IP, which is used only for management purposes and API.
All the networks (a public subnet, and the internal networks) are delivered point-to-point via vlans.
I configured CARP HA following the tutorials, everything. I have gone through the config step by step several times and everything is correctly configured.
- Status > CARP shows the same pfSync nodes on both sides;
- Configuration done on the primary is reflected maybe under a second on the secondary;
- I can ping without a problem;
- All the IP's are reflected on both routers
- IPSec VPN site to site using CARP WAN IP
- NAT rules to NAT using the CARP WAN IP
Testing CARP using both Status > CARP
Enter persistent CARP maintenance mode
or shutting down the first router:- Secondary instantly changes from Backup to Master on all CARP IP's
- No traffic to outside, can't access the VM's. IPSec VPN goes down.
Primary router comes up:
- Automatically takes the Master role and the secondary goes to Backup on all CARP IP's
- Traffic starts flowing normally
- VPN quickly recovers
And a weird behaviour:
Whenever I reboot the secondary router, the certificate shows as untrusted and I have to accept the certificate again. Only when I reboot, only on the secondary router.I'd truly appreciate some help, because I have no clue where to look next.
System Logs on the secondary router (all the logs below are from the secondary firewall) - newest entries on top:
Aug 27 16:40:37 kernel arp: 1.2.3.146 moved from 78:20:94:35:f8:98 to 82:90:7c:4d:88:60 on xn0 Aug 27 16:40:31 kernel arp: 1.2.3.146 moved from 82:90:7c:4d:88:60 to 78:20:94:35:f8:98 on xn0 Aug 27 16:40:07 kernel arp: 1.2.3.146 moved from 78:20:94:35:f8:98 to 82:90:7c:4d:88:60 on xn0 Aug 27 16:39:40 kernel arp: 1.2.3.146 moved from 82:90:7c:4d:88:60 to 78:20:94:35:f8:98 on xn0 Aug 27 16:38:53 kernel arp: 1.2.3.146 moved from 78:20:94:35:f8:98 to 82:90:7c:4d:88:60 on xn0 Aug 27 16:38:17 kernel arp: 1.2.3.146 moved from 82:90:7c:4d:88:60 to 78:20:94:35:f8:98 on xn0
1.2.3.146 is the upstream gateway
When I enable persistent maintenance mode, this is what shows on the log of on the secondary router - newest entries on top:
Aug 27 16:40:52 php-fpm 60821 /rc.carpmaster: HA cluster member "(172.16.140.254@xn2): (DMZ)" has resumed CARP state "MASTER" for vhid 5 Aug 27 16:40:51 php-fpm 12134 /rc.carpmaster: HA cluster member "(1.2.3.155@xn0): (WAN)" has resumed CARP state "MASTER" for vhid 58 Aug 27 16:40:51 php-fpm 60821 /rc.carpmaster: HA cluster member "(1.2.3.153@xn0): (WAN)" has resumed CARP state "MASTER" for vhid 56 Aug 27 16:40:51 php-fpm 12134 /rc.carpmaster: HA cluster member "(1.2.3.152@xn0): (WAN)" has resumed CARP state "MASTER" for vhid 55 Aug 27 16:40:51 php-fpm 348 /rc.carpmaster: HA cluster member "(1.2.3.151@xn0): (WAN)" has resumed CARP state "MASTER" for vhid 54 Aug 27 16:40:51 php-fpm 60821 /rc.carpmaster: HA cluster member "(1.2.3.150@xn0): (WAN)" has resumed CARP state "MASTER" for vhid 53 Aug 27 16:40:51 php-fpm 348 /rc.carpmaster: HA cluster member "(1.2.3.147@xn0): (WAN)" has resumed CARP state "MASTER" for vhid 50 Aug 27 16:40:51 php-fpm 348 /rc.carpmaster: HA cluster member "(172.16.200.254@xn4): (DATA)" has resumed CARP state "MASTER" for vhid 20 Aug 27 16:40:51 php-fpm 12134 /rc.carpmaster: HA cluster member "(172.16.190.254@xn3): (ADM)" has resumed CARP state "MASTER" for vhid 13 Aug 27 16:40:51 php-fpm 60821 /rc.carpmaster: HA cluster member "(172.16.110.254@xn1): (LAN)" has resumed CARP state "MASTER" for vhid 1 Aug 27 16:40:50 check_reload_status Reloading filter Aug 27 16:40:50 php-fpm 347 /xmlrpc.php: Resyncing OpenVPN instances. Aug 27 16:40:50 php-fpm 347 /xmlrpc.php: Gateway, none 'available' for inet6, use the first one configured. '' Aug 27 16:40:50 php-fpm 348 /rc.carpmaster: HA cluster member "(1.2.3.159@xn0): (WAN)" has resumed CARP state "MASTER" for vhid 62 Aug 27 16:40:50 php-fpm 12134 /rc.carpmaster: HA cluster member "(1.2.3.158@xn0): (WAN)" has resumed CARP state "MASTER" for vhid 61 Aug 27 16:40:50 php-fpm 348 /rc.carpmaster: HA cluster member "(1.2.3.157@xn0): (WAN)" has resumed CARP state "MASTER" for vhid 60 Aug 27 16:40:50 php-fpm 12134 /rc.carpmaster: HA cluster member "(1.2.3.156@xn0): (WAN)" has resumed CARP state "MASTER" for vhid 59 Aug 27 16:40:50 php-fpm 348 /rc.carpmaster: HA cluster member "(1.2.3.154@xn0): (WAN)" has resumed CARP state "MASTER" for vhid 57 Aug 27 16:40:49 kernel carp: 62@xn0: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 61@xn0: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 60@xn0: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 5@xn2: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 59@xn0: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 58@xn0: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 57@xn0: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 56@xn0: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 55@xn0: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 54@xn0: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 53@xn0: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 50@xn0: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 20@xn4: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 1@xn1: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 kernel carp: 13@xn3: BACKUP -> MASTER (preempting a slower master) Aug 27 16:40:49 check_reload_status Syncing firewall Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event Aug 27 16:40:49 check_reload_status Carp master event
And when I disable the maintenance mode on the master - newest entries on top:
Aug 27 16:41:34 php-fpm 64435 /rc.carpbackup: HA cluster member "(1.2.3.158@xn0): (WAN)" has resumed CARP state "BACKUP" for vhid 61 Aug 27 16:41:34 php-fpm 348 /rc.carpbackup: HA cluster member "(1.2.3.153@xn0): (WAN)" has resumed CARP state "BACKUP" for vhid 56 Aug 27 16:41:34 php-fpm 8955 /rc.carpbackup: HA cluster member "(1.2.3.152@xn0): (WAN)" has resumed CARP state "BACKUP" for vhid 55 Aug 27 16:41:34 php-fpm 60821 /rc.carpbackup: HA cluster member "(1.2.3.151@xn0): (WAN)" has resumed CARP state "BACKUP" for vhid 54 Aug 27 16:41:34 php-fpm 12134 /rc.carpbackup: HA cluster member "(1.2.3.150@xn0): (WAN)" has resumed CARP state "BACKUP" for vhid 53 Aug 27 16:41:34 php-fpm 64435 /rc.carpbackup: HA cluster member "(1.2.3.147@xn0): (WAN)" has resumed CARP state "BACKUP" for vhid 50 Aug 27 16:41:33 check_reload_status Reloading filter Aug 27 16:41:33 php-fpm 20724 /xmlrpc.php: Gateway, none 'available' for inet6, use the first one configured. '' Aug 27 16:41:33 php-fpm 347 /rc.carpbackup: HA cluster member "(1.2.3.159@xn0): (WAN)" has resumed CARP state "BACKUP" for vhid 62 Aug 27 16:41:33 php-fpm 60821 /rc.carpbackup: HA cluster member "(1.2.3.157@xn0): (WAN)" has resumed CARP state "BACKUP" for vhid 60 Aug 27 16:41:33 php-fpm 347 /rc.carpbackup: HA cluster member "(1.2.3.156@xn0): (WAN)" has resumed CARP state "BACKUP" for vhid 59 Aug 27 16:41:33 php-fpm 348 /rc.carpbackup: HA cluster member "(1.2.3.155@xn0): (WAN)" has resumed CARP state "BACKUP" for vhid 58 Aug 27 16:41:33 php-fpm 12134 /rc.carpbackup: HA cluster member "(1.2.3.154@xn0): (WAN)" has resumed CARP state "BACKUP" for vhid 57 Aug 27 16:41:33 php-fpm 60821 /rc.carpbackup: HA cluster member "(172.16.140.254@xn2): (DMZ)" has resumed CARP state "BACKUP" for vhid 5 Aug 27 16:41:33 php-fpm 12134 /rc.carpbackup: HA cluster member "(172.16.200.254@xn4): (DATA)" has resumed CARP state "BACKUP" for vhid 20 Aug 27 16:41:33 php-fpm 64435 /rc.carpbackup: HA cluster member "(172.16.190.254@xn3): (ADM)" has resumed CARP state "BACKUP" for vhid 13 Aug 27 16:41:33 php-fpm 348 /rc.carpbackup: HA cluster member "(172.16.110.254@xn1): (LAN)" has resumed CARP state "BACKUP" for vhid 1 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn4: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn3: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn2: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn1: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn0: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn0: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn0: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn0: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn0: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn0: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn0: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn0: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn0: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn0: 3 Aug 27 16:41:32 kernel ifa_maintain_loopback_route: deletion failed for interface xn0: 3 Aug 27 16:41:32 kernel carp: 62@xn0: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 61@xn0: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 60@xn0: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 5@xn2: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 59@xn0: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 58@xn0: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 57@xn0: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 56@xn0: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 55@xn0: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 54@xn0: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 53@xn0: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 50@xn0: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 20@xn4: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 1@xn1: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 kernel carp: 13@xn3: MASTER -> BACKUP (more frequent advertisement received) Aug 27 16:41:32 check_reload_status Syncing firewall Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event Aug 27 16:41:32 check_reload_status Carp backup event
But no traffic goes through the secondary.
-
Is "Synchronize states" checked on both routers in System/HA Sync? Not having states synced would block existing connections but new connections should work.
Possibly something upstream isn't liking the IP changes? Did you look at https://docs.netgate.com/pfsense/en/latest/book/highavailability/high-availability-troubleshooting.html#other-switch-and-layer-2-issues