Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    CARP failing over when rc.newwanip runs?

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    1 Posts 1 Posters 135 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P
      pfsense-user1234
      last edited by pfsense-user1234

      I'm having an issue with my CARP where it fails over seemingly randomly to the other unit.

      I have 2 units on AWS running CARP on the WAN interface of both with the AWS HA plugin. When the unit fails over in this way, the other unit takes over the route table and when it fails back to the original master, the route table is not changed back. This results in traffic not going out and traffic not going over VPN tunnels.

      In the system.log on the master, this seems to happen when it runs rc.newwanip. I'm not sure what this process does or how it is triggered, I don't see any interface up/down in the logs.

      Master Unit
      Sep  3 11:08:47 pfSense-AZa kernel: ena0: ioctl promisc/allmulti
      Sep  3 11:08:48 pfSense-AZa check_reload_status[458]: Carp backup event
      Sep  3 11:08:48 pfSense-AZa check_reload_status[458]: rc.newwanip starting ena0
      Sep  3 11:08:48 pfSense-AZa kernel: carp: 1@ena0: MASTER -> INIT (hardware interface up)
      Sep  3 11:08:48 pfSense-AZa kernel: ena0: promiscuous mode disabled
      Sep  3 11:08:49 pfSense-AZa php-fpm[14697]: /rc.carpbackup: HA cluster member "(192.0.2.101@ena0): (WAN)" has resumed CARP state "BACKUP" for vhid 1
      Sep  3 11:08:49 pfSense-AZa php-fpm[14697]: /rc.newwanip: rc.newwanip: Info: starting on ena0.
      Sep  3 11:08:49 pfSense-AZa php-fpm[14697]: /rc.newwanip: rc.newwanip: on (IP address: 10.200.247.175) (interface: WAN[wan]) (real interface: ena0).
      Sep  3 11:08:49 pfSense-AZa check_reload_status[458]: Carp backup event
      Sep  3 11:08:49 pfSense-AZa php-fpm[14697]: /rc.newwanip: waiting for pfsync...
      Sep  3 11:08:49 pfSense-AZa kernel: ena0: ioctl promisc/allmulti
      Sep  3 11:08:49 pfSense-AZa kernel: ena0: promiscuous mode enabled
      Sep  3 11:08:49 pfSense-AZa kernel: carp: 1@ena0: INIT -> BACKUP (initialization complete)
      Sep  3 11:08:49 pfSense-AZa kernel: carp: demoted by 0 to 0 (pfsync bulk start)
      Sep  3 11:08:50 pfSense-AZa kernel: carp: demoted by 0 to 0 (pfsync bulk done)
      Sep  3 11:08:50 pfSense-AZa php-fpm[95703]: /rc.carpbackup: HA cluster member "(192.0.2.101@ena0): (WAN)" has resumed CARP state "BACKUP" for vhid 1
      Sep  3 11:08:52 pfSense-AZa check_reload_status[458]: Carp master event
      Sep  3 11:08:52 pfSense-AZa kernel: carp: 1@ena0: BACKUP -> MASTER (preempting a slower master)
      Sep  3 11:08:53 pfSense-AZa php-fpm[95703]: /rc.carpmaster: HA cluster member "(192.0.2.101@ena0): (WAN)" has resumed CARP state "MASTER" for vhid 1
      Sep  3 11:08:55 pfSense-AZa php-fpm[95703]: /rc.carpmaster: Couldn't get parameters for vhid  on interface ena1
      Sep  3 11:08:55 pfSense-AZa php-fpm[95703]: /rc.carpmaster: Couldn't determine advbase and advskew for vhid  on interface ena1
      Sep  3 11:08:56 pfSense-AZa kernel: arpresolve: can't allocate llinfo for 10.200.2.1 on ena1
      Sep  3 11:08:56 pfSense-AZa kernel: arpresolve: can't allocate llinfo for 10.200.2.1 on ena1
      Sep  3 11:09:21 pfSense-AZa php-fpm[14697]: /rc.newwanip: pfsync done in 31 seconds.
      Sep  3 11:09:21 pfSense-AZa php-fpm[14697]: /rc.newwanip: Configuring CARP settings finalize...
      Sep  3 11:09:21 pfSense-AZa check_reload_status[458]: Reloading filter
      Sep  3 11:09:21 pfSense-AZa check_reload_status[458]: Reloading filter
      Sep  3 11:09:52 pfSense-AZa php-cgi[36489]: aws_highavail_periodic: New alert found: Resource eipalloc-0a117cb1c304b1f63 has been modified by a lower priority master,
      Sep  3 11:09:52 pfSense-AZa php-cgi[36489]: troubleshooting of CARP vhid wan@1 may be necessary.
      Sep  3 11:09:52 pfSense-AZa php-cgi[36489]: The resource has been restored to the expected state.
      
      

      On the secondary device, I get a master timed out event that switches it from backup to master, then it changes right back to backup.

      Backup Unit
      Sep  3 11:08:52 pfSense-AZb check_reload_status[458]: Carp master event
      Sep  3 11:08:52 pfSense-AZb check_reload_status[458]: Carp backup event
      Sep  3 11:08:52 pfSense-AZb kernel: carp: 1@ena0: BACKUP -> MASTER (master timed out)
      Sep  3 11:08:52 pfSense-AZb kernel: carp: 1@ena0: MASTER -> BACKUP (more frequent advertisement received)
      Sep  3 11:08:53 pfSense-AZb php-fpm[406]: /rc.carpmaster: HA cluster member "(192.0.2.101@ena0): (WAN)" has resumed CARP state "MASTER" for vhid 1
      Sep  3 11:08:53 pfSense-AZb php-fpm[57097]: /rc.carpbackup: HA cluster member "(192.0.2.101@ena0): (WAN)" has resumed CARP state "BACKUP" for vhid 1
      Sep  3 11:08:55 pfSense-AZb php-fpm[406]: /rc.carpmaster: Couldn't get parameters for vhid  on interface ena1
      Sep  3 11:08:55 pfSense-AZb php-fpm[406]: /rc.carpmaster: Couldn't determine advbase and advskew for vhid  on interface ena1
      Sep  3 11:08:55 pfSense-AZb php-fpm[406]: /rc.carpmaster: Couldn't get parameters for vhid  on interface ena1
      Sep  3 11:08:55 pfSense-AZb php-fpm[406]: /rc.carpmaster: Couldn't determine advbase and advskew for vhid  on interface ena1
      
      

      Here is a picture of my VIP.
      b4411e58-2168-4e11-872e-d418d4b62cd5-image.png

      I've tried different values for the base but this doesn't seem to resolve the issue.

      The only thing I can seem to find is that rc.newwanip script that runs before the issue happens every time. The WAN interface is set for DHCP as it's on AWS but the IP of the interface never changes, as it's set statically in the AWS console. I was thinking maybe trying to set the WAN interface to a static instead of DHCP for testing but I'll need to do that afterhours, however I don't believe this should be causing an issue with the CARP. I may just have a misconfiguration that I'm not seeing.

      Thanks in advance for the help.

      1 Reply Last reply Reply Quote 0
      • First post
        Last post
      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.