CARP sync broken ? PFsense 2.1.3



  • Hi

    Running 2.1.3 latest version (at this moment)

    Did set everything up as: https://doc.pfsense.org/index.php/Configuring_pfSense_Hardware_Redundancy_(CARP)

    Checked the password as well… but nothing will sync. The only hint i get under STATUS/Filter Reload is this:

    The other member is on older configuration version of pfSense. Sync will not be done to prevent problems!…

    On the slave i can see some connection from master on port 80 and some PFSYNC protocol.

    But no rules, settings,. etc gets synced.

    Master is BOX
    Slave is VM in vmware. Special settings for vmware has been enabled as instructed in the doc.

    Both are running same version, so why am I getting the warning above? Maybe that's why isn't getting synced?

    Thanks in advance!


  • Rebel Alliance Developer Netgate

    Are both nodes actually running 2.1.3?



  • @jimp:

    Are both nodes actually running 2.1.3?

    Yes they are both running 2.1.3 - Both shows:

    2.1.3-RELEASE (amd64)
    built on Thu May 01 15:52:13 EDT 2014
    FreeBSD 8.3-RELEASE-p16

    on the 1st, I did actually an upgrade from 2.1.0 to 2.1.3 (web update)
    2nd i did a clean new install 2.1.3


  • Rebel Alliance Developer Netgate

    I think I see a potential problem. I committed a fix, but for a fast test, try this on both (or at least the secondary):

    echo 8.3 > /etc/version_base
    echo 8.3 > /etc/version_kernel

    Then do another sync test.



  • @jimp:

    I think I see a potential problem. I committed a fix, but for a fast test, try this on both (or at least the secondary):

    echo 8.3 > /etc/version_base
    echo 8.3 > /etc/version_kernel

    Then do another sync test.

    I tried this on both, but no difference. Still same issue.

    EDIT:
    Also tried this on both:
    pfSsh.php playback gitsync RELENG_2_1

    Doesn't work after gitsync - same issue.


  • Rebel Alliance Developer Netgate

    Are there any errors in the system log on either one, aside from the notice about the version not matching?



  • After doing gitsync there are more errors, but not related to carp as i see it…

    
    May 20 12:21:01	php: rc.filter_synchronize: The other member is on older configuration version of pfSense. Sync will not be done to prevent problems!
    May 20 12:20:59	check_reload_status: Syncing firewall
    May 20 11:41:58	php: rc.filter_synchronize: The other member is on older configuration version of pfSense. Sync will not be done to prevent problems!
    May 20 11:41:57	kernel: pflog0: promiscuous mode enabled
    May 20 11:41:56	check_reload_status: Syncing firewall
    May 20 11:41:55	check_reload_status: Reloading filter
    May 20 11:41:37	php: rc.filter_synchronize: The other member is on older configuration version of pfSense. Sync will not be done to prevent problems!
    May 20 11:41:35	check_reload_status: Syncing firewall
    May 20 11:41:03	php: rc.restart_webgui: Creating rrd update script
    May 20 11:41:00	kernel: pflog0: promiscuous mode disabled
    May 20 11:41:00	check_reload_status: webConfigurator restart in progress
    May 20 11:40:59	lighttpd[27710]: (mod_fastcgi.c.3587) all handlers for /getstats.php? on .php are down.
    May 20 11:40:59	lighttpd[27710]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 4 load: 1
    May 20 11:40:59	lighttpd[27710]: (mod_fastcgi.c.1754) connect failed: No such file or directory on unix:/tmp/php-fastcgi.socket-0
    May 20 11:40:59	lighttpd[27710]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 3 load: 1
    May 20 11:40:59	lighttpd[27710]: (mod_fastcgi.c.1754) connect failed: No such file or directory on unix:/tmp/php-fastcgi.socket-1
    May 20 11:40:59	lighttpd[27710]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 2 load: 1
    May 20 11:40:59	lighttpd[27710]: (mod_fastcgi.c.1754) connect failed: No such file or directory on unix:/tmp/php-fastcgi.socket-2
    May 20 11:40:59	lighttpd[27710]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 1 load: 1
    May 20 11:40:59	lighttpd[27710]: (mod_fastcgi.c.1754) connect failed: No such file or directory on unix:/tmp/php-fastcgi.socket-3
    May 20 11:40:59	lighttpd[27710]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 0 load: 1
    May 20 11:40:59	lighttpd[27710]: (mod_fastcgi.c.1754) connect failed: No such file or directory on unix:/tmp/php-fastcgi.socket-4
    May 20 11:40:58	lighttpd[27710]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi.socket
    May 20 11:40:58	lighttpd[27710]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi.socket
    May 20 11:40:58	lighttpd[27710]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi.socket
    May 20 11:40:58	lighttpd[27710]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi.socket
    May 20 11:40:58	lighttpd[27710]: (mod_fastcgi.c.2779) fcgi-server re-enabled: 0 /tmp/php-fastcgi.socket
    May 20 11:40:56	lighttpd[27710]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 4 load: 1
    May 20 11:40:56	lighttpd[27710]: (mod_fastcgi.c.1754) connect failed: No such file or directory on unix:/tmp/php-fastcgi.socket-0
    May 20 11:40:56	lighttpd[27710]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 3 load: 1
    May 20 11:40:56	lighttpd[27710]: (mod_fastcgi.c.1754) connect failed: No such file or directory on unix:/tmp/php-fastcgi.socket-1
    May 20 11:40:56	lighttpd[27710]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 2 load: 1
    May 20 11:40:56	lighttpd[27710]: (mod_fastcgi.c.1754) connect failed: No such file or directory on unix:/tmp/php-fastcgi.socket-2
    May 20 11:40:56	lighttpd[27710]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 1 load: 1
    May 20 11:40:56	lighttpd[27710]: (mod_fastcgi.c.1754) connect failed: No such file or directory on unix:/tmp/php-fastcgi.socket-3
    May 20 11:40:56	lighttpd[27710]: (mod_fastcgi.c.3021) backend died; we'll disable it for 1 seconds and send the request to another backend instead: reconnects: 0 load: 1
    May 20 11:40:56	lighttpd[27710]: (mod_fastcgi.c.1754) connect failed: No such file or directory on unix:/tmp/php-fastcgi.socket-4
    May 20 11:40:55	php: pfSsh.php: Ended Configuration upgrade at 11:40:55
    May 20 11:40:55	php: pfSsh.php: Start Configuration upgrade at 11:40:55, set execution timeout to 15 minutes
    May 20 11:26:05	syslogd: kernel boot file is /boot/kernel/kernel
    
    

  • Rebel Alliance Developer Netgate

    Is that on the primary, or the secondary? Be sure to sync both nodes.

    Also look at a config.xml backup from both, see if they really are on the same "version" (see the "version" tag near the top of the config)



  • Hi again

    I downloaded both backups and I can see there is a difference between them

    Master which was on 2.0.3 at the original time and then upgraded via "upgrade" button, shows:

    <pfsense><version>10.7</version>
    <lastchange><theme>pfsense_ng</theme>
    <sysctl>and secondary which is a VM and a clean new install of 2.1.3 shows:

    <pfsense><version>10.1</version>
    <lastchange><theme>pfsense_ng</theme>
    <sysctl>Why and how can I correct this?

    Thanks so far!  :)</sysctl></lastchange></pfsense></sysctl></lastchange></pfsense>


  • Rebel Alliance Developer Netgate

    The first one was upgraded to 2.2 somehow. You'll need to restore back a config from 2.1.x



  • @jimp:

    The first one was upgraded to 2.2 somehow. You'll need to restore back a config from 2.1.x

    Damn, that is a problem… and very strange that it went to 2.2 when it actually says 2.1.3 on dashboard.
    other alternative could be that I upgraded my secondary to 2.2 ?


  • Rebel Alliance Developer Netgate

    Maybe you did a gitsync to master and then immediately did another gitsync to RELENG_2_1. Hard to say, but the "10.7" in the master's config.xml indicates that the configuration came from 2.2.



  • @jimp:

    Maybe you did a gitsync to master and then immediately did another gitsync to RELENG_2_1. Hard to say, but the "10.7" in the master's config.xml indicates that the configuration came from 2.2.

    All right, i upgraded the secondary to 2.2 - and now CARP does play… but..

    there is so many errors on the secondary, git is not found, version wrong, and the inconsistency between the primary is also scary...

    I will make a fresh install, since primary is running production... The funny part is something on primary is very strange... some part is running 2.2 other is running 2.1.3. I can't figure out what went wrong...

    thanks so far!


  • Rebel Alliance Developer Netgate

    You'd be better off on 2.1.x for the moment (2.2 is still alpha)

    You just need to make sure that your config.xml version is right for that version (10.1, not 10.7)


Log in to reply