High Avail. Sync broken



  • Hello everyone.

    I have two pfsenses so far in 2.3.
    It has an interface with a lan and a sync.
    I made the update in 2.4
    Since the sync does not work anymore and I can not find the explanation.
    Maybe you can help me.
    Thank you very much.

    Here are the pfsense master logs when he tries to sync.

    01/10/1930 17:45 check_reload_status Syncing firewall
    01/10/1930 17:45 php-fpm 53724 /rc.filter_synchronize: Beginning XMLRPC sync data to https://10.88.88.2:443/xmlrpc.php.
    01/10/1930 17:45 php-fpm 53724 /rc.filter_synchronize: A communications error occurred while attempting to call XMLRPC method host_firmware_version: Unable to connect to tls://10.88.88.2:443. Error: Operation timed out
    01/10/1930 17:45 php-fpm 53724 /rc.filter_synchronize: New alert found: A communications error occurred while attempting to call XMLRPC method host_firmware_version: Unable to connect to tls://10.88.88.2:443. Error: Operation timed out
    01/10/1930 17:45 php-fpm 53724 /rc.filter_synchronize: Beginning XMLRPC sync data to https://10.88.88.2:443/xmlrpc.php.
    01/10/1930 17:46 php-fpm 53724 /rc.filter_synchronize: A communications error occurred while attempting to call XMLRPC method host_firmware_version: Unable to connect to tls://10.88.88.2:443. Error: Operation timed out
    01/10/1930 17:46 php-fpm 53724 /rc.filter_synchronize: New alert found: A communications error occurred while attempting to call XMLRPC method host_firmware_version: Unable to connect to tls://10.88.88.2:443. Error: Operation timed out
    01/10/1930 17:46 php-fpm 53724 /rc.filter_synchronize: XMLRPC versioncheck: – 17.3
    01/10/1930 17:46 php-fpm 53724 /rc.filter_synchronize: The pfSense software configuration version of the other member could not be determined. Skipping synchronization to avoid causing a problem!

    0 log on the Pfsense Slave.



  • Same as mine? https://forum.pfsense.org/index.php?topic=139032.0

    Once primary IP gets blocked, I get the timeouts too. So might be the same. No solution on that though..



  • Commented on the other thread, but you need to have both boxes on the same version- upgrade them both.



  • Both Pfsense are on

    2.4.1-RELEASE (amd64)
    built on Sun Oct 22 17:26:33 CDT 2017
    FreeBSD 11.1-RELEASE-p2



  • @andipadi
    I don't think it was the same issue, i have 0 log on secondary pfsense.

    This morning I try to install from scratch on the two pfsense with 2.4.1-release version, same issue !



  • I restart from scratch, sync works perfectly, then i use backup to restore my firewall, and sync broken.

    Import conf brokes sync every time …..

    Any idea ?



  • I make a new try.

    Reinstall both pfsense on 2.4

    Configure Interface
    Configure Sync
    Configure firewall => perfectly sync on slave
    configure DNS forwarder => perfectly sync on slave
    Configure DHCP  => perfectly sync on slave
    Configure Virtual IP => perfectly sync on slave

    etc
    everything work and everything is perfectly sync on slave

    I make a backup of Pfsense Master and I Load this backup on Pfsense Master to check if it's work, everything is here, but syn is broken :

    A communications error occurred while attempting to call XMLRPC method host_firmware_version: @ 2017-11-03 16:58:33

    Somoene have an idea ?


  • Netgate

    What is in the system log? There should be an explicit version check line like this:

    Nov 3 18:49:01 php-fpm 536 /rc.filter_synchronize: Beginning XMLRPC sync data to https://172.25.254.2:443/xmlrpc.php.
    Nov 3 18:49:01 php-fpm 536 /rc.filter_synchronize: XMLRPC reload data success with https://172.25.254.2:443/xmlrpc.php (pfsense.host_firmware_version).
    Nov 3 18:49:01 php-fpm 536 /rc.filter_synchronize: XMLRPC versioncheck: 17.3 – 17.3
    Nov 3 18:49:01 php-fpm 536 /rc.filter_synchronize: Beginning XMLRPC sync data to https://172.25.254.2:443/xmlrpc.php.



  • Thank you for you reply Derelict.

    I have this one on master pfsense :

    01/10/1930 17:46  php-fpm  53724  /rc.filter_synchronize: XMLRPC versioncheck: – 17.3

    both pfsense are in :

    2.4.1-RELEASE (amd64)
    built on Sun Oct 22 17:26:33 CDT 2017
    FreeBSD 11.1-RELEASE-p2

    Install with the same usb.


  • Netgate

    Sounds like your SYNC interface is not configured correctly. Can you ping across it? Check the firewall rules on it.



  • yes master (10.88.88.1) can ping slave on 10.88.88.2
    and slave can ping master on 10.88.88.1.

    I have this in log now on master

    Time Process PID Message
    Nov 7 12:40:52 php-fpm 48758 /rc.filter_synchronize: Beginning XMLRPC sync data to https://10.88.88.2:443/xmlrpc.php.
    Nov 7 12:40:52 php-fpm 48758 /rc.filter_synchronize: New alert found: A communications error occurred while attempting to call XMLRPC method host_firmware_version:
    Nov 7 12:40:52 php-fpm 48758 /rc.filter_synchronize: A communications error occurred while attempting to call XMLRPC method host_firmware_version:
    Nov 7 12:40:43 php-fpm 38230 /rc.filter_synchronize: New alert found: A communications error occurred while attempting to call XMLRPC method filter_configure:
    Nov 7 12:40:43 php-fpm 38230 /rc.filter_synchronize: A communications error occurred while attempting to call XMLRPC method filter_configure:
    Nov 7 12:40:38 check_reload_status Reloading filter
    Nov 7 12:40:37 php-fpm 51322 /rc.filter_synchronize: Beginning XMLRPC sync data to https://10.88.88.2:443/xmlrpc.php.
    Nov 7 12:40:36 check_reload_status Syncing firewall
    Nov 7 12:40:18 pfsense1 nginx: 2017/11/07 12:40:18 [error] 12688#100122: send() failed (54: Connection reset by peer)
    Nov 7 12:40:18 php-fpm 51646 /status_logs_settings.php: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1510054818] unbound[90624:0] error: bind: address already in use [1510054818] unbound[90624:0] fatal error: could not open ports'
    Nov 7 12:40:15 syslogd kernel boot file is /boot/kernel/kernel


  • Netgate

    Are you passing the traffic on the sync interface on the secondary?

    Are both nodes set to the same webgui settings (http/https/port) and have the same username and password set?



  • Sorry fo the response delay,

    Yes both node are on https same port.

    Both node use dedicated ionterface for the sync, no vlan .

    Here are the new log :

    Nov 10 09:49:29 php-fpm 34743 /rc.filter_synchronize: A communications error occurred while attempting to call XMLRPC method host_firmware_version:
    Nov 10 09:49:10 check_reload_status Reloading filter
    Nov 10 09:49:10 php-fpm 57111 /rc.filter_synchronize: The pfSense software configuration version of the other member could not be determined. Skipping synchronization to avoid causing a problem!
    Nov 10 09:49:10 php-fpm 57111 /rc.filter_synchronize: XMLRPC versioncheck: – 17.3
    Nov 10 09:49:10 php-fpm 57111 /rc.filter_synchronize: New alert found: A communications error occurred while attempting to call XMLRPC method host_firmware_version:
    Nov 10 09:49:10 php-fpm 57111 /rc.filter_synchronize: A communications error occurred while attempting to call XMLRPC method host_firmware_version:
    Nov 10 09:48:54 php-fpm 66735 /rc.filter_synchronize: Beginning XMLRPC sync data to https://10.88.88.2:443/xmlrpc.php.
    Nov 10 09:48:54 php-fpm 66735 /rc.filter_synchronize: New alert found: A communications error occurred while attempting to call XMLRPC method host_firmware_version:
    Nov 10 09:48:54 php-fpm 66735 /rc.filter_synchronize: A communications error occurred while attempting to call XMLRPC method host_firmware_version:
    Nov 10 09:48:44 php-fpm 56798 /rc.filter_synchronize: Beginning XMLRPC sync data to https://10.88.88.2:443/xmlrpc.php.

    Master and slave can ping each other.

    Each time i make a change on master it's very long to validate .

    Thanks for your help Derelict.


  • Netgate

    Can you bring up the webgui on the secondary at the time?

    Do firewall rules pass xmlrpc (webgui) traffic on the sync interface?

    If looks like the primary cannot connect to the secondary there. Need to isolate the reason why that is so.



  • Hallo

    I have exactly the same issue.

    Master and slave pfsense same version 2.4.2-p1,

    2.4.2-RELEASE-p1 (amd64)
    built on Tue Dec 12 13:45:26 CST 2017
    FreeBSD 11.1-RELEASE-p6

    same admin, password, same webgui https port
    Master ans slave can ping each other on the HA interfaces

    Here the logs on the master

    Jan 1 17:42:09 check_reload_status Syncing firewall
    Jan 1 17:42:10 php-fpm 13794 /system_hasync.php: waiting for pfsync…
    Jan 1 17:42:10 php-fpm 21804 /rc.filter_synchronize: Beginning XMLRPC sync data to https://192.168.100.2:443/xmlrpc.php.
    Jan 1 17:42:20 php-fpm 21804 /rc.filter_synchronize: A communications error occurred while attempting to call XMLRPC method host_firmware_version: Unable to connect to tls://192.168.100.2:443. Error: Operation timed out
    Jan 1 17:42:20 php-fpm 21804 /rc.filter_synchronize: New alert found: A communications error occurred while attempting to call XMLRPC method host_firmware_version: Unable to connect to tls://192.168.100.2:443. Error: Operation timed out
    Jan 1 17:42:20 php-fpm 21804 /rc.filter_synchronize: Beginning XMLRPC sync data to https://192.168.100.2:443/xmlrpc.php.
    Jan 1 17:42:30 php-fpm 21804 /rc.filter_synchronize: A communications error occurred while attempting to call XMLRPC method host_firmware_version: Unable to connect to tls://192.168.100.2:443. Error: Operation timed out
    Jan 1 17:42:30 php-fpm 21804 /rc.filter_synchronize: New alert found: A communications error occurred while attempting to call XMLRPC method host_firmware_version: Unable to connect to tls://192.168.100.2:443. Error: Operation timed out
    Jan 1 17:42:30 php-fpm 21804 /rc.filter_synchronize: XMLRPC versioncheck: -- 17.3
    Jan 1 17:42:30 php-fpm 21804 /rc.filter_synchronize: The pfSense software configuration version of the other member could not be determined. Skipping synchronization to avoid causing a problem!
    Jan 1 17:42:42 php-fpm 13794 /system_hasync.php: pfsync done in 30 seconds.
    Jan 1 17:42:42 php-fpm 13794 /system_hasync.php: Configuring CARP settings finalize...
    Jan 1 17:47:00 check_reload_status Syncing firewall
    Jan 1 17:47:01 php-fpm 58070 /rc.filter_synchronize: Beginning XMLRPC sync data to https://192.168.100.2:443/xmlrpc.php.
    Jan 1 17:47:03 check_reload_status Reloading filter
    Jan 1 17:47:11 php-fpm 58070 /rc.filter_synchronize: A communications error occurred while attempting to call XMLRPC method host_firmware_version: Unable to connect to tls://192.168.100.2:443. Error: Operation timed out
    Jan 1 17:47:11 php-fpm 58070 /rc.filter_synchronize: New alert found: A communications error occurred while attempting to call XMLRPC method host_firmware_version: Unable to connect to tls://192.168.100.2:443. Error: Operation timed out
    Jan 1 17:47:11 php-fpm 58070 /rc.filter_synchronize: Beginning XMLRPC sync data to https://192.168.100.2:443/xmlrpc.php.
    Jan 1 17:47:21 php-fpm 58070 /rc.filter_synchronize: A communications error occurred while attempting to call XMLRPC method host_firmware_version: Unable to connect to tls://192.168.100.2:443. Error: Operation timed out

    Could you solve already the issue you had?

    Kind Regards

    and first!!!

    Happy NEW YEAR 2018



  • Hi,

    After double checking the configuration, all is working fine!!!

    THX



  • @kikuyu:

    Hi,

    After double checking the configuration, all is working fine!!!

    THX

    And what was your configuration problem ? may be I am still missing something.



  • Hallo,

    I used the HA1 IP 192.168.200.1 and HA2 IP 192.168.200.2. Normally netmask /24 for the network 192.160.200.0/24.
    But after double checking, on the slave, the netmask was /32. I don't know how it was here?
    After correction, all is now working fine.

    Rgds.
    Kikuyu



  • Hello,
    Thanks for tip, I had this correctly configured, but I find out fact, that my sync is not working, when I have my DNS resolver Turned ON. May be this will also help somebody.



  • I wanted to simply add that I had this same problem and thankfully found your suggestion to turn off DNS Resolver....this is unfortunate that it is required for HA Sync to work. I'll continue to investigate.

    My error log also stated "The pfSense software configuration version of the other member could not be determined. Skipping synchronization to avoid causing a problem!"

    2.4.3-RELEASE-p1 (arm)
    built on Thu May 10 15:59:52 CDT 2018
    FreeBSD 11.1-RELEASE-p10

    The system is on the latest version.
    Version information updated at Sat Jul 14 1:35:11 UTC 2018


  • Netgate

    This post is deleted!


  • @vigorfac said in High Avail. Sync broken:

    Nov 7 12:40:18 php-fpm 51646 /status_logs_settings.php: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1510054818] unbound[90624:0] error: bind: address already in use [1510054818] unbound[90624:0] fatal error: could not open ports'

    The above error sounds similar to this bug in pfSense, which was since resolved:
    https://redmine.pfsense.org/issues/7326#note-2 (the code didn't wait long enough for unbound to stop before trying to start it again...in our case the master server was unaffected but the backup router would end up with unbound not running)

    re: HA sync, we have "DNS Forwarder and DNS Resolver configurations" checked in our setup and have no sync issues. So I don't think that by itself is an issue.


 

© Copyright 2002 - 2018 Rubicon Communications, LLC | Privacy Policy