Persistent XMLRPC Sync Error



  • I have 2x SG-8860's configured with CARP, pfSync, XMLRPC connected between Opt4 (which I've named the Sync interface) with a Cat6 cable.

    I'm seeing the following error on the primary pfSense host after having performed the configuration for XMLRPC/ pfSync / CARP on both hosts:

    A communications error occurred while attempting XMLRPC sync with username admin https://172.16.1.3:443\. @ 2017-09-16 13:10:42
    

    On the primary I have rules set to allow for configuration sync, state sync and echo request reply. On the secondary I have an any any rule set so that it can receive the config sync data from the primary.  I'm pretty much following the nomenclature and IP syntax from Jim Pingles March Hangout where he demonstrated an HA setup in 2.3.3/2.4.

    I cannot ping each hosts Sync IP's, 172.16.1.2 for the primary and 172.16.1.3 for the secondary.
    tcpdump on the IGB5 interface on both systems reveals not much other than ARP requests, btw, all interfaces on both systems MATCH:

    tcpdump from the primary:

    10:06:06.982669 ARP, Request who-has 172.16.1.3 tell 172.16.1.2, length 28
    10:06:07.986666 ARP, Request who-has 172.16.1.3 tell 172.16.1.2, length 28
    10:06:08.990669 ARP, Request who-has 172.16.1.3 tell 172.16.1.2, length 28
    10:06:09.994671 ARP, Request who-has 172.16.1.3 tell 172.16.1.2, length 28
    10:06:10.998667 ARP, Request who-has 172.16.1.3 tell 172.16.1.2, length 28
    10:06:12.002667 ARP, Request who-has 172.16.1.3 tell 172.16.1.2, length 28
    
    

    tcpdump from the secondary:

    
    10:06:07.850632 ARP, Request who-has 172.16.1.2 tell 172.16.1.3, length 28
    10:06:08.854623 ARP, Request who-has 172.16.1.2 tell 172.16.1.3, length 28
    10:06:09.858622 ARP, Request who-has 172.16.1.2 tell 172.16.1.3, length 28
    10:06:10.861621 ARP, Request who-has 172.16.1.2 tell 172.16.1.3, length 28
    10:06:11.863622 ARP, Request who-has 172.16.1.2 tell 172.16.1.3, length 28
    
    

    Here's my configuration which should provide syncing without any problems:

    • Using the same admin username and password on both systems

    • Firewall rules on primary sync interface:

    • Firewall rules on secondary sync interface:

    • Status -> CARP shows the primary as being in MASTER mode and the secondary as being in BACKUP mode

    Any time I make a change I get the error you see at the top of this post.

    What bothers me is that this was working up until about a week ago, and while it's a brand new deployment absolutely no changes have been made since then other than powering down both units to move them in the rack.

    What could possibly be happening here to prevent XMLRPC syncing from occurring? I've also swapped out the CAT6 cable so it's not that.



  • Well go figure, re-configuring the sync interface to use igb4 instead of igb5, and then swapping the firewall rules assigned to the interface and hey presto, a working XMLRPC setup,  so devs…bug here hey?!

    tcpdump -i igb4 results:

    08:19:47.313327 IP 172.16.0.3 > 172.16.0.2: PFSYNCv5 len 280
        update compressed count 3
        eof count 1
    08:19:47.758196 IP 172.16.0.2 > 172.16.0.3: PFSYNCv5 len 280
        update compressed count 3
        eof count 1
    08:19:48.377325 IP 172.16.0.3 > 172.16.0.2: PFSYNCv5 len 196
        update compressed count 2
        eof count 1