• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

XMLRPC Sync makes backup node's GUI unresponsive

Scheduled Pinned Locked Moved HA/CARP/VIPs
6 Posts 5 Posters 3.2k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • G
    Gernupe
    last edited by May 16, 2016, 8:04 PM May 16, 2016, 7:50 PM

    I'm setting up a simple cluster (1 LAN, 1 WAN and a SYNC Interface) both runnig 2.3 but i'm getting troubles with the syncronization, if i force an update or make any change that requires it, the backup node's GUI stops responding.

    On the master node, i get a notification stating "A communications error occurred while attempting XMLRPC sync with username admin https://10.5.2.2:443"

    On the backup node i get a 504 from nginx and to get it back up, just needed to restart PHP-FPM. In that nodes logs i get

    "[error] 87989#0: *93 upstream timed out (60: Operation timed out) while reading response header from upstream, client: 10.181.0.201, server: , request: "GET /getstats.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fpm.socket", host: "10.181.0.52", referrer: "https://10.181.0.52/""

    Saw this issue in the 2.3.1 redmine (https://redmine.pfsense.org/issues/6328), which seems to be related, and just in case did an upgrade to 2.3.1-DEVELOPMENT but the problem remains the same.

    Now after the restart, the logs shows this message "rc.php-fpm_restart >>> Found XMLRPC lock. Removing" and the gui starts responding again just like it did on the release one.

    Sometimes after the php-fpm reset, some things sync. In the nginx i see this:

    May 16 16:38:30 gateway-2 gateway-2.domain.com nginx: 10.5.2.1 - admin [16/May/2016:16:38:30 -0300] "POST /xmlrpc.php HTTP/1.0" 200 763 "-" "PEAR XML_RPC"
    May 16 16:38:49 gateway-2 gateway-2.domain.com nginx: 10.5.2.1 - admin [16/May/2016:16:38:49 -0300] "POST /xmlrpc.php HTTP/1.0" 200 145 "-" "PEAR XML_RPC"
    May 16 16:39:49 gateway-2 gateway-2.domain.com nginx: 10.5.2.1 - admin [16/May/2016:16:39:49 -0300] "POST /xmlrpc.php HTTP/1.0" 499 0 "-" "PEAR XML_RPC"

    I guess that 499 its a timeout from the client and that triggers the message on the master (after that, the gui on the backup kept working).

    Could this be a bug or a misconfiguration on my side?

    Edit: Just in case, both host are running on top of two proxmox 4.2 servers, LAN and SYNC are actually vlans on one of the proxmox interfaces.

    Also the backup node works without problems when in master.

    pfBlocker's sync does the same thing "/usr/local/www/pfblockerng/pfblockerng.php: XML_RPC_Client: RPC server did not send response before timeout. 103"

    1 Reply Last reply Reply Quote 0
    • E
      EditioN
      last edited by May 20, 2016, 9:54 AM

      Having kinda the same issue here but in my case it's physical hardware and a specific interface for sync. After some firewall rules the 2nd firewall stops responding and I get this error on the first one:

      A communications error occurred while attempting XMLRPC sync with username admin https://10.222.0.3:443.	@ 2016-05-20 11:48:47
      

      I'm able to sync again only after php-fpm restart. It happened 3 times already in about 10 minutes…

      1 Reply Last reply Reply Quote 0
      • E
        EditioN
        last edited by May 25, 2016, 11:42 AM

        Still having the same issue, it's becoming really hard to manage it since I need to restart php-fpm a lot of times to remove the lock…

        Last logs from nginx-errors:

        2016/05/25 12:11:13 [error] 41431#0: *199 upstream timed out (60: Operation timed out) while reading response header from upstream, client: 10.10.100.80, server: , request: "GET /widgets/widgets/system_information.widget.php?getupdatestatus=1 HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fpm.socket", host: "10.10.115.238", referrer: "https://10.10.115.238/"
        2016/05/25 13:29:04 [error] 41431#0: *2335 upstream timed out (60: Operation timed out) while reading response header from upstream, client: 10.10.100.80, server: , request: "GET /widgets/widgets/system_information.widget.php?getupdatestatus=1 HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fpm.socket", host: "10.10.115.238", referrer: "https://10.10.115.238/"
        

        And from system.log:

        May 24 11:25:59 fw2 xinetd[25532]: Starting reconfiguration
        May 24 11:25:59 fw2 xinetd[25532]: Swapping defaults
        May 24 11:25:59 fw2 xinetd[25532]: readjusting service 6969-udp
        May 24 11:25:59 fw2 xinetd[25532]: Reconfigured: new=0 old=1 dropped=0 (services)
        [u]May 24 11:25:59 fw2 php-cgi: rc.banner: PHP ERROR: Type: 1, File: /etc/inc/rrd.inc, Line: 60, Message: Call to undefined function gettext()[/u]
        May 24 11:26:00 fw2 php-fpm[71032]: /xmlrpc.php: The command '/usr/local/sbin/dhcpd -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpd.conf -pf /var/run/dhcpd.pid lagg0_vlan102 lagg0_vlan103 lagg0_vlan104 lagg0_vlan105 lagg0_vlan106 lagg0_vlan107 lagg0_vlan108' returned exit code '1', the output was 'Internet Systems Consortium DHCP Server 4.3.3-P1 Copyright 2004-2016 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Config file: /etc/dhcpd.conf Database file: /var/db/dhcpd.leases PID file: /var/run/dhcpd.pid Wrote 6882 leases to leases file. Listening on BPF/lagg0_vlan108/a0:36:9f:91:3c:c9/10.108.0.0/16 Sending on   BPF/lagg0_vlan108/a0:36:9f:91:3c:c9/10.108.0.0/16 Listening on BPF/lagg0_vlan107/a0:36:9f:91:3c:c9/10.107.0.0/16 Sending on   BPF/lagg0_vlan107/a0:36:9f:91:3c:c9/10.107.0.0/16 Listening on BPF/lagg0_vlan106/a0:36:9f:91:3c:c9/10.106.0.0/16 Sending on   BPF/lagg0_vlan106/a0:36:9f:91:3c:c9/10.106.0.0/16 Listening on BPF/lagg0_vlan105/
        

        And not sure if it is related but fw2 is creating crash reports with the following:

        					Crash report begins.  Anonymous machine information:
        
        amd64
        10.3-RELEASE-p3
        FreeBSD 10.3-RELEASE-p3 #1 3ef16fb(RELENG_2_3_1): Tue May 17 19:34:13 CDT 2016     root@ce23-amd64-builder:/builder/pfsense-231/tmp/obj/builder/pfsense-231/tmp/FreeBSD-src/sys/pfSense
        
        Crash report details:
        
        PHP Errors:
        [25-May-2016 13:31:09 Europe/Berlin] PHP Fatal error:  Call to undefined function gettext() in /etc/inc/rrd.inc on line 60
        [25-May-2016 13:31:09 Europe/Berlin] PHP Fatal error:  Call to undefined function gettext() in /etc/inc/rrd.inc on line 60
        
        

        Any idea?

        Thanks

        1 Reply Last reply Reply Quote 0
        • B
          byusinger84
          last edited by Sep 23, 2016, 2:35 PM

          I know this post is old, but I am having the same issues related to the backup firewall and the 502 bad gateway error. Has anyone found a solution?

          1 Reply Last reply Reply Quote 0
          • C
            chamm
            last edited by Oct 7, 2016, 8:13 PM

            I'm getting this exact same issue when trying to do an XMLRPC sync. In my specific case, it may or may not be related to HAProxy, which is the only package that I'm also syncing.

            Has anyone got any ideas on this? A whole lot of nothing in the last few months.

            Thanks!

            1 Reply Last reply Reply Quote 0
            • D
              Derelict LAYER 8 Netgate
              last edited by Oct 8, 2016, 5:56 AM

              The first thing to check is if the secondary can resolve names, check for updates, etc while in backup status. And if not why not.

              Chattanooga, Tennessee, USA
              A comprehensive network diagram is worth 10,000 words and 15 conference calls.
              DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
              Do Not Chat For Help! NO_WAN_EGRESS(TM)

              1 Reply Last reply Reply Quote 0
              • First post
                Last post
              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                This community forum collects and processes your personal information.
                consent.not_received