Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    XMLRPC sync errors since upgrade to 2.4.4

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    64 Posts 13 Posters 13.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DerelictD
      Derelict LAYER 8 Netgate
      last edited by

      Is the webgui healthy on the secondary at the time? Can you log in there and navigate?

      Are you trying to game things without the requisite 3 public IP addresses on WAN? Can the secondary get to the internet, resolve names, etc when it is not CARP master?

      N 1 Reply Last reply Reply Quote 0
      • N
        Nima304 @Derelict
        last edited by

        @derelict said in XMLRPC sync errors since upgrade to 2.4.4:

        Is the webgui healthy on the secondary at the time? Can you log in there and navigate?

        Are you trying to game things without the requisite 3 public IP addresses on WAN? Can the secondary get to the internet, resolve names, etc when it is not CARP master?

        Yup, the webgui is just fine. I'm not trying to game anything, both firewalls have their own unique upstream address, and the CARP address is a different and also unique address as well. The secondary firewall can get to the Internet and resolve DNS names when it's not CARP master, I pinged google.com to check.

        1 Reply Last reply Reply Quote 0
        • S
          SteveITS Galactic Empire @Nima304
          last edited by

          @nima304
          is 172.16.1.3 the sync IP or the LAN IP of the second router?

          @windiz
          same question for 10.51.0.2?

          The routers I upgraded last week aren't logging comm errors...

          A long time ago I did have sync issues. I seem to recall I tracked it down to Suricata and that we had selectively disabled many of the unneeded individual rules. Turns out all that had to sync and it was timing out. Solution: don't disable individual rules and it has less to process.

          Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
          When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
          Upvote 👍 helpful posts!

          N 1 Reply Last reply Reply Quote 0
          • N
            Nima304 @SteveITS
            last edited by

            @teamits said in XMLRPC sync errors since upgrade to 2.4.4:

            @nima304
            is 172.16.1.3 the sync IP or the LAN IP of the second router?

            @windiz
            same question for 10.51.0.2?

            The routers I upgraded last week aren't logging comm errors...

            A long time ago I did have sync issues. I seem to recall I tracked it down to Suricata and that we had selectively disabled many of the unneeded individual rules. Turns out all that had to sync and it was timing out. Solution: don't disable individual rules and it has less to process.

            That's the sync IP for the second firewall. The primary's is 172.16.1.2.

            1 Reply Last reply Reply Quote 0
            • DerelictD
              Derelict LAYER 8 Netgate
              last edited by

              This post is deleted!
              1 Reply Last reply Reply Quote 0
              • B
                bbrendon @Nima304
                last edited by

                @nima304 Thanks for digging into your setup to get to the bottom of this. I just haven't had time on my end and since things more or less work, it hasn't been a priority.

                N 1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  Do you have a large number of users in the config?

                  Steve

                  1 Reply Last reply Reply Quote 0
                  • N
                    Nima304 @bbrendon
                    last edited by

                    @bbrendon said in XMLRPC sync errors since upgrade to 2.4.4:

                    @nima304 Thanks for digging into your setup to get to the bottom of this. I just haven't had time on my end and since things more or less work, it hasn't been a priority.

                    No problem, hopefully there's a resolution that solves it for all of us.

                    @stephenw10 said in XMLRPC sync errors since upgrade to 2.4.4:

                    Do you have a large number of users in the config?

                    Steve

                    No, literally just the admin user, but I also have LDAP auth configured.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      That should be no problem as long as the user accounts are not on pfSense. A large number can introduce delays on the secondary when the sync'c config is added preventing it responding in reasonable time.

                      Hmm, I'd probably start a packet capture on the secondary sync interface. Set it for a large number and wait for it to fail. See what's actually happening there.

                      Steve

                      1 Reply Last reply Reply Quote 0
                      • S
                        SteveITS Galactic Empire
                        last edited by

                        In windiz's logs, it is exactly 60 seconds from the beginning of the sync to the error and that sounds like a timeout to me. Brainstorming, how large is your config export file? We have some decently complex ones for our data center that are about 180 KB, for reference...Suricata rules, pfBlockerNG, OpenVPN, etc.

                        Router2 isn't set to sync back to router1 is it? That would be a loop.

                        Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
                        When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
                        Upvote 👍 helpful posts!

                        N 1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Yes, the timeout is 60s. It used to be possible to take longer than that to load the config and respond with more than ~50 users on some hardware. There have been improvements gone in since then though.

                          Steve

                          1 Reply Last reply Reply Quote 0
                          • N
                            Nima304 @SteveITS
                            last edited by

                            @teamits said in XMLRPC sync errors since upgrade to 2.4.4:

                            In windiz's logs, it is exactly 60 seconds from the beginning of the sync to the error and that sounds like a timeout to me. Brainstorming, how large is your config export file? We have some decently complex ones for our data center that are about 180 KB, for reference...Suricata rules, pfBlockerNG, OpenVPN, etc.

                            Router2 isn't set to sync back to router1 is it? That would be a loop.

                            Good catch, my logs are showing the same thing. While config sync isn't set at all on the secondary, the primary is syncing states from the secondary, and the secondary from the primary, as per pfSense's documentation.

                            I'm going to try to blow the firewall rules open on the sync interface for both firewalls and see if that does anything.

                            1 Reply Last reply Reply Quote 0
                            • N
                              Nima304
                              last edited by

                              Blowing open the rules did nothing, unfortunately. I'm seeing data received on the secondary firewall, so it's not a cable issue. I'll do a packet capture and see if anything interesting turns up.

                              1 Reply Last reply Reply Quote 0
                              • N
                                Nima304
                                last edited by

                                The transmission is encrypted using TLS, so I can't actually see what's going on.

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  You could set the GUI to http just while you test. However you should still be able to see the TCP sequence and lack or responses.
                                  Make sure both nodes are time sync'd and then compare the log entries. Does the secondary log anything during that 60s window?

                                  Steve

                                  1 Reply Last reply Reply Quote 0
                                  • A
                                    AcaaliK
                                    last edited by

                                    Hello All,

                                    I am facing the same issue after an upgrade from 2.4.3 to 2.4.4, I have gone through all the checks suggested on the thread and most are ok with the exception of an entry in Secondary system logs under the general tab. The error is XMLRPC unbound /var/unbound/root.key corrupt deleted and recreated each time a sync is performed.

                                    On the primary node I will get the sporadic XMLRPC communication errors stated here. Please note the sync is successful and the changes from the primary are reflected on the secondary with some delay. This only started after the upgrade.

                                    1 Reply Last reply Reply Quote 0
                                    • N
                                      netblues
                                      last edited by netblues

                                      I'm facing exactly the same issue. And after upgrading to 2.4.4p1 from 2.4.3
                                      Settings are replicated, however I see this on the secondary.

                                      nginx: 2018/12/17 16:36:37 [crit] 79693#100242: *18691 SSL_write() failed (SSL:) (13: Permission denied) while sending to client, client: 192.168.50.3, server: , request: "POST /xmlrpc.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fpm.socket", host: "192.168.50.4" 
                                      
                                      
                                      

                                      50.3 is the primary and 50.4 is the secondary
                                      Any ideas?

                                      It looks like the config is received but the ack is never send back to the primary, thus the complaint.

                                      1 Reply Last reply Reply Quote 0
                                      • DerelictD
                                        Derelict LAYER 8 Netgate
                                        last edited by

                                        Permission denied is almost always something being blocked by policy.

                                        Are you running snort or suricata?

                                        Is it enabled on the sync interface?

                                        N 1 Reply Last reply Reply Quote 0
                                        • N
                                          netblues @Derelict
                                          last edited by

                                          @derelict No snort or suricata ever installed. Not even pfblocker.

                                          1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            Do you see that same error if you just save the Unbound settings page on the secondary without making any changes?

                                            Does Unbound actually start on the secondary?

                                            Is the filesystem full?

                                            Steve

                                            N 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.