HA Sync breaks after restoring configuration
-
Hmm, well I agree that 20 users is not that many and I wouldn't expect any issue there.
However as a test try disabling the user sync from the xmlrpc settings on the primary.
The actual issue there though is the time the secondary takes to re-build the users file from the config and that still applies I believe.
Steve
-
I only have the Users checked for syncing. I disabled it, and I do not see any errors relating to XMLRPC but that's because there isn't anything to sync but that at least rules out authentication issues etc.
To test further, I checked only the Firewall Aliases as a test, but still get the "New alert found: A communications error occurred while attempting Filter sync with username admin" error.
I've also changed the password disabled the sync on both machines and changed the password for the admin account and reenabled the sync, which synced fine once and failed again.
I'm out of ideas!
-
And you did not see 504/502 errors on the secondary GUI at that time?
Steve
-
The 504 error doesn't happen all the time. The sync fails even when the GUI is responding on the second firewall.
-
Hmm, it still looks like a timing issue to me from the initial logs though it's unclear what the cause is. Do you still see that same 1m delay on the primary? Nothing obviously logged as an error on the secondary?
Steve
-
In the end, I restored most of the existing config apart from the users. That seemed to work ok.
I also restored the DHCP section which contains a lot of static mappings for a few interfaces. Once I restored this, sync broke which I guess it's taking too long to sync. I removed all static mappings and syncing worked again!
Can I increase this default timeout period to something higher than 60 seconds?
-
There is no easy way to increase it though I believe it could be done. However you should not normally need to.
How many static mappings do you have? What size is your config file?
Steve
-
There are 186 mappings. The config xml file is 1.8MB
-
I restored the dhcp mappings again and the sync works.
Where it breaks is very inconsistent and makes it hard to troubleshoot. As of now, the config is complete (except with users and certificates)
-
Syncing a number of users can slow it down drastically. This is known and something we plan to address shortly: https://redmine.pfsense.org/issues/7469