High Availability HA authentication failure
-
I had HA set up, settings synced for a while, but some time ago this got lost, only noticed recently.
Might have been the 2.4.0 update or mocking around with firewall or other settings.. from what I can see now, firewall is not the problem though, since in the local network, everything is permitted.Primary is on 2.4.0, secondary on 2.3.3 - I keep secondary for primary reboots, but also since IPSEC changes sometimes seem to break functionality, and since they are a nightmare to debug, to keep a known version. Primary is on 192.168.0.2, secondary on 192.168.0.3, they have a shared CARP IP on 192.168.0.1 that should not matter in this scenario.
So.. what I get on primary is
An authentication failure occurred while trying to access https://192.168.0.3:443/xmlrpc.php (host_firmware_version). @ 2017-10-29 15:21:14
in notification bubble
and in system logs
Oct 29 15:21:13 php-fpm[74235]: /rc.filter_synchronize: New alert found: An authentication failure occurred while trying to access https://192.168.0.3:443/xmlrpc.php (host_firmware_version).
Oct 29 15:21:13 php-fpm[74235]: /rc.filter_synchronize: An authentication failure occurred while trying to access https://192.168.0.3:443/xmlrpc.php (host_firmware_version).
Oct 29 15:21:13 php-fpm[74235]: /rc.filter_synchronize: Beginning XMLRPC sync data to https://192.168.0.3:443/xmlrpc.php.On the secondary, I get in the system logs
Oct 29 15:21:15 sshlockout 71691 Locking out 192.168.0.2 after 15 invalid attempts
Oct 29 15:21:15 php-fpm 79982 /xmlrpc.php: webConfigurator authentication error for 'admin' from 192.168.0.2
Oct 29 15:21:15 php-fpm 79982 /xmlrpc.php: webConfigurator authentication error for 'admin' from 192.168.0.2 during sync settings.also, main IP gets locked out now.
I changed passwords on admin account a few times, kept it the same in user manager and in HA settings. Also, the communication seems to be working at least partially, otherwise I would not get the error and the secondary logs.
-
Both boxes have to be running the same version to sync. Upgrade the secondary and it should work.
-
Thanks.
Seems there are other issues involved as well.
https://forum.pfsense.org/index.php?topic=137572.0
I regret doing the update… thought it would be more stable.. -
That is an edge case and probably unrelated to what you are seeing. It is plainly telling you the reason for the failure is the firmware version mismatch. You cannot XMLRPC sync from mismatched versions because the configuration file format might have changed.
Your HA nodes need to be on the same version after you have upgraded the secondary, failed over to it, and determined it worked. The mismatch period should not be long or extended, but just long enough to determine everything is fine.
If not you need to fail back and rebuild the secondary on the old version so they match again.
-
Thank you!
Can I downgrade (you call it rebuild?) without issues? Is there a guide on how to do it?
I think I have upgraded to 3.4 too early and see some issues now in a system I am using in production.
BTW, sync between different 3.3 subversions was working fine.
-
Since you did it backwards (upgrading the primary first) the easiest way to get it back is to fail over to the secondary, reinstall 2.3.3 on the primary, and restore the 2.3.3 configuration backup you took before you upgraded.
-
Hi,
I did it the other way. First upgraded secondary, tested a week and then upgraded primary. Both nodes have same firmware 2.4.1 now, but HA-Sync still fails.
Is there a chance to get them in sync again without fresh install? I don't wanna lose my squid logs.Thanks
-
Both nodes have same firmware 2.4.1 now, but HA-Sync still fails.
Fails with what log messages?
I don't wanna lose my squid logs.
That's why you don't use a firewall as a log-retention device.
-
Fails with what log messages?
A communications error occurred while attempting to call XMLRPC method filter_configure:
A communications error occurred while attempting to call XMLRPC method restore_config_section:
A communications error occurred while attempting to call XMLRPC method exec_php:
A communications error occurred while attempting to call XMLRPC method host_firmware_version:
At the moment I consider 2.4.x firmware and HA as broken. I get more and more issues. The sync is just one issue. So I will have to downgrade to 2.3.x again.
-
After upgrade, I can only access firewall when stopping firewall 'service pf onestop', but when looking into web interface, my rules allowing me access to machine are still present.
-
The defined gateways are pingable from firewall cli, but in web interface, they are displayed as down. Only the IPv6 gateway is shown as up
The secondary firewall with 2.4.1 same/sync'ed configuration works, but since primary machine having issues with gateways and access to interface, I do not dare to leave maintenance mode afraid of crashing the network and that fallback to secondary works again. So I will rebuild primary with backup configuration .
That's why you don't use a firewall as a log-retention device.
I just need them on machine for lightsquid reports.
Edit:
Made backups and made fresh install of 2.4.1. At least I have access again to firewall via webinterface, but sync is pretty special. When I force sync via [Status -> Filter reload –> Force config sync] it works. But when doing changes like rules or other actions that are synced in backround, I get the messages like shown on top.
At the same time, I get```
/xmlrpc.php: ERROR! Either LDAP search failed, or multiple users were found.I use LDAP for user authentication to firewall itself and a local user for sync. Seems that the background sync uses the default server defined in [System–>User Manager-->Settings-->Authentication Server] to lookup sync user, while the forced sync uses the local database. Maybe I have to create a LDAP user account for the syncing? I would prefer a fallback to local server instead.
-
-
Hmm. I have never done an HA pair with an LDAP-configured authentication backend for the webgui (which will also be xmlrpc sync.)
Later versions (including 2.4.X) fixed the long-standing issue of being unable to specify the xmlrpc username and password.
It might be worth creating a local user on the primary, which should sync to the secondary, that specifically includes the System - HA node sync permission then specifying that user on the primary in the XMLRPC settings.
The secondary is the one that is controlling where things are authenticated. Are you certain the user being specified is present there? Does the XMLRPC sync user and password pass on the secondary in Diagnostics > Authentication? Is there any significant delay? Are the Authentication servers specified identical on the primary and the secondary? Do both nodes pass Diagnostics > Authentication?
![Screen Shot 2017-11-03 at 12.12.06 PM.png](/public/imported_attachments/1/Screen Shot 2017-11-03 at 12.12.06 PM.png)
![Screen Shot 2017-11-03 at 12.12.06 PM.png_thumb](/public/imported_attachments/1/Screen Shot 2017-11-03 at 12.12.06 PM.png_thumb)