hi,
I finally made a patch for it !
I am now looking for testers (even if you are not using High Availability). Could anyone install this patch on a development Server (2.5.0) and give me some feedback?
Here is how to install it :
Install the
patch package
Create a new patch. In "URL/Commit ID", enter https://patch-diff.githubusercontent.com/raw/pfsense/pfsense/pull/4150.diff . Let the default settings in the "Patch Application Behavior" section (Path Strip Count : 2, etc...)
Fetch and apply the new patch. After installing,
reboot your pfSense.
After installing the patch, if you wish to use High Availability for captive portal :
Configure High Availability normally using System->High Avail. Sync menu.
Configure XMLRPC sync on the primary node only, as it would be done for a normal configuration
on the secondary node, please go to Services->Captive Portal->(your zone)->High Availability and configure backward synchronization.
How it works / Behavior
When using HA,
In normal situation (both nodes UP), captive portal users and vouchers are synchronized between nodes.
If the primary node become unreachable, secondary node become master and continues to run the captive portal
If the primary node switch back from backup to master, it tries to refresh connected users from the secondary (and now backup) node.
If the secondary node leave then re-join the cluster, users will NOT be synchronized on the backup node. Users have to be manually synchronized from Captive Portal->Your CP zone->High Availability in such situation.
What this patch is NOT / Limitations
This patch aims to sync connected users, and in-use/expired vouchers. Allowed IP addresses/hostnames/MACs synchronization are out of scope.
This patch is designed to handle a failure from the primary node, not from the secondary one. Because of the very way HA is implemented on pfSense, a failure on the secondary node would have some bad effects for the cluster. In the case of the captive portal, the effects would be some slowness when performing an user (dis)connection.
This issue is not specific to captive portal, and is due to how how XMLRPC sync works in pfSense.
The workaround to this issue is to manually un-check Captive Portal in HA settings when secondary node leaves the cluster.
RADIUS accounting also works fine with HA, but per-user data consumption is not synchronized between nodes.
Developer notes / technical info
This patch implement a new XMLRPC endpoint, pfsense.captive_portal_sync. It was necessary to implement this endpoint because of bi-directional synchronization (using pfsense.restore_config_section is causing many problems, such as triggering a DHCP server restart every time an user get connected)
Please don't hesitate to comment if you have questions/feedback to share ! ☺