Enabling MIM causes Authentication Error for voucher based logins in Captive Portal
-
On a system that is configured with multiple VLan based Captive Portals, those that are set to "use an authentication backend" and vouchers enabled, can no longer login to Captive Portal, giving a "could not connect to authentication server" error. Disable MIM on that system and they can log in again. Prior connections continue to work fine as long as a login/authentication is not required. pfSense and Captive Portal are using an SSL certificate with Kea DHCP patched as per https://redmine.pfsense.org/issues/15321 to support RFC8910 (DHCP option 114)
The MIM and Captive Portal were both on the MIM controller in this configuration with 2 remote systems configured for MIM.
-
stephenw10 Netgate Administratorlast edited by stephenw10 Nov 11, 2024, 2:14 PM Nov 11, 2024, 2:12 PM
Hmm, that patch should not be required for most devices to detect a CP.
I assume here clients are still redirected to login they just cannot?
Are you able to authenticate against the auth server for other use? In Diag > Auth?
Or are you just using local vouchers there?Steve
-
@stephenw10 said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
Hmm, that patch should not be required for most devices to detect a CP.
As of iOS14 Apple introduced DHCP option 114 as described in RFC8910. iOS will not permit http traffic at all, only https traffic is permitted and thus the captive portal is not triggered, instead an error related to https being required pops up. Quite simply put, iPhones can not log into the captive portal without using DHCP option 114. That option is now supported natively in Kea but not in the current pfSense GUI, thus the patch. This has NOTHING to do with the issue of 24.11 Beta authentication error when MIM is on. I mention it only for the purpose of documenting the setup I tested it on.
@stephenw10 said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
Are you able to authenticate against the auth server for other use? In Diag > Auth?
Or are you just using local vouchers there?My setup uses local user database, vouchers and FreeRadius authentication across 8 VLan based Captive Portals I did not test local user authentication through Captive Portal and FreeRadius authentication continues to work once MIM is on. The GUI uses the SSL certificate for webconfigurator and it too continues to work fine. To me this appears to be a direct relationship between voucher authentication and setting MIM on. It was 100% repeatable across 3 installs so far.
RE: Diagnostics, Authenticate
Once MIM is turned on, local user database and freeRadius both continue to authenticate under Diagnostics, Authenticate. Only voucher authentication fails and the only place to test that is at the login screen for the specific Captive Portal. No need to even reload the login page with the authentication error displayed, simply turn off MIM and it immediately accepts the voucher. -
Hmm, so far unable to replicate.
Do you have MIM setup to a remote controller or just have the daemon enabled?
Do you see that error on the login page? In the logs? Both?
-
@stephenw10 said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
Do you see that error on the login page? In the logs? Both?
On the login page; I will check the logs. I will also add a captive portal with voucher authentication on one of the guest/other 24.11 Beta installs to see if I can duplicate it on an absolutely base install. I have the setup running today on the lab bench and will do a minimal config test this afternoon and give you the results and log info from the current master system.
The setup is 3 XG-7100-1U systems, the master is restored with a copy of our production setup for an RV site. 8 VLans/Captive Portals, the error is present independent of whether the two guest/slaves are turned on or not. It only shows up when you toggle MIM on, on the master. We run a custom captiveportal.inc but I have duplicated this error with the default captiveportal.inc as well.
Nov 11 10:33:08 logportalauth 21857 Zone: vlan30 - Voucher login good for 2614 min.: 7DvMWuKpp2s, f8:95:ea:09:ba:59, 192.168.30.2
Nov 11 10:33:08 logportalauth 21857 Zone: vlan30 - CONCURRENT LOGIN - REUSING OLD SESSION: 7DvMWuKpp2s, f8:95:ea:09:ba:59, 192.168.30.2MIM Log: Nov 11 10:30:09 pfnet-controller 32816 Done shutting down main
Nov 11 10:29:55 logportalauth 21857 Zone: vlan30 - FAILURE: , f8:95:ea:09:ba:59, 192.168.30.2, Error : could not connect to authentication server.
MIM Log: Nov 11 10:25:14 pfnet-controller 32816 Controller ready
-
So you are testing with multiple CPs setup in each instance?
One potential issue here could be the ports in use. When you have multiple CP instances the ports used escalate. It's possible it's hitting the default MIM ports (8443/8080). Try enabling MIM on alternate ports.
-
@stephenw10 said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
It's possible it's hitting the default MIM ports (8443/8080).
8 Portals. each using even for http, odd for https so they will use ports from :8002 through :8017, no conflict there.
I took one of the minimal setups and reconfigured it as a controller and added only one captive portal to it. It continued to work with voucher authentication, independent of MIM being enabled or not. This suggests some other aspect of the more complex Production installation may be involved in addition to the MIM itself.
I then did a full reinstall from the bootable installer and restored our production setup onto it. I did not configure MIM but I simply enabled and disabled it. The behaviour was identical to what I experienced initially, with MIM enabled, I get an authentication error.
At this time I do not need MIM and the fact our Production setup does not work with it is valuable information but not something we need to fix. We simply need to not enable MIM and we are working perfectly under 24.11 Oct 31 Beta and that was the original intent of the lab test of 24.11 Beta in the first place. I am sorry but I can't explain why we are getting the error only that we are. I will continue to test this with future releases but can't afford the time to go any further at the moment.
-
Ok that's good info.
Are you running the radius accounting mods on that when it fails?
-
@stephenw10 -
I am not sure what you refer to as radius accounting mods but we do use mods for captive portal (all in captiveportal.inc) that include freeRadius authentication and the re-authenticate every minute override to use the accounting interval instead (10 minutes). It is easy to cease to use them, simply put the original captiveportal.inc that came with 24.11 Beta back and reboot.All of my testing has been done with the original captiveportal.inc that shipped with 24.11 Beta except for a couple of times where I put the modified captiveportal.inc in to see if anything changed. It did not make any difference.
All of the captive portal pages are also customized for login, error and logout. We use logout as an information dashboard that shows remaining time and data for that account i.e. the venue url for RFC8910, DHCP Option 114.. During my testing, I set the test portal back to the defaults for the full set of login, error, and logout as well. In every case it made no difference.
It will be very difficult to isolate this but now that I know the MIM controller doesn't have to be setup first, just turned on, I will toggle it on and off regularly and hopefully get more info for you.
-
Ok, we will continue trying to replicate it here. Let us know if you manage to narrow it down any further.
-
@stephenw10
One thing that was interesting, I switched the authentication server from freeRadius to Local user and the error changed from unable to reach ... to invalid. That setting should not matter for vouchers and although the error changed, the symptoms remained identical. -
Mmm, initially it 'felt' to me like it's somehow treating the vouchers as a remote auth server. Hard to see how enabling MIM could make any difference though...
-
@stephenw10
Agreed, especially given the bare system controller test did not display the same symptoms. I have now rebuilt and reproduced the symptoms 4 times. I have a 4 Vlan Captive Portal installation at another site that I can throw onto the lab machine and do the same test. I will try to do that today. That setup has no customization. -
@EDaleH do you need to have multiple VLANs with CPs or can you reproduce this with a singe VLAN with a CP?
-
Mmm also when you hit this does it fail in the same way for all CP instances?
-
@rlinnemann said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
can you reproduce this with a singe VLAN with a CP?
@stephenw10 said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
does it fail in the same way for all CP instances
OK, on the 8 CP system restore to lab server test, so far it is the only setup that is failing consistently. Both Local Users and vouchers fail on all Captive Portals that are configured for vouchers. FreeRadius authenticated CPs continue to work fine. I have eliminated the SSL certificate (https vrs http), removed my custom CP code, reset CP login/error/logout code to defaults and the issue with failed authentication did not change. It fails on all voucher enabled CPs when MIM is enabled and works fine when it is disabled. Unfortunately I don't have a CP on that system that is just Local user authentication but I will try to isolate that when I get a chance.
I have been unable to reproduce the problem with a new install with single CP and I just bench tested a different site restore that had 4 CPs and VLans, it worked fine too.
I am running out of corners to look in but I will give more thought to what is different between my two restored site tests as they are reasonably similar.
-
Ah OK. Well the fact it only fails on systems with multiple CPs seems like a good clue.
-
@stephenw10 said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
Ah OK. Well the fact it only fails on systems with multiple CPs seems like a good clue
Perhaps but the clue that worked for me was that the 4 portal restore test that worked fine was on a Plus 24.03 and the 8 portal restore that was causing all the trouble was on a CE 2.7.2 system. So.... I rebuilt the 8 portal lab test and instead of installing 24.11 Beta directly, I installed 24.03, restored the CE 2.7.2 8 portal backup onto it, tested it, then upgraded to 24.11 Beta and it worked just fine, no authentication errors.
Now I was happy but wanted to be sure I found a way to reproduce it as this was a brand new backup of that 8 portal production system. So... I did a fresh install of 24.11 Beta and restored the identical backup onto it and tested it. Voila!, authentication errors when MIM is enabled.
So Advice to everyone, go through 24.03 before you go to 24.11 Beta.
For you Stephen, the cause is hiding in the restore of the config file from a 2.7.2 directly to a 24.11 beta. I guess you can solve it with the traditional slap on the hand and a firm "so, don't do that"?
-
@EDaleH I'm glad it sounds like you've worked around it, but my spidey sense is still tingling here. Can you supply a redacted as necessary config that creates the problem on restore?
-
Just to confirm when you restored the config into 24.11 was that the full config via the webgui? In other words was the config upgrade script run against it?
-
@stephenw10 said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
ust to confirm when you restored the config into 24.11 was that the full config via the webgui? In other words was the config upgrade script run against it?
The interfaces match on the production and lab units so it is a simple webgui restore that runs without any further intervention and provides a working unit (gateway for Wan has to be changed, which is simple to edit in the config file first, that's it).
To be honest, I don't know what you are referring to as an upgrade script. If that provides an output log, it would be excellent to run it and look over what it changes, not to mention if it fixes the symptoms.
This afternoon I built a 2.7.2 single portal and restored it to a 24.11 directly and did not reproduce the problem. Time permitting, I will make the installation multi-portal and try again.
-
If you import a config that has an older config version that whatever is current for the pfSense version it gets run through a script to upgrade it to current. That includes code for each config version step.
However the config version is only help in the main <system> section of the config. If you import the full config file the version is seen and any required upgrades are run. But if you import only some section of the config (other than system) the version is unknown and no upgrades are run. That can result in an invalid config.
The fact it worked for you importing into 24.03 first hints at a config version problem because it has the same config version as 2.7.2.
https://docs.netgate.com/pfsense/en/latest/releases/versions.html24.11 actually uses 23.6.
-
@rlinnemann said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
glad it sounds like you've worked around it, but my spidey sense is still tingling here.
Having identified a config version as the cause is the conclusion from my perspective.
@stephenw10 said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
24.11 actually uses 23.6.
The chart says config ver 23.3 for Plus 24.11 but I am quite satisfied that all restores to 24.11 must be done by restoring from or through (if it is CE 2.7.2) Plus 24.03.
Attempting to answer your questions is what lead to the final diagnosis here, it is comforting to know that existing installations have an upgrade path that includes MIM.
-
@EDaleH said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
The chart says config ver 23.3 for Plus 24.11
Yeah that page needs to be updated when 24.11 is released but currently it's using 23.6.
So it could be failing to upgrade the config at import....
-
Doesn't look like is it though. The search continues...
-
@stephenw10 said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
doesn't look like is it though. The search continues...
Well, your search may continue for the "fix" but the cause is clearly identified.
This morning I built a 24.03, restored the CE 2.7.2 backup onto it, tested it worked and then Backed it UP. I then upgraded it to 24.11 Beta and it does not display the authentication error when MIM is turned on.
Next, I built a new 24.11 Beta and restored that 24.03 backup onto it and voila! the authentication error is there every time you turn MIM on. Conclusive proof that the only way to get a stable 24.11 Beta in my case is to go through 24.03 and do a GUI upgrade.
-
Right which really seems like config upgrade issue at restore. It's not doing something that is done at system upgrade.
But it's more complex than that because I tried exactly that with a basic config and it still worked fine.
-
@stephenw10 said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
But it's more complex than that because I tried exactly that with a basic config and it still worked fine.
I have been unable to duplicate it with a fresh install either. This install is as complex as it gets for me and runs flawlessly. I am just trying to ensure it continues to do so under 24.11, Kea and MIM. Lots of lab testing left!
-
@stephenw10
As part of my testing of 24.11 Beta, I had a step to do a backup, fresh install and restore to confirm functionality. I moved that to the top of my list due to the restore issues I had encountered. I can confirm that a backup of a working 24.11 install (i.e. one that came through a 24.03 upgrade) will restore to a fresh 24.11 and work properly without displaying the authentication error.That suggests the format/processing of the backup config file (V23.3) is by far the most likely cause.
-
Exactly it appears that when you import the 24.03 config into 24.11 it's not being upgraded correctly. But only when the config is sufficiently complex.
Are you able to compare a failing config in 24.11 with a working one?
That looks identical in my testing here but clearly something in your config is hitting an issue.
-
@stephenw10 said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
Are you able to compare a failing config in 24.11 with a working one?
Well, comparing proved difficult as I had to be extremely careful to build exactly the same setup. When I finally succeeded and had one working (24.03->24.11 restore/upgrade) and one not working (24.11 direct restore) install that was backed up immediately BEFORE any testing, all I came up with was this missing line in the install that didn't work:
</notifications> <qinqs></qinqs> <-- This line is not there in the "BAD" config backup
It did fail on other items like dhcp leases "db", one captive portal encrypted "db" section, time of the last revision, and pkg repo conf path. Other than that, they were identical.
-
Hmm, none of that should make any difference.
-
@stephenw10 said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
Hmm, none of that should make any difference
OK, if it doesn't make any difference then, as I had two appliances, one with a good install and one with a faulty install, I simply took the good backup and restored it to the faulty install and took the faulty system backup and restored it to the good system install.
Well, good stayed good and faulty stayed faulty. The issue is not in the backup, it is in the 24.11 Beta install itself and once "broke", it stays broke.
I will follow up with the results of a restore of the original CE 2.7.2 backup to the good system when I have time. That restore in the past has always been to a fresh 24.11 Beta install, this time it will be to an existing, good 24.11 Beta install. Stay tuned.....
-
Hmm, to be clear, it now looks like a system with a clean 24.11 install fails when given the config from a system that was upgraded to 24.11?
-
@EDaleH said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
restore of the original CE 2.7.2 backup to the good system
OK, the original CE 2.7.2 Backup file repeatably results in a faulty 24.11 Beta (when that installation is fresh/new) resulting in a Voucher/local DB authorization error when MIM is on.
Restoring that file to a 24.03->24.11 upgraded installation, that does not display the authorization error, results in a good installation that does NOT display the authorization error either.
In other words, we have narrowed this down to occurring only when the V23.3 config file from CE 2.7.2 (or Plus 24.03) is restored to a brand new install of 24.11 Beta. If the install was a result of an upgrade from 24.03 (where the V23.3 config file from CE 2.7.2 / Plus 24.03 was already restored in advance) to 24.11 Beta, the authentication error does not occur when MIM is enabled.
Once a 24.11 Beta install is present that does not display the authentication error, an installation of a V23.3 config file does not "break" it.
Again, the only safe way to move CE 2.7.2 or Plus 24.03 to 24.11 Beta is to upgrade through the GUI. A fresh install of 24.11 Beta does not consistently result in MIM compatibility if restored from a prior version backup.
-
Ah, OK! So still looks like an upgrade issue in the config then. Even though the config itself does not look significantly different.
And to be clear still only happens when MIM is enabled?
-
Are you able to take status/diag files from the failing instance and upload them to us to analyze?
If so you can upload it here: https://nc.netgate.com/nextcloud/s/2poNFGxGJ7QZF8C
-
@stephenw10 said in Enabling MIM causes Authentication Error for voucher based logins in Captive Portal:
Are you able to take status/diag files from the failing instance and upload them to us to analyze?
Done, one successful login MIM disabled, one unsuccessful login MIM enabled. See captive portal authentication log.
FYI: Lab systems use XG-7100-1Us and Production systems are on XG-1541 Maxes. No internal switches are used, all are physical interfaces, specifically igc0, ix0 and ix1 for both Lab and Production installations.I will leave this "broken" installation of 24.11 Beta running on a currently spare XG-7100-1U for a while in case you want more log info. I do have a separate, working installation too if you want logs from an MIM enabled successful login on a "good" 24.11 Beta install to compare to. That system is identical to the one that produced the logs I uploaded.
- 9 days later
-
This morning I did an update on the "faulty restored" 24.11 Nov 12 Beta to 24.11 Nov 21 Stable and the problem was corrected. However, I also did a fresh install of 24.11 Nov 21 Stable and then restored the CE 2.7.2 backup to it and the issue returned, so any GUI update corrects the problem but a direct restore of a V23.3 config file (2.7.2 or 24.03) appears to still create the Authentication Error if done through diagnostics, backup & Restore.
My advice is to restore a CE 2.7.2 or a 24.03 Plus to a 24.03 Plus and then do the GUI Update to 24.11 Plus just in case. This appears to be an installation specific backup config versioning issue.
-
Hmm, we still haven't replicated it here. We may simply not have a config sufficiently complex to hit it. IIRC you also didn't hit it on a basic config?