Has anyone been able to get CP working flawless on iOS and Mac devices?



  • I need to create a captive portal so that users agree to terms.  I am currently just testing the default log in page, but it seems like the iOS devices dont respond to the first press on the log in button.  I have to press it twice in order for it to get to the "success" page.

    Same with the Mac laptops, if i click it once i get a spinning wheel on the bottom of the CP assist page, and if i dont press "continue" a second time, it will give me an error "A problem occurred"

    After that i am able to get online.

    I feel this will give me much grief as the users will think the network is down.

    Any help is appreciated.

    I am currently using 2.3.1_1 Community ED
    WAN/LAN/opt1 for CP



  • I have a lot of users with iphones / mac that seem happy with my CP.
    Have you tried changing the default login page by something more 2016ish ?


  • Netgate

    CP and flawless are antonyms. CP, by default, breaks the internet so it cannot be flawless.

    It really does sound like something wrong with your custom page.

    You can always put up a test "hidden" test SSID protected by WPA on a test VLAN with a test portal to test things on. Did I say test enough?



  • I am in the testing stage.  This whole setup is just for testing before we deploy.

    I am just trying it out with the default page, because i didn't want to add another layer of complexity to the service until i know whats broken and what work.

    Have you guys seen any of the issues i described?



  • @kabrutus:

    I need to create a captive portal so that users agree to terms.  I am currently just testing the default log in page, but it seems like the iOS devices dont respond to the first press on the log in button.  I have to press it twice in order for it to get to the "success" page.

    Same with the Mac laptops, if i click it once i get a spinning wheel on the bottom of the CP assist page, and if i dont press "continue" a second time, it will give me an error "A problem occurred"

    After that i am able to get online. 
    ….

    This is exactly what I'm experimenting the last couple of days (weeks).
    As soon as I see this "can not connect to the network error" on my iPhone again, I'll make a screen copy.
    On  the pfSense, captbe portal side, the session is opened - the connection exists - the device (iPhone HAS a connection).

    I tend to say that something is going on related to a latest iOS update-how the iDevice tests if it has a connection (the test page at something like "portal.apple.com" page after logging that show "success" …. )
    No change or update at the pfSense side is explaining this behavior (also because lately no changes were made concerning the captive portal).

    This is new .... the iPhone always connected fine the last x years to our pfSense portal.


  • Netgate

    I guess a static DHCP mapping for a device exhibiting the (hopefully reliably repeatable) behavior followed by a packet capture of that IP address is probably in order.



  • I checked with the captive portal web server logs.
    Something is new.

    I connect to the portal - the entire "DHCP" session goes well.
    My iPhone obtains an Ipv4 : 192.168.2.167 (which is not a static lease, I do not have to on my portal)

    The, the captive portal magic kicks in:
    My device (iPhone) has established the wifi connection, an IP has been obtains.
    iOS checks if it has a open to "http://captive.apple.com/hotspot-detect.html" :
    05-29-2016 09:02:55 Local5.Info 192.168.1.1 May 29 09:03:01 pfsense.brit-hotel-fumel.net nginx: 192.168.2.176 - - [29/May/2016:09:03:01 +0200] "GET /hotspot-detect.html HTTP/1.0" 302 0 "-" "CaptiveNetworkSupport-325.10.1 wispr"

    It's redirected to (302) to
    05-29-2016 09:02:56 Local5.Info 192.168.1.1 May 29 09:03:02 pfsense.brit-hotel-fumel.net nginx: 192.168.2.176 - - [29/May/2016:09:03:02 +0200] "GET /index.php?zone=cpzone1&redirurl=http%3A%2F%2Fcaptive.apple.com%2Fhotspot-detect.html HTTP/1.0" 200 1504 "-" "CaptiveNetworkSupport-325.10.1 wispr"
    [my pfSense Captive portal] : /index.php?zone=cpzone1&redirurl=http://captive.apple.com/hotspot-detect.html HTTP/1.0"

    Now, the strange part - this is new : My device launches another GET to
    http://captive.apple.com/hotspot-detect.html :
    05-29-2016 09:02:56 Local5.Info 192.168.1.1 May 29 09:03:03 pfsense.brit-hotel-fumel.net nginx: 192.168.2.176 - - [29/May/2016:09:03:03 +0200] "GET /hotspot-detect.html HTTP/1.1" 302 5 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_2 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Mobile/13F69"

    and of course, pfSense replies with another instance of the login page :
    05-29-2016 09:02:57 Local5.Info 192.168.1.1 May 29 09:03:03 pfsense.brit-hotel-fumel.net nginx: 192.168.2.176 - - [29/May/2016:09:03:03 +0200] "GET /index.php?zone=cpzone1&redirurl=http%3A%2F%2Fcaptive.apple.com%2Fhotspot-detect.html HTTP/1.1" 200 844 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_2 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Mobile/13F69"

    My style.css get loaded :
    05-29-2016 09:02:57 Local5.Info 192.168.1.1 May 29 09:03:03 pfsense.brit-hotel-fumel.net nginx: 192.168.2.176 - - [29/May/2016:09:03:03 +0200] "GET /captiveportal-style.css HTTP/1.1" 200 836 "https://portal.brit-hotel-fumel.net:8003/index.php?zone=cpzone1&redirurl=http%3A%2F%2Fcaptive.apple.com%2Fhotspot-detect.html" "Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_2 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Mobile/13F69"

    My device tries another test to http://captive.apple.com/hotspot-detect.html :
    05-29-2016 09:02:57 Local5.Info 192.168.1.1 May 29 09:03:03 pfsense.brit-hotel-fumel.net nginx: 192.168.2.176 - - [29/May/2016:09:03:03 +0200] "GET /hotspot-detect.html HTTP/1.0" 302 0 "-" "CaptiveNetworkSupport-325.10.1 wispr"

    pfSense replies to it with another instance / login session :
    05-29-2016 09:02:58 Local5.Info 192.168.1.1 May 29 09:03:04 pfsense.brit-hotel-fumel.net nginx: 192.168.2.176 - - [29/May/2016:09:03:04 +0200] "GET /index.php?zone=cpzone1&redirurl=http%3A%2F%2Fcaptive.apple.com%2Fhotspot-detect.html HTTP/1.0" 200 1504 "-" "CaptiveNetworkSupport-325.10.1 wispr"

    I entered my login credentials, and validate (POST) :
    05-29-2016 09:03:22 Local5.Info 192.168.1.1 May 29 09:03:28 pfsense.brit-hotel-fumel.net nginx: 192.168.2.176 - - [29/May/2016:09:03:28 +0200] "POST /index.php?zone=cpzone1 HTTP/1.1" 302 5 "https://portal.brit-hotel-fumel.net:8003/index.php?zone=cpzone1&redirurl=http%3A%2F%2Fcaptive.apple.com%2Fhotspot-detect.html" "Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_2 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Mobile/13F69"

    and I reclick on "Enter", which generates another POST :
    05-29-2016 09:03:33 Local5.Info 192.168.1.1 May 29 09:03:40 pfsense.brit-hotel-fumel.net nginx: 192.168.2.176 - - [29/May/2016:09:03:40 +0200] "POST /index.php?zone=cpzone1 HTTP/1.1" 302 5 "https://portal.brit-hotel-fumel.net:8003/index.php?zone=cpzone1&redirurl=http%3A%2F%2Fcaptive.apple.com%2Fhotspot-detect.html" "Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_2 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Mobile/13F69"

    and I past the login procedure.

    When I do not "double click" on the "Login button" on the captive portal login page, after the first time, it times out with
    https://goo.gl/photos/vrhn6SU4xYf17tWn9

    I tracked down the first occurrence in my pfSense captive portal sever logs : April 14, 2016.
    That was the day I installed 2.3-RELEASE which came out the day before, April 12, 2016 ….. that was the day httpd was switched for ningx, right ?


  • Netgate

    Still impossible to tell what is going on.

    Do a Diagnostics > Packet Capture while executing the exact same procedure. In the previous case you would filter on 192.168.2.176 in Host Address.



  • I have the same problem. On all devices - not exclusively on an apple device - you need to press the "submit" button two times to get redirected. It wasn't a problem previous to version 2.3


  • Netgate

    Still need a packet capture.



  • I can't upload .cap files. The results of "packets captured" I put into a txt file.

    packetcapture_device.txt
    packetcapture_pfsense.txt


  • Netgate

    Not enough detail and what are we looking at? What IP address is the specific client in question?

    Identifying the specific times you were taking the various steps would help too.

    Uploading pcaps would be better. If this site won't take them put them somewhere else.


  • Netgate

    @sbb:

    I can't upload .cap files. The results of "packets captured" I put into a txt file.

    Please at least put level of detail to full.





  • Have you tried changing the default login page by something more 2016ish ?

    Since version 2.3 in the code of the login page must be placed a string of code that is there in the
    default login page!!!! Otherwise you all will be getting problems with;

    • pfSense updates/upgrades (particularly)
    • Login of some CP guests

    I was reading this in another german speaking forum I´ll try to find it back for more assistance
    and/or the code string that is needed. I am not really sure if this helps you out here, but it can
    be, so please don´t bother with me if this is not really 100% matching.


  • Netgate

    Test it again.

    But this time, before pressing login a second time, check the captive portal sessions and see if the session was actually logged on. It looks to me like it will be. So instead of just pressing login a second time, bring up a web browser and try a site instead.

    This is not a fix, but will confirm you're not dealing with a captive portal login problem but a site redirection problem instead.

    The device capture you sent shows a redirect to htv-tennis.de.

    The pfSense capture you sent shows a redirect to spiegel.de

    Why are they different?

    This is sent to the client after each login page POST.

    HTTP/1.1 302 Moved Temporarily  (text/html)

    Looking at the client capture, after the first redirect is sent the client just sends a "GET / HTTP/1.1" to 212.223.29.33 without establishing the TCP connection first (packet has PSH,ACK flags) so it either thinks it already has a session established or the client is broken. As far as the firewall is concerned, this is out-of-state traffic, will be blocked, and will probably show up in the firewall logs with TCP:PA flags. This connection attempt is retransmitted three times then reset a couple seconds later.

    Then the client does the second login POST. The same HTTP 302 is sent by the portal. This time the client establishes a TCP session to the destination by issuing a SYN. This time it looks like it works.

    There is not enough detail to see exactly what is happening but it initially looks to be a client problem. Maybe nginx is doing something different enough to make it fail but the same redirect is being sent to the client both times - I wonder if there is some sort of HTTP streaming mode that should be disabled on the portal server instance that forces clients to issue new TCP SYNs for every page request or something. A more complete packet capture will show whether the initial HTTP GET by the client after login is part of the pre-login TCP session that was initially intercepted by the portal.

    I think the device capture is the only one necessary. Set the count to zero and capture the entire session. Turn off wi-fi on the device, delete the portal session if any, turn on capture, then turn on the device wi-fi. Don't stop the capture until the actual destination page at least starts to load.


  • Netgate

    I wonder what keepalive_timeout 0 in he portal's nginx.conf file would do here.



  • @Derelict:

    But this time, before pressing login a second time, check the captive portal sessions and see if the session was actually logged on. It looks to me like it will be. So instead of just pressing login a second time, bring up a web browser and try a site instead.

    This, I checked - and you're right.

    After the first click (using capturing) I saw the POST from /index.html coming by, and pfSense confirmed me at the "Status -> Captive portal" page that the device WAS connected (or : the needed firewall rules were inject, all communication was possible to the net, etc).
    Still, the "cripled-browser" that the iPhone uses (I don't recall the exact name of these kind of special login-browser) keeps spinning around.
    And finally it times out with some completely non-related error about the AP.
    Before, it showed pretty instantly the word "Succes" which is the result of GETting a page at portal.apple.com.

    Btw : My AP's (Linksys WRT54xx séries, among others), are all working flawlessly the last 5 ? 6 or 7 years.

    Anyway, I know now how to capture. I'll make some captures tomorrow.

    I also bring my iPad (very old so no latest iOS) to see if this one connects fine (which could indicate : recent iOS issue).

    @BlueKobold:

    Have you tried changing the default login page by something more 2016ish ?

    Since version 2.3 in the code of the login page must be placed a string of code that is there in the
    default login page!!!! Otherwise you all will be getting problems with;

    • pfSense updates/upgrades (particularly)
    • Login of some CP guests

    You mean this :

    I confirm this is what I use in my own page - but, I'll be using the default pages for testing.



  • @Derelict:

    Test it again.

    But this time, before pressing login a second time, check the captive portal sessions and see if the session was actually logged on. It looks to me like it will be. So instead of just pressing login a second time, bring up a web browser and try a site instead.

    This is not a fix, but will confirm you're not dealing with a captive portal login problem but a site redirection problem instead.

    Yeah same like Gertjan said. I am logged in after the first time cklicking the submit button. I can visit other sites, so it's a redirect problem.

    @Derelict:

    The device capture you sent shows a redirect to htv-tennis.de.

    The pfSense capture you sent shows a redirect to spiegel.de

    Why are they different?

    I visited different sites when capturing the IP adress of the device and the ip adress of the pfsense machine.

    @Derelict:

    I think the device capture is the only one necessary. Set the count to zero and capture the entire session. Turn off wi-fi on the device, delete the portal session if any, turn on capture, then turn on the device wi-fi. Don't stop the capture until the actual destination page at least starts to load.

    Here is a new caputure: http://www.file-upload.net/download-11630451/packetcapture.cap.html



  • Please excuse if this is way off, but: do you see blocked packets in the firewall log (dropped by the default rule)?
    Then it might be somehow the same as this: https://forum.pfsense.org/index.php?topic=111964

    Because it would match the general setup - first opened page (that will get redirected) is same as after auth (that should go through), because it is both times the captive portal detection from Apple. If the connection is dropped, it seems like you have not authenticated - the second click on the login button builds a new connection, which will go through (as will any other new connection).



  • @skron:

    Please excuse if this is way off, but: do you see blocked packets in the firewall log (dropped by the default rule)?
    Then it might be somehow the same as this: https://forum.pfsense.org/index.php?topic=111964

    Because it would match the general setup - first opened page (that will get redirected) is same as after auth (that should go through), because it is both times the captive portal detection from Apple. If the connection is dropped, it seems like you have not authenticated - the second click on the login button builds a new connection, which will go through (as will any other new connection).

    yeah you could be right. I dunno how to export the firewall log. I tried to wait a longer time after clicking the submit button and I did get redirected. So exactly what you described in the other thread. The problem is users don't wait such a long time. A work around could be that I also define a "After authentication Redirection URL".



  • Derelict nailed it.
    https://forum.pfsense.org/index.php?topic=112594.msg627568#msg627568 is the solution.
    No more "double click", after hitting the logging button, the "Succces" from captive.apple.com will show right away  :D

    Btw : It's NOT a firewall, issue - mine didn't change the last several years - its a web server (nginx) setup issue, in combination with iDevices.

    Derelict will post more info soon.



  • @Gertjan:

    Derelict nailed it.
    https://forum.pfsense.org/index.php?topic=112594.msg627568#msg627568 is the solution.
    No more "double click", after hitting the logging button, the "Succces" from captive.apple.com will show right away  :D

    Btw : It's NOT a firewall, issue - mine didn't change the last several years - its a web server (nginx) setup issue, in combination with iDevices.

    Derelict will post more info soon.

    I did change keepalive_timeout to 0 also but it didn't work for me. Maybe I am doing something wrong

    edit: works now! I killed the captiveportal.pid again and that did it. But I am afraid every time you edit the captive portal page keepalive_timeout resets to "65". I think Derelict will clarify and write an instruction how to solve it (he pmed me that instruction)


  • Netgate

    I'm pretty sure this will only happen if the initial URL attempted is the same as the redirect URL after authentication. Else a new TCP session will be established and the keepalive will be irrelevant.

    https://redmine.pfsense.org/issues/6421



  • @Derelict:

    I'm pretty sure this will only happen if the initial URL attempted is the same as the redirect URL after authentication….

    I do not redirect, leaving it up to the initial http request where the visitor goes after authentication.




  • @Derelict:

    I'm pretty sure this will only happen if the initial URL attempted is the same as the redirect URL after authentication. Else a new TCP session will be established and the keepalive will be irrelevant.

    https://redmine.pfsense.org/issues/6421

    Yes, that would explain both issues - the one here, and the one in the linked thread.

    In case of the iOS/mac devices, it is the device itself opening the the same URL before/after auth (for CP detection) with the initial request.

    Thanks for looking into and opening a new bug on this!



  • Wow this page had grown some legs since Friday.  Can someone send me the info on how to fix the issue?  Where would i make this edit?

    I wonder what keepalive_timeout 0 in he portal's nginx.conf file would do here.

    Thanks for all your help.  :)



  • @kabrutus:

    ….
    Can someone send me the info on how to fix the issue?  Where would i make this edit?

    I wonder what keepalive_timeout 0 in he portal's nginx.conf file would do here.

    Its not a secret, I can make it public:

    If you are in a position to change keepalive_timeout from 65 to 0 in the nginx config for the captive portal and kill -HUP that instance of nginx in the shell and test again I think that will fix this.

    It's in /var/etc/nginx-zone_name-CaptivePortal.conf

    ( and if you have a https portal running, change also /var/etc/nginx-zone_name-CaptivePortal-SSL.conf )

    zone_name is of cours your Captive portal zone name.
    More then ONE zone can exist !

    Open a SSH connection.
    Open the file(s).
    Look for keepalive_timeout 65
    Change it to keepalive_timeout 5
    Save file (s)
    Now, determine the "ningx" processes with :
    ps ax | grep

    This will return something like:

    [2.3.1-RELEASE][admin@pfsense.brit-hotel-fumel.net]/root: ps ax | grep "/nginx"
     6018  -  Is       0:00.00 nginx: master process /usr/local/sbin/nginx -c /var/etc/nginx-webConfigurator.conf (nginx)
     7174  -  Is       0:00.01 nginx: master process /usr/local/sbin/nginx -c /var/etc/nginx-cpzone1-CaptivePortal.conf (nginx)
     8681  -  Is       0:00.01 nginx: master process /usr/local/sbin/nginx -c /var/etc/nginx-cpzone1-CaptivePortal-SSL.conf (nginx)
    29823  1  S+       0:00.00 grep /nginx
    

    (Note : again, I have also the https portal running, so I have a " /var/etc/nginx-zone_name-CaptivePortal-SSL.conf " instance.
    My case informs me that process 7174 and 8681 needs to be restart (from here, the SSH access, NOT by the GUI !!)

    kill -HUP 7171
    kill -HUP 8681
    

    Done.

    IF - and only IF - you want to make this more permanent (until the next update / upgrade)
    and
    Some PHP editing doesn't scare you
    AND
    You are able to help-and-repair yourself if you messed up and you know that this is heavy beta, thus side effects aren't know ….
    AND
    You accept that with your modified pfSense core code you can't ask for help here (no need to explain, right ?)

    THEN

    edit /etc/inc/system.inc
    and look for "keepalive_timeout"

    BE CAREFUL : this PHP code is used to start the GUI web server AND the Captive portal server(s).

    I edited mine like this:

    ....
    	sendfile        on;
    
    	access_log      syslog:server=unix:/var/run/log,facility=local5 combined;
    
    EOD;
    
    	if ($captive_portal !== false) {
    		$nginx_config .= "\tlimit_conn_zone \$binary_remote_addr zone=addr:10m;\n";
    		$nginx_config .= "\tkeepalive_timeout 5;\n";
    	}
    	else
    	{
    		$nginx_config .= "\tkeepalive_timeout 65;\n";
    	}
    
            if ($cert <> "" and $key <> "") {
    .....
    

    (NO, I'm not posting a 'diff' :) and yes, the code is very easy to understand for those with some PHP knowledge).

    I decided to edit /etc/inc/system.inc because the web server(s) config file(s) are regenerated every time they start or get restarted (for example when pfSense reboots).

    If doubt, wait for the update ;)

    PS : Derelict proposed a
    keepalive_timeout 0
    but I used a
    keepalive_timeout 5

    Because "0" is zero (which is extremely short and close to nothing) and "5" is just a little bit lesser as "65" and …..... "5" worked for me :)



  • Awesome!  Thanks guys!


  • Netgate

    It will be keepalive_timeout 0 in 2.3.1_2.

    https://redmine.pfsense.org/issues/6421