Portal times out on some browsers



  • Perplexing problem:  running pfsense 2.0 RC3, some browsers get through the portal just fine, some time out, some hang for a while then are allowed through.

    Traced it down to a difference in how each browser uses TCP; was my assumption that the O/S handles TCP, but does not appear to be the case.

    Test results follow (on Windows 7 Enterprise sp1 x64); I listed the TCP segments that occur after the client is redirected through the portal, logs in, and clicks Submit.  Redirurl goes to google.com in each case.  Please note that there is no problem with clients getting to the portal page itself; the problems start after they click the login button.  I verified that the client ip address is added correctly to ipfw tables 1 and 2.

    Firefox 3.6.18 (captive portal behaves as expected with low latency)
    –-----------
    client -> portal [FIN, ACK]
    client -> google [SYN] (Note:  Firefox correctly sends a new SYN)
    portal -> client [ACK]
    portal -> client [PSH, ACK]
    client -> portal [RST, ACK]
    portal -> client [FIN, ACK]
    google -> client [SYN, ACK]
    client -> google [ACK]
    client -> google [HTTP GET]

    Internet Explorer 9 (hangs for 24 seconds before finally trying the page again)
    –---------
    client -> portal [FIN, ACK]
    client -> google [PSH,ACK] (Note:  IE doesn't even bother sending a new SYN!)
    portal -> client [ACK]
    portal -> client [PSH, ACK]
    client -> portal [RST, ACK]
    portal -> client [FIN, ACK]
    portal -> client [ICMP Host Unreachable]
    client -> google [TCP Retransmission - PSH, ACK] (Note:  a new SYN is still not sent)

    (24 seconds elapse with more ICMP Host Unreachables and TCP Retransmissions)
    ...
    client -> google [RST, ACK] (Note:  now IE gives up and tries to start a new session with google)
    client -> google [SYN]
    google -> client [SYN, ACK]
    client -> google [ACK]
    client -> google [HTTP GET]

    Chrome 12.0.742.122 (times out after 22 seconds, gives a page cannot be displayed error without trying to reload it)
    –---------
    client -> portal [FIN, ACK]
    portal -> client [ACK]
    portal -> client [PSH, ACK]
    client -> portal [RST, ACK]
    portal -> client [FIN, ACK]
    client -> google [PSH,ACK] (Note:  Like IE, Chrome doesn't bother sending a new SYN)
    portal -> client [ICMP Host Unreachable]
    client -> google [TCP Retransmission - PSH, ACK] (Note:  a new SYN is still not sent)

    (22 seconds elapse with more ICMP Host Unreachables and TCP Retransmissions)
    ...
    client -> google [RST, ACK] (Note:  now Chrome gives up; but unlike IE it doesn't even bother to try starting a new session with google, it just say that it can't display the web page)

    I've only tested one other browser (Safari on my Ipod Touch), and that appears to work normally.

    The other thing to note is that if you try to navigate to any site other than the original redirurl, it will immediately work as expected.  That is, it will go directly to the site, since the user's ip address has already been cleared through ipfw.  But Chrome and IE will hang if you try going to the original redirurl, unless you close and reopen the browser window; this appears to be enough to convince them to send new SYN's.

    I'd be grateful for any help anyone can provide.

    Thanks,
    Mike



  • I think I figured this out.  It appears that newer browsers make multiple embryonic connections to the user's homepage – based on my testing they try to make more than 16 connections, which is the max allowed to the portal at the same time.

    Two questions:

    1. On the portal settings page (Services -> Captive portal -> Captive portal tab), if the maximum concurrent connections setting is blank, does that mean that the default of 4 per IP is used, or does that mean the same as setting it to zero?  When I set it to zero and click Save, the number disappears -- so does that mean 4 connections per IP, or the max of 16?

    2. Is there a way to increase the max beyond 16 to see if that resolves the problem with newer browsers attempting to make more than 16 connections?  I didn't see that as a system tunable.

    Update on more browser testing:  this doesn't appear to be a problem on Firefox 5.0 and IE8.  So far I've only seen the problem on Chrome 12.x and IE9...but that represents a significant portion of users so would be great to get the problem fixed.

    Thanks,
    Mike


Log in to reply