RADIUS response time - initial authentication sometimes times out on secondary



  • pFsense 2.0.3

    I have a dozen sites using CP, users being authenticated to primary/secondary RADIUS servers running Freeradius which query AD. Overall it's a fairly complex setup but it works very well indeed. However, I've recently been testing failover of RADIUS by shutting down the primary RADIUS server, and some (but not all) CP logins are failing.

    A typical syslog entry is…

    2013-08-08 08:35:21 Local4.Info <cp ip=""> 5 Aug 8 08:35:17 logportalauth[44889]: ERROR: <username>, <client mac="" address="">, <client ip="">, Error sending request: sendto: Host is down</client></client></username></cp>
    

    … there are no such errors at all when both RADIUS servers are up, and no errors if I swap the primary and secondary servers.

    I assume that what's happening is this...

    1 CP starts a timer
    2 CP queries primary RADIUS
    3 after a while CP gives up on primary and queries secondary
    4 CP times out
    5 Secondary responds, but too late

    Does this sound reasonable?

    Is there a way to increase the overall amount of time that CP will wait for a response? Wouldn't it be better to apply the timeout on a per-server basis rather than overall (if that is the case)?

    Given that logins will sometimes work I suspect that the timeing is very close. Response times of the servers varies between 100mS and 1000mS, perhaps longer if they need to reconnect to AD.



  • Ah, I've found the default values for the timeout & maxtries in \etc\inc\radius.inc so I'll tinker with those and see how I get on…



  • Did you find the way to configure Radius timeout on CP.

    My error code es:

    "logportalauth[15960]: ERROR: , EA:EA:EA:28:b5:64, 170.1.2.115"

    Auth works always for a domain "XXXX" and almost never on "XXXXXXXXXX" (lenght as it's showed)

    Radius server is the default on Windows server 2008 + AD.



  • Yes, you have to manually edit the values in /etc/inc/radius.inc. They're easy enough to spot - if you're not comfortable with command-line just use Diagnostics > Edit file.

    Look for a line like the following…

    function addServer($servername = 'localhost', $port = 0, $sharedSecret = 'testing123', $timeout = 3, $maxtries = 2)

    I've not yet got to the bottom of my problem - increasing the timeouts has not fixed it for me. Unfortunately I've had little time to look into it further, and it's always in use so access is tricky.


Log in to reply