Freeradius stops and cannot be restarted
Users cannot authenticate at the captive portal login page (error: invalid radius credentials). This happens randomly every few days.
When I look at the web interface system status I find freeradius is stopped. If I try to start the service there is a notification that the service has started however it still displays as stopped and users still cannot log in.
If I reboot pfSense everything starts normally and users can log in ok.
pfSense 2.1 (i386)
freeradius2 (2.1.12_1/2.2.0 pkg v1.6.7_2)
squid (2.7.9 pkg v.4.3.3)
last few lines of the log seem to indicate a problem connecting to the mysql database, but why should this result in a crashed service that cannot be started?
$ radiusd -X
rlm_sql (sql): Driver rlm_sql_mysql (module rlm_sql_mysql) loaded and linked
rlm_sql (sql): Attempting to connect to firstname.lastname@example.org:3306/ab2597_radius
rlm_sql (sql): starting 0
rlm_sql (sql): Attempting to connect rlm_sql_mysql #0
rlm_sql_mysql: Starting connect to MySQL server for #0
rlm_sql_mysql: Couldn't connect socket to MySQL server email@example.com:ab2597_radius
rlm_sql_mysql: Mysql error 'Can't connect to MySQL server on '22.214.171.124' (51)'
rlm_sql (sql): Failed to connect DB handle #0
rlm_sql (sql): starting 1
rlm_sql (sql): starting 2
rlm_sql (sql): starting 3
rlm_sql (sql): starting 4
rlm_sql (sql): Failed to connect to any SQL server.
rlm_sql (sql): Processing generate_sql_clients
rlm_sql (sql) in generate_sql_clients: query is SELECT id, nasname, shortname, type, secret, server FROM nas
rlm_sql (sql): Ignoring unconnected handle 4..
rlm_sql (sql): Ignoring unconnected handle 3..
rlm_sql (sql): Ignoring unconnected handle 2..
rlm_sql (sql): Ignoring unconnected handle 1..
rlm_sql (sql): Ignoring unconnected handle 0..
rlm_sql (sql): There are no DB handles to use! skipped 5, tried to connect 0
Failed to load clients from SQL.
rlm_sql (sql): Closing sqlsocket 4
rlm_sql (sql): Closing sqlsocket 3
rlm_sql (sql): Closing sqlsocket 2
rlm_sql (sql): Closing sqlsocket 1
rlm_sql (sql): Closing sqlsocket 0
/usr/pbi/freeradius-i386/etc/raddb/sql.conf: Instantiation failed for module "sql"
/usr/pbi/freeradius-i386/etc/raddb/sites-enabled/default: Failed to find "sql" in the "modules" section.
/usr/pbi/freeradius-i386/etc/raddb/sites-enabled/default: Failed to parse "sql" entry.
/usr/pbi/freeradius-i386/etc/raddb/sites-enabled/default: Errors parsing authorize section.
Hehe. Welcome to the fun of configuring freeradius.
You have readnas = true somewhere in your config which means load the list of nas devices from the db. Since the db is not up. The startup of freeradius fails.
Thanks for your feedback. Any suggestions as to where I might find that setting?
I have looked at /usr/local/etc/raddb/radiusd.conf but can't find any reference to 'readnas'
Or perhaps I'm looking in the wrong place… now I look at the GUI and can see 'Read Clients from Database' is set to 'yes' and the 'RADIUS Client Table' is set to 'nas'
My 'nas' table is empty, so why does Freeradius start at all?
I will try setting 'Read Clients from Database' to 'no' for now, but might it be better to insert some values into table 'nas'?
Freeradius never started as it was told to load the nas list from sql db when the sql db is not reachable or has not been started up yet if it on the same machine. It says so in the startup log you posted above.
Well that's odd, because under 'Status -> Services' it showed as running most of the time (until the random halts). According to the log it should never have started at all?
Anyhow I have set 'Read Clients from Database' to 'no' and that particular issue appears to be solved, that is freeradius always shows as running now.
However the original problem still manifests, i.e. users at some point cannot authenticate at the captive portal login page (error: invalid radius credentials). This happens randomly every few days.
The only fix is to reboot pfsense, and things are fine again for a few hours, or days. Random interval as far as I can tell.
Starting to think about a re-installation
noticed these lines from the latest log dump, it sounds serious but I wonder what might be the cause, and solution?
radiusd: #### Opening IP addresses and Ports ####
type = "auth"
ipaddr = 192.168.120.1
port = 1812
Failed binding to authentication address 192.168.120.1 port 1812: Address already in use
/usr/pbi/freeradius-i386/etc/raddb/radiusd.conf: Error binding to port for 192.168.120.1 port 1812
Something has gone wrong with your setup/install/config I assume.
Freeradius is being re/started (how?) but a previous instance is still running which is hogging the listening ports, hence the new instance cannot start.
stop radius then check port 1812 if it is open. might be used by other services.