PfSense 1.2.2 500 Internal Server Error pops up randomly on login page

inthane

Greetings,

I work at a college where we are using pfSense as the gateway for our wireless users. Upon connecting to the wireless network, they are directed to enter a username and password, which is then authenticated against an (external) Radius server. Once authenticated they are allowed to access the internet.

The pfSense server is running in a virtual machine hosted on a VMWare ESX 3 Server. It currently has 512mb of RAM, and has been assigned to run on a single CPU, with 1ghz of reserved CPU time. It is hosted on a 6gb disk image which shows plenty of free space. I started with the 1.2 image, and updated it to the 1.2.2 image when I first encountered this problem. The CPU and memory logs show plenty of memory and CPU time available at all times to the server in question.

About once or twice a day, the login page stops working and is replaced with an "Error 500: Internal Server Error" message. People who are already connected to the server seem to not be losing their connections - it appears to be affecting the web server only, not the passthrough. Restarting the server fixes the problem for the time being, but I would like a more permanent fix than this. I'm not well conversant in BSD, so I'm not sure what information I should post. I will provide any further information that people request.

Thanks,
Thane

jimp

You might want to check the contents of /var/log/lighttpd.error.log

You should be able to do this from the WebGUI or the console. Going to Diagnostics > Command, and typing "cat /var/log/lighttpd.error.log" may be enough.

inthane

I'll look at it again the next time the server falls over. Right now the contents of the file are:

2009-02-19 14:00:53: (log.c.97) server started
2009-02-19 14:00:54: (log.c.97) server started

Slam

Ive recently been testing freeradius, the only time that the captive portal gave me this error was when I had forgot to run radius itself! obviously this isnt the case with you, but it could be that your radius server is being overloaded and your pfs box isnt getting replies in time? just a thought.

You could try playing around with "Maximum concurrent connections" on the CP page to reduce number of connections a user makes.

Slam

inthane

This might be the case - I didn't set up the Radius server, someone else told me to point at it. Turns out it's a virtualized Windows 2003 Server box running on 256mb of memory. I bumped it up to 512, gave it some minimum CPU time and reserved memory, and we'll see if that helps.

inthane

It's been crashing again. I've attached the log file. The relevant chunks, as near as I can tell:

2009-03-04 08:47:53: (mod_fastcgi.c.2494) unexpected end-of-file (perhaps the fastcgi process died): pid: 0 socket: unix:/tmp/php-fastcgi.socket-0
2009-03-04 08:47:53: (mod_fastcgi.c.3325) response not received, request sent: 1222 on socket: unix:/tmp/php-fastcgi.socket-0 for /index.php , closing connection
2009-03-04 09:09:24: (request.c.1153) request-size too long: 2147479552 -> 413
2009-03-04 09:09:25: (request.c.1153) request-size too long: 2147479552 -> 413
2009-03-04 09:18:15: (network_writev.c.115) writev failed: Operation not permitted 18
2009-03-04 09:18:15: (connections.c.606) connection closed: write failed on fd 18

2009-03-04 10:30:03: (network_writev.c.115) writev failed: Operation not permitted 62
2009-03-04 10:30:03: (connections.c.606) connection closed: write failed on fd 62

2009-03-04 10:43:42: (mod_fastcgi.c.1768) connect failed: Connection refused on unix:/tmp/php-fastcgi.socket-0
2009-03-04 10:43:42: (mod_fastcgi.c.2956) backend died; we'll disable it for 5 seconds and send the request to another backend instead: reconnects: 0 load: 195
2009-03-04 10:43:42: (mod_fastcgi.c.3568) all handlers for /index.php on .php are down.
2009-03-04 10:43:48: (mod_fastcgi.c.2769) fcgi-server re-enabled: unix:/tmp/php-fastcgi.socket-0

Except for the first segment, the rest of it repeats over and over. It's all in the logs.

Any ideas?

Thanks,
Thane

lighttpd.error.log.txt

inthane

In the interest of keeping people updated who might also be experiencing this issue, I've upgraded to a beta of 1.2.3, and things seem to be working alright. The issue appears to be an issue with PHP, as detailed in this link:

http://pecl.php.net/bugs/bug.php?id=12608

I'm not enough of a guru to do a spot PHP upgrade, so I figured moving to BSD 7.1 (as 1.2.3 is moving to it) would help with the issue. Whaddya know, it seems to be working so far. ;D