PfSense 1.2.2 500 Internal Server Error pops up randomly on login page



  • Greetings,

    I work at a college where we are using pfSense as the gateway for our wireless users.  Upon connecting to the wireless network, they are directed to enter a username and password, which is then authenticated against an (external) Radius server.  Once authenticated they are allowed to access the internet.

    The pfSense server is running in a virtual machine hosted on a VMWare ESX 3 Server.  It currently has 512mb of RAM, and has been assigned to run on a single CPU, with 1ghz of reserved CPU time.  It is hosted on a 6gb disk image which shows plenty of free space.  I started with the 1.2 image, and updated it to the 1.2.2 image when I first encountered this problem.  The CPU and memory logs show plenty of memory and CPU time available at all times to the server in question.

    About once or twice a day, the login page stops working and is replaced with an "Error 500: Internal Server Error" message.  People who are already connected to the server seem to not be losing their connections - it appears to be affecting the web server only, not the passthrough.  Restarting the server fixes the problem for the time being, but I would like a more permanent fix than this.  I'm not well conversant in BSD, so I'm not sure what information I should post.  I will provide any further information that people request.

    Thanks,
    Thane


  • Rebel Alliance Developer Netgate

    You might want to check the contents of /var/log/lighttpd.error.log

    You should be able to do this from the WebGUI or the console. Going to Diagnostics > Command, and typing "cat /var/log/lighttpd.error.log" may be enough.



  • I'll look at it again the next time the server falls over.  Right now the contents of the file are:

    2009-02-19 14:00:53: (log.c.97) server started
    2009-02-19 14:00:54: (log.c.97) server started
    


  • Ive recently been testing freeradius, the only time that the captive portal gave me this error was when I had forgot to run radius itself! obviously this isnt the case with you, but it could be that your radius server is being overloaded and your pfs box isnt getting replies in time? just a thought.

    You could try playing around with "Maximum concurrent connections" on the CP page to reduce number of connections a user makes.

    Slam



  • This might be the case - I didn't set up the Radius server, someone else told me to point at it.  Turns out it's a virtualized Windows 2003 Server box running on 256mb of memory.  I bumped it up to 512, gave it some minimum CPU time and reserved memory, and we'll see if that helps.



  • It's been crashing again.  I've attached the log file.  The relevant chunks, as near as I can tell:

    2009-03-04 08:47:53: (mod_fastcgi.c.2494) unexpected end-of-file (perhaps the fastcgi process died): pid: 0 socket: unix:/tmp/php-fastcgi.socket-0
    2009-03-04 08:47:53: (mod_fastcgi.c.3325) response not received, request sent: 1222 on socket: unix:/tmp/php-fastcgi.socket-0 for /index.php , closing connection
    2009-03-04 09:09:24: (request.c.1153) request-size too long: 2147479552 -> 413
    2009-03-04 09:09:25: (request.c.1153) request-size too long: 2147479552 -> 413
    2009-03-04 09:18:15: (network_writev.c.115) writev failed: Operation not permitted 18
    2009-03-04 09:18:15: (connections.c.606) connection closed: write failed on fd 18

    2009-03-04 08:47:53: (mod_fastcgi.c.2494) unexpected end-of-file (perhaps the fastcgi process died): pid: 0 socket: unix:/tmp/php-fastcgi.socket-0
    2009-03-04 08:47:53: (mod_fastcgi.c.3325) response not received, request sent: 1222 on socket: unix:/tmp/php-fastcgi.socket-0 for /index.php , closing connection
    2009-03-04 09:09:24: (request.c.1153) request-size too long: 2147479552 -> 413
    2009-03-04 09:09:25: (request.c.1153) request-size too long: 2147479552 -> 413
    2009-03-04 09:18:15: (network_writev.c.115) writev failed: Operation not permitted 18
    2009-03-04 09:18:15: (connections.c.606) connection closed: write failed on fd 18

    2009-03-04 10:30:03: (network_writev.c.115) writev failed: Operation not permitted 62
    2009-03-04 10:30:03: (connections.c.606) connection closed: write failed on fd 62

    2009-03-04 10:43:42: (mod_fastcgi.c.1768) connect failed: Connection refused on unix:/tmp/php-fastcgi.socket-0
    2009-03-04 10:43:42: (mod_fastcgi.c.2956) backend died; we'll disable it for 5 seconds and send the request to another backend instead: reconnects: 0 load: 195
    2009-03-04 10:43:42: (mod_fastcgi.c.3568) all handlers for  /index.php on .php are down.
    2009-03-04 10:43:48: (mod_fastcgi.c.2769) fcgi-server re-enabled: unix:/tmp/php-fastcgi.socket-0

    Except for the first segment, the rest of it repeats over and over.  It's all in the logs.

    Any ideas?

    Thanks,
    Thane

    lighttpd.error.log.txt



  • In the interest of keeping people updated who might also be experiencing this issue, I've upgraded to a beta of 1.2.3, and things seem to be working alright.  The issue appears to be an issue with PHP, as detailed in this link:

    http://pecl.php.net/bugs/bug.php?id=12608

    I'm not enough of a guru to do a spot PHP upgrade, so I figured moving to BSD 7.1 (as 1.2.3 is moving to it) would help with the issue.  Whaddya know, it seems to be working so far.  ;D


Log in to reply