Pfsense Crashes after 5-9 days running time



  • Hi,

    our Pfsense 1.0beta2 on Wrap 3Nic board crashes (no response on any iface) after 5-9 days running time. Because i havn´t a syslog server i cannot look
    it cause. The internal log isn´t available because it is a cf card.

    I haven´t set the flag  "Disable writing log files to the local disk"
    should i set the flag? Perhaps the logs which are written in the ruinning time are to big?



  • Please upgrade to the newer version that can be found here http://pfsense.com/~sullrich/RELENG_1_SNAPSHOT_03-19-2006/ . I'm running several wraps, one of them in production environment, and they are rockstable with uptimes longer than you mention. I didn't have issues withthe original beta2 either though most of them are now running the above posted snapshot.



  • Check the power supply and make sure you are getting enough power.



  • …especially if you run addon cards in the mini pci slot.



  • What kind of CF card are you running and how much memory is on your WRAP board?

    The cheaper the CF card, the less read writes you are going to have. Not unless that card has been hammered over a year I couldn't see it failing it's read/writes, but depending on the size of your CF card, it may be filing up. Look at your Disk space usage if you can get it to boot.

    I have always had good luck with cheaper brand PNY, but on critical things I love the San Disk Extremes. I picked up my 256Mb CF cards around $20 on ebay and these were SanDisk extremes brand new.



  • i'm now having the same issue at one my client end buildings.

    I have a 400watt power supply, 633mhz celeron, tyan motherboard, 256Mb of Ram.

    As a percausion i increased firewall states to 25000 and agressive.

    It seems about every 5-9 days now it just locks up. Nothing found in logs or anything. Just stops running.



  • Make sure you have no shared irq's and deactivate all non-needed items in the bios such as parallel port, serial, etc.



  • it did it again today…havn't check the IRQ's yet but it seems like it is still passing the taffic. Havn't heard back from the clients yet but when I saw it today the ethernet leds where still blinking. maybe it just stops responding to pings? i have seem many devices do this from working at an ISP.



  • Just checked some of my productive systems. Longest uptime is a nexcom carp cluster having several ipsec tunnels to different other devices (cisco, sonicwall, other pfsense), handling a /30 public IP subnet, pptp server enabled with several clients using it daily… 80+ days. A wrap at our office has around 18 days which has transfered 35 GB in this period... You really have some other issue. Check your network for conflicting IPs or defective switches or cables or hardware or ...don't know.



  • ok it's not crashing, it just stops responding to pings. Apparently it is still passing traffic I just can't ping anything. I didn't touch it after it "locked up/stop responding" and two hours later it is now back to working and responding.

    Nothing significant in any of the logs. This is the log from when I sshd into it after I rebooted it when it locked up/stop responding earlier, to back responding at this time.

    Apr 13 17:17:30 kernel: arplookup 66.250.241.193 failed: host is not on local network
    Apr 13 17:17:30 kernel: arplookup 66.250.241.193 failed: host is not on local network
    Apr 13 16:55:46 last message repeated 2 times
    Apr 13 16:48:46 kernel: arplookup 66.250.241.193 failed: host is not on local network
    Apr 13 13:52:25 sshd[10490]: Accepted keyboard-interactive/pam for admin from 65.16.27.190 port 1142 ssh2

    So fro mabout 14:00 till about 17:00 it didn't respond at all. Then magically a few hours later everything is responding.



  • I was getting the same behavior here (not embedded) since version SNAPSHOT_04-03-2006 to SNAPSHOT_04-08-2006.

    The webgui interface just stop responding, same thing pinging the box.

    The traffic is still passing in and out from Internet, and the block and NAT rules are still working…

    I have no clue why this happened the logs didn't show me any relevant information about errors, or something :(

    Aniway, it was solved with the last update SNAPSHOT_04-12-06!!!

    Thanks for the help you guys!

    Emanuel Gonzalez
    Guatemala



  • I think my issue was is that they keep this link hammered @ 3Mb 24/7 almost. My WISP that I work for is new to traffic shaping and such and originally they were just given a backhaul link but were sold @ 3Mb. So this is a great thing for then us and the client as the client has not noticed any difference in speed.



  • @EmanuelG:

    I was getting the same behavior here (not embedded) since version SNAPSHOT_04-03-2006 to SNAPSHOT_04-08-2006.

    The webgui interface just stop responding, same thing pinging the box.

    The traffic is still passing in and out from Internet, and the block and NAT rules are still working…

    I have no clue why this happened the logs didn't show me any relevant information about errors, or something :(

    Aniway, it was solved with the last update SNAPSHOT_04-12-06!!!

    Thanks for the help you guys!

    Emanuel Gonzalez
    Guatemala

    Well, in my case I'm using now Beta3, and problem it's gone  :D

    Awesome job guys, I have not idea how can I work without pfsense all of these years  :D  !!!

    Emanuel Gonzalez
    Guatemala



  • upgraded to beta 3, but before i did it happened two times earlier that day. Some times it last up to two hours of not being pingable/ssh'd into etc/etc. Other times it only lasts 20 min.

    so we will see how beta 3 does. Beta 3 so far seems much more responsive just through the menus and such. great work!


Log in to reply