Help troubleshooting looping crash



  • Hi all,

    I was running 2.4.4 as a Hyper-V VM. Today the pfSense VM crashed and did not automatically recover after reboot. The instance ended up looping. I went into single user mode and did a sbin/fsck -y -t ufs a number of times until I saw no more errors. After rebooting, the system was corrupted: no working console, no TCP/IP, etc. Lots of errors about missing files.

    So I applied a snapshot from two months ago which was 2.4.1 and I'm back in business. However, is there anything I can do to try and learn what went wrong? I saved the FUBAR snapshot so I should be able to get back into it. This happened once before and my hunch is that this happens with very high outbound traffic. The pfSense instance has 2 vCPUs and 2GB memory. I'm happy to assign it more resources if necessary. The firewall has lots of NAT traversal traffic for mail server, web servers, FTP, VPN, Owncloud, etc. etc. Somewhere in the ballpark of 15 static IPs, nat'ed through, 4 VLANs, DHCP serving multiple subnets.

    Any ideas would be appreciated.

    Thanks,

    Matt



  • I did discover that the Windows 2012 R2 host server initiated an unexpected Window Update reboot at that time which most likely induced the corruption. Assuming (always a risky proposition) that the VM didn't complete its graceful shutdown prior to the host server reboot, I can see how this might contribute to data corruption within the guest OS. It's a new(er) server so I didn't catch that auto-restart was somehow still enabled.


  • Netgate Administrator

    Simply running at high load should not cause filesystem damage or trigger a reboot.

    It sounds like it won't happen again with auto-restart disabled but in a situation where it might happen you should consider running ZFS. It's more resilient to unexpected power issues in general.

    Steve



  • Hi Steve,

    Thanks for the advice. I'll read up on that during vacation this week.

    Greatly appreciated and Merry Christmas,

    Matt



  • @stephenw10

    Created a new instance of 2.4.4 using ZFS and restored my config. As expected, I'm using more system memory with ZFS. Without any IPS/IDS packages the system is using approximately 44% of the 2GB assigned. At some point I'm going to deploy either Snort or Suricata. Would pfSense benefit from additional system memory even without those packages or is 2GB sufficient with ZFS for a "basic" config? As it is, it's running just fine as far as I can tell.

    Thanks for the assistance,

    Matt


  • Netgate Administrator

    I would think that's fine. It depends on how much disk size you've given it really.

    If the memory use isn't rising unreasonably you're almost certainly OK. At least until you add Snort/Suricata.

    Steve


  • Rebel Alliance Developer Netgate

    Be aware that using ZFS in a virtual environment may have some unexpected behavior. ZFS is copy-on-write so it doesn't play well with thin provisioned storage, eventually it will take up the entire space allocated to its disk(s).


Log in to reply