Random reboots pfSense 2.1.5 VM [SOLVED]
-
I'm experiencing random reboots on my pfSense firewall running as a VM under vSphere 5.5
Anyone else experienced this problems?Earlier this year I got random kernel panics, so there actually was a log (this happened avg 1 time/month)
Now there is nothing, it just dies and starts itself up and everything is fine for everything between 8h and 1week.After the kernel panics I opened a support ticked. They couldn't see any obvius stuff then but I was recommended to upgrade the pfSense from x86 to x64 to match the real hardware better.
I tried this but wasn't successful so I went back to x86.Some info:
pfSense2.1.0 x862.1.5 x86
vSphere 5.5.0, 1623387
HP ProLiant DL385 G7 (latest FW)
Broadcom NC382i + Intel 82580 (total 8 NICs)
VM Hardware version 8
^ everything on vSphere HCLThank you!
-
My config is similar and I've never seen that. I would recommend:
1. Make a backup of your config via Diagnostics - Backup/Restore
2. Install pfSense 2.1.5-i386
3. Restore your backup fileMake sure you snap your VM before doing any of this. Note that upgrading from i386 to x64 is not recommended, from what I recall. You can't take an i386 backup file and use it to restore on an x64 config.
-
there is no reason to run x64 of pfsense unless you are giving it more than 4GB of ram, etc.
You are old version of pfsense, 2.1.5 i386 is what I would be on. Your esxi is OLD.. current build is 2143827 came out 10/15, your build is the original release of update1, there have been like 6 patches since then.
Why do you run hardware version 8, and not 9 or 10? I can understand not going to 10 if your using free version and the client to manage your esxi host. But its easy enough to move to 9, just upgrade to 10 and then back it off to 9 in the vmx file.
-
Mysterious.
I don't see any reason to go to x64 either if memory isn't an issue. And my box uses ~150-200MB.
ESXi version is "a bit after". Rolling 3.5 is OLD :) I will upgrade but it take some planning to do so when 250 users needs connection.
Same with pfSense itself of course. And I can't see any release notes in either of the newer ESXi-releases or pfSense-versions that touches random reboots.
HW version 8 or 9 is king of the hill since it works with everything! But upgrading wouldn't hurt I guess.So you say that the following steps would probably solve the random reboots:
Going to ESXi 5.5-2143827
Upgrading to pfSense 2.1.5
Updating vHW to 10 -
don't upgrade to HW10 if you run the FREE esxi hypervisor ( you won't be able to edit the VM after update to 10)
do you have a virtual CD-driver on the pfsense-VM ? If yes ==> remove it and restart the VM
see related post here: https://forum.pfsense.org/index.php?topic=82849.0 -
I know, but I'm not.
Was able to catch som interessting stuff today when this happened:Nov 26 08:43:16 kernel: ZFS storage pool version 28
Nov 26 08:43:16 kernel: ZFS filesystem version 5
Nov 26 08:43:16 kernel: in /boot/loader.conf.
Nov 26 08:43:16 kernel: Consider tuning vm.kmem_size and vm.kmem_size_max
Nov 26 08:43:16 kernel: ZFS WARNING: Recommended minimum kmem_size is 512MB; expect unstable behavior.
Nov 26 08:43:16 kernel: add "vfs.zfs.prefetch_disable=0" to /boot/loader.conf.
Nov 26 08:43:16 kernel: ZFS NOTICE: Prefetch is disabled by default on i386 – to enable,
Nov 26 08:43:16 kernel: WARNING: / was not properly dismounted
Nov 26 08:43:16 kernel: Trying to mount root from ufs:/dev/da0s1a
Nov 26 08:43:16 kernel: SMP: AP CPU #2 Launched!
Nov 26 08:43:16 kernel: SMP: AP CPU #4 Launched!
Nov 26 08:43:16 kernel: SMP: AP CPU #3 Launched!
Nov 26 08:43:16 kernel: SMP: AP CPU #5 Launched!
Nov 26 08:43:16 kernel: SMP: AP CPU #1 Launched!
Nov 26 08:43:16 kernel: da0: 51200MB (104857600 512 byte sectors: 255H 63S/T 6527C)
Nov 26 08:43:16 kernel: da0: Command Queueing enabled
Nov 26 08:43:16 kernel: da0: 320.000MB/s transfers (160.000MHz DT, offset 127, 16bit)
Nov 26 08:43:16 kernel: da0: <vmware virtual="" disk="" 1.0="">Fixed Direct Access SCSI-2 device
Nov 26 08:43:16 kernel: da0 at mpt0 bus 0 scbus0 target 0 lun 0Could it be a lead? Afaik, ZFS isn't used on pfSense.</vmware>
-
that warning is "normal" … i see it on all my installs
check the virtual cd-drive ;) -
that warning is "normal" … i see it on all my installs
check the virtual cd-drive ;)Will do that at lunch time :) thanks for the suggestion. Let's see what happens!
-
Upgraded pfSense to 2.1.5 and removed virtual CD.
Just had another reboot, so that didn't help :( this is starting to be pretty critical.
Wonder if x64 will help. But i've never seen that it's "needed" on other systems (like Linux and Windows).The ONLY time I've seen Firewalls restart like this was on a virtual Clavister. That was due to a bug in the AES hardware encryption/decryption when VPN connections was made.
Can this be something related to that? Last times I've noticed that a OpenVPN-connection is done just before the restarts. Will keep an eye on the log.
Reboots have never happen at night/production time. -
Things to consider with HW.
CPU's need micro code updates to fix bugs in the cpu, which are normally delivered via OS updates like windows update and some linux updates.
If you dont have an OS vm running on your machine which can update the cpu, can you check to see if your cpu needs an update and if so has been patched? https://downloadcenter.intel.com/Detail_Desc.aspx?DwnldID=14303
Likewise if you are running Intel chip's capable of supporting AMT, can you be sure no one is messing with your machine via AMT, the OOB makes for a great back door into people system irrespective of their firewall & other security constraints beit in a data centre or office block.
Could be a simple ram chip failing if the system is old, or it could be sectors dropping out on the drive which might be causing problems. Thats the problem with random reboots, its not always obvious whats at fault.
There are also "magic" packets which can be sent out which can also mess with some machines as well, if someone was "playing" with your system, but in all honesty its probably a simple bit of hw failure, maybe a ram chip not seated properly, maybe some dust build up is shorting something (these things are like vacuum cleaners in the wrong places) thats causing the random reboots.
-
I run the latest firmwares from HP. And AMD doesn't have AMT.
The server is pretty well isolated, so I'm as sure as I can be that this isn't someone messing with it… it should produce som log indication of this as well.I've handeled loads of servers, and I've never seen one that has not been able to reported errors in REGECC MEM.
The RAID0 sould be able to handle disk failures.I don't believe that this is something hardware related.
-
don't upgrade to HW10 if you run the FREE esxi hypervisor ( you won't be able to edit the VM after update to 10)
Actually this is no longer true..
Current build of client/esxi allows to edit vmx-10
Not saying updating fixes the problem, but not working with current is support issue. What is the first thing any support tells you when you call ;) If your wanting to track down a bug - why would you track it down on old versions..
-
Current build of client/esxi allows to edit vmx-10
Thanks for the tip. I had no idea. I was still on the original 5.5.0 release of vi-client. Thank the stars that I don't have to use their annoying web client. I know the web client is the future, but I find it a PITA to use.
-
there is no reason to run x64 of pfsense unless you are giving it more than 4GB of ram, etc.
Not true. There likely aren't any functional differences in that case, but 32 bit is a dying breed, every 64 bit capable system should be running 64 bit. FreeNAS and Dragonfly both just put out their last releases with 32 bit support. We'll stop putting out 32 bit releases before too long, maybe a year or two down the road. We do much more testing on 64 bit than 32, and 64 is more widely used, so less chance of issues there. I'm not aware of any architecture-specific issues in 2.2, but if there are any, they're likely 32 bit only.
@KOM:
2. Install pfSense 2.1.5-i386
No, don't do that, use 64 bit.
@KOM:
You can't take an i386 backup file and use it to restore on an x64 config.
Yes you can, there is nothing architecture-specific in most all configs. The only thing that can be architecture-specific is if you manually set your auto-update URL. Just going to System>Firmware, Updater Settings tab, and verifying you don't have "Use an unofficial server for firmware upgrades" checked will ensure that's not an issue.
-
Yes you can, there is nothing architecture-specific in most all configs.
Good news. I was repeating something I had heard from someone else here many months ago.
Yes you can, there is nothing architecture-specific in most all configs.
Oh? Then why is upgrading from 32 to 64 bit not supported?
-
Because that's a re-install anyway.
-
Yea, I'll give x64 another try. But I don't know when I can have that service window.
Now i crashed again. And yet again this seems to be related to OpenVPN.
I had a OpenVPN connection @ 09.45, just seconds after it crashed. So this must be OpenVPN related.
Should I open a ticket?EDIT: And again - OpenVPN connection @ 10.12, crash just right after..
-
did you enable hardware crypto by any chance?
i vaguely remember I once tried this setting on esxi and it resulted in "fatal trap xxx"
-
did you enable hardware crypto by any chance?
i vaguely remember I once tried this setting on esxi and it resulted in "fatal trap xxx"
Yep, HW Crypto enabled (BSD Cryptodev engine).
Disable it now, hope it helps.But i love HW decryption :(
Lets see the result.
-
I think we can mark this as solved for now ;D
Since HW-crypto for OpenVPN was turned off, I've not had a single reboot.I'd call this a bug.
Thanks all!