Upgrade to 2.1 and vmware 5.1 gone wrong

hypemedia

I have updated the software to pfsense 2.1 on vmware 5.1 and now pfsense works for a few minutes after that is crashing.

This is the error:
mpt0: request 0xffffff80002daf80:313 timed out for ccb 0xffffff0002554000
mpt0: attempting to abort req 0xffffff80002daf80:313 function 0

The pfsense install is not usable.

I have used this code in loader.conf but without any results:

Limit queue size for ZFS

vfs.zfs.vdev.min_pending="1"
vfs.zfs.vdev.max_pending="1"

Disable MSI as a potential workaround for MPT being a colossal jerk

hw.pci.enable_msix="0"
hw.pci.enable_msi="0"

I have the following packages installed in pfsense:

arpwatch Security 2.1.a15_5

Dashboard Widget: Snort System 0.3.4

iperf Network Management 2.0.5

Open-VM-Tools-8.8.1 Services 528969 VMware Tools

snort Security 2.9.4.6 pkg v. 2.5.9

Am I the only one facing those problems?

![Screen Shot 2013-09-16 at 17.53.10.png](/public/imported_attachments/1/Screen Shot 2013-09-16 at 17.53.10.png)
![Screen Shot 2013-09-16 at 17.53.10.png_thumb](/public/imported_attachments/1/Screen Shot 2013-09-16 at 17.53.10.png_thumb)

tenortim

What are you using as the underlying datastore for the VM?
I ask because the error looks like it's simply an issue with the underlying datastore itself. Any chance it's on an NFS server and there are problems?

hypemedia

it is a local disk raid 1 connected to a hp smart array. It did worked before the update. The other vm are working.

itsJim

I've upgraded to v2.1 from v2.0.3 in vmware 5.1 without any issues.

blackbird

What are your virtual machine properties

ESXI 5.1 update 1

Here are mine that might make the difference.

Guest Operating System FreeBSD 23bit
SCSI Controller 0 –------------LSI Logic Parallel
Network Adapter--------------- E1000

I didn't have any problems.

hypemedia

vmware 5.1 update 1
Guest Operating System FreeBSD 64bit
SCSI Controller 0 –------------LSI Logic Parallel
Network Adapter--------------- E1000

Now the pfsense appliance is restarting after a few minutes of running some type of kernel panic.

Supermule

Download a new image and try a vanilla install.

hypemedia

More information form the system log. It seams to be an error related to FreeBSD that should not exist in anymore VMware 5.1 because it was fixed.

Sep 17 20:36:28 kernel: calcru: runtime went backwards from 60382 usec to 21110 usec for pid 16281 (apinger)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 3113 usec to 921 usec for pid 13663 (sshlockout_pf)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 794 usec to 235 usec for pid 13261 (inetd)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 1319 usec to 390 usec for pid 13064 (sshd)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 25296 usec to 7489 usec for pid 13064 (sshd)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 93904 usec to 31950 usec for pid 12383 (choparp)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 174188 usec to 64017 usec for pid 11922 (logger)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 219612 usec to 70215 usec for pid 11656 (tcpdump)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 207 usec to 117 usec for pid 268 (devd)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 105 usec to 31 usec for pid 259 (check_reload_status)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 38045492 usec to 11264747 usec for pid 254 (check_reload_status)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 8674 usec to 2718 usec for pid 63 (md0)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 3820 usec to 1345 usec for pid 37 (zfskern)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 3953 usec to 1316 usec for pid 24 (softdepflush)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 10821 usec to 3674 usec for pid 23 (syncer)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 1733 usec to 609 usec for pid 22 (vnlru)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 1428 usec to 513 usec for pid 21 (bufdaemon)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 12 usec to 4 usec for pid 20 (pagezero)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 516 usec to 186 usec for pid 19 (idlepoll)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 507 usec to 178 usec for pid 17 (pagedaemon)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 5760 usec to 2556 usec for pid 15 (pfpurge)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 21 usec to 6 usec for pid 9 (sctp_iterator)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 1431 usec to 506 usec for pid 8 (fdc0)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 37841 usec to 12959 usec for pid 14 (yarrow)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 38870 usec to 11813 usec for pid 3 (g_up)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 9106 usec to 2696 usec for pid 2 (g_event)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 11 usec to 3 usec for pid 13 (ng_queue)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 1713154 usec to 575555 usec for pid 12 (intr)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 941565057 usec to 358711030 usec for pid 11 (idle)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 13970 usec to 4266 usec for pid 1 (init)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 131412019 usec to 38935778 usec for pid 1 (init)
Sep 17 20:36:28 kernel: calcru: runtime went backwards from 1551529 usec to 501300 usec for pid 0 (kernel)

Supermule

That seems to be a mismatch between servertime in Vmware tools and your NTP used in Pfsense….

hypemedia

There is no NTP client on Vmware host and on psfense there is the normal NTP client setup.
I have kern.timecounter.hardware=i8254

Supermule

Is the host configured to use NTP and is the client running on the host?

ntp.jpg_thumb

hypemedia

No the host is not running NTP the NTP is stopped.

hypemedia

I am not able to get more than 5-6 hours without a crash. I am already very annoyed with this update experience, and the many problems of FreeBSD and Vmware.

Filename: /var/crash/info.0
Dump header from device /dev/label/swap0
Architecture: amd64
Architecture Version: 1
Dump Length: 82432B (0 MB)
Blocksize: 512
Dumptime: Wed Sep 18 16:51:36 2013
Hostname: pfsense.localdomain
Magic: FreeBSD Text Dump
Version String: FreeBSD 8.3-RELEASE-p11 #1: Wed Sep 11 18:59:48 EDT 2013
root@snapshots-8_3-amd64.builders.pfsense.org:/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense_SMP.8
Panic String: softdep_setup_freeblocks: inode busy
Dump Parity: 3431883880
Bounds: 0
Dump Status: good

doktornotor

@hypemedia:

No the host is not running NTP the NTP is stopped.

So that obviously would be a problem, no??? Why's it disabled where you clearly have issues with time going backwards?

hypemedia

Because if you use both the NTP form the guest and the one form the host, and those are not synchronized (something is slow usually on the guest) the guest can crush. This is at least what I get from the vmware forums and also somewhere here it says the same. If I am wrong please let me know but also with NTP enabled on the host vmware I had the same problems.

kejianshi

Id have the host getting NTP updates and then have the guests sync (often) with the host.

hypemedia

Ok I will try to do this and see if it works but I think the crash is not related to the clock going backwards.

kejianshi

Is it 64bit? Is it a VM?
I think I've become pretty convinced that the 64bit version has issues when ran as a VM.

Try this. Back up all settings, do a clean install of 32 bit version, restore settings.

Then tell results. If it works, the dvs need to know your hypervisor type, ect ect ect.

hypemedia

It is 64 and it is on VMware 5.1.

kejianshi

I think I've become pretty convinced that the 64bit version has issues when ran as a VM.

Try this. Back up all settings, do a clean install of 32 bit version, restore settings.

Then tell results. If it works, the devs need to know your hypervisor type, ect ect ect.

(More info - In my first install of 2.1 on ESXi long ago I had issue with 2.1RC crashing NTP but that was my smaller problem. When I added more than 4 interfaces, always crashed - Recently with others, after much hairpulling to figure out why their NTP was core-dumping, I recommended 32bit as experiment to see if it was related to my old issue. Seems it was because 32bit worked for them. So, now I'm recommending anyone getting flakey results in any 64bit 2.1 pfsense running as VM to try the 32bit version right away rather than pulling hair all day for days.)