Upgrade 2.15 to 2.2 failed
-
Hello all,
I changed to pfSense with the release of version 2.1 and prior to my problem description I would like to say thank you to everyone involved in this project.
I'm running two pfSense installations in a failover cluster, both on KVM/QEMU virtual machines. I haven't had any problems so far (fingers crossed) and everything runs smoothly. Last weekend I did an autoupdate on the secondary node (as recommended in the upgrade guide) but after the upgrade the box wouldn't reboot (it actually went into a boot loop). I tried several times (after resetting the machine each time) but it wouldn't work, regardless of the changes I made (deinstallation of the plugins in use, disabling CARP). Finally, I tried a clean installation which worked flawlessly but after restoring my configuration from the 2.1.5 machine the same thing happened, the box kept rebooting. There is nothing special about the configuration: CARP, 4 interfaces, single WAN, openVPN, IpSec, SNORT, pfBlocker, some limiters and of course a couple of rules and NAT.
Since I am not able to tell what exactly went wrong from the console output I don't know what to do. Randomly turning on and off features before an upgrade attempt isn't a very good solution. The relevant lines in the console output look like this (full output can be provided):
Firmware upgrade in progress…
(...)
Booting...
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2014 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.1-RELEASE-p4 #0 36d7dec(releng/10.1)-dirty: Thu Jan 22 15:12:38 CST 2015
(...)
Welcome to pfSense 2.2-RELEASE ...Creating symlinks......ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib
a.out ldconfig path: /usr/lib/aout /usr/lib/compat/aout
done.
External config loader 1.0 is now starting...
Launching the init system... done.
Initializing...................... done.
Starting device manager (devd)...done.
Loading configuration......done.
Updating configuration................................................done.
Cleaning backup cache...done.
Setting up extended sysctls...done.
Setting timezone...done.
Configuring loopback interface...done.
Starting syslog...done.
Starting Secure Shell Services...done.
Setting up polling defaults...done.
Setting up interfaces microcode...done.
Configuring loopback interface...done.
Creating wireless clone interfaces...done.
Configuring LAGG interfaces...done.
Configuring VLAN interfaces...done.
Configuring QinQ interfaces...done.
Configuring RED interface...done.
Configuring GREEN interface...done.
Configuring ORANGE interface...done.
Configuring BLUE interface...done.
Configuring CARP settings...done.
Configuring CARP settings...done.
Syncing OpenVPN settings...done.
Configuring firewall..DUMMYNET 0 with IPv6 initialized (100409)
load_dn_sched dn_sched FIFO loaded
load_dn_sched dn_sched QFQ loaded
load_dn_sched dn_sched RR loaded
load_dn_sched dn_sched WF2Q+ loaded
load_dn_sched dn_sched PRIO loaded
Bump sched buckets to 256 (was 0)
Bump sched buckets to 256 (was 0)
Bump sched buckets to 256 (was 0)
Bump sched buckets to 256 (was 0)
Bump sched buckets to 256 (was 0)
Bump sched buckets to 256 (was 0)
....done.
Starting PFLOG...done.
Setting up gateway monitors...panic: pfsync_undefer_state: unable to find deferred state
cpuid = 1
KDB: enter: panic
[ thread pid 0 tid 100035 ]
Stopped at kdb_enter+0x3d: movl $0,kdb_why
db:0:kdb.enter.default> textdump set
textdump set
db:0:kdb.enter.default> capture on
db:0:kdb.enter.default> run lockinfo
db:1:lockinfo> show locks
No such command
db:1:locks> show alllocks
No such command
db:1:alllocks> show lockedvnods
Locked vnodes
db:0:kdb.enter.default> show pcpu
cpuid = 1
dynamic pcpu = 0x1e546500
curthread = 0xc686b000: pid 0 "em0 taskq"
curpcb = 0xe0ef5d60
fpcurthread = none
idlethread = 0xc66ffc40: tid 100004 "idle: cpu1"
APIC ID = 1
currentldt = 0x50
(…)Can anyone make sense of this? Many thanks!
Florian
-
Answering my own post here…
It seems that the synchronization of limiter rules are causing the problem in version 2.2. I haven't tested it yet but it sounds reasonable. The topic has also been discussed here:
https://forum.pfsense.org/index.php?topic=87541.0
…and a bug report has been filed here: