[Solved] Boot fails after upgrade from 2.2 if not connected to internet
I'm testing out 2.3 using a test machine with the following process: install 2.2.6, restore my production config.xml (with interface names manually changed), and then perform an upgrade from the command-line option 13. I have been using the CE-Full-Update-2.3-BETA-amd64 build from earlier today (51f8df0, Mar 29 02:06:06).
After the upgrade and reboot, it nearly finishes booting, but ends up hanging during or right after the dummynet configuration. I let it sit for a while to see if it would time out, but it seemed to be hung. I tried booting in safe mode and in single user mode, neither of which bypassed the DN config. (This would make recovery from this in production quite a pain.)
If I edit the config.xml prior to the upgrade and remove the entire dnshaper block, everything appears to be OK.
Here is the console output in verbose mode from the failed boot. Note the system is on an isolated network with no other traffic. pfsense is configured to use 3 VLAN interfaces on em0 and is connected to a trunk port on a VLAN switch. Back in 2.2.6, the console will output all the same DN-related messages, but finishes booting immediately after them.
start_init: trying /sbin/init padlock0: No ACE support. aesni0: No AESNI support. tun1: bpf attached DUMMYNET 0 with IPv6 initialized (100409) load_dn_sched dn_sched FIFO loaded load_dn_sched dn_sched QFQ loaded load_dn_sched dn_sched RR loaded load_dn_sched dn_sched WF2Q+ loaded load_dn_sched dn_sched PRIO loaded Bump flowset buckets to 256 (was 0) Bump WF2Q+ weight to 1 (was 0) Bump flowset buckets to 256 (was 0) ... [flowset messages repeat several more times, then nothing]
Here is the content of my 2.2.6 dnshaper config. It's a fair sharing limiter as described by foxale08 in this thread.
<dnshaper><queue><name>Download</name> <number>3</number> <qlimit><plr><description><bandwidth><bw>145</bw> <burst><bwscale>Mb</bwscale> <bwsched>none</bwsched></burst></bandwidth> <enabled>on</enabled> <buckets><mask>none</mask> <maskbits><maskbitsv6><delay>0</delay> <queue><name>Download_LAN</name> <number>1</number> <qlimit><description><weight><enabled>on</enabled> <buckets><mask>dstaddress</mask> <maskbits><maskbitsv6></maskbitsv6></maskbits></buckets></weight></description></qlimit></queue></maskbitsv6></maskbits></buckets></description></plr></qlimit></queue> <queue><name>Upload</name> <number>4</number> <qlimit><plr><description><bandwidth><bw>140</bw> <burst><bwscale>Mb</bwscale> <bwsched>none</bwsched></burst></bandwidth> <enabled>on</enabled> <buckets><mask>none</mask> <maskbits><maskbitsv6><delay>0</delay> <queue><name>Upload_LAN</name> <number>2</number> <qlimit><description><weight><enabled>on</enabled> <buckets><mask>srcaddress</mask> <maskbits><maskbitsv6></maskbitsv6></maskbits></buckets></weight></description></qlimit></queue></maskbitsv6></maskbits></buckets></description></plr></qlimit></queue></dnshaper>
I threw that dnshaper part into a config, restored it as described, booted up no problem. Tried associating those with some firewall rules too, restored, rebooted fine. Seems there's something to it in combination with the rest of your config, can you share a that full config backup with me?
Thanks cmb, I've sent you my config via PM.
Thanks. Changed just the interfaces (s/igbX/vtnetX/ for bhyve VM), set <ipaddr>on WAN to DHCP so it had network connectivity to fetch the update. It upgraded fine, reinstalled packages, was all working fine. It does log what you pasted, but no differently than 2.2.6 did before the upgrade.
The log output you posted makes me wonder if you're looking at the VGA monitor when you have the serial console enabled. What you describe might be all that will appear on the VGA output with your config having serial console enabled. Which console are you expecting?</ipaddr>
Aha! That's exactly it I think. My production system is an SG-2220 and so it has the console set to serial mode. I am using another box for testing and it just has an ordinary monitor. There is still a lot of FreeBSD boot output even if you have the console set to serial so I didn't think to check this. I also haven't seen the normal pfsense console boot output in probably years (it just works!), so it didn't occur to me that it was missing a lot of things.
I redid the process and changed the console type to 'video' before restoring, and now it's working fine. Well, almost… because I don't have it connected to the Internet, it's spinning in an infinite loop trying to contact the beta package update site to finish the upgrade. I expect it'll be fine once I swap it in for the production box.
Thanks for your help and sorry for the false alarm! I wonder if it might be useful to output some kind of message to the non-active console, so if you end up in this situation you will have a better idea about what's wrong.
So I just realized that the "hang" I reported was just this package upgrade loop with no console output. If connectivity is required during an upgrade, this should probably be explained somewhere during the process. I can't put my test system online without changing a lot of stuff (static IP, VLAN settings, etc).