Fresh install on Silicom IA3003 (Netgate 8200)
-
@svandive said in Fresh install on Silicom IA3003 (Netgate 8200):
hw.uart.console="io:1016,br:115200"
Did you set that ^?
We do not set it. The only other difference I see, except all the led driver stuff which is missing, is:
hint.uart.0.at=acpi
-
So I did not set the hw.uart.console. This must be being picked up from the UEFI.
I should make sure this is clear. The machine I am installing on is NOT a Netgate 8200 purchased from Netgate but rather the same machine (Silicom Cordoba) that I use and work as part of my job. The only difference is the smbios on the Netgate 8200 is pre-programed with these two fields:
smbios.system.maker="Netgate"
smbios.system.product="8200"I was originally planning on simply installing pfSense Plus on here since I had the hardware and I need a Firewall for a new stub in my lab at work. I figured since Netgate uses the Silicom box it would be a tested / known white-box to run pfSense on. Reality has been a bit more cumbersome.
I have tried setting the two fields above (Netgate and 8200), and when the box boots pfSense does load the loader.conf.lua environment settings for the LEDs. However this has not helped me with booting. :-)
I have been able to resolve the initial boot up issue. the final issue I seem to be having is I am still getting stuck at the final stage of the boot up process. The console stops at the line:
Performing automatic boot verification...done.
I'm not able to get the menu to show but I am able to login now via the Web GUI. Any thoughts why console won't get to the end stage and show the login?
-
Anything shown in the system log?
Some hung process shown in the Diag > System Activity?
-
I corrected the issue that was causing the console to hang at the end of the boot cycle and not allowing the login screen / menu to appear.
What appears to be the final hurdle is getting past a hang-up that appears to be occurring when analyzing if any serial interface has been configured for GDB kernel debugging. At least that is the next message printed to the console when the unit boots properly, (GDB: no debug ports present). Below is all I get on the console and then it hangs.
__ _ __ / _|___ ___ _ __ ___ ___ _ | '_ \| |_/ __|/ _ \ '_ \/ __|/ _ \ _| |_ | |_) | _\__ \ __/ | | \__ \ __/ |_ _| | .__/|_| |___/\___|_| |_|___/\___| |_| |_| /---- Welcome to Netgate pfSense Plus ----\ __________________________ | | / ___\ | 1. Boot Multi user [Enter] | | /` | 2. Boot Single user | | / :-| | 3. Escape to loader prompt | | _________ ___/ /_ | | 4. Reboot | | /` ____ / /__ ___/ | | 5. Cons: Serial | | / / / / / / | | | | / /___/ / / / | | Options: | | / ______/ / / _ | | 6. Kernel: default/kernel (1 of 1) | |/ / / / _| |_ | | 7. Boot Options | / /___/ |_ _| | | | / |_| | | | /_________________________/ \-----------------------------------------/ Loading kernel... /boot/kernel/kernel text=0x1a4c98 text=0xff3048 text=0x17ed568 data=0x180+0xe80 data=0x24c808+0x3b37f8 0x8+0x1d4108+0x8+0x1e9a19- Loading configured modules... /boot/kernel/zfs.ko size 0x619a40 at 0x37be000 /boot/entropy size=0x1000 /boot/kernel/opensolaris.ko size 0x1e2a8 at 0x3dd9000 /etc/hostid size=0x25 staging 0x67e00000-0x6c51a000 (not copying) tramp 0x6c51a000 PT4 0x6c51b000 Start @ 0xffffffff803a5000 ... \
It is very odd behavior. If I stop the autoboot at this menu and interact with the menu in some way, (e.g. cycle through the Consoles (option 5) or access the loader prompt (option 3)) I am able to select option 1 and the machine will boot into pfSense properly. If I don't touch the menu and allow the autoboot to cycle down, the machine will hang just after printing:
Start @ 0xffffffff803a5000 ...
to the console. I have searched through all of the loader config files, but I'm not able to identify where the kernel flag for GDB is being set, or where I need to look to find answers to why the bootstrapping is getting hung up at this point. I suspect it might have something to do with the configuration of uart.0; however, I'm not really able to nail this down.
You had asked in an earlier post if I was setting the variable
hw.uart.console="io:1016,br:115200"
I don't have this configured in loader.conf.local, and I've not seen it set anywhere else. Also, when I stop at the initial boot menu and drop down to the loader prompt, the variable is not set at that stage and does not show up in the output from a "show" command. However, after booting up and looking at the output from "kenv" I do see it set.
The other issue I am seeing that is related to troubleshooting this problem is my "loader.conf.local" file keeps getting reverted to an earlier revision. When I first started working on this device and directly after the initial install of pfSense, the loader.conf.local file was empty. I added the following to it:
smbios.system.maker="Netgate" smbios.system.product="8200" legal.intel_ipw.license_ack="1" legal.intel_iwi.license_ack="1" console="efi"
I have since tried adding other various variables to this file, but anytime I reboot, the file always reverts back to just having the five (5) entries that I listed above. This is extremely odd, and I cannot for the life of me figure out how this file is being reverted back to a version that only has these five (5) entries. I would expect that if the OS was going to overwrite the file, it would be empty post-boot, not reverted to this already modified state sans my most recent updates to the file.
-
Well I figured out why loader.conf.local is seeming to be reverted. It's due to the script:
/etc/inc/pfsense-utils.inc
The fields I had been adding are specifically ones that are removed by this script.
/* These values should be removed from loader.conf and loader.conf.local * As they will be replaced when necessary. */ $remove = array( "hint.cordbuc.0", "hint.e6000sw.0", "hint.gpioled", "hint.mdio.0.at", "hint-model.", "hw.e6000sw.default_disabled", "hw.hn.vf_transparent", "hw.hn.use_if_start", "hw.usb.no_pf", "net.pf.request_maxcount", "vm.pmap.pti", ); if (!$local) { /* These values should only be filtered in loader.conf, not .local */ $remove = array_merge($remove, array( "autoboot_delay", "boot_multicons", "boot_serial", "comconsole_speed", "comconsole_port", "console", "debug.ddb.capture.bufsize", "hint.uart.0.flags", "hint.uart.1.flags", "net.link.ifqmaxlen", "hint.hwpstate_intel.0.disabled", "loader_conf_files", "machdep.hwpstate_pkg_ctrl", "net.pf.states_hashsize" )); }
I had been adding the "hint.gpioled" fields that would normally be picked up by loading loader.conf.lua if the unit is detected as a Netgate 8200. However, I noticed that my configuration in loader.conf.local that configures the two "smbios" fields to trick the software into thinking it is truly installing on a Netgate 8200 wasn't always being picked up before loader.conf.lua was read and processed, so I thought adding it to loader.conf.local would be a safe bet. As it turns out, the script pfsense-utils.inc doesn't like me doing that and is removing these entries.
Mystery solved. Now if I can just figure out why the bootstrapping process is getting hung up at the start, I'll be golden.
-
Are you sure that the serial console is using COM1? If it is using COM2, interrupt the booting at the pfSense splash screen by pressing "esc" or "3" and at the OK prompt enter "set comconsole_port=0x2f8" (without the quotes). If that works, then add "comconsole_port=0x2f8" (without the quotes) into the loader.conf.local file. I was having the same issue but on different hardware. But, I also have another problem I am trying to work through.
-
It definitely uses com1:
uart0: <Non-standard ns8250 class UART with FIFOs> port 0x3f8-0x3ff irq 16 flags 0x10 on acpi0 ... uart0: console (115200,n,8,1)
-
Did you ever come to a resolution on this? I'm having the exact same issues as you using an OEM appliance.
-
So the problems I was having on this device were intermittent and very difficult to diagnose. I ended up switching hardware to a generic Whitebox machine. I wrongly assumed that using the same hardware as a supported Netgate appliance would provide an easy path, but in reality, it resulted in too many intermittent issues.
When you say you are having the same issue(s) I was having, which specifically? I believe I ended up having four repeating / intermittent issues. I just need a pointer from you for which one you are experiencing.
-
I foolishly also made this assumption thinking it should have just worked out of the box.
The issue im getting is after various permutations messing with CSM, secure boot, various USB sticks and other variables, i still cannot pass (this is just past the splash page to even install pfsense). The only thing left i havent tried is a BIOS update.
Loading kernel...
/boot/kernel/kernel text=0x1a4c98 text=0xff3048 text=0x17ed568 data=0x180+0xe80 data=0x24c808+0x3b37f8 0x8+0x1d4108+0x8+0x1e9a19-
Loading configured modules...
/boot/kernel/zfs.ko size 0x619a40 at 0x37be000
/boot/entropy size=0x1000
/boot/kernel/opensolaris.ko size 0x1e2a8 at 0x3dd9000
/etc/hostid size=0x25
staging 0x67e00000-0x6c51a000 (not copying) tramp 0x6c51a000 PT4 0x6c51b000
Start @ 0xffffffff803a5000 ...
\