[resolved] pfsense-beta-102 netboot hangs @ "Lan configuration …"
-
I'm working on a headless install of pfsense (1.0.1-SNAPSHOT-03-15-2007 on the 'cdrom' platform) to soekris 4801 + hard drive.
I'm booting pfsense's bundled freebsd via pxeboot loaded from a tftpd server.
The subsequent pfsense boot picks up /root via nfs.
The process progresses thru the freebsd load, and starting the pfsense boot,
... Timecounters tick every 1.000 msec Fast IPsec: Initialized Security Association Processing. ad0: 38154MB <hts541040g9at00 mb2oa60a="">at ata0-master UDMA33 Trying to mount root from nfs: NFS ROOT: 10.0.0.10:/private/tftpboot sis0: link state changed to UP ___ ___/ f \ / p \___/ Sense \___/ \ \___/ Welcome to pfSense 1.0.1 on the 'cdrom' platform... ... md0.uzip: 1563 x 65536 blocks Generating MFS /var partition Generating MFS /etc partition Generating MFS /root partition Looking for pfi.conf on acd0c done. Looking for pfi.conf on done. Looking for config.xml on done. Generating a MFS /conf partition... done. Mounting filesystems... done. Creating symlinks......done. Launching PHP init system... done. Initializing................. done. Starting device manager (devd)...done. Loading configuration......done. Updating configuration...done. Cleaning backup cache...done. Setting up extended sysctls...done. Syncing user passwords...done. Starting Secure Shell Services...done. Setting timezone...done. Starting syslog...done. Configuring LAN interface...</hts541040g9at00>
but hangs here, and proceeds no further.
After a bunch of digging around in the code, I think I've tracked down where the problem lies.
For DEBUGGING, I've modified "/private/tftpboot/etc/inc/config.inc" at ~line #1711:
... function mute_kernel_msgs() { print "TEST1.\n"; /* exec("/sbin/conscontrol mute on"); */ print "TEST2.\n"; } function unmute_kernel_msgs() { print "TEST3.\n"; /* exec("/sbin/conscontrol mute off"); */ print "TEST4.\n"; } ...
and, "/private/tftpboot/etc/inc/interfaces.inc", at ~ line #764:
... function interfaces_wan_configure() { ... unlink_if_exists("{$g['varetc_path']}/mpd.links"); unlink_if_exists("{$g['vardb_path']}/wanip"); unlink_if_exists("{$g['varetc_path']}/nameservers.conf"); } +++ print "WAN1.\n"; /* remove all addresses first */ while (mwexec("/sbin/ifconfig " . escapeshellarg($wancfg['if']) . " -alias") == 0); +++ print "WAN2.\n"; mwexec("/sbin/ifconfig " . escapeshellarg($wancfg['if']) . " down"); +++ print "WAN3.\n"; /* wireless configuration? */ if (is_array($wancfg['wireless'])) interfaces_wireless_configure($wancfg['if'], $wancfg['wireless']);
and, in "/private/tftpboot/etc/inc/util.inc" @ ~ line#349
/* wrapper for exec() */ function mwexec($command) { print "MWEXEC1.\n"; global $g; $oarr = ""; $retval = ""; if ($g['debug']) { print "MWEXEC2.\n"; if (!$_SERVER['REMOTE_ADDR']) print "MWEXEC3.\n"; echo "mwexec(): $command\n"; print "MWEXEC4.\n"; exec("$command > /dev/null 2>&1", $oarr, $retval); print "MWEXEC5.\n"; } else { print "MWEXEC6.\n"; exec("$command > /dev/null 2>&1", $oarr, $retval); print "MWEXEC7.\n"; } print "MWEXEC8.\n"; return $retval; }
On reboot, I now see:
... Starting syslog...MWEXEC1. MWEXEC6. MWEXEC7. MWEXEC8. done. Configuring LAN interface...TEST1. TEST2. MWEXEC1. MWEXEC6. MWEXEC7. MWEXEC8. TEST3. TEST4. done. Configuring WAN interface...TEST1. TEST2. WAN1. MWEXEC1. MWEXEC6. nfs server 10.0.0.10:/private/tftpboot: not responding nfs server 10.0.0.10:/private/tftpboot: not responding nfs server 10.0.0.10:/private/tftpboot: not responding nfs server 10.0.0.10:/private/tftpboot: not responding (repeats endlessly ...)
which points to:
while (mwexec("/sbin/ifconfig " . escapeshellarg($wancfg['if']) . " -alias") == 0);
as the point at which the nfs server error repeats/hangs.
I'm not clear yet as to why.
Is this a code or config issue?
Any ideas/suggestions would be helpful!
Thanks.
-
I've run into this before. The issue is that the WAN interface is likely where you are netbooting from and we just deleted the IP address on it. ;) I haven't yet figured out a fix for it (honestly, it just hasn't been top of my priority list) - I believe doing a netboot of the cdrom iso works fine though (I'm pretty sure I've done that).
–Bill
-
Hi Bill,
I've run into this before. The issue is that the WAN interface is likely where you are netbooting from and we just deleted the IP address on it.
I had not realized that deletion in slogging my way through the code. Yes, that would be a problem :-/
I haven't yet figured out a fix for it
The Soekris Net4801 does have 3 Ethernet interfaces. Perhaps somehow making use of a second one? So that the deletion occurs only on the 1st? Grasping at straws …
(honestly, it just hasn't been top of my priority list)
Clear. With v102 'imminent', that seems to be a common matter, at the moment. Getting this headless, network install working will make a huge difference for us. In time, I suppose.
- I believe doing a netboot of the cdrom iso works fine though (I'm pretty sure I've done that).
I think that's effectively what I'm doing. I've cp'd the FullInstall pfsense cdrom iso to a /dir, published it via TFTP & NFS for netboot-ing, and stumbled valiantly onward through the process. Getting, as above, past the pfSense launch, but hanging at that mwexec() …
Perhaps I don't understand what you're suggesting? Something different?
Also, simply mentioning it here, I've been trying a different approach.
I've successfully netbooted a 'pure' FreeBSD 6.2 RELEASE from the distro's bootonly.iso, and gotten into the installer.
I've ALSO NFS-exported a /dir contining the pfSense v102 beta /root.
In the FreeBSD installer, I've selected an "NFS install", and pointed to the pfsense-containing NFS dir at:
10.0.0.10:/private/pfSense_tftpboot
and hit [OK].
that install hangs at:
lqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqk x Mounting 10.0.0.10:/private/pfSense_tftpboot over NFS on /dist x mqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj
I'm not clear if that can even be done. I'd naively expect that it should.
Further ideas/comments?
Thanks!
-
Hi Bill,
@billm:I've run into this before. The issue is that the WAN interface is likely where you are netbooting from and we just deleted the IP address on it.
I had not realized that deletion in slogging my way through the code. Yes, that would be a problem :-/
I haven't yet figured out a fix for it
The Soekris Net4801 does have 3 Ethernet interfaces. Perhaps somehow making use of a second one? So that the deletion occurs only on the 1st? Grasping at straws …
Don't recall.
(honestly, it just hasn't been top of my priority list)
Clear. With v102 'imminent', that seems to be a common matter, at the moment. Getting this headless, network install working will make a huge difference for us. In time, I suppose.
Considering it's not a supported way to install, it should be no surprise that it barely works…if at all. I tried once and decided that I didn't have the patience to work on it, it wasn't terribly critical for me.
- I believe doing a netboot of the cdrom iso works fine though (I'm pretty sure I've done that).
I think that's effectively what I'm doing. I've cp'd the FullInstall pfsense cdrom iso to a /dir, published it via TFTP & NFS for netboot-ing, and stumbled valiantly onward through the process. Getting, as above, past the pfSense launch, but hanging at that mwexec() …
Perhaps I don't understand what you're suggesting? Something different?
The only hint I can come up with is, don't change your IP address. I don't recall what I had to do, it's been a few months since I tried this, but I believe I used a config file that agreed with the IP the DHCP server gave out. ie. your issue is likely due to the IP changing, the kernel can no longer reach it's root disk and hangs.
Also, simply mentioning it here, I've been trying a different approach.
I've successfully netbooted a 'pure' FreeBSD 6.2 RELEASE from the distro's bootonly.iso, and gotten into the installer.
I've ALSO NFS-exported a /dir contining the pfSense v102 beta /root.
In the FreeBSD installer, I've selected an "NFS install", and pointed to the pfsense-containing NFS dir at:
10.0.0.10:/private/pfSense_tftpboot
and hit [OK].
that install hangs at:
lqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqk x Mounting 10.0.0.10:/private/pfSense_tftpboot over NFS on /dist x mqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj
I'm not clear if that can even be done. I'd naively expect that it should.
Further ideas/comments?
Yep, that'd be kind of a naive expectation :) The answer here is that the FreeBSD boot process works differently than the pfSense one. You can probably get it to work, but it'll require understanding of what FreeBSD does and what pfSense does in the boot process and modifying our process such that it no longer breaks. I don't expect that this is a difficult task, I needed it for my lab environment, but stopped working on it due to some PXE boot issues with Parallels (it doesn't do it) and my lack of desire to spend my evenings trucking up and down two flights of stairs between the soekris used for testing and my comfy coding chair. If you've got an IP power switch (or a way to make Parallels PXE boot pfSense - hint etherboot.org/rom-o-matic.net doesn't work), I'd be more than willing to work on this again.
–Bill
-
If you've got an IP power switch (or a way to make Parallels PXE boot pfSense - hint etherboot.org/rom-o-matic.net doesn't work), I'd be more than willing to work on this again.
I certainly don't, for neither. Sorry. :-/
I've gotten other comments (roughly) in #irc considering getting netboot working a 'waste of time'. So, unless someone else chimes-in that DOES have some input/interest, I gather that that moots any work on a 'official fix' for now.
Thanks for the comments!
-
If you've got an IP power switch (or a way to make Parallels PXE boot pfSense - hint etherboot.org/rom-o-matic.net doesn't work), I'd be more than willing to work on this again.
I certainly don't, for neither. Sorry. :-/
I've gotten other comments (roughly) in #irc considering getting netboot working a 'waste of time'. So, unless someone else chimes-in that DOES have some input/interest, I gather that that moots any work on a 'official fix' for now.
Thanks for the comments!
–Tenzen
I wouldn't consider it a waste of time, but I do consider it really only useful for development. As it's more effort right now than it's really worth, given my current development environment and other projects I'm currently working on and I'm probably the only dev actually interested in it, it's unlikely to be "fixed" (defined as such only because it doesn't work, not because it's supposed to work) in the near future.
–Bill
-
Hi again,
I wouldn't consider it a waste of time, but I do consider it really only useful for development. As it's more effort right now than it's really worth, given my current development environment and other projects I'm currently working on and I'm probably the only dev actually interested in it, it's unlikely to be "fixed" (defined as such only because it doesn't work, not because it's supposed to work) in the near future.
I certainly understand your stance; and understand that you've other priorities.
For reference, personally, I have a very different view of "worth" in this case.
I consider headless, netboot install to a disk-based system as a business need, more than a technical one. Deployment/maintenance of systems across a "wide area" drives our interest/need for this solution. It already exists in retail solutions, e.g. the Linksys+CustomFirmware, as well as small Cisco, NetScreen, SonicWall, etc. (we use/deploy both).
pfSense-on-a-Soekris(or pcWrap) seems like an intermediate option – at the right price point -- with the stated goals of giving commercial solutions "a run for their money".
Functionally, from what I can glean from docs, reviews, etc, it certainly seem to do that already. But, atm, install/administration -- admittedly, on these platforms and in our case -- it's a non-starter.
For now. And, if/until we're able to stumble upon the solution.
What we can't do, at this stage, is drive the solution through development by ourselves.
An alternative, of course, is 'just' FreeBSD on the box; we lose pfSense, but it works/installs right now. Catch-22.
Just thinking aloud :-)
Thanks again!
-
Hi again,
I wouldn't consider it a waste of time, but I do consider it really only useful for development. As it's more effort right now than it's really worth, given my current development environment and other projects I'm currently working on and I'm probably the only dev actually interested in it, it's unlikely to be "fixed" (defined as such only because it doesn't work, not because it's supposed to work) in the near future.
I certainly understand your stance; and understand that you've other priorities.
For reference, personally, I have a very different view of "worth" in this case.
I consider headless, netboot install to a disk-based system as a business need, more than a technical one. Deployment/maintenance of systems across a "wide area" drives our interest/need for this solution. It already exists in retail solutions, e.g. the Linksys+CustomFirmware, as well as small Cisco, NetScreen, SonicWall, etc. (we use/deploy both).
tftp/nfs is about the ugliest (and insecure) way I've heard of to install or upgrade a security device. I'm sure we could argue this point all day long until we're both blue in the face - I just don't see the point of a PXE boot based installer for the embedded platform, pop it open and replace the flash card. There are devices out there that do a MUCH better job of giving you access to the flash card than the Soekris boxes do (one of them is on our recommended vendors page). We're also not really targetting the embedded space that hard, m0n0 does a great job there, we're considerably more CPU and memory intensive. 266Mhz boxes aren't really a good match for pfsense (although we do run on them).
Again, this is something I'm interested working on, but not for install, for netbooting a machine that mounts it's filesystem remotely. This allows for a quicker dev cycle and an easier to fix box when I break it (which is certainly guaranteed). I don't expect pxe booted embedded installs to work any better than they do today, if they get magically fixed by any other work done to fix netboots, great. I have however done a full install via pxe, I just don't recall what magical incantations I had to speak before doing it - of note, this was a FULL install from CD over the wire, not an embedded install.
–Bill
-
I'm sure we could argue this point all day long until we're both blue in the face
Well, we can agree there :-)
I just don't see the point of a PXE boot based installer for the embedded platform, pop it open and replace the flash card.
Neither do I. I'm not doing that.
this was a FULL install from CD over the wire, not an embedded install.
Exactly my situation.
FULL install. Goal – over the wire.
I am NOT by any stretch wed to the idea of tftp, nfs, or any particular technology.
Frankly, It matters not one whit to me what technology is used :-)
What I am interested in doing is getting pfSense installed onto a Soekris Net4801 that has a 'real' HardDrive in it -- no CF, no MicroDive, but a 40GB IDE Drive -- without opening the box.
I can do that handily with 'full' FreeBSD. I can't with pfSense. And the detail provided above is my attempt at helping to identify what's causing the problem.
Sure, the FreeBSD install via tftp/pxeboot/nfs is 'messy' ... BUT, if fulfills the purpose, can be easily scripted, and requires nothing more than Power/Serial/Ethernet cables to be attached.
If the pfSense-install-over-the-wire is already possible, I've neither found the detail how to do it, nor have I come across anyone (yet) that can provide it.
But, again, to be clear, I am not currently using, nor do I intend to use, a CF-based/Embedded-pfSense install. The folks at #pfsense did too good of a job arguing that I "needed" packages -- and it was made clear that that requires a FullInstall. :-)
Regards.
-
What I would do in your situation is tar up the pfSense contents, then create a script that runs from a standard FreeBSD netboot. The script would partition/fdisk, install the MBR and then explode the tar gzipped contents on the new system and reboot.
This will be a LOT easier than trying to get pfSense to netboot I suspect.
-
pfSense will netboot, I've done it. I may even still have working configs somewhere - it's how I did a full install on my hacom. I do remember having all sorts of issues making it work on the soekris, but I was trying to do something slightly different. At this point I don't recall what the workaround was, but it had something to do with the dhcp server AND the default config.xml agreeing - this is where the full install worked better as there is no default config.xml on the cdrom (if I remember correctly) so I was able to change which nic the WAN was on. Or it may have been the fact that our default config.xml uses sis0/1 which is there on a Soekris box and isn't there on the hacom unit (fxp's) so first time setup was triggered. Again, it's been a while, it does work though.
–Bill
-
Hi Scott,
What I would do in your situation is tar up the pfSense contents
Easy enough.
then create a script that runs from a standard FreeBSD netboot. The script would partition/fdisk, install the MBR and then explode the tar gzipped contents on the new system and reboot.
Clear, in principle. Have no idea how to do that, as yet. So, off to read FreeBSD 'stuff'.
If there's a pfsense wiki/doc/list/forum reference that someone knows about, a pointer would be appreciated.
Thanks for the suggestion
-
I believe the following worked for netbooting the cdrom…don't quote me on it though.
dhcpd.conf:
# hacom host pxe2 { hardware ethernet 00:40:f4:47:e7:d5; fixed-address 192.168.69.102; next-server 192.168.69.80; filename "pfsense/boot/pxeboot"; option root-path "/usr/local/tftpboot/pfsense/"; option routers 192.168.69.1; }
in /usr/local/tftpboot/pfsense/cf/conf/config.xml I had
<interfaces><lan><if>fxp0</if> <ipaddr>192.168.69.102</ipaddr> <subnet>24</subnet> <media></media> <mediaopt></mediaopt> <bandwidth>100</bandwidth> <bandwidthtype>Mb</bandwidthtype></lan></interfaces>
and pxe2 resolved to
maradns config:pxe2.% fqdn4 192.168.69.102
-
Hi Bill,
Bingo!
Once I figured out that I needed to make the change in:
/private/pfSense_tftpboot/conf.default/config.xml
rather than
/private/pfSense_tftpboot/cf/conf/config.xml
which, looking at the path, I suppose, makes sense …
Changing:
<interfaces><lan><if>fxp0</if> --- <ipaddr>192.168.1.1</ipaddr> +++ <ipaddr>10.0.0.10</ipaddr></lan></interfaces>
where, as in your example, "10.0.0.10" is the ip Addr assigned to the LAN port in dhcpd.conf, on reboot, I see:
Starting syslog...done. ... Configuring LAN interface...done. <----- WE'RE PAST THIS PROBLEM! Configuring WAN interface...done. Configuring OPT interfaces...done. Configuring CARP interfaces...done. Syncing system time before startup...done. Configuring firewall......done. Starting webConfigurator...done. Starting DNS forwarder...done. Starting DHCP service...done. Setting up microcode and tx/rx offloading...done. Starting FTP helpers...done. Generating RRD graphs...done. Starting DHCP service...done. Starting OpenNTP time client...done. Starting CRON... done. Bootup complete
So that issue seems to have gotten fixed (!?).
If I now open a broser, and nav to:
http://10.0.0.10
with credentials "admin/pfsense", I see:
http://img48.imageshack.us/img48/7258/untitledun9.jpg
Which, I gather, is what I should see! :-)
I'll try the install itself later today.
Thanks!
-
Maybe you can dump your setup into a vmware-preinstallation environment now so others can use it easily too ;D
-
Maybe you can dump your setup into a vmware-preinstallation environment now so others can use it easily too ;D
I'm doing "this" on/with a PowerBook G4. No VM* to speak of.
Assuming all goes well, I'll gladly post my step-by-step notes.
Once I 'bless' the setup, and delegate to the techie-types, perhaps they can cobble something up on/for VM*.
–Tenzen
-
Now that I've netbooted pfsense to the point I can see the pfSense web interface at http://10.0.0.10, how, exactly do I go about INSTALLING the system TO the Net4801's local HDD?
Is that done via the web interface? Poking around, I suspect, perhaps not.
At the serial console, however, the output currently 'sits' at:
... Starting DNS forwarder...done. Starting DHCP service...done. Setting up microcode and tx/rx offloading...done. Starting FTP helpers...done. Generating RRD graphs...done. Starting DHCP service...done. Starting OpenNTP time client...done. Starting CRON... done. Bootup complete
and goes no further. Is this as expected? Or have I stumbled on my 'next issue'?
Thanks.
–Tenzen
-
You have stumbled upon your next issue. The console should auto login and you should be presented with a menu.
-
You want to turn on the serial console at system>advanced. You will have an option 99 at the shellmenu to install it. This also can be done via ssh btw (enable it at system>advanced) and wait until the keygeneration has finished.
-
You have stumbled upon your next issue. The console should auto login and you should be presented with a menu.
Well, rats. I have discovered mention of the "shell menu" in the docs, which I guess is what you're referring to – and I'm not seeing.
You want to turn on the serial console at system>advanced. You will have an option 99 at the shellmenu to install it. This also can be done via ssh btw (enable it at system>advanced) and wait until the keygeneration has finished.
Here's an interface problem. I can't select any of the items from the System menu. If I hover over it, it drops down, but if I move cursor down to attempt to select any item, the menu vanishes, and a submenu of other items appears. This is only true of the System Menu. Other menus seem to be fine.
Instead, manually entering URL:
http://10.0.0.10/system_advanced.php
At the web interface, check/select/save to ENABLE both:
Enable Serial Console This will enable the first serial port with 9600/8/N/1 Note: This will disable the internal video card/keyboard Secure Shell Enable Secure Shell SSH port
Then, from shell @ pc,
ssh -l admin 10.0.0.10 Password: "pfsense"
Login is successful, and I now see the expected "shell menu".
*** Welcome to pfSense 1.0.1-SNAPSHOT-03-23-2007-cdrom on pfSense *** LAN* -> sis0 -> 10.0.0.10 WAN* -> sis1 -> 10.0.0.20(DHCP) pfSense console setup *********************** 0) Logout (SSH only) 1) Assign Interfaces ... 99) Install pfSense to a hard drive/memory drive, etc.
Selecting Option==99 takes me into the pfSense installer …
Following the step-by-step instructions at:
http://doc.pfsense.org/index.php/Chapter_3:_Installing_pfSense#Installing_pfSense_to_harddrive
works without a hitch. Finally, selecting the "<reboot>" option, the Net4801 reboots.
Checking console output, I see:
... ad0: 38154MB <hts541040g9at00 mb2oa60a="">at ata0-master UDMA33 Trying to mount root from ufs:/dev/ad0s1a ___ ___/ f \ / p \___/ Sense \___/ \ \___/ Welcome to pfSense 1.0.1-SNAPSHOT-03-23-2007 on the 'pfSense' platform... ...</hts541040g9at00>
so it's booting from the HDD.
Output now successfully continues to:
pfSense console setup *********************** 0) Logout (SSH only) 1) Assign Interfaces 2) Set LAN IP address 3) Reset webConfigurator password 4) Reset to factory defaults 5) Reboot system 6) Halt system 7) Ping host 8) Shell 9) PFtop 10) Filter Logs 11) Restart webConfigurator Enter an option:
and, checking in a browser, I do see the pfsense interface. For reference, the System menu is till acting-up … but otherwise, I think I've managed to get it done!
Thanks.
--Tenzen</reboot>