Drive read/boot errors w/2.0 RC3



  • I have a system (Jetway NF96 motherboard) which functions fine under 1.23 - standard. (see attached output of dmesg "pfsense123-standard")
    Mainboard: Jetway NF96FL-525 Dual Core Atom D525, Fanless
    Memory: 240-pin DDR2 800 DIMM 1GB
    HDD: Emphase FDM44XDI1G Industrial Flash Module (44-pin) 1GB
    LAN Ports: 4x Gb LAN Ports (AD3RTLANG Daughterboard + 1x onboard Gb LAN)

    with 1.23 embedded I get occasional errors of the form:
    ad0: FAILURE - READ status=ff <busy,ready,dma_ready,dsc,drq,correctable,index,error>error=0 LBA=727255
    "pfsense123-embedded-1" shows a boot where this happens and it fails as a result
    "pfsense123-embedded-2" is a successful boot

    with 2.0RC3-embedded it hangs every time after:
    SMP: AP CPU #3 (or 2 or 1) Launched!
    which based on what i see, what should come next is:
    Root mount waiting for: usbus6 usbus5 usbus4 usbus3 usbus2 usbus1 usbus
    log is "pfsense2rc3-embedded"

    with 2.0RC3 installing from scratch it returns an error 1 on trying to format the drive. (partition and bootblock go just fine though)

    On an upgrade (upgrade logs attached as well, there is some complaining at the end about a missing PHP module), 1st try took forever (left it going 24 hours). Reinstalled 1.23 and tried again and it finished (which is the logs that are attached). However on rebooting it has
    many errors like the above read error (instead of just one sometimes on 1.23 embedded and none on 1.23 standard) and fails to boot.

    It seems like something changed with maybe the ide drivers (it's an ICH8 based board) from 1.23 to 2RC3 that makes it very unhappy. But it does seem to be an issue that was slightly present under 1.23 embedded).

    P.S. tried installing to usb stick, several different ide flash modules and a hard drive. all seem to do the same thing.

    The upgrade-log was too large to attach (476k) and it doesn't say it will allow zip files so…
    I can break it up and post in multiple pieces if that will help.
    fdisk_upgrade_log.txt
    firmware_update_misc.log.txt
    pfsense2RC3-embedded.txt
    pfsense123-embedded-1.txt
    pfsense123-embedded-2.txt
    pfsense123-standard.txt
    pfsenselog2-standard.txt</busy,ready,dma_ready,dsc,drq,correctable,index,error>



  • Just wanted to add this happens on both 386 and x64.



  • Hi,

    I'd like to add that I'm experiencing the same issue, also with the Jetway NF96 motherboard.  Same CPU - probably from MitxPC.  Tried using different USB drives, and it always stops after 'SMP: AP CPU #3 Launched!'

    Hope someone can help please.



  • The 2.0 startup output wasn't very informative.

    Here are some ideas:

    • On boot select verbose boot, capture the output and post here. verbose boot is very chatty but sometimes gives more clues about the problem. To select verbose boot, on startup when you see the prompt Hit [Enter] to boot immediately, or any other key for command prompt. hit the space bar then at the prompt type boot -v then hit the Enter key

    • try booting with only one CPU: in the BIOS there MIGHT be options to disable hyperthreading and/or to disable multiple CPUs.



  • Thanks for the suggestions.  I downloaded the file pfSense-memstick-2.0-RC3-i386-20110621-1650.img.gz and uncompressed it on my Mac.  Then used dd if=pfSense…. of=dev/disk5 bs=16k to copy the file to a USB drive.

    The system (Jetway) booted up, and I set the boot option to:

    set kern.cam.boot_delay=10000
    boot -v

    The final logs look something like this - I can't copy and paste since it's on a stand-alone machine.

    SMP: AP CPU #2 Launched!
    cpu2 AP:
      ID: 0x....
      lint0: 0x....
    timer:
    coapic0:
    2vector48
    iopaci0: routing int pin 7...
    ioapic0: ...
    ioapic0: ...
    ioapic0: ...
    msi: Assigning MSI IRQ 256 to local APIC 2 vector 53
    [cursor stops here]

    I think it's supposed to mount the root file system next, but the system just hangs.  If I unplug the USB keyboard, the console shows a message, but otherwise no further messages.

    Thanks very much for your help!



  • Think I found a workaround: http://forum.pfsense.org/index.php/topic,39068.0.html

    Looks like there is an issue with USB on Jetway motherboards.  Will use a CD or other device.

    Thanks!



  • @macross42:

    Think I found a workaround: http://forum.pfsense.org/index.php/topic,39068.0.html

    Looks like there is an issue with USB on Jetway motherboards.  Will use a CD or other device.

    The referenced note is for a different situation - attempting to do a full install from a USB CD. A full install requires a hard drive which shouldn't be a flash drive due to the limited number of writes supported on a flash drive. You have indicated you are doing an embedded install.

    That other post mentioned disabling unused devices: It should be pretty quick to enter the BIOS and disable unused devices such as parallel port, one or more COM ports, sound ports etc.



  • @macross42:

    Thanks for the suggestions.  I downloaded the file pfSense-memstick-2.0-RC3-i386-20110621-1650.img.gz and uncompressed it on my Mac.  Then used dd if=pfSense…. of=dev/disk5 bs=16k to copy the file to a USB drive.

    The system (Jetway) booted up, and I set the boot option to:

    set kern.cam.boot_delay=10000
    boot -v

    The final logs look something like this - I can't copy and paste since it's on a stand-alone machine.

    SMP: AP CPU #2 Launched!
    cpu2 AP:
      ID: 0x....
     lint0: 0x....
    timer:
    coapic0:
    2vector48
    iopaci0: routing int pin 7...
    ioapic0: ...
    ioapic0: ...
    ioapic0: ...
    msi: Assigning MSI IRQ 256 to local APIC 2 vector 53
    [cursor stops here]

    I think it's supposed to mount the root file system next, but the system just hangs.  If I unplug the USB keyboard, the console shows a message, but otherwise no further messages.

    Thanks very much for your help!

    Yeah I wasn't trying to say the CPU was the problem, I had tried disabling hyperthreading, etc, same story. I assume it was something in between the CPU detection and the next step (which was mounting root in the non verbose boot)
    But between the other post you reference and your messages, maybe an IRQ issue? They suggested disabling parallel port…

    I did have more luck installing from a (SATA) CD, but only w/ 1.2.3.
    As mentioned later there seems to be some sort of disk i/o issue on this board with 2.0RC3 standard (maybe w/embedded too, never got far enough to see)



  • @palesius:

    Yeah I wasn't trying to say the CPU was the problem,

    Nor was I, but sometimes  there can be unexpected interactions between CPUs that cause a system to hang.

    The original post referenced a startup log which included:```

    acpi0: reservation of ffc00000, 300000 (3) failed
    acpi0: reservation of fee00000, 1000 (3) failed
    acpi0: reservation of 0, a0000 (3) failed
    acpi0: reservation of 100000, 3f600000 (3) failed

    which suggests the BIOS provided ACPI data might not be good. Can you turn off ACPI in the BIOS or boot to ignore ACPI? Maybe there is a BIOS update to fix this reservation failure. Maybe it is insignificant.
    
    I seem to recall there is a way in FreeBSD to disable MSI. a MSI assignment was the last thing reported in the verbose boot. Perhaps if MSI was disabled the startup might get a bit further.
    
    Have you tried booting another OS? Linux?


  • Ah, ok found a way to make it boot using the Jetway boards.  As mentioned in the other thread, I just disabled the parallel port.  Apparently that conflicts with the 3xLAN daughterboard enough to cause pfSense to stop booting.  That's the only change I needed to make.  ACPI, etc are still on.  I'm doing an embedded install, but was also using a USB drive (just like the other thread, although they were ultimately trying to do a full install).  So I can get this to work using the embedded install now.

    Thanks for your help!



  • Great! I'll give it a try. But were you seeing any of the drive read errors I mention above?
    I don't have the system available at the moment, but when I do, I'll give it a try and see if I see the same problem with the embedded firmware.
    No reason for me to leave parallel enabled anyway.
    The drive errors were definitely not caused by the parallel thing, because I had disabled absolutely everything I could.


Log in to reply