2.1 on net5501 (fatal failure on boot)



  • First tried to use Firmware Upgrade through the GUI to move from 2.0.3-Release to 2.1-Release

    Got a fatal error on boot

    Then tried a fresh 2.1-Release install on the Compact Flash Card and go the same fatal error

    Had to move back to 2.0.3-Release to get working again

    Console Fatal Error Shown below …

    --snip
    Fatal trap 12: page fault while in kernel mode
    cpuid = 0; apic id = 00
    fault virtual address  = 0x0
    fault code      = supervisor read, page not present
    instruction pointer = 0x20:0xc060349b
    stack pointer          = 0x28:0xe73e18b4
    frame pointer          = 0x28:0xe73e1908
    code segment        = base 0x0, limit 0xfffff, type 0x1b
                = DPL 0, pres 1, def32 1, gran 1
    processor eflags    = interrupt enabled, resume, IOPL = 0
    current process    = 5656 (dhcpd)
    trap number    = 12
    panic: page fault
    cpuid = 0
    Uptime: 4m0s
    Cannot dump. Device not defined or unavailable.
    Automatic reboot in 15 seconds - press a key on the console to abort
    Rebooting...
    --snip



  • Okay so the error occurs on my Soekris net5501 and also today I discovered that the same error occurs on my PCEngines Alix 2D13 when attempting to upgrade from 2.0.3-Release to 2.1-Release.  Looks like this is an error in the dhcpd process but nevertheless this error was fatal again.  The system did boot up successfully to the console but after I restored my confguration then I got the error.  Seems to be a dhcpd and my pfsense configuration related.



  • I didn't realize that the 2.1 release is at RC0.  I'll try again when the release is a non release candidate.

    2.2 did come up on the console but after I restored my configuration then DHCP barfed.  Didn't sound like anyone had the same issue as I did but I'll just try again when the full 2.1 release comes along.



  • @atzouris:

    I didn't realize that the 2.1 release is at RC0.  I'll try again when the release is a non release candidate.

    2.2 did come up on the console but after I restored my configuration then DHCP barfed.  Didn't sound like anyone had the same issue as I did but I'll just try again when the full 2.1 release comes along.

    What?  2.1 was released 15-Sep-13, and AFAIK 2.2 snapshots are not yet available.  Please check your sources.



  • I don't have any sources to check as I just downloaded and installed the binary releases.

    I just tried again today December 19th with new downloads of pfsense 2.1 release on my net4801 and my alix2d13 and the pfsense-2.1 release i386 4G nanobsd still has the same problems.  Neither platform can boot to a prompt and just reboots endlessly.

    Note that these were upgrades and not fresh installs.  I started from pfsense 2.0.3 release i386 4G nanobsd.



  • I have 11 Alix2D13 happily running 2.1-RELEASE in production at various offices, and others I'm sure have 100's or 1000's of them.
    Post console output. And was there anything special/non-standard about your 2.0.3 system(s) that would mean the config file has extra/different contents?



  • Stock.  What I have found is that my configuration caused the upgrades to not work.

    So then I took my Soekris net4801 to do a fresh install and that seems to work okay if I put in a manual configuration.  My ufl wireless connection no longer works on my miniPCI wireless card so I removed the wireless card.  Used this system for a little bit and it works.

    Then I took my PCEngines Alix 2D13 and also did a fresh install and manual configuration and it also seems to work okay.  Then I enabled wireless and I got the following during runtime and then the following after that during reboot.  Hmm wonder if it's the wireless causing this issue.  Too little information to draw any conclusions but can only assign suspicion.

    –snip
    Press <enter>to continue.
    *** Welcome to pfSense 2.1-RELEASE-nanobsd (i386) on pfSense ***

    WAN (wan)      -> vr1        -> v4/DHCP4: 10.0.0.222/24
    LAN (lan)      -> vr0        -> v4: 10.0.1.1/24

    1. Logout (SSH only)                  8) Shell
    2. Assign Interfaces                  9) pfTop
    3. Set interface(s) IP address      10) Filter Logs
    4. Reset webConfigurator password    11) Restart webConfigurator
    5. Reset to factory defaults        12) pfSense Developer Shell
    6. Reboot system                    13) Upgrade from console
    7. Halt system                      14) Enable Secure Shell (sshd)
    8. Ping host                        15) Restore recent configuration

    Enter an option:
    Message from syslogd@pfSense at Dec 20 07:33:56 ...
    pfSense php: /index.php: Successful login for user 'admin' from: 10.0.1.200

    Fatal trap 12: page fault while in kernel mode
    cpuid = 0; apic id = 00
    fault virtual address = 0x0
    fault code = supervisor read, page not present
    instruction pointer = 0x20:0xc060349b
    stack pointer         = 0x28:0xe3b0d8b4
    frame pointer         = 0x28:0xe3b0d908
    code segment = base 0x0, limit 0xfffff, type 0x1b
    = DPL 0, pres 1, def32 1, gran 1
    processor eflags = interrupt enabled, resume, IOPL = 0
    current process = 50955 (dhcpd)
    trap number = 12
    panic: page fault
    cpuid = 0
    Uptime: 22m14s
    Cannot dump. Device not defined or unavailable.
    Automatic reboot in 15 seconds - press a key on the console to abort
    Rebooting...

    --snip


    --snip
        ___
    / f
    / p _
    / Sense
    _

        _
    _/

    Welcome to pfSense 2.1-RELEASE  ...

    Creating symlinks......done.

    Under 512 megabytes of ram detected.  Not enabling APC.
    External config loader 1.0 is now starting... ad0s3
    Launching the init system... done.
    Initializing............................. done.
    Starting device manager (devd)...done.
    Loading configuration......done.
    Updating configuration...done.
    Cleaning backup cache........done.
    Setting up extended sysctls...done.
    Setting timezone...done.
    Configuring loopback interface...done.
    Starting syslog...done.
    Starting Secure Shell Services...done.
    Setting up polling defaults...done.
    Setting up interfaces microcode...done.
    Configuring loopback interface...done.
    Creating wireless clone interfaces...done.
    Configuring LAGG interfaces...done.
    Configuring VLAN interfaces...done.
    Configuring QinQ interfaces...done.
    Configuring WAN interface...done.
    Configuring LAN interface...done.
    Configuring OPT1 interface...done.
    Configuring OPT2 interface...done.
    Syncing OpenVPN settings...done.
    Configuring firewall......done.
    Starting PFLOG...done.
    Setting up gateway monitors...done.
    Synchronizing user settings...done.
    Starting webConfigurator...done.
    Configuring CRON...done.
    Starting DNS forwarder...done.
    Starting NTP time client...done.
    Starting DHCP service...done.
    Starting DHCPv6 service...done.
    Configuring firewall......done.
    Generating RRD graphs...

    Fatal trap 12: page fault while in kernel mode
    cpuid = 0; apic id = 00
    fault virtual address = 0x0
    fault code = supervisor read, page not present
    instruction pointer = 0x20:0xc060349b
    stack pointer         = 0x28:0xe3ac08b4
    frame pointer         = 0x28:0xe3ac0908
    code segment = base 0x0, limit 0xfffff, type 0x1b
    = DPL 0, pres 1, def32 1, gran 1
    processor eflags = interrupt enabled, resume, IOPL = 0
    current process = 87366 (dhcpd)
    trap number = 12
    panic: page fault
    cpuid = 0
    Uptime: 3m12s
    Cannot dump. Device not defined or unavailable.
    Automatic reboot in 15 seconds - press a key on the console to abort
    Rebooting...
    --snip</enter>



  • Let me answer the special/different contents.  I don't think so,  I have all of my configuration for standard stuff like

    • dhcp static mappings
    • ntp
    • wireless
    • bridged interfaces
    • firewall rules

    I don't consider any of that special or different than typical pfsense configuration stuff.

    Now my Alix 2D13 has to be taken apart and I have to reflash the compact flash as I can't get to an command interface to reset to factory defaults.  I'm in the endless reboot mode again.



  • I have this Atheros card in a couple of my Alix 2D13 with 2.1-RELEASE - they are working fine.
    http://store.netgate.com/KIT-ALIX-5004MP-DUAL-P190C34.aspx
    Maybe post exact details of you WiFi hardware and see if anyone else has the same?



  • I'm actually running 4 different versions of hardware with pfsense and all run wireless with atheros chipsets.  All are successfully runing 2.0.3 Release before I started on this upgrade path.  Platforms are net5501, net4801, alix 2d13, and it-100.

    I've been focused on getting just two platforms upgraded and working and they are net4801 and alix 2d13. The net4801 is running an Engenius EMP-8602 Plus-S and the alix 2d13 is running a DCMA-82.

    EMP-8602 PLUS-S: 802.11a/b/g 600mW High Power mini PCI Card (NMP-8602 PLUS-S)
    DCMA-82 Atheros 6G: 802.11a/b/g High Power mPCI Card



  • Well, I have had a hard time as well when trying to make my Alix2D13 work with pfSense 2.1. System is running fine with 2.0.3 but I am unable to get up my WLAN card which is an Atheros AR5413. Symptoms are very similar to yours. However, my Alix is at least booting correctly even with the WLAN card installed. The WLAN card, however, is not usable:
    http://forum.pfsense.org/index.php/topic,68531.0.html

    There is another interesting thread where the author has observed that problems start as soon as authentication against a radius server (IEEE802.1X) is activated. I am using IEEE802.1X as well. However, I have not yet had time to check, if my WLAN card will work without IEEE802.1X. See for further details this thread:
    http://forum.pfsense.org/index.php/topic,69312.0.html

    I've received feedback from several people using successfully WLAN in particluar with their Alix board and pfSense 2.1. But obviously there are some others having trouble forcing them currently to stay with pfSense 2.0.3. Testing is very time consuming and took me at least two full days. As soon as I have some time left I will report about my results after deactivating IEEE802.1X.

    By the way: Do you use your WLAN card with IEEE802.1X authentication?

    Regards,
    Peter



  • No I'm not using IEEE802.1X authentication.  I was thinking about using this in the future.

    Thanks for the links to the other reports about Atheros WLAN issues with pfsense 2.1.  I was beginning to feel like I was the only one with this issue starting with RC pfsense 2.1.

    At this point in time, IMHO it seems to me that this may be a FreeBSD issue and not a pfsense issue.  Apparently some Atheros cards are working with pfsense 2.1.  A lot of times problems appear differently on different hardware due to different processor configurations but it's for sure that this main issue is wireless.

    I'm probably going to park on upgrading to pfsense on my main net5501 but I'm going to run my net4801 and alix2d13 without wireless to get some time on pfsense 2.1 as wireless is not critical to me on these two systems.  However, on my net5501 it is critical to have wireless support on this system.  It's very inconvenient to have to crack open my boxes to remove the Compact Flash Card.

    Thanks to all for the feedback.



  • @atzouris:

    No I'm not using IEEE802.1X authentication.  I was thinking about using this in the future.

    So IEEE802.1X authentication is at least not the only if at all reason.

    @atzouris:

    Thanks for the links to the other reports about Atheros WLAN issues with pfsense 2.1.  I was beginning to feel like I was the only one with this issue starting with RC pfsense 2.1.

    Well, I think there are more pfSense 2.1 working WLAN cards out there than failing ones.

    @atzouris:

    At this point in time, IMHO it seems to me that this may be a FreeBSD issue and not a pfsense issue.

    Yeah, I have the same feeling but cannot prove it. Unfortunately, this would mean that the error will probably not be fixed with pfSense 2.1.x. I feel not experienced enough to report this issue in the FreeBSD forum. As phil.davis reported the Wistron CM9 working correctly with the Alix board, I've recently bought one. As soon as I have some time for testing I will report my results with this card.

    @atzouris:

    I'm probably going to park on upgrading to pfsense on my main net5501 but I'm going to run my net4801 and alix2d13 without wireless to get some time on pfsense 2.1 as wireless is not critical to me on these two systems.  However, on my net5501 it is critical to have wireless support on this system.  It's very inconvenient to have to crack open my boxes to remove the Compact Flash Card.

    Could you please report about your net5501 experiences. Because of my bad experiences with my Alix board I have delayed upgrade of my net6501.

    Regards,
    Peter



  • Very late reply here but want to report an update on my issue and reply to Peter.

    Peter … my net5501 has been working great for a couple of years now.  I have not had any issues with the hardware or really either the pfsense firmware that has been running on it.  I upgrade the pfsense firmware whenever I can.  Currently running version 2.0.3 (with FreeBSD 8.1).

    Now ... for my issue with "Fatal trap 12: page fault while in kernel mode"

    so far this issue has happenned on pfsense 2.1.x where .x is .1, .2, and .3.  I had originally tried upgrading the net5501 but since that is my main router and it was work upgrading and then reverting backwards ... I resorted to seeing if my Alix 2d13 works instead before I attempt the net5501.  What I just found today is that by using pfsense 2.2 (Alpha) with FreeBSD 10.0 Stable it works.  I suspect that the issue has to do something with or related FreeBSD 8.3 or some driver or software running on FreeBSD 8.3 as I have seen other reports on the net of this kernel mode failure.  Please take my suspicion loosely as the error is vague and does not particularly hone in on any blame.

    Good news is that for me I have an upgrade migration path from 2.0.3 to 2.2 and onward for both my net5501 and my 2d13.



  • Thank you for your late reply. It's good to hear about your good experiences with pfSense 2.2. My  WLAN/IEEE802.1X issues could be fixed so that I am happily running now 2.1.3 on net6501 and ALIX.2D13. You can find details in this thread: https://forum.pfsense.org/index.php?topic=72483.0

    Regards,
    Peter



  • Success.  While I was not able to run pfsense 2.1.x on Soekris net5501 and Alix2d13 hardware platforms after restoring my configurations without kernel panics … Now that pfsense 2.2 Release is available ... I have successfully updated both hardware platforms from pfsense 2.0.3 Release to 2.2 Release and all is well after restoring my 2.0.3 configuration.  I'm happy.  I've learned to be patient in life as in the end it's very worthwhile.

    p.s. for OVPN I had to disable a deprecated option for tls-remote and optionally add an option for auth-nocache

    -- Anthony Tzouris


Log in to reply