• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

"Page fault while in kernel mode" on APU2 after bios/coreboot upgrade

Scheduled Pinned Locked Moved General pfSense Questions
41 Posts 8 Posters 3.3k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • K
    kiokoman LAYER 8
    last edited by kiokoman Oct 13, 2020, 12:04 AM Oct 12, 2020, 10:52 PM

    it can be useful for others with this kind of errors but

    it's the MCI status register, not the RAM bank

    ECC error (ADDR valid) 0x9426c0010b000813
    ECC error overflow (ADDR valid) 0xd426c0010b000813
    ECC error (ADDR invalid) 0x9026c0010b000813
    ECC error overflow (ADDR invalid) 0xd026c0010b000813
    L1 Cache Data Store error (UE) 0xb600200000000145
    **L1 Instruction Cache (Instruction Fetch) error (ADDR valid) 0x9400000000000151**
    L1 Instruction Cache (Instruction Fetch) error overflow (ADDR valid) 0xd400000000000151
    Bus Unit (L2 Cache) error (UE) 0xb600000000020136
    L2 Data Cache (Line Fill) error (ADDR valid) 0x9400400000000136
    L2 Data Cache (Line Fill) error overflow (ADDR valid) 0xd400400000000136
    

    this is specific for this CPU:

    The error-reporting machine check register banks supported in this processor are:
    • MC0: Data cache (DC).
    • MC1: Instruction cache (IC). <- "MCA bank 1"
    • MC2: Bus unit (BU), including L2 cache.
    • MC3: Reserved.
    • MC4: Northbridge (NB), including the IO link. These MSRs are also accessible from configuration
    space. There is only one NB error-reporting bank, independent of the number of cores.
    • MC5: Fixed-issue reorder buffer (FR) machine check registers.
    

    ̿' ̿'\̵͇̿̿\з=(◕_◕)=ε/̵͇̿̿/'̿'̿ ̿
    Please do not use chat/PM to ask for help
    we must focus on silencing this @guest character. we must make up lies and alter the copyrights !
    Don't forget to Upvote with the 👍 button for any post you find to be helpful.

    1 Reply Last reply Reply Quote 1
    • K
      kiokoman LAYER 8
      last edited by kiokoman Oct 13, 2020, 11:24 AM Oct 13, 2020, 11:14 AM

      @CS
      CPU ID 0 and CPU ID 1 it's probably a dual core cpu ?
      timeout stopping CPUs, it was unable to speak with the CPU
      with spin lock held too long, it's basically telling you: "I can't wait forever here, so I guess I'll stop and panic"
      based on what you had before I would check CPU settings like overclock / voltage / frequency, overheat, and dust on the fan if there is one

      Does it seem to be a common problem for Apu2 ? https://forum.netgate.com/topic/156830/could-you-help-me-analyze-these-crashdumps?_=1602587866619

      ̿' ̿'\̵͇̿̿\з=(◕_◕)=ε/̵͇̿̿/'̿'̿ ̿
      Please do not use chat/PM to ask for help
      we must focus on silencing this @guest character. we must make up lies and alter the copyrights !
      Don't forget to Upvote with the 👍 button for any post you find to be helpful.

      C 1 Reply Last reply Oct 13, 2020, 3:23 PM Reply Quote 0
      • C
        CS @kiokoman
        last edited by Oct 13, 2020, 3:23 PM

        @kiokoman APU2 has a single AMD Embedded G series GX-412TC, 4 CPUs: 1 package x 4 cores.
        No overclocking and no active cooling in place for these boards.

        Reference: https://pcengines.ch/apu2.htm

        1 Reply Last reply Reply Quote 0
        • K
          kiokoman LAYER 8
          last edited by kiokoman Oct 13, 2020, 4:45 PM Oct 13, 2020, 4:42 PM

          ah i didn't understand that the problem was solved
          so it was Core Performance Boost
          it was probably overclocking the cpu

          ̿' ̿'\̵͇̿̿\з=(◕_◕)=ε/̵͇̿̿/'̿'̿ ̿
          Please do not use chat/PM to ask for help
          we must focus on silencing this @guest character. we must make up lies and alter the copyrights !
          Don't forget to Upvote with the 👍 button for any post you find to be helpful.

          1 Reply Last reply Reply Quote 0
          • C
            CS
            last edited by Oct 13, 2020, 6:42 PM

            @kiokoman correct, "Core Performance Boost" was causing it and we were trying to find out why considering that other folks have it enabled on APU2 without experiencing any issues.

            D 1 Reply Last reply Oct 15, 2020, 6:28 PM Reply Quote 0
            • K
              kiokoman LAYER 8
              last edited by Oct 13, 2020, 7:07 PM

              we have a saying in Italy, literally translated as ‘not all donuts come out with a hole’ meaning ‘not everything turns out as planned’ 😂
              it's called "silicon lottery", not all cpu are the same, there is ample opportunity for some microscopic part of a CPU, which works fine at a certain speed/voltage combination, to no work if the speed or voltage is increased.

              ̿' ̿'\̵͇̿̿\з=(◕_◕)=ε/̵͇̿̿/'̿'̿ ̿
              Please do not use chat/PM to ask for help
              we must focus on silencing this @guest character. we must make up lies and alter the copyrights !
              Don't forget to Upvote with the 👍 button for any post you find to be helpful.

              1 Reply Last reply Reply Quote 0
              • F
                fireodo @CS
                last edited by fireodo Oct 14, 2020, 1:31 PM Oct 14, 2020, 1:09 PM

                @CS said in "Page fault while in kernel mode" on APU2 after bios/coreboot upgrade:

                @DaddyGo, @fireodo , @stephenw10

                Could anyone share their APU2 loader.config.local file for reference? I'm wondering if I'm missing something obvious, I haven't done any tuning for years because it has been running smoothly with no issues.

                Hi, here the content of my loader.config.local:

                legal.intel_ipw.license_ack=1
                legal.intel_iwi.license_ack=1
                debug.acpi.avoid="_SB_.PCI0.GPIO" (necessary for loading apuled.ko)

                if you still have "hint.acpi_perf.0.disabled=1" in your loader.conf.local you will see those increased frecv. in sysctl dev.cpu even when you have disabled CPB in BIOS.

                Regards,
                fireodo

                Kettop Mi4300YL CPU: i5-4300Y @ 1.60GHz RAM: 8GB Ethernet Ports: 4
                SSD: SanDisk pSSD-S2 16GB (ZFS) WiFi: WLE200NX
                pfsense 2.7.2 CE
                Packages: Apcupsd Cron Iftop Iperf LCDproc Nmap pfBlockerNG RRD_Summary Shellcmd Snort Speedtest System_Patches.

                1 Reply Last reply Reply Quote 0
                • D
                  DaddyGo @CS
                  last edited by Oct 15, 2020, 6:28 PM

                  @CS said in "Page fault while in kernel mode" on APU2 after bios/coreboot upgrade:

                  other folks have it enabled on APU2 without experiencing any issues.

                  I confirm this 😉

                  we have lot of such units at end users, they are "run" with CPB without any problems
                  we basically configure these "routers / NGFWs" + pfSense with CPB

                  CPB as I wrote above has been enabled in the Coreboot BIOS, but can only be interpreted on 1 core with a frequency of 1,400 instead of 1,000 this is good for OpenVPN stuff, for example...

                  @CS I think don't look for the rabbit in the bush...
                  this is not an issue whic is caused by CPB or pfSense

                  I think the APU2 MOBO is damaged somewhere, cold soldering or something like that

                  which causes a malfunction in the BUS or RAM operation due to the elevated clock....???

                  maybe try a CPU shock test under linux and insulate the APU2 housing to warm up .....Voilà, maybe there will be results

                  @kiokoman anyway, this is an AMD embedded series CPU can not really be overdriven, designed for low-power devices
                  either it works or it doesn't, there is no overclocking it only the CPB allows for a small tuning...

                  Cats bury it so they can't see it!
                  (You know what I mean if you have a cat)

                  1 Reply Last reply Reply Quote 0
                  • C
                    CS
                    last edited by Oct 15, 2020, 7:39 PM

                    @DaddyGo @fireodo I won't continue troubleshooting this honestly, the board works fine for me with CPB disabled and I still get the boosted CPU frequency by having the right settings in my loader.conf.local. I'm not even sure if my performance would get any better! Actually, I'm now wondering, just out of curiosity, if this happens when you have both, the CPU boost settings in loader.conf.local and the BIOS setting enabled.

                    G 1 Reply Last reply Feb 27, 2021, 1:53 PM Reply Quote 1
                    • AKEGECA
                      AKEGEC
                      last edited by Oct 16, 2020, 11:09 AM

                      @CS , every hardware has a different outcome in Q.C. Even with the same parts, but a different batch.
                      Rule of thumb, 4 years lifespan (4 is death in Chinese). Nowadays you should be happy if your electronic works for more than 4 years.

                      I am not sure how handy you are but you could try heating up the cpu (without thermal paste) with the heat gun on a flat surface. Keep around 10 cm distance with circular motion for around 10-15 mins. But be warn, you could burn the cpu.

                      D 1 Reply Last reply Oct 16, 2020, 11:50 AM Reply Quote 0
                      • D
                        DaddyGo @AKEGEC
                        last edited by Oct 16, 2020, 11:50 AM

                        @AKEGEC said in "Page fault while in kernel mode" on APU2 after bios/coreboot upgrade:

                        but you could try heating up the cpu (without thermal paste) with the heat gun on a flat surface.

                        It's a very bad idea.
                        This AMD CPU reaches its maximum TDP in about 40 seconds without a cooling surface (heat shrink) and dies...
                        (moreover as I wrote it is an embedded CPU, soldered to the PCB)
                        (earlier than the said Chinese 4-year death)

                        the pcEngines stuff is stable and we have several pieces of it that has been working for 6 years (from ALIX and APU series)

                        The ALIXs works as a radio os WISP PtP and AP and is constantly exposed to the weather.
                        So these are not subject to your Chinese rule 😉

                        Cats bury it so they can't see it!
                        (You know what I mean if you have a cat)

                        AKEGECA 1 Reply Last reply Oct 16, 2020, 2:05 PM Reply Quote 0
                        • AKEGECA
                          AKEGEC @DaddyGo
                          last edited by Oct 16, 2020, 2:05 PM

                          @DaddyGo said in "Page fault while in kernel mode" on APU2 after bios/coreboot upgrade:

                          It's a very bad idea.
                          This AMD CPU reaches its maximum TDP in about 40 seconds without a cooling surface (heat shrink) and dies...
                          (moreover as I wrote it is an embedded CPU, soldered to the PCB)
                          (earlier than the said Chinese 4-year death)

                          the pcEngines stuff is stable and we have several pieces of it that has been working for 6 years (from ALIX and APU series)

                          The ALIXs works as a radio os WISP PtP and AP and is constantly exposed to the weather.
                          So these are not subject to your Chinese rule 😉

                          Well to solder embedded cpu you need a temperature between 200-400°c.
                          Anyway I was talking about heating it up a bit. As long you are not reaching 90°c you will be fine. But if it already passed the 4 years mark, then I would leave it as it is. 😏
                          I don't know why manufactures are shortening their products lifespan. It used to be 15-30 years quality guarantee.

                          D 1 Reply Last reply Oct 16, 2020, 2:24 PM Reply Quote 0
                          • D
                            DaddyGo @AKEGEC
                            last edited by DaddyGo Oct 16, 2020, 2:26 PM Oct 16, 2020, 2:24 PM

                            @AKEGEC said in "Page fault while in kernel mode" on APU2 after bios/coreboot upgrade:

                            Well to solder embedded cpu you need a temperature between 200-400°c.

                            Yes, this soldering temperature is a separate data in the catalog and the automatic production lines (soldering machines) solder to the permitted seconds. (2 - 3s)
                            The temperature of the inner silicon layer of the CPU does not tolerate this temperature (cca. 150 max).

                            I don't know, if you already had an APU MOBO in your hand?
                            So this is what it looks like...
                            (the metal router housing itself cools the CPU and connects to the metal surface with a bit of heat transfer)

                            fc118a87-03a7-4d5d-9522-c1531c604811-image.png

                            https://www.pcengines.ch/apucool.htm

                            As I wrote above, the test method for the following may be, CPU shock test under f.e. Linux and while covering the house.

                            Once we launched such a MOBO without its metal housing for testing and it "boiled" quickly under load.

                            You're right, half of today's stuff can't stand it until then 25-30 years. 😞
                            Anno, I even repaired 30-year-old cathode ray tubes and black and white televisions and continued to operate for another 10 years.

                            Welcome to today's money-hungry world. hahahahaha 😉

                            Cats bury it so they can't see it!
                            (You know what I mean if you have a cat)

                            1 Reply Last reply Reply Quote 1
                            • AKEGECA
                              AKEGEC
                              last edited by Oct 16, 2020, 2:36 PM

                              @DaddyGo , did we just revealed our age? hahahahahaa
                              Oh well age is just a number. 😊

                              D 1 Reply Last reply Oct 16, 2020, 3:03 PM Reply Quote 0
                              • D
                                DaddyGo @AKEGEC
                                last edited by Oct 16, 2020, 3:03 PM

                                @AKEGEC said in "Page fault while in kernel mode" on APU2 after bios/coreboot upgrade:

                                did we just revealed our age?

                                Not a shame ...
                                the age, I think, brings wisdom

                                yea and everyone is as old as he/she feel 😉

                                droll - quadragenarian, hihihihi

                                Cats bury it so they can't see it!
                                (You know what I mean if you have a cat)

                                1 Reply Last reply Reply Quote 0
                                • G
                                  garywagstaff @CS
                                  last edited by Feb 27, 2021, 1:53 PM

                                  @cs Thank you CS, I found this post after a couple of weeks of trying to sort out the exact same issue. I cant count the different firmwares I tried and reinstalls of PFsense. Even tried OPNsense. Long story short updated to v4.13.0.3 BIOS disabled CPB and my firewall has been stable since. Uptime 18 Hours 56 Minutes 32 Seconds at present, but I am quietly confident that the issue is resolve.

                                  Once again thank you.

                                  K 1 Reply Last reply May 6, 2021, 7:21 AM Reply Quote 0
                                  • K
                                    Kuser @garywagstaff
                                    last edited by May 6, 2021, 7:21 AM

                                    @cs Thank you very much for this post, it seems I have the same issue... Is this due to a defective board/cpu - has your firewall been running stable since disabling SPB?

                                    C 1 Reply Last reply May 11, 2021, 5:14 AM Reply Quote 0
                                    • C
                                      CS @Kuser
                                      last edited by May 11, 2021, 5:14 AM

                                      @kuser Super stable, I have had no issues at all since then!

                                      K 1 Reply Last reply Jun 1, 2021, 1:59 PM Reply Quote 0
                                      • K
                                        Kuser @CS
                                        last edited by Jun 1, 2021, 1:59 PM

                                        @cs Thanks for the feedback. Looks like they've found a bug.
                                        I reported this issue a couple of weeks ago:
                                        https://github.com/pcengines/coreboot/issues/469

                                        1 Reply Last reply Reply Quote 1
                                        • First post
                                          Last post
                                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                          This community forum collects and processes your personal information.
                                          consent.not_received