Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Panic booting 2.6.0 on Jetway NF692G6-420

    Scheduled Pinned Locked Moved Hardware
    13 Posts 4 Posters 1.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      chrullrich
      last edited by

      "The NF692G6-420 motherboard features the Intel Apollo Lake Pentium N4200 quad core processor and six Gigabit Ethernet LAN Ports [...]".

      The upgrade from 2.5.2 to 2.6.0 went fine until it failed to return from the automatic reboot. When I restarted it by hand, it ran into a kernel panic:

      Configuring loopback interface...done.
      Configuring LAGG interfaces...done.
      Configuring VLAN interfaces...done.
      Configuring WAN interface...done.
      Configuring LAN interface...done.
      Configuring SYNC interface...done.
      Configuring CARP settings...done.
      MCA: Bank 0, Status 0xb200000010000400
      MCA: Global Cap 0x0000000000000c07, Status 0x0000000000000004
      MCA: Vendor "GenuineIntel", ID 0x506c9, APIC ID 4
      MCA: CPU 2 UNCOR PCC internal timer error
      timeout stopping cpus
      panic: Unrecoverable machine check exception
      cpuid = 2
      time = 1645961035
      KDB: enter: panic
      [ thread pid 24487 tid 100543 ]
      Stopped at      kdb_enter+0x37: movq    $0,0x28f4676(%rip)
      db:0:kdb.enter.default> textdump set
      textdump set
      db:0:kdb.enter.default>  capture on
      db:0:kdb.enter.default>  run lockinfo
      db:1:lockinfo> show locks
      No such command; use "help" to list available commands
      db:1:lockinfo>  show alllocks
      No such command; use "help" to list available commands
      db:1:lockinfo>  show lockedvnods
      Locked vnodes
      db:0:kdb.enter.default>  show pcpu
      cpuid        = 2
      dynamic pcpu = 0xfffffe007f0b9200
      curthread    = 0xfffff8000669a740: pid 24487 tid 100543 "rtsold"
      curpcb       = 0xfffff8000669ace0
      fpcurthread  = 0xfffff8000669a740: pid 24487 "rtsold"
      idlethread   = 0xfffff80005323000: tid 100005 "idle: cpu2"
      curpmap      = 0xfffff8004456f138
      tssp         = 0xffffffff83719870
      commontssp   = 0xffffffff83719870
      rsp0         = 0xfffffe004d993bc0
      kcr3         = 0xffffffffffffffff
      ucr3         = 0xffffffffffffffff
      scr3         = 0x0
      gs32p        = 0xffffffff83720088
      ldt          = 0xffffffff837200c8
      tss          = 0xffffffff837200b8
      tlb gen      = 1665
      curvnet      = 0
      db:0:kdb.enter.default>  bt
      Tracing pid 24487 tid 100543 td 0xfffff8000669a740
      kdb_enter() at kdb_enter+0x37/frame 0xfffffe0002420e50
      vpanic() at vpanic+0x197/frame 0xfffffe0002420ea0
      panic() at panic+0x43/frame 0xfffffe0002420f00
      mca_intr() at mca_intr+0x9b/frame 0xfffffe0002420f20
      mchk_calltrap() at mchk_calltrap+0x8/frame 0xfffffe0002420f20
      --- trap 0x1c, rip = 0xffffffff80ddc9a2, rsp = 0xfffffe004d993750, rbp = 0xfffffe004d993760 ---
      lock_delay() at lock_delay+0x32/frame 0xfffffe004d993760
      __rw_wlock_hard() at __rw_wlock_hard+0x188/frame 0xfffffe004d993810
      pmap_remove_pages() at pmap_remove_pages+0x676/frame 0xfffffe004d993910
      vmspace_exit() at vmspace_exit+0x9e/frame 0xfffffe004d993950
      exit1() at exit1+0x55b/frame 0xfffffe004d9939b0
      sys_sys_exit() at sys_sys_exit+0xd/frame 0xfffffe004d9939c0
      amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe004d993af0
      fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe004d993af0
      --- syscall (1, FreeBSD ELF64, sys_sys_exit), rip = 0x8003ac5fa, rsp = 0x7fffffffeba8, rbp = 0x7fffffffebc0 ---
      db:0:kdb.enter.default>  ps
      
      [... mountains of stack traces ...]
      
      Tracing command zpool-zroot pid 31 tid 100203 td 0xfffff8000625d000
      sched_switch() at sched_switch+0x630/frame 0xfffffe004d77b9a0
      mi_switch() at mi_switch+0xd4/frame 0xfffffe004d77b9d0
      sleepq_wait() at sleepq_wait+0x2c/frame 0xfffffe004d77ba00
      _sleep() at _sleep+0x253/frame 0xfffffe004d77ba80
      taskqueue_thread_loop() at taskqueue_thread_loop+0xe9/frame 0xfffffe004d77bab0
      fork_exit(
      

      This is where the output stopped.

      After the automatic restart there was no activity on the serial console before I pulled the plug. I reinstalled 2.5.2 and attempted the upgrade again; this time, the console output just stopped after "Configuring CARP settings...". After returning to 2.5.2 again, it runs fine.

      I can find no reports of panics on this hardware, either mainboard or CPU, of either pfSense or FreeBSD.

      If I started on an extensive testing project (try upgrading an identical second system, try installing FreeBSD 12.3 instead of pfSense, try installing pfSense 2.6.0 instead of upgrading, ...) would the results be of any help to anyone to either tell me my hardware is bad, or find a way to work around the problem?

      Thanks for any suggestions.

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        An MCA error is almost always a hardware problem so if you're seeing it in 2.6 consistently and not at all in 2.5.2 it's probably some device that's not enabled in 2.5.2.

        Steve

        1 Reply Last reply Reply Quote 0
        • jimpJ
          jimp Rebel Alliance Developer Netgate
          last edited by

          $ cat mce.log 
          MCA: Bank 0, Status 0xb200000010000400
          MCA: Global Cap 0x0000000000000c07, Status 0x0000000000000004
          MCA: Vendor "GenuineIntel", ID 0x506c9, APIC ID 4
          MCA: CPU 2 UNCOR PCC internal timer error
          
          $ mcelog --no-dmi --ascii --file mce.log
          mcelog: Family 6 Model 92 CPU: only decoding architectural errors
          mcelog: Family 6 Model 92 CPU: only decoding architectural errors
          Hardware event. This is not a software error.
          CPU 2 BANK 0 
          MCG status:MCIP 
          STATUS b200000010000400 MCGSTATUS 4
          MCGCAP c07 APICID 4 SOCKETID 0 
          CPUID Vendor Intel Family 6 Model 92 Step 9
          

          tl;dr: Hardware issue.

          Might be something in the EFI/BIOS, an EFI or BIOS update might help, maybe switching between EFI and legacy booting, but that's just a guess.

          Jetway is not known for quality hardware, though, so it's also possible it's an actual hardware problem with that CPU.

          Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

          Need help fast? Netgate Global Support!

          Do not Chat/PM for help!

          C 1 Reply Last reply Reply Quote 0
          • C
            chrullrich @jimp
            last edited by

            @jimp Actually, I cannot complain about the Jetway hardware I have in use (not a lot lot, but double digits). This is the first time I have any significant issue with any of it.

            Even if the actual error is caused by something in the hardware, because 2.5.2 runs perfectly fine the suggestion upthread that it may be exposed by a kernel change makes sense to me.

            It looks like there is a BIOS update available, I will try that soon.

            I have not booted into the 2.6.0 installer at all yet; if that works, perhaps it will give me a clue. Most likely I will end up bisecting the kernel, which will be so much fun ...

            1 Reply Last reply Reply Quote 0
            • jimpJ
              jimp Rebel Alliance Developer Netgate
              last edited by

              On occasion a newer base OS will utilize some new feature of the hardware and uncover a latent problem as well. So even if it is related to the newer base OS that doesn't necessarily rule out a hardware problem, though it may be a specific hardware device or function that wasn't touched in the old version.

              Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

              Need help fast? Netgate Global Support!

              Do not Chat/PM for help!

              1 Reply Last reply Reply Quote 0
              • C
                chrullrich
                last edited by

                There is more wrong here than just a "simple" hardware issue that is exposed by pfSense 2.6.

                • FreeBSD 12.3 and 13.0 install fine as well, and I can bring up the network and do some basic testing without any indication of trouble.
                • pfSense 2.6.0 installs (from scratch) without problems, and reboots and runs without the network connected, but when I plug in the WAN link it freezes after no more than ten seconds.
                • Same for the SYNC link while pfsync is enabled at the other end.

                However, I also cannot get it to send or receive anything on the LAN interface(s). The original configuration was with an LACP LAGG over igb0/igb1. I reduced it to a static LAGG, then to a single interface, and consistently only saw outgoing traffic on both the firewall and the switch respectively. Neither side received anything from the other, and I tried every combination and several cables, of course.

                I am now back on 2.5.2 with the original configuration, and everything is working just as before.

                A second NF692G6-420 behaves the same insofar as it panics on the first reboot after installation. Experimenting any further seems pointless.

                Conclusion: If I want to use any pfSense after 2.5.2, I need different hardware. How nice.

                bmeeksB 1 Reply Last reply Reply Quote 0
                • bmeeksB
                  bmeeks @chrullrich
                  last edited by bmeeks

                  @chrullrich said in Panic booting 2.6.0 on Jetway NF692G6-420:

                  There is more wrong here than just a "simple" hardware issue that is exposed by pfSense 2.6.

                  • FreeBSD 12.3 and 13.0 install fine as well, and I can bring up the network and do some basic testing without any indication of trouble.

                  Just curious -- which FreeBSD? Did you try STABLE or RELEASE? pfSense is now using the STABLE branch, and it is different than the same version number in RELEASE. pfSense 2.5.2 was FreeBSD 12.2 STABLE. The 2.6.0 pfSense is based on FreeBSD 12.3 STABLE.

                  So a fair test would need to be done on the STABLE branch for FreeBSD. Just mentioning this because some folks grab RELEASE and don't realize that STABLE can be quite different when it comes to drivers (and bugs).

                  So with all that said, it is true that pfSense runs on a "customized" FreeBSD, so there are some changes. If you see different behavior between FreeBSD 12.3 STABLE and pfSense, then it might point to a pfSense issue (or still might be the particular patch level between the 12.3 STABLE you test on versus what pfSense 2.6.0 is built on).

                  1 Reply Last reply Reply Quote 1
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    Does it make any difference which NIC you have assigned as WAN?

                    Do all 6 NICs use the igb(4) driver?

                    Steve

                    C 2 Replies Last reply Reply Quote 0
                    • C
                      chrullrich @stephenw10
                      last edited by

                      OK, I think I figured it out, and this is embarrassing. Short version: The ACPI OS selection was on Windows, and it works much better when set to Linux, although I'm not completely sure that fixed the panics. It fixed something, though.

                      Long version:

                      The BIOS on the NF692G6 has the usual ACPI OS selection, which (of course) defaults to Windows. The other options available are Linux and MSDOS, and since FreeBSD is neither Linux nor MSDOS, I figured I might as well leave it at the default. Big mistake.
                      I set up a test lab with a single WAN instead of two and a single LAN instead of ~10. From the start, I saw an entirely different problem than before: Rather than panicing or just freezing once they received CARP or pfSync traffic, each individual network interface stopped working when it saw the first TCP packet (or possibly anything but ICMP). I could literally ping forever without trouble, but as soon I tried to get to the web configurator, the ping responses immediately stopped (and the browser timed out). As usual, this did not reproduce on vanilla FreeBSD (12.3, 13-RELEASE, 14-CURRENT). OPNsense 22.1 (with 13.0-RELEASE) did the same, and 21.7 (12.1) did not.
                      Then I noticed the OS option again, set it to Linux, and the new problem went away like the wind. If only I had not already replaced the hardware with nicer (read: much pricier) things.

                      1 Reply Last reply Reply Quote 0
                      • C
                        chrullrich @stephenw10
                        last edited by

                        @stephenw10 To answer your questions: No, it makes no difference which is the WAN, and yes, the six interfaces on this board are igb0 through igb5.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Hmm, that sounds like it could be some hardware off loading in the NIC that isn't implemented as it's reporting. You might try comparing the output between 2.5.2 and 2.6.0 of:

                          ifconfig -vvvm igb0
                          

                          TCP Segmentation Offloading should be disabled by default.

                          It's possible the BIOS reports different capabilities there to Windows. That's not something I've seen on any other hardware though.

                          Steve

                          C 1 Reply Last reply Reply Quote 0
                          • C
                            chrullrich @stephenw10
                            last edited by

                            @stephenw10 Looks identical to me. This is with the two versions on the two NF692G6s, and the one with 2.6.0 had link. I didn't notice until I started comparing them, by which time it was too late.

                            pfSense 2.6.0, ACPI OS = "Intel Linux":

                            [2.6.0-RELEASE][root@pfSense.home.arpa]/root: ifconfig -vvvm igb0
                            igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
                                    options=e100bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6>
                                    capabilities=f53fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,NETMAP,RXCSUM_IPV6,TXCSUM_IPV6>
                                    ether 00:30:18:09:13:75
                                    inet6 fe80::230:18ff:fe09:1375%igb0 prefixlen 64 scopeid 0x1
                                    inet 0.0.0.0 netmask 0xff000000 broadcast 255.255.255.255
                                    media: Ethernet autoselect (1000baseT <full-duplex>)
                                    status: active
                                    supported media:
                                            media autoselect
                                            media 1000baseT
                                            media 1000baseT mediaopt full-duplex
                                            media 100baseTX mediaopt full-duplex
                                            media 100baseTX
                                            media 10baseT/UTP mediaopt full-duplex
                                            media 10baseT/UTP
                                    nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
                            

                            pfSense 2.6.0, ACPI OS = "Windows":

                            [2.6.0-RELEASE][root@pfSense.home.arpa]/root: ifconfig -vvvm igb0
                            igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
                                    options=e100bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6>
                                    capabilities=f53fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,NETMAP,RXCSUM_IPV6,TXCSUM_IPV6>
                                    ether 00:30:18:09:13:75
                                    inet6 fe80::230:18ff:fe09:1375%igb0 prefixlen 64 scopeid 0x1
                                    inet 0.0.0.0 netmask 0xff000000 broadcast 255.255.255.255
                                    media: Ethernet autoselect (1000baseT <full-duplex>)
                                    status: active
                                    supported media:
                                            media autoselect
                                            media 1000baseT
                                            media 1000baseT mediaopt full-duplex
                                            media 100baseTX mediaopt full-duplex
                                            media 100baseTX
                                            media 10baseT/UTP mediaopt full-duplex
                                            media 10baseT/UTP
                                    nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
                            

                            pfSense 2.5.2, ACPI OS = "Intel Linux":

                            [2.5.2-RELEASE][root@pfSense.home.arpa]/root: ifconfig -vvvm igb0
                            igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
                                    options=e100bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6>
                                    capabilities=f53fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,NETMAP,RXCSUM_IPV6,TXCSUM_IPV6>
                                    ether 00:30:18:09:12:df
                                    inet6 fe80::230:18ff:fe09:12df%igb0 prefixlen 64 scopeid 0x1
                                    media: Ethernet autoselect
                                    status: no carrier
                                    supported media:
                                            media autoselect
                                            media 1000baseT
                                            media 1000baseT mediaopt full-duplex
                                            media 100baseTX mediaopt full-duplex
                                            media 100baseTX
                                            media 10baseT/UTP mediaopt full-duplex
                                            media 10baseT/UTP
                                    nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
                            

                            pfSense 2.5.2, ACPI OS = "Windows", after it spontaneously rebooted once at "Configuring LAN interface...", with no additional output on the serial console, on the first boot after changing the OS option:

                            [2.5.2-RELEASE][root@pfSense.home.arpa]/root: ifconfig -vvvm igb0
                            igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
                                    options=e100bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,RXCSUM_IPV6,TXCSUM_IPV6>
                                    capabilities=f53fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,NETMAP,RXCSUM_IPV6,TXCSUM_IPV6>
                                    ether 00:30:18:09:12:df
                                    inet6 fe80::230:18ff:fe09:12df%igb0 prefixlen 64 scopeid 0x1
                                    media: Ethernet autoselect
                                    status: no carrier
                                    supported media:
                                            media autoselect
                                            media 1000baseT
                                            media 1000baseT mediaopt full-duplex
                                            media 100baseTX mediaopt full-duplex
                                            media 100baseTX
                                            media 10baseT/UTP mediaopt full-duplex
                                            media 10baseT/UTP
                                    nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>
                            
                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              Mmm, I agree looks to be configured the same in all cases. ๐Ÿ˜•

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.