• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

2.4.5-p1 crash report

Development
2
13
789
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T
    tomahhunt
    last edited by Jun 16, 2020, 10:51 AM

    Hi,

    I have been using the new 2.4.5-p1 and I just got a crash.
    It's running on proxmox.

    Filename: /var/crash/info.0
    Dump header from device: /dev/vtbd0p2
    Architecture: amd64
    Architecture Version: 1
    Dump Length: 72192
    Blocksize: 512
    Dumptime: Tue Jun 16 11:40:35 2020
    Hostname: ********
    Magic: FreeBSD Text Dump
    Version String: FreeBSD 11.3-STABLE #243 abf8cba50ce(RELENG_2_4_5): Tue Jun 2 17:53:37 EDT 2020
    root@buildbot1-nyi.netgate.com:/build/ce-crossbuild-245/obj/amd64/YNx4Qq3j/build/ce-crossbuild-245/source
    Panic String: double fault
    Dump Parity: 3471571240
    Bounds: 0
    Dump Status: good

    Is a "double fault" a memory/hardware error?
    Or do I need more context?

    Cheers,

    Tom

    1 Reply Last reply Reply Quote 0
    • J
      jimp Rebel Alliance Developer Netgate
      last edited by Jun 16, 2020, 1:52 PM

      That doesn't have enough information to suggest what the problem might be. Both of you need to post the whole crash dump, or at least the ddb.txt and msgbuf.txt from inside the crash dump archive.

      Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

      Need help fast? Netgate Global Support!

      Do not Chat/PM for help!

      1 Reply Last reply Reply Quote 0
      • T
        tomahhunt
        last edited by Jun 16, 2020, 2:24 PM

        I did wonder. OK mine is attached.
        textdump.tar.0

        1 Reply Last reply Reply Quote 0
        • J
          jimp Rebel Alliance Developer Netgate
          last edited by Jun 16, 2020, 2:33 PM

          Fatal double fault
          rip 0xffffffff80de1ba4 rsp 0 rbp 0xfffffe01ee369870
          rax 0xee369820 rdx 0x6 rbx 0xfffff801543cdc00
          rcx 0 rsi 0xfffff80006419b7c rdi 0xfffff80154619012
          r8 0xfffff801543cdc00 r9 0xd r10 0
          r11 0xfffffe01ee358000 r12 0xfffff800064edd00 r13 0x6488
          r14 0xfffff8015461900c r15 0xfffff80006661800 rflags 0x10246
          cs 0x20 ss 0x28 ds 0x3b es 0x3b fs 0x13 gs 0x1b
          fsbase 0x800762f30 gsbase 0xffffffff83522e80 kgsbase 0
          cpuid = 0; apic id = 00
          panic: double fault
          cpuid = 0
          KDB: enter: panic
          
          db:0:kdb.enter.default>  show pcpu
          cpuid        = 0
          dynamic pcpu = 0xc02580
          curthread    = 0xfffff800065de620: pid 12 "irq259: virtio_pci2"
          curpcb       = 0xfffffe01ee369b80
          fpcurthread  = none
          idlethread   = 0xfffff80006241000: tid 100003 "idle: cpu0"
          curpmap      = 0xffffffff834f1c40
          tssp         = 0xffffffff835a32d0
          commontssp   = 0xffffffff835a32d0
          rsp0         = 0xfffffe01ee369b80
          gs32p        = 0xffffffff835a9f28
          ldt          = 0xffffffff835a9f68
          tss          = 0xffffffff835a9f58
          tlb gen      = 2674103
          db:0:kdb.enter.default>  bt
          Tracing pid 12 tid 100072 td 0xfffff800065de620
          kdb_enter() at kdb_enter+0x3b/frame 0xffffffff83484040
          vpanic() at vpanic+0x19b/frame 0xffffffff834840a0
          panic() at panic+0x43/frame 0xffffffff83484100
          dblfault_handler() at dblfault_handler+0x1de/frame 0xffffffff834841d0
          Xdblfault() at Xdblfault+0xbd/frame 0xffffffff834841d0
          --- trap 0x17, rip = 0xffffffff80de1ba4, rsp = 0, rbp = 0xfffffe01ee369870 ---
          ether_nh_input() at ether_nh_input+0x314/frame 0xfffffe01ee369870
          netisr_dispatch_src() at netisr_dispatch_src+0xa2/frame 0xfffffe01ee3698c0
          ether_input() at ether_input+0x26/frame 0xfffffe01ee3698e0
          vtnet_rxq_eof() at vtnet_rxq_eof+0x7ae/frame 0xfffffe01ee3699b0
          vtnet_rx_vq_intr() at vtnet_rx_vq_intr+0x71/frame 0xfffffe01ee3699e0
          intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe01ee369a20
          ithread_loop() at ithread_loop+0xe7/frame 0xfffffe01ee369a70
          fork_exit() at fork_exit+0x83/frame 0xfffffe01ee369ab0
          fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe01ee369ab0
          

          Unfortunately no solid leads there that I recognize and nothing turned up in a search that exactly matched it. The closest I could find is that it could be a concurrency issue in the driver, though that particular result was 8 years old and unlikely to still be the case.

          Have you changed any settings like net.isr.dispatch? What is the current value reported by sysctl net.isr.dispatch?

          Since that is vtnet, make sure you have all offloading disabled. Go under System > Advanced, Networking tab, and under Network Interfaces, make sure all three offloading boxes are checked.

          Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

          Need help fast? Netgate Global Support!

          Do not Chat/PM for help!

          1 Reply Last reply Reply Quote 0
          • T
            tomahhunt
            last edited by Jun 16, 2020, 2:50 PM

            All the offloading is disabled. Very slow without that!
            net.isr.dispatch is "direct" as reported on the command line.

            The only other manual setting I changed was net.inet.ip.fastforwarding to 1 when i was debugging the multi cpu issues on 2.4.5 before people worked out the underlying cause. Should I undo that?

            1 Reply Last reply Reply Quote 0
            • J
              jimp Rebel Alliance Developer Netgate
              last edited by Jun 16, 2020, 2:58 PM

              That knob doesn't exist in 2.4.x so it wouldn't affect anything

              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

              Need help fast? Netgate Global Support!

              Do not Chat/PM for help!

              1 Reply Last reply Reply Quote 0
              • T
                tomahhunt
                last edited by Jun 16, 2020, 4:15 PM

                Aside from the upgrade to 2.4.5-p1 and adding more cpus back to my VM the only other recent change was a new internet connection.
                So I switched to PPPoE for my WAN as I got fibre.

                1 Reply Last reply Reply Quote 0
                • J
                  jimp Rebel Alliance Developer Netgate
                  last edited by Jun 16, 2020, 4:18 PM

                  If it was due to the PPPoE I'd expect to see things like mpd and netgraph in the backtrace and it isn't there, just vtnet. The debug output mentions virtio_pci2 which is this:

                  virtio_pci2: <VirtIO PCI Network adapter> port 0xe0c0-0xe0df mem 0xfea92000-0xfea92fff,0xfe408000-0xfe40bfff irq 10 at device 18.0 on pci0
                  vtnet0: <VirtIO Networking Adapter> on virtio_pci2
                  vtnet0: Ethernet address: 96:92:4d:86:43:0e
                  vtnet0: netmap queues/slots: TX 1/256, RX 1/128
                  000.002009 [ 503] vtnet_netmap_attach       vtnet attached txq=1, txd=256 rxq=1, rxd=128
                  

                  I see you are using netmap, so suricata? It could be related to that.

                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                  Need help fast? Netgate Global Support!

                  Do not Chat/PM for help!

                  1 Reply Last reply Reply Quote 0
                  • J
                    jimp Rebel Alliance Developer Netgate
                    last edited by Jun 16, 2020, 4:19 PM

                    Ah, nevermind on the netmap that's just what the driver prints at boot regardless. I checked one of my proxmox VMs and it's there, too

                    Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                    Need help fast? Netgate Global Support!

                    Do not Chat/PM for help!

                    1 Reply Last reply Reply Quote 0
                    • T
                      tomahhunt
                      last edited by Jun 16, 2020, 4:27 PM

                      I don't have many packages.
                      bandwidthd
                      iperf
                      openvpn-client-export
                      pfBlockerNG-devel

                      There is a new version of pfBlockerNG-devel so i will update that just in case.

                      1 Reply Last reply Reply Quote 0
                      • T
                        tomahhunt
                        last edited by Jun 16, 2020, 5:06 PM

                        Interesting. It just crashed again.
                        Never had this before and been running 2.4.5 for a long while.
                        Tempted to go back to 1 cpu for a bit and see if that helps.

                        1 Reply Last reply Reply Quote 0
                        • T
                          tomahhunt
                          last edited by Jun 17, 2020, 1:47 PM

                          No crash in 24 hours after back to 1 cpu.
                          Will keep monitoring.

                          1 Reply Last reply Reply Quote 0
                          • T
                            tomahhunt
                            last edited by Jun 18, 2020, 10:44 PM

                            Crashed again.
                            Any idea what else I might do to aid debug?

                            1 Reply Last reply Reply Quote 0
                            4 out of 13
                            • First post
                              4/13
                              Last post
                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.