Debug Kernel Build with Witness

torred

While testing, I tried procstat, but it locked up my console. I'm assuming it tried to crawl the process structure and came across one of the dead locked processes?

Let me explain the dump a bit more. In the top image the rm_priotracker pointer (which is actually called rm_activeReaders) is at offset 0x02077e8c (which would be vmem address 0xc1c77e8c, or pf_rules_lock+0x14). It's value is 0xc7b86908 (virtual), which maps to a physical address of 0x07f86908, which I've shown in the last image.

The first value of the rm_priotracker is actually ~~a pointer to~~ a queue rm_queue rmp_cpuQueue, whose first value is a pointer, so this should have a pointer in it.

(kgdb) ptype struct rm_priotracker
type = struct rm_priotracker {
    rm_queue rmp_cpuQueue;
    rmlock *rmp_rmlock;
    thread *rmp_thread;
    int rmp_flags;
    struct {
        rm_priotracker *le_next;
        rm_priotracker **le_prev;
    } rmp_qentry;
}

(kgdb) ptype struct rm_queue
type = struct rm_queue {
    rm_queue * volatile rmq_next;
    rm_queue * volatile rmq_prev;
}

So the value in the dump at 0x07f86908 should be a pointer to an rm_queue struct (which is a queue/linked list), however, 0x00306f65 is not a valid memory location (not in kernel memory anyway). When I checked this value on a live system (reading /dev/kmem), it mapped to a valid 0xc* address.

markj

It's possible that one of the deadlocked threads held a lock required by procstat. If you're able to run kgdb against a vmcore you should be able to get backtraces with "thread apply all bt".

Why do you say that 0xc7b86908 maps linearly to physaddr 0x07f86908? ARM kernels do not use a direct map. Indeed, the base of the kernel map is 0xc0000000 but kernel virtual memory is allocated dynamically once the page table code is bootstrapped. Maybe I'm missing something - did you resolve the address using the kernel page tables?

torred

Sorry, guess I should also have posted that!

I got the physical offset using readelf on the vmcore. Considering the information below, and the fact the pf_rules_lock->lo_name actually pointed to the string "pf rulesets" (that middle image, physical address 0x00d2227a maps to virtual 0x0c92227a), I figured it was accurate.

readelf --segments vmcore.0

Elf file type is CORE (Core file)
Entry point 0x0
There are 2 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x001000 0x00001000 0x00001000 0x7ffff000 0x7ffff000 R   0x1000
  LOOS+0xfb5d000 0x000000 0xc0000000 0x00400000 0x00000 0x00000 R   0x1000

Edit, quick little python snippet I've been using (its not perfect, but should map the kernel offsets):

po = 0x00400000
vo = 0xc0000000
def p_to_v(paddr):
  return hex(paddr - po + vo)
def v_to_p(vaddr):
  return hex(po + (vaddr - vo))

>>> p_to_v(0x00d2227a)
'0xc092227a'

Edit2:
Lastly, kgdb could not read the vmcore. It kept saying: Failed to open vmcore: cannot read l1pt

My assumption there (since my expertise is not in ARM) is that the "l1pt" (table?) was corrupted.

Edit3:
Really lastly, while I'm not willing to share the full vmcore, if you (or someone else) is really interested in figuring this out, shoot me a private message and I can carve out data from the vmcore if you have some offsets you want to look at.

markj

What I mean is that in general you cannot translate kernel virtual addresses by subtracting 0xc0000000. Lock names are strings embedded into the kernel, and the kernel image itself is indeed mapped linearly. However, rm lock priotrackers are allocated from kernel stacks, which are allocated and mapped dynamically.

Indeed, it sounds like the vmcore is corrupted or somehow unrecognized. l1pt is the root of the kernel page tables, needed to translate virtual addresses.

Are you able to break into the in-kernel debugger from the console? That might be another way to get info about the offending code if you are able to reproduce the problem. I'm much more familiar with FreeBSD than pfSense, so I'm not sure whether a debugger is included in the kernel configuration.

torred

Right, which is why I used the PhysAddr/VirtAddr from the segments information in the vmcore to do the mapping (So <virtual address> - 0xc0000000 + 0x00400000 gets me the physical address).

I see what you are saying though about the rm_lock being allocated on the kernel stacks; however, everytime I run (kgdb) print &pf_rules_lock against that kernel, it always returns 0xc1c77e78. On my 2.4.5p1 build, when I run it I always get 0xc1a3e2d4.

As far as a debugger, I only just noticed (as I went to check) /usr/lib/debug/boot/kernel/kernel.debug, which I was originally using for debug symbols, but now I realize I wasn't booting from that one, heh. That one of course has witness support (like jimp told me, but I was too thick skulled to realize I wasn't booting the correct kernel, smh), and also appears to have remote kgdb support.

Alright, well If I get some more time, I'll try actually booting that kernel and see what I can find, but for now I need my SG-3100 to be stable.

luckman212

@torred Not sure how I stumbled onto this thread, but I was very impressed by your debugging skills. I see you used Ghidra to analyze the stack traces (is that right?). Just wondering how you got started learning how to do that. Any pointers or tips for getting started down that path?

torred

@luckman212 Yeah, I used Ghidra to analyze the vmcore (kernel dump). I'm more familiar with IDA, but I chose this project to try my hand at learning Ghidra. I'm rather impressed by its functionality, but the scripting interface leaves something to be desired; can't beat the price though.

As for where to start to learn...that's a large question, but like most things, I start with a problem. Find some project that requires reverse engineering or debugging, and learn Ghidra as you go. If you're having trouble finding something, head over to hackthebox and check out their RE challenges.

If you are interested primarily in reverse engineering (or learning the deep internals of operating systems), I would recommend "The Rootkit Arsenal: Escape and Evasion in the Dark Corners of the System (2nd Edition)" (ISBN-13: 978-1449626365). Chapters 3 and 4 offer an excellent overview of hardware and system design. It focuses mainly (exclusively?) on x86/x86_64 and Windows, but the foundational knowledge is pretty solid IMO. I've got a copy of "Reversing: Secrets of Reverse Engineering" (ISBN-13: 978-0764574818), but honestly I found "The Rootkit Arsenal" more useful...check them both out if you can and see which you like.

Now, I tackled this from an RE perspective because that's where my knowledge lies, but there are probably other ways that the more seasoned FreeBSD dev's have of doing this. (I have a hammer, so everything is a nail)

As a side note, I had to use Ghidra because the vmcore was corrupted by the memory leak...otherwise I would have just used kgdb to crawl the vmcore. Also, I'm rather annoyed that I was never able to find the source of the problem. It now appears that others may be experiencing something similar with the official release of 2.5.0.

bmeeks

I had to troubleshoot some SG-3100 issues a couple of years ago, and now I have to do it again, with the Snort and Suricata packages crashing only on ARM 32-bit hardware.

For what it's worth, I think the llvm compiler for the ARM hardware (and maybe just the 32-bit ARM hardware) has some issues of its own that lend to it generating faulty machine code. Note that this issue with freezing seems to impact the SG-3100 hardware, which is a 32-bit ARM CPU, but not x86-64 hardware. But the actual C source code for FreeBSD is the same.

torred

@bmeeks Yeah, its odd. It looks like FreeBSD devs are aware, if not quite how to fix it completely: https://reviews.freebsd.org/D28821

bmeeks

@torred said in Debug Kernel Build with Witness:

@bmeeks Yeah, its odd. It looks like FreeBSD devs are aware, if not quite how to fix it completely: https://reviews.freebsd.org/D28821

Yep. They had a long conversation about the solution. They settled on a memory barrier approach. I think the compiler is a little too aggressive with its optimizations. That was the issue with Snort a couple of years ago. The compiler was choosing the ONLY two instructions in the armv7 chip that did not do auto-fixup of non-aligned memory access. Snort's C source code has a number of those problems, but on Intel hardware the CPUs have always done an automatic fix-up at runtime and thus have hidden the bad code so that it persists.

luckman212

@torred Great stuff. Thank you for taking the time for such a detailed reply. I am going to see if I can wrap my head around some of this.

bmeeks

It can be fun if you are into such things. In my younger days I did this with embedded systems things. Many of those used Intel or Texas Instruments embedded microprocessors with the code in EPROM. Dump the EPROM using a reader, start at the power-on reset vector in the EPROM code for the CPU and start disassembling the binary. I often wrote my own disassemblers using the datasheets from the CPU manufacturers to get the binary opcodes and their assembler mnemonics so I could properly code the disassembler.

I did it to satisfy my curiosity and not for nefarious purposes. Mostly just to see if I could do it. My wife did question my sanity at times when she found me poring over a stack of paper from a dot-matrix printer composed of pages of assembly code. I would be writing in comments as I figured out what a section of code was likely doing.

Luckily I recovered ... , and no longer suffer from that yearning.