Crash with report (Wan Link issues again)
-
A bit of background, so i used to have lock ups with 2.3.2 with my realtek NIC's on my ZOTAC Nano CI323 device. I would lose the WAN link and GUI access to the pfsense(SSH still worked) the only way to get it back was a hard reboot.. i have since upgraded to to 2.4 snapshots, i was still having the same issues, then i tried to limit my WAN link to 100Mbps, so now it no longer locks up, but it crashes and reboots. Here's the crash log.
I am in the process of trying to compile the realtek driver for my NIC's to see if that will help, its seems to have helped for others with similar issue in FreeBSD and the same problems i was having
Crash report begins. Anonymous machine information:
amd64
11.0-RELEASE-p3
FreeBSD 11.0-RELEASE-p3 #211 307498f(RELENG_2_4): Wed Nov 30 10:00:36 CST 2016 root@buildbot2.netgate.com:/builder/ce/tmp/obj/builder/ce/tmp/FreeBSD-src/sys/pfSenseCrash report details:
Filename: /var/crash/bounds
1Filename: /var/crash/info.0
Dump header from device: /dev/gptid/dc0a6279-b76a-11e6-a6f0-00012e6fad50
Architecture: amd64
Architecture Version: 1
Dump Length: 78336
Blocksize: 512
Dumptime: Sat Dec 10 16:19:08 2016
Hostname: Roasted.localdomain
Magic: FreeBSD Text Dump
Version String: FreeBSD 11.0-RELEASE-p3 #211 307498f(RELENG_2_4): Wed Nov 30 10:00:36 CST 2016
root@buildbot2.netgate.com:/builder/ce/tmp/obj/builder/ce/tmp/FreeBSD-src/sys/pfSense
Panic String:
Dump Parity: 1030638445
Bounds: 0
Dump Status: goodFilename: /var/crash/info.last
Dump header from device: /dev/gptid/dc0a6279-b76a-11e6-a6f0-00012e6fad50
Architecture: amd64
Architecture Version: 1
Dump Length: 78336
Blocksize: 512
Dumptime: Sat Dec 10 16:19:08 2016
Hostname: Roasted.localdomain
Magic: FreeBSD Text Dump
Version String: FreeBSD 11.0-RELEASE-p3 #211 307498f(RELENG_2_4): Wed Nov 30 10:00:36 CST 2016
root@buildbot2.netgate.com:/builder/ce/tmp/obj/builder/ce/tmp/FreeBSD-src/sys/pfSense
Panic String:
Dump Parity: 1030638445
Bounds: 0
Dump Status: goodFilename: /var/crash/textdump.tar.0
ddb.txt06000014000013023070514 7065 ustarrootwheeldb:0:kdb.enter.default> run lockinfo
db:1:lockinfo> show locks
No such command
db:1:locks> show alllocks
No such command
db:1:alllocks> show lockedvnods
Locked vnodes
db:0:kdb.enter.default> show pcpu
cpuid = 3
dynamic pcpu = 0xfffffe026553c900
curthread = 0xfffff8004002da00: pid 82640 "sh"
curpcb = 0xfffffe0230dd5cc0
fpcurthread = 0xfffff8004002da00: pid 82640 "sh"
idlethread = 0xfffff80005204000: tid 100006 "idle: cpu3"
curpmap = 0xfffff801be2e8138
tssp = 0xffffffff82a24e48
commontssp = 0xffffffff82a24e48
rsp0 = 0xfffffe0230dd5cc0
gs32p = 0xffffffff82a2b6a0
ldt = 0xffffffff82a2b6e0
tss = 0xffffffff82a2b6d0
db:0:kdb.enter.default> bt
Tracing pid 82640 tid 100111 td 0xfffff8004002da00
turnstile_broadcast() at turnstile_broadcast+0x9c/frame 0xfffffe0230dd5480
__rw_wunlock_hard() at _rw_wunlock_hard+0x8f/frame 0xfffffe0230dd54b0
vm_map_delete() at vm_map_delete+0x3dc/frame 0xfffffe0230dd5530
vm_map_remove() at vm_map_remove+0x47/frame 0xfffffe0230dd5560
exec_new_vmspace() at exec_new_vmspace+0x22f/frame 0xfffffe0230dd55e0
exec_elf64_imgact() at exec_elf64_imgact+0xa58/frame 0xfffffe0230dd56f0
kern_execve() at kern_execve+0x74d/frame 0xfffffe0230dd5a50
sys_execve() at sys_execve+0x4a/frame 0xfffffe0230dd5ad0
amd64_syscall() at amd64_syscall+0x4ce/frame 0xfffffe0230dd5bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0230dd5bf0
–- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x800b40d3a, rsp = 0x7fffffffe728, rbp = 0x7fffffffe870 ---
db:0:kdb.enter.default> ps
pid ppid pgrp uid state wmesg wchan cmd
82640 81373 281 0 R CPU 3 sh
82535 81373 281 0 R CPU 1 grep
82230 81373 281 0 R CPU 2 grep
82004 81373 281 0 R CPU 0 grep
81699 81373 281 0 S pipdwt 0xfffff8000bb9f2f8 clog
81373 73438 281 0 R sh
73438 281 281 0 S piperd 0xfffff80079c1d5f0 php-fpm
21271 281 281 0 S accept 0xfffff8000bdf106c php-fpm
17756 43248 281 0 S nanslp 0xffffffff8286ccd0 sleep
76839 1 76839 0 Ss (threaded) ntopng
100411 S nanslp 0xffffffff8286ccd3 ntopng
100451 S select 0xfffff8000ba38040 ntopng
100452 S uwait 0xfffff800672a8a80 ntopng
100453 S uwait 0xfffff800404cfe00 ntopng
100454 S uwait 0xfffff80079648b00 ntopng
100455 S uwait 0xfffff80040506980 ntopng
100456 S uwait 0xfffff80040513880 ntopng
100457 S nanslp 0xffffffff8286ccd2 ntopng
100458 S nanslp 0xffffffff8286ccd0 ntopng
100459 S nanslp 0xffffffff8286ccd2 ntopng
100460 S nanslp 0xffffffff8286ccd0 ntopng
100461 S bpf 0xfffff80079fe7c00 ntopng
100462 S select 0xfffff80079ca7940 ntopng
100463 S select 0xfffff8000bf61440 ntopng
76587 1 281 0 S (threaded) redis-server
100438 S kqread 0xfffff800674ae000 redis-server
100449 S uwait 0xfffff8000bca2800 redis-server
100450 S uwait 0xfffff80079648a80 redis-server
43248 1 281 0 S wait 0xfffff8007931e000 sh
42383 1 42383 0 Ss select 0xfffff800405152c0 openvpn
37561 1 37561 136 Ss select 0xfffff80079be77c0 dhcpd
32867 1 32867 59 Ss (threaded) unbound
100200 S kqread 0xfffff8000b83fd00 unbound
100446 S kqread 0xfffff80079bac700 unbound
100447 S kqread 0xfffff80079baca00 unbound
100448 S kqread 0xfffff80040495000 unbound
30359 1 30359 0 Ss (threaded) dpinger
100201 S uwait 0xfffff80067301580 dpinger
100442 S sbwait 0xfffff80079c0a4a4 dpinger
100443 S nanslp 0xffffffff8286ccd0 dpinger
100444 S nanslp 0xffffffff8286ccd3 dpinger
100445 S accept 0xfffff80067248a8c dpinger
19837 1 19837 65 Ss select 0xfffff8007950f640 dhclient
15365 1 15365 0 Ss select 0xfffff800404a0940 dhclient
98145 98081 98145 0 S+ ttyin 0xfffff8000b87eca8 sh
98081 96620 98081 0 S+ wait 0xfffff800792d8a50 sh
97918 77570 97918 0 Ss (threaded) sshlockout_pf
100092 S piperd 0xfffff800797352f8 sshlockout_pf
100173 S nanslp 0xffffffff8286ccd1 sshlockout_pf
97641 1 97641 0 Ss+ ttyin 0xfffff8000b9348a8 getty
97421 1 97421 0 Ss+ ttyin 0xfffff8000b934ca8 getty
97411 1 97411 0 Ss+ ttyin 0xfffff8000b9350a8 getty
97342 1 97342 0 Ss+ ttyin 0xfffff8000b9354a8 getty
97209 1 97209 0 Ss+ ttyin 0xfffff8000b9358a8 getty
97125 1 97125 0 Ss+ ttyin 0xfffff8000b87e4a8 getty
96795 1 96795 0 Ss+ ttyin 0xfffff8000b87e8a8 getty
96620 1 96620 0 Ss+ wait 0xfffff8000bb8da50 login
83854 83360 83360 0 S nanslp 0xffffffff8286ccd2 minicron
83360 1 83360 0 Ss wait 0xfffff8004003ca50 minicron
83126 82896 82896 0 S nanslp 0xffffffff8286ccd0 minicron
82896 1 82896 0 Ss wait 0xfffff8000bb39528 minicron
82605 82420 82420 0 S nanslp 0xffffffff8286ccd2 minicron
82420 1 82420 0 Ss wait 0xfffff8000bb8ea50 minicron
77570 1 77570 0 Ss select 0xfffff8000bc88740 syslogd
33600 1 33600 0 Ss (threaded) ntpd
100133 S select 0xfffff800672a83c0 ntpd
30868 1 30868 0 Ss nanslp 0xffffffff8286ccd0 cron
30634 30287 30287 0 S kqread 0xfffff80040494c00 nginx
30500 30287 30287 0 S kqread 0xfffff800672c4000 nginx
30287 1 30287 0 Ss pause 0xfffff8000bc140a8 nginx
20878 1 20878 0 Ss select 0xfffff800404d29c0 xinetd
19056 1 19056 0 Ss bpf 0xfffff80040467000 filterlog
7750 1 7750 0 Ss (threaded) sshlockout_pf
100121 S uwait 0xfffff8000bca3380 sshlockout_pf
100122 S nanslp 0xffffffff8286ccd0 sshlockout_pf
7745 1 7745 0 Ss select 0xfffff8000ba39c40 sshd
311 1 311 0 Ss select 0xfffff8000b9a0140 devd
297 295 295 0 S kqread 0xfffff8000bb52c00 check_reload_status
295 1 295 0 Ss kqread 0xfffff8000b9fb700 check_reload_status
281 1 281 0 Ss kqread 0xfffff8000bb52600 php-fpm
58 0 0 0 DL mdwait 0xfffff8000b9f4800 [md0]
27 0 0 0 DL syncer 0xffffffff82966880 [syncer]
26 0 0 0 DL vlruwt 0xfffff8000b89a528 [vnlru]
25 0 0 0 DL (threaded) [bufdaemon]
100065 D psleep 0xffffffff82965104 [bufdaemon]
100078 D sdflush 0xfffff8000b8b7ae8 [/ worker]
24 0 0 0 DL - 0xffffffff82965db4 [bufspacedaemon]
23 0 0 0 DL pgzero 0xffffffff8297a4e4 [pagezero]
22 0 0 0 DL pollid 0xffffffff8286b680 [idlepoll]
21 0 0 0 DL psleep 0xffffffff8297698c [vmdaemon]
20 0 0 0 DL (threaded) [pagedaemon]
100060 D psleep 0xffffffff82a24105 [pagedaemon]
100069 D umarcl 0xffffffff829762b8 [uma]
19 0 0 0 DL - 0xffffffff82964914 [soaiod4]
18 0 0 0 DL - 0xffffffff82964914 [soaiod3]
9 0 0 0 DL - 0xffffffff82964914 [soaiod2]
8 0 0 0 DL - 0xffffffff82964914 [soaiod1]
17 0 0 0 DL cooling 0xfffff8000528b958 [acpi_cooling0]
16 0 0 0 DL tzpoll 0xffffffff82627838 [acpi_thermal]
7 0 0 0 DL - 0xffffffff82740bf0 [rand_harvestq]
6 0 0 0 DL pftm 0xffffffff80f513f0 [pf purge]
5 0 0 0 DL waiting 0xffffffff82a13740 [sctp_iterator]
15 0 0 0 DL (threaded) [usb]
100035 D - 0xfffffe000110e460 [usbus0]
100036 D - 0xfffffe000110e4b8 [usbus0]
100037 D - 0xfffffe000110e510 [usbus0]
100038 D - 0xfffffe000110e568 [usbus0]
100039 D - 0xfffffe000110e5c0 [usbus0]
4 0 0 0 DL (threaded) [cam]
100032 D - 0xffffffff82613a80 [doneq0]
100055 D - 0xffffffff826138c8 [scanner]
3 0 0 0 DL crypto_r 0xffffffff82974e70 [crypto returns]
2 0 0 0 DL crypto_w 0xffffffff82974d18 [crypto]
14 0 0 0 DL (threaded) [geom]
100023 D - 0xffffffff829eb940 [g_event]
100024 D - 0xffffffff829eb948 [g_up]
100025 D - 0xffffffff829eb950 [g_down]
13 0 0 0 DL (threaded) [ng_queue]
100019 D sleep 0xffffffff825d1810 [ng_queue0]
100020 D sleep 0xffffffff825d1810 [ng_queue1]
100021 D sleep 0xffffffff825d1810 [ng_queue2]
100022 D sleep 0xffffffff825d1810 [ng_queue3]
12 0 0 0 WL (threaded) [intr]
100007 I [swi3: vm]
100008 I [swi4: clock (0)]
100009 I [swi4: clock (1)]
100010 I [swi4: clock (2)]
100011 I [swi4: clock (3)]
100012 I [swi1: netisr 0]
100013 I [swi6: task queue]
100014 I [swi6: Giant taskq]
100017 I [swi5: fast taskq]
100033 I [irq256: ahci0]
100034 I [irq257: xhci0]
100040 I [irq258: hdac0]
100041 I [irq259: pcib1]
100042 I [irq260: re0]
100043 I [irq261: re1]
100044 I [swi0: uart uart]
100045 I [irq1: atkbd0]
100049 I [swi1: pf send]
100050 I [swi1: pfsync]
11 0 0 0 RL (threaded) [idle]
100003 CanRun [idle: cpu0]
100004 CanRun [idle: cpu1]
100005 CanRun [idle: cpu2]
100006 CanRun [idle: cpu3]
1 0 1 0 SLs wait 0xfffff80005202528 [init]
10 0 0 0 DL audit_wo 0xffffffff82a1a8c0 [audit]
0 0 0 0 DLs (threaded) [kernel]
100000 D swapin 0xffffffff829eb978 [swapper]
100015 D - 0xfffff80005218c00 [aiod_kick taskq]
100016 D - 0xfffff80005218a00 -
First, that is a really old snapshot. If you're going to be running 2.4, keep up and don't run one that is weeks old.
Second, that looks like it crashed in a memory operation. While it's possible for that to be an OS bug, it's unlikely.
-
Will update snapshot tonight, can you let me know what pointed towards a memory issue? so i can look for it next time instead of bothering you guys.
So maybe this was just an un-related issue and forcing my WAN link 100Mbps fixes my WAN link going down when its set 1Gbps and I push more traffic through it.
-
The top few calls in the backtrace are in locking & memory management. Not a 100% definitive analysis, just an educated guess.
There was a lock problem fixed in routing last week, but it was with a fairly uncommon configuration (default gateway outside of the WAN subnet).
No matter what though, updating is the first step.
-
Just wanted to update for closures sakes
After a few weeks of trying to fix it i think i finally have my zotac box stable, no "watchdog timeout" and loss of WAN IP under heavy load in about 8 days. Longest run ever.
Need to test for a longer period of time for stability , but what i did was compile the latest realtek freeBSD driver (1.92) and load it on start up using kldload. Seems ok for now.
Before all this NOTHING was working stable. Always crashed on the WAN ip (watchdog timeout) no matter what interface it was assigned to (re0 - re1)