Reboot During Upload Speed Test



  • I just got 2.4 up and running and decided to run the DSLReports SpeedTest.  During the upload portion my router would reboot after a couple of seconds.  This was the same behavior I reported here:

    https://redmine.pfsense.org/issues/5383

    The only difference now from the bug I entered is that I am not using a traffic shaper now.

    I tried it several times and it always rebooted.  I reverted back to 2.3.2 with the same configuration and re-ran the speed test with no issues.  I had a zfs mirror with 2.4 and mirrored my swap so I guess the 2.4 system was unable to save a crash dump.

    My system is:
    Supermicro A1SRI-2558F
    8GB ECC Ram
    120GB Intel 320 SSD
    Frontier FiOS 150/150


  • Rebel Alliance Developer Netgate

    Without any details about the crash dump we can't do anything for it.

    I can't replicate any crash doing a speed test.

    We'll at least need to see a crash dump/backtrace. If you've got a serial console on there, set your serial client to record the output and then replicate the crash conditions.



  • I will set 2.4 back up and replicate the crash.

    If I do not mirror swap will the system generate a usable crash dump for you or is the serial console the only way to really get what you need?

    If serial console is needed I will need to find some directions on how to set this up.


  • Rebel Alliance Developer Netgate

    I am not sure about mirrored swap, it may be ZFS more than mirrored swap to blame. Try UFS with traditional swap and see what happens. I don't think we've yet run any tests to see what is needed for crash dumps on ZFS.



  • I setup the latest snapshot of 2.4 (11/16/16) on UFS.  I then ran the DSLReports Speedtest.  My box rebooted within a couple of seconds of starting the upload speed portion of the test.

    I also submitted the dump from IP 47.XXX.X.182.  I attached a screen grab of the speed test.  My Upload speed is 150Mbps but the box crashes right around 15Mbps.  I can run this test with 2.3.2 without issue with this configuration file.

    					Crash report begins.  Anonymous machine information:
    
    amd64
    11.0-RELEASE-p3
    FreeBSD 11.0-RELEASE-p3 #185 dd6dc55(RELENG_2_4): Wed Nov 16 16:14:40 CST 2016     root@buildbot2.netgate.com:/builder/ce/tmp/obj/builder/ce/tmp/FreeBSD-src/sys/pfSense
    
    Crash report details:
    
    Filename: /var/crash/bounds
    1
    
    Filename: /var/crash/info.0
    Dump header from device: /dev/ada0p3
      Architecture: amd64
      Architecture Version: 2
      Dump Length: 581373952
      Blocksize: 512
      Dumptime: Wed Nov 16 23:23:04 2016
      Hostname: pfsense.wagsnet.lan
      Magic: FreeBSD Kernel Dump
      Version String: FreeBSD 11.0-RELEASE-p3 #185 dd6dc55(RELENG_2_4): Wed Nov 16 16:14:40 CST 2016
        root@buildbot2.netgate.com:/builder/ce/tmp/obj/builder/ce/tmp/FreeBSD-src/sys/pfSense
      Panic String: sbsndptr: sockbuf 0xfffff800774601b8 and mbuf 0xfffff801cd71d200 clashing
      Dump Parity: 2673792273
      Bounds: 0
      Dump Status: good
    
    Filename: /var/crash/info.last
    Dump header from device: /dev/ada0p3
      Architecture: amd64
      Architecture Version: 2
      Dump Length: 581373952
      Blocksize: 512
      Dumptime: Wed Nov 16 23:23:04 2016
      Hostname: pfsense.wagsnet.lan
      Magic: FreeBSD Kernel Dump
      Version String: FreeBSD 11.0-RELEASE-p3 #185 dd6dc55(RELENG_2_4): Wed Nov 16 16:14:40 CST 2016
        root@buildbot2.netgate.com:/builder/ce/tmp/obj/builder/ce/tmp/FreeBSD-src/sys/pfSense
      Panic String: sbsndptr: sockbuf 0xfffff800774601b8 and mbuf 0xfffff801cd71d200 clashing
      Dump Parity: 2673792273
      Bounds: 0
      Dump Status: good
    
    Filename: /var/crash/minfree
    2048
    




  • Maybe this is interesting:

    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=148807#c32

    I added this to /boot/loader.conf.local

    hw.igb.num_queues="1"
    

    I can now complete the test without a panic.  The upload portion looks a little choppy but maybe that is just the DSLReports servers.




  • I just reproduced this on my new apu2c4 as well.  With default settings the system panics during the upload portion of the DSLReports Speedtest.

    I can prevent this from happening by editing the /boot/loader.conf.local with the following:

    hw.igb.num_queues="2"
    

    or

    hw.igb.num_queues="1"
    

    I have not tried other numbers of queues but both of these settings allow the test to complete without a reboot.  This may actually have the same root cause as an issue I was seeing when using traffic shaping.  See this bug report:

    https://redmine.pfsense.org/issues/5383#change-29447

    Should I enter a new bug for this?



  • I have had same crashes on igb cards with 2.4, but with shaper enabled and speedtest.net  test, I think load does matter. But on latest versions I can not reproduce the crash.
    All I have in loader.conf.local regarding igb is
    kern.ipc.nmbclusters="131072"
    kern.ipc.nmbjumbo9="20000"
    kern.ipc.nmbclusters="1000000"
    legal.intel_wpi.license_ack=1
    legal.intel_ipw.license_ack=1