Crash/freeze since 2.2



  • Hi,

    we are an University whith an internet access of 10Gbits.
    we use two dell r630 with 2x10Gb bonding, vlans and carp to handle about 500Mbit/s of bandwidth (filtering + nat).
    Since we have upgrade from 2.1.5 to 2.2.x we have repeatly crash (Same crash in 2.2.2, 2.2.3 and 2.2.4).
    Version 2.1.x was running since many years on others servers.
    We have also :

    • 2 pfsenses (filtering) in 2.2.4 for datacenter access with no problem
    • 2 pfsenses (filtering + nat + cp) in 2.2.4 for student captive portal with no problem
    • 2 pfsenses (filtering + nat) in 2.1.5 for wireless acces with no problem

    We try to gather informations but no crashdump is generated.
    When pfsense freeze,  we can't ping and the console is frozen (except capslock), then we must do a hard reboot.

    We try pfsense on usb external disk and it crash again (then we consider this is not the raid controler).
    We try to replace x520 card by an intel gigabit card and we continue to have crash.
    We try to install pfsense on another server (r715) and it crash again.

    Is there a way to have more informations about freeze ?

    Best regards



  • logs ? crash dumps ?



  • No crashdump are generated and no error mesage are printed in the console.

    If needed, we can post dmesg, system.log, …

    Thanks



  • Hummm.
    A BIG user !  :)

    What helps seeing things go wrong is this https://www.test-domaine.fr/munin/brit-hotel-fumel.net/pfsense.brit-hotel-fumel.net/index.html => a Munin collector on your pfSEnse !
    One of my web servers (on the net, somewhere) is collecting the info from the pfSense box.
    Note : I didn't even activate ALL the Munin plugins - so more info is possible.
    Note : added myself a Captive Portal Users graph.

    Difficult to 'debug' is that fact that pfSense changed for a more recent FreeBSD version lately.
    2.2.4 is using "FreeBSD 10.1-RELEASE-p15".



  • We have already remote monitoring using zabbix agent and comparing counters between two crash does not help (no increase of memory usage, cpu, pf table ,…).

    When pfsense freeze all network connections are stop and it is not possible to gather informations using network.

    Then, we have installed a collectd on pfsense and we are waiting a new freeze.

    Do you think to force generation of memory dump can be helpful ?



  • Hi,

    We had a new freeze  but  with local collectd we have more informations :

    http://www.unicaen.fr/pfsense/freeze-20150915.xhtml

    The system is not completely freeze, just console and network.

    After some searches on the web we have modified the following values in bios  :

    • disable "logical processor"
    • disable "virtualization technology"
    • disable "SRIOV" on intel X520 network card

    We also tuned following values in /boot/loader.conf.local :
    cc_htcp_load="YES"
    net.link.ifqmaxlen="4096"
    hw.igb.num_queues="1"
    hw.ix.num_queues="8"
    net.isr.maxthreads="1"
    net.isr.defaultqlimit="2048"
    hw.igb.max_interrupt_rate="32000"
    hw.ix.rx_process_limit="-1"
    net.inet.tcp.syncache.hashsize="1024"
    net.inet.tcp.syncache.bucketlimit="100"
    net.isr.bindthreads="0"

    Now we have to wait.