FRR Zebra died after upgrade from 2.4.3 on Vers 2.4.4



  • After upgrading from Vers. 2.4.3 to latest 2.4.4 FRR zebra died on startup, I don´t know if redmine  bug#7038 is related, the crashlog is different.

    Shell Output - cat /var/tmp/quagga.zebra.crashlog
    ZEBRA: Received signal 11 at 1523178483 (si_addr 0x20); aborting…
    Backtrace for 5 stack frames:
    0x8008c9650 <zlog_backtrace_sigsafe+0x40>at /usr/local/lib/libfrr.so.0
    0x8008c8e98 <zlog_signal+0x558>at /usr/local/lib/libfrr.so.0
    0x8008dd344 <signal_init+0x244>at /usr/local/lib/libfrr.so.0
    0x801596904 <pthread_sigmask+0x544>at /lib/libthr.so.3
    0x801595e9f <pthread_getspecific+0xe2f>at /lib/libthr.so.3
    no thread information available</pthread_getspecific+0xe2f></pthread_sigmask+0x544></signal_init+0x244></zlog_signal+0x558></zlog_backtrace_sigsafe+0x40>


  • Rebel Alliance Developer Netgate

    What hardware are you using?

    That bug you reference is only relevant to ARM devices (SG-1000, SG-3100), is that where you're using it?

    I am seeing zebra crash on SG-3100 running 2.4.4, but the crash log is different.

    EDIT: Meant to say that 2.4.4 is now running FRR 4.0, so it's possible the previous fix didn't carry over or something new is happening


  • Rebel Alliance Developer Netgate

    After checking further, it doesn't appear specific to ARM. I have an OSPF instance running just fine on an SG-1000, but a BGP setup on SG-1000 and an amd64 VM both have a dead zebra daemon.

    Opened a ticket here: https://redmine.pfsense.org/issues/8449



  • It occurs on multiple vm´s on VMware Workstation running on Intel Architecture. ospf only. no bgp.


  • Rebel Alliance Developer Netgate

    Curious, as my only instance with just OSPF is running fine, but the BGP instances all crash.


  • Rebel Alliance Developer Netgate

    Looks like there are multiple crashes and stability issues with FRR 4.0 on FreeBSD, so the FreeBSD port maintainer made a net/frr3 port to use in the meantime while those are investigated upstream.

    We'll get that merged into our tree and change the FRR package to use it instead for the time being, which should stabilize everything.

    More detail at https://redmine.pfsense.org/issues/8449



  • a probable fix in version 5.0.1 released on July 6

    Fix crash with gif/tun/gre interface.

    PR: 228643

    0_1531653282987_41a7efe5-821a-48a0-a856-0bb5bcaaabce-image.png

    https://www.freshports.org/net/frr5/