Regular crash dumps



  • Hi everybody

    I've got a virtualized pfSense (Proxmox 6) running 4 cores and 2 GB RAM.
    pfSense 2.4.4 hardware checksum offload disabled.
    Since a couple of weeks I'm getting crash dumps on a regular basis.

    Most of the times the system is acting normally though. In some cases there were DNS issues.
    Could someone please take a look, thank you.
    textdump.tar.0 info.0



  • I just spotted that the following two processes were missing completely:

    /usr/local/sbin/dhcpd -6 -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpdv6.conf -pf /var/run/dhcpdv6.pid vtnet1 vtnet2 vtnet3 vtnet5 v
    /usr/local/sbin/dhcpleases6 -c /usr/local/bin/php-cgi -f /usr/local/sbin/prefixes.php|/bin/sh -l /var/dhcpd/var/db/dhcpd6.leases
    

    After I restarted dhcpd from services the processes came back on again.
    But even after dhcpd -6 is up again it does not provide DHCPv6 addresses on vtnet3 (subnet6 2000:1111:2222:5000::/64).
    Another DHCPv6 domain is working which is configured very similar, i.e. vtnet5 (subnet6 2000:1111:2222:6004::/64).
    This is my config:

    option domain-name "my.domain.name";
    option ldap-server code 95 = text;
    option domain-search-list code 119 = text;
    
    default-lease-time 7200;
    max-lease-time 86400;
    log-facility local7;
    one-lease-per-client true;
    deny duplicates;
    ping-check true;
    update-conflict-detection false;
    authoritative;
    subnet6 2000:1111:2222:1b::/64 {
            range6 2000:1111:2222:1b::1000 2000:1111:2222:1b::2000;
            option domain-name "my.domain.name";
            option domain-search "my.domain.name";
            ddns-domainname "my.domain.name";
            allow client-updates;
            option dhcp6.name-servers 2620:fe::fe,2620:fe::9;
            default-lease-time 84000;
            max-lease-time 164000;
    
    }
    key dns.server.com {
            algorithm hmac-md5;
            secret blablablablablablablablablabalblablablablablablablablablab;
    }
    zone my.domain.name. {
            primary 10.10.100.52;
            key srv201.my.domain.name;
    }
    subnet6 2000:1111:2222:6002::/64 {
            range6 2000:1111:2222:6002::1000 2000:1111:2222:6002::2000;
            do-forward-updates false;
            option dhcp6.name-servers 2000:1111:2222:6002:3cbe:80ff:fe25:e608;
            prefix6 2000:1111:2222:3000:: 2000:1111:2222:3ff0:: /60;
    
    }
    subnet6 2000:1111:2222:5000::/64 {
            range6 2000:1111:2222:5000::1000 2000:1111:2222:5000::2000;
            option domain-name "my.domain.name";
            option domain-search "my.domain.name";
            do-forward-updates false;
            option dhcp6.name-servers 2620:fe::9,2620:fe::fe;
    
    }
    subnet6 2000:1111:2222:6004::/64 {
            range6 2000:1111:2222:6004::1000 2000:1111:2222:6004::2000;
            option domain-name "another.domain";
            option domain-search "another.domain";
            ddns-domainname "another.domain";
            allow client-updates;
            option dhcp6.name-servers 2620:fe::fe,2620:fe::9;
            prefix6 2000:1111:2222:3000:: 2000:1111:2222:3ff0:: /60;
            default-lease-time 72000;
            max-lease-time 148000;
    option domain-name "my.domain.name";
    option ldap-server code 95 = text;
    option domain-search-list code 119 = text;
    
    default-lease-time 7200;
    max-lease-time 86400;
    log-facility local7;
    one-lease-per-client true;
    deny duplicates;
    ping-check true;
    update-conflict-detection false;
    authoritative;
    subnet6 2000:1111:2222:1b::/64 {
            range6 2000:1111:2222:1b::1000 2000:1111:2222:1b::2000;
            option domain-name "my.domain.name";
            option domain-search "my.domain.name";
            ddns-domainname "my.domain.name";
            allow client-updates;
            option dhcp6.name-servers 2620:fe::fe,2620:fe::9;
            default-lease-time 84000;
            max-lease-time 164000;
    
    }
    key dns.server.com {
            algorithm hmac-md5;
            secret blablablablablablablablablabalblablablablablablablablablab;
    }
    zone my.domain.name. {
            primary 10.10.100.52;
            key srv201.my.domain.name;
    }
    subnet6 2000:1111:2222:6002::/64 {
            range6 2000:1111:2222:6002::1000 2000:1111:2222:6002::2000;
            do-forward-updates false;
            option dhcp6.name-servers 2000:1111:2222:6002:3cbe:80ff:fe25:e608;
            prefix6 2000:1111:2222:3000:: 2000:1111:2222:3ff0:: /60;
    
    }
    subnet6 2000:1111:2222:5000::/64 {
            range6 2000:1111:2222:5000::1000 2000:1111:2222:5000::2000;
            option domain-name "my.domain.name";
            option domain-search "my.domain.name";
            do-forward-updates false;
            option dhcp6.name-servers 2620:fe::9,2620:fe::fe;
    
    }
    subnet6 2000:1111:2222:6004::/64 {
            range6 2000:1111:2222:6004::1000 2000:1111:2222:6004::2000;
            option domain-name "another.domain";
            option domain-search "another.domain";
            ddns-domainname "another.domain";
            allow client-updates;
            option dhcp6.name-servers 2620:fe::fe,2620:fe::9;
            prefix6 2000:1111:2222:3000:: 2000:1111:2222:3ff0:: /60;
            default-lease-time 72000;
            max-lease-time 148000;
    


  • i found a 3d with a similar problem if you have CARP / limiters
    https://redmine.pfsense.org/issues/4310



  • @kiokoman
    I don't think so. Neither do I use CARP nor do I have Kernel Panics or reboots.

    Edit:
    I filed a bug since I'm convinced this is one: https://redmine.pfsense.org/issues/9723


  • Netgate Administrator

    Key parts of that crash are:

    db:0:kdb.enter.default>  show pcpu
    cpuid        = 1
    dynamic pcpu = 0xfffffe00f8025380
    curthread    = 0xfffff80036d9f620: pid 12 "swi1: pfsync"
    curpcb       = 0xfffffe0079efdcc0
    fpcurthread  = none
    idlethread   = 0xfffff8000451a620: tid 100004 "idle: cpu1"
    curpmap      = 0xffffffff82b85998
    tssp         = 0xffffffff82bb6878
    commontssp   = 0xffffffff82bb6878
    rsp0         = 0xfffffe0079efdcc0
    gs32p        = 0xffffffff82bbd0d0
    ldt          = 0xffffffff82bbd110
    tss          = 0xffffffff82bbd100
    db:0:kdb.enter.default>  bt
    Tracing pid 12 tid 100130 td 0xfffff80036d9f620
    pfsync_state_export() at pfsync_state_export+0x1e/frame 0xfffffe0079efda60
    pfsync_sendout() at pfsync_sendout+0x19f/frame 0xfffffe0079efdaf0
    pfsyncintr() at pfsyncintr+0x59/frame 0xfffffe0079efdb20
    intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe0079efdb60
    ithread_loop() at ithread_loop+0xe7/frame 0xfffffe0079efdbb0
    fork_exit() at fork_exit+0x83/frame 0xfffffe0079efdbf0
    fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0079efdbf0
    --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
    db:0:kdb.enter.default>  ps
    

    and

    Fatal trap 12: page fault while in kernel mode
    cpuid = 1; apic id = 01
    fault virtual address	= 0x0
    fault code		= supervisor read data, page not present
    instruction pointer	= 0x20:0xffffffff80f541de
    stack pointer	        = 0x28:0xfffffe0079efda50
    frame pointer	        = 0x28:0xfffffe0079efda60
    code segment		= base 0x0, limit 0xfffff, type 0x1b
    			= DPL 0, pres 1, long 1, def32 0, gran 1
    processor eflags	= interrupt enabled, resume, IOPL = 0
    current process		= 12 (swi1: pfsync)
    

    I doubt it's 4310 directly (that was finally fixed a while back) but it does look similar.

    What are you using pfSync for if you don't have CARP? Do you have any HA features configured?

    Steve



  • Neither CARP nor pfsync are being used, just a single firewall.


  • Netgate Administrator

    It is crashing in the pfSync code so it's to see how that could happen unless something is using it.

    The pfsync interface does get created on all installs but I'm not aware of ever seeing an issue like that previously without it actually syncing.

    Steve


Log in to reply