Regular crash dumps
-
Hi everybody
I've got a virtualized pfSense (Proxmox 6) running 4 cores and 2 GB RAM.
pfSense 2.4.4 hardware checksum offload disabled.
Since a couple of weeks I'm getting crash dumps on a regular basis.Most of the times the system is acting normally though. In some cases there were DNS issues.
Could someone please take a look, thank you.
textdump.tar.0 info.0 -
I just spotted that the following two processes were missing completely:
/usr/local/sbin/dhcpd -6 -user dhcpd -group _dhcp -chroot /var/dhcpd -cf /etc/dhcpdv6.conf -pf /var/run/dhcpdv6.pid vtnet1 vtnet2 vtnet3 vtnet5 v /usr/local/sbin/dhcpleases6 -c /usr/local/bin/php-cgi -f /usr/local/sbin/prefixes.php|/bin/sh -l /var/dhcpd/var/db/dhcpd6.leases
After I restarted dhcpd from services the processes came back on again.
But even after dhcpd -6 is up again it does not provide DHCPv6 addresses on vtnet3 (subnet6 2000:1111:2222:5000::/64).
Another DHCPv6 domain is working which is configured very similar, i.e. vtnet5 (subnet6 2000:1111:2222:6004::/64).
This is my config:option domain-name "my.domain.name"; option ldap-server code 95 = text; option domain-search-list code 119 = text; default-lease-time 7200; max-lease-time 86400; log-facility local7; one-lease-per-client true; deny duplicates; ping-check true; update-conflict-detection false; authoritative; subnet6 2000:1111:2222:1b::/64 { range6 2000:1111:2222:1b::1000 2000:1111:2222:1b::2000; option domain-name "my.domain.name"; option domain-search "my.domain.name"; ddns-domainname "my.domain.name"; allow client-updates; option dhcp6.name-servers 2620:fe::fe,2620:fe::9; default-lease-time 84000; max-lease-time 164000; } key dns.server.com { algorithm hmac-md5; secret blablablablablablablablablabalblablablablablablablablablab; } zone my.domain.name. { primary 10.10.100.52; key srv201.my.domain.name; } subnet6 2000:1111:2222:6002::/64 { range6 2000:1111:2222:6002::1000 2000:1111:2222:6002::2000; do-forward-updates false; option dhcp6.name-servers 2000:1111:2222:6002:3cbe:80ff:fe25:e608; prefix6 2000:1111:2222:3000:: 2000:1111:2222:3ff0:: /60; } subnet6 2000:1111:2222:5000::/64 { range6 2000:1111:2222:5000::1000 2000:1111:2222:5000::2000; option domain-name "my.domain.name"; option domain-search "my.domain.name"; do-forward-updates false; option dhcp6.name-servers 2620:fe::9,2620:fe::fe; } subnet6 2000:1111:2222:6004::/64 { range6 2000:1111:2222:6004::1000 2000:1111:2222:6004::2000; option domain-name "another.domain"; option domain-search "another.domain"; ddns-domainname "another.domain"; allow client-updates; option dhcp6.name-servers 2620:fe::fe,2620:fe::9; prefix6 2000:1111:2222:3000:: 2000:1111:2222:3ff0:: /60; default-lease-time 72000; max-lease-time 148000; option domain-name "my.domain.name"; option ldap-server code 95 = text; option domain-search-list code 119 = text; default-lease-time 7200; max-lease-time 86400; log-facility local7; one-lease-per-client true; deny duplicates; ping-check true; update-conflict-detection false; authoritative; subnet6 2000:1111:2222:1b::/64 { range6 2000:1111:2222:1b::1000 2000:1111:2222:1b::2000; option domain-name "my.domain.name"; option domain-search "my.domain.name"; ddns-domainname "my.domain.name"; allow client-updates; option dhcp6.name-servers 2620:fe::fe,2620:fe::9; default-lease-time 84000; max-lease-time 164000; } key dns.server.com { algorithm hmac-md5; secret blablablablablablablablablabalblablablablablablablablablab; } zone my.domain.name. { primary 10.10.100.52; key srv201.my.domain.name; } subnet6 2000:1111:2222:6002::/64 { range6 2000:1111:2222:6002::1000 2000:1111:2222:6002::2000; do-forward-updates false; option dhcp6.name-servers 2000:1111:2222:6002:3cbe:80ff:fe25:e608; prefix6 2000:1111:2222:3000:: 2000:1111:2222:3ff0:: /60; } subnet6 2000:1111:2222:5000::/64 { range6 2000:1111:2222:5000::1000 2000:1111:2222:5000::2000; option domain-name "my.domain.name"; option domain-search "my.domain.name"; do-forward-updates false; option dhcp6.name-servers 2620:fe::9,2620:fe::fe; } subnet6 2000:1111:2222:6004::/64 { range6 2000:1111:2222:6004::1000 2000:1111:2222:6004::2000; option domain-name "another.domain"; option domain-search "another.domain"; ddns-domainname "another.domain"; allow client-updates; option dhcp6.name-servers 2620:fe::fe,2620:fe::9; prefix6 2000:1111:2222:3000:: 2000:1111:2222:3ff0:: /60; default-lease-time 72000; max-lease-time 148000;
-
i found a 3d with a similar problem if you have CARP / limiters
https://redmine.pfsense.org/issues/4310 -
@kiokoman
I don't think so. Neither do I use CARP nor do I have Kernel Panics or reboots.Edit:
I filed a bug since I'm convinced this is one: https://redmine.pfsense.org/issues/9723 -
Key parts of that crash are:
db:0:kdb.enter.default> show pcpu cpuid = 1 dynamic pcpu = 0xfffffe00f8025380 curthread = 0xfffff80036d9f620: pid 12 "swi1: pfsync" curpcb = 0xfffffe0079efdcc0 fpcurthread = none idlethread = 0xfffff8000451a620: tid 100004 "idle: cpu1" curpmap = 0xffffffff82b85998 tssp = 0xffffffff82bb6878 commontssp = 0xffffffff82bb6878 rsp0 = 0xfffffe0079efdcc0 gs32p = 0xffffffff82bbd0d0 ldt = 0xffffffff82bbd110 tss = 0xffffffff82bbd100 db:0:kdb.enter.default> bt Tracing pid 12 tid 100130 td 0xfffff80036d9f620 pfsync_state_export() at pfsync_state_export+0x1e/frame 0xfffffe0079efda60 pfsync_sendout() at pfsync_sendout+0x19f/frame 0xfffffe0079efdaf0 pfsyncintr() at pfsyncintr+0x59/frame 0xfffffe0079efdb20 intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe0079efdb60 ithread_loop() at ithread_loop+0xe7/frame 0xfffffe0079efdbb0 fork_exit() at fork_exit+0x83/frame 0xfffffe0079efdbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0079efdbf0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- db:0:kdb.enter.default> ps
and
Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x0 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80f541de stack pointer = 0x28:0xfffffe0079efda50 frame pointer = 0x28:0xfffffe0079efda60 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 12 (swi1: pfsync)
I doubt it's 4310 directly (that was finally fixed a while back) but it does look similar.
What are you using pfSync for if you don't have CARP? Do you have any HA features configured?
Steve
-
Neither CARP nor pfsync are being used, just a single firewall.
-
It is crashing in the pfSync code so it's to see how that could happen unless something is using it.
The pfsync interface does get created on all installs but I'm not aware of ever seeing an issue like that previously without it actually syncing.
Steve