Pfsense crashing randomly pfsnese plus 24.03
-
@ssjucrono As we're experiencing the same (same signature, same Redmine bug to track, another forum Topic though) I'd be interested .. do you run any of the following packages:
- acme
- aws-wizard (pre-installed on pfsense+)
- frr
- ipse-profile-wizard (pre-installed on pfsense+)
- netgate_firmware_upgrade (pre-installed on pfsense+)
- node_exporter
- openvpn-client-export (pre-installed, I think)
- zabbix-agent64
-
@cboenning said in Pfsense crashing randomly pfsnese plus 24.03:
Thank you! yes I run these 2. though I can remove openvpn as I do not use it anymore. I have switched to tailscale
acme
openvpn-client-export -
@ssjucrono no no. Don’t remove anything. I was just interested if there might be some similarities to our setup.
I think those 2 packages are pretty unspectacular given they’re not really doing „anything network“
-
@cboenning
yeah, I don't need them. I removed acme and openvpn exporter as I have never used them.thank you
-
@ssjucrono you may want to opt in to enabling „full core dumps“ as outlined here (https://forum.netgate.com/topic/188861/24-03-crashing-again/19) and provide them to @stephenw10 and/or Redmine to get this debugged eventually though.
-
Yup, that. If you're able to enable full core dumps that will help a lot here. However be aware that you need to have enough SWAP available for the dump file which will be the size of the used RAM.
An alternative that may also help would be to run the debug kernel:
https://docs.netgate.com/pfsense/en/latest/troubleshooting/debug-kernel.html
That may show additional errors before the panic.
-
@ssjucrono you may want to check the Redmine issue for a workaround (https://redmine.pfsense.org/issues/15684#note-14)
-
Yup let us know if disabling
net.inet.tcp.sack.enable
works to prevent it.For reference that looks like:
-
-
@stephenw10 Thank you for the update. I don't have net.inet.tcp.sack.enable in my system tunables? should I add it? or just leave it as is?
-
Yes you will need to add that. It's not a default tunable.
-
@stephenw10 I have not seen this crash in awhile. I will set this though.
Maybe it was caused by my Unraid Docker Containers being backed up each night. So they are all stopped and then started within about 12minutes. I do get a flapping warning from arpwatch each night when this occurs. Perhaps that was the cause of the initial crash?
-
I doubt it. But it's unclear what actually triggers it since most users never hit it.
-
@stephenw10 said in Pfsense crashing randomly pfsnese plus 24.03:
Yup let us know if disabling
net.inet.tcp.sack.enable
works to prevent it.For reference that looks like:
It works. I had random crashes, but once I added "net.inet.tcp.sack.enable=0", I haven't experienced any crashes.
-
Great. That should be patched in the next release.
-
I wonder if this is what hit me the other day.
Will post the dumps to see if they are of any use.
Will upgrading to 24.11 fix this? I normally just update the system patches (currently 2.2.11_17)
Thanks
Rob
info.0 -
Upgrading to latest version always recommended.
or you can try adding this entry in System Tunnable : "net.inet.tcp.sack.enable=0"
I am running 24.11 - Pretty solid
-
@enthu19 well 24.03 was until it wasn’t. 6 months ish uptime from memory and then I suspect the ISP changed my WAN IP (PPPOE) and I got a page fault. Thought updating the system packages was an alternative to doing a full upgrade.
Was this system tunable added to 24.11 otherwise I don’t see how upgrading will help with my problem. -
@hulleyrob
no, I added System Tunnable entry in 24.03, -
@hulleyrob The sysctl above avoids a bug with selective ACK that's been fixed in 24.11.
However, the backtrace in your dump does not match that problem. In fact, it doesn't match any known problem. Or has any useful hints about what the problem could be.
Does this problem happen regularly? Is it always the same backtrace?In any event, upgrading to the latest version is always a good idea.
-
@kprovost nope can’t remember the last time I had a crash I think none since I got the 6100.
I’ve preferred the system update route since it was added to avoid the downtime of a full system update. I thought they would leave the system in the same state code wise and only upgraded to 24.03 for something but can’t remember what it was maybe a WireGuard update?
I did hold of on 24.11 on purpose due to the high cpu usage reports initially.