Pfsense crashes every few days or hours - what happens to the crash reports?



  • We have a fairly lively pfSense installation (lots of IPSEC and OpenVPN vpns) which has been running on the same machine for several years.  Since updating to 2.3 we're finding it crashes regularly.  It crashed this morning at 6am, then again at about 11am and now it has just crashed again.

    I have disabled the other three CPUs as per the instructions in the forum.  But that doesn't seem to have done it.

    I am faithfully submitting crash reports every time they appear in the web configurator, but I was just wondering if this is just an annoyance or whether anyone is actually looking at them?  Should I be reporting them here instead?

    I am perfectly happy to entertain the idea that the machine may have got "tired" (you know, RAM or disks) and needs rebuilding or replacing, but it seems to much of a coincidence that the instability has happened after upgrading to 2.3.

    Currently running 2.3.1-p1 but I guess my users are getting used to interruptions so I may update to the latest in a moment.  But the release notes don't mention anything in this area, as far as I can tell.

    I've attached the latest crash report, in case someone can hazard a guess at the problem we're seeing.  Thanks!
    pfs_crash_report.txt



  • Nothing happens to the crash reports (other than being used for general statistical "how many systems have reported X crash") unless you ask and provide the IP where the crash report was submitted.

    Disabling of CPUs isn't necessary or desirable on 2.3.1 and newer, the root cause that lead people to that workaround was only in 2.3.0 and fixed in 2.3.1.

    It seems like the crash is somewhere in dummynet. Are you using limiters by chance? Could be the issue with limiters and pfsync combined causing crashes, if you upgraded from 2.1.x. If you upgraded from 2.2.x, that wouldn't be any different.

    You're running 32 bit on a 64 bit system, I'd reinstall it on 2.3.1 64 bit, restore your config, and upgrade it to 2.3.1_5. It's possible that will fix the issue (especially if you ended up inadvertently switching architectures during upgrade by having the wrong auto-update URL hard-coded).



  • Thanks cmb.  No, we don't use limiters, and we didn't update directly from 2.1 but from 2.2.6.

    I'll try the 64-bit upgrade.  I've just submitted another crash report from very early this morning where there should've been very little going through the machine, also attached to this message

    pfs_crash_report_post_smp_fix_early_morning.txt



  • The use of dummynet must be in captive portal then, since you're not using limiters. Do you have speed limits defined in captive portal? That's where dummynet would come into play.

    That crash looks similar to the previous. Also seems it may be dummynet-related. This is the first I've heard of something like that going from 2.2.x to 2.3.

    If you can at least temporarily disable captive portal, or remove any speed limits in use, that would help confirm or deny whether that's the source of the issue.



  • We don't use the captive portal.  And no traffic shaping either.

    I've raised a case for this through paid-for support (well, I sent the dump file yesterday but not yet had an acknowledgement).  Would it help if I sent our config through that route too?



  • Your logs show captive portal loading. The status tgz output would definitely help. Browse to status.php and download the resulting file and attach that.

    Your ticket was replied to less than an hour after you submitted it, with the main suggestion I offered here, switching to 64 bit. Might want to check your spam if you didn't get the email.

    I assigned your ticket to me, so it'll notify me when you attach the status output. I'll check that and reply back to you there.


Log in to reply