Web GUI crashes after upgrade from 22.05 to 23.01
-
@dyk-dike
Looking at those dumps, our machines do not have much in common.
We are both using an I5 different generation but similar features.
you have 4Gb and I have 8Gb of ram
Your nics are 2 Intel(R) PRO/1000 PT 82571EB/82571GB (em driver) and a RealTek 8168/8111 (re driver) where mine are Intel I211 (igb driver)..
The kernel trap is however the same, a memcpy to an invalid memory location which was triggered by nginx sendfile through a tcp->ipsec call.Why is this happening on FREEBSD 14 and not in FREEBSD 13 I am not 100% sure but here is what I could find out:
The sys call sendfile(2) that allows direct send of a file to a socket was modified with a PR by Netflix to use unmapped mbuf instead of regular mbuf (Revision 349529 since FreeBSD 12.
What are unmapped mbuf?
Quote from the above revision:
Unmapped mbufs allow sendfile to carry multiple pages of data in a
single mbuf, without mapping those pages. It is a requirement for
Netflix's in-kernel TLS, and provides a 5-10% CPU savings on heavy web
serving workloads when used by sendfile, due to effectively
compressing socket buffers by an order of magnitude, and hence
reducing cache misses.
...
NIC drivers advertise support for unmapped mbufs on transmit via a new
IFCAP_NOMAP capability. This capability can be toggled via the new
'nomap' and '-nomap' ifconfig(8) commands. For NIC drivers that only
transmit packet contents via DMA and use bus_dma, adding the
capability to if_capabilities and if_capenable should be all that is
required.If a NIC does not support unmapped mbufs, they are converted to a
chain of mapped mbufs (using sf_bufs to provide the mapping) in
ip_output or ip6_output. If an unmapped mbuf requires software
checksums, it is also converted to a chain of mapped mbufs before
computing the checksum.Unmapped mbuf are also used by KTLS which is a facility that allows the kernel to perform Transport Layer Security (TLS) framing on TCP sockets. I am not sure if ktls is enabled in pfsense by default or if this could be part of the issue.
What has changed form FreeBSD 13 to 14 is that the 14 nic drivers support the use of unmapped mbuf as shown by the NOMAP flag in ifconfig which was not there in FreeBSD 13 (pfsense 22.05).
Why some hardware crash and some other don't, I am not sure but it may have to do with a particular combination of hardware and nic drivers that is why I was interested at seeing your config.
This is what is behind the rationale of disabling sendfile in nginx, disable use of unmapped mbuf altogether with kern.ipc.mb_use_ext_pgs=0 but this is going to affect ktls as well or possibly disable unmapped mbuf use in the nic driver by adding the -mextpg (this is what -nomap was renamed to) flag to ifconfig. This last option which I haven't tried yet, would be like going back to the FreeBSD 13 situation without the NOMAP on the nic drivers.
-
The above system patch fixes the issue on both of my firewalls.
-
Are these installations using UFS or ZFS for the filesystem?
-
@jjstecchino said in Web GUI crashes after upgrade from 22.05 to 23.01:
That sysctl disable the use of unmapped buffers (mbuffs). While setting that sysctl to 0 would solve the problem, by disabling mbuffs altogether it MAY slow down your firewall as well. Not sure about this last statement as I haven't done any speed testing with kern.ipc.mb_use_ext_pgs=0.
According to one of our developers, disabling
sendfile
is slower than disabling unmapped mbufs (kern.ipc.mb_use_ext_pgs=0
). If both prevent the crash, the sysctl change is preferable as a workaround until we find a permanent fix. -
@jimp
Mine is ZFS and efi -
Good to know as setting that sysctl will survive an update whereas editing /etc/inc/system.inc will not
-
@jjstecchino said in Web GUI crashes after upgrade from 22.05 to 23.01:
Good to know as setting that sysctl will survive an update whereas editing /etc/inc/system.inc will not
You can setup the diff to auto-apply in the system patches package which would let it persist between updates.
But you can also toss
kern.ipc.mb_use_ext_pgs=0
into/boot/loader.conf.local
if you want, which ensures it's set at boot time before anything else. Setting it in the GUI tunables is fine, though. It just gets set a bit later in the boot process, though it should still be early enough that the GUI would be OK. -
@jjstecchino I started https://redmine.pfsense.org/issues/13938 to track this. Thanks for the very detailed analysis!
(Even if it did use svn instead of git )
-
ZFS
-
The fix below works on my 2 firewalls. I will test the second fix (kern.ipc.mb_use_ext_pgs=0) tonight on both my firewalls and report back.
===
And then you can use the System Patches package package to disable sendfile:
diff --git a/src/etc/inc/system.inc b/src/etc/inc/system.inc index d36efc2fca..b7cda99366 100644 --- a/src/etc/inc/system.inc +++ b/src/etc/inc/system.inc @@ -1380,7 +1380,7 @@ http { add_header X-Frame-Options SAMEORIGIN; server_tokens off; - sendfile on; + sendfile off; access_log syslog:server=unix:/var/run/log,facility=local5 combined;
After applying that patch, use the console menu option to restart the GUI (11).
-
-
-