Web GUI crashes after upgrade from 22.05 to 23.01
-
@stephenw10
Yeah but that disables a kernel wide optimization that may be important to allow better handling of network traffic by the firewall. Turning off the sendfile optimization on nginx may be a better option as what it does is allows direct move of a file data to a tcp socket without copying to a memory buffer first. This is important for a high traffic web server but overall irrelevant for pfsense.As more people will start to use pfsense 23.xx with freebsd 14 this bug may start to affect others as well.
Setting sysctl kern.ipc.mb_use_ext=0 would allow seamlessly updates if sendfile remains set to on on nginx config but it would turn off an important kernel optimization.
I would respectfully suggest to consider turning off sendfile in nginx config instead.
-
I do have one box here with
igb
andNOMAP
showing (A Netgate 7551), but so far I haven't been able to make it crash.That said, the only IPsec tunnel I have on there that is testable without some work is VTI, not tunnel mode.
I'll see if I can rig up a tunnel mode test on there.
-
Setup a tunnel and still no crash. I can reach the GUI LAN to LAN with a full browser and it appears to be working fine.
Do you have something enabled on the dashboard that might be contributing? Maybe the picture widget with a large image?
Usually the web server wouldn't be using sendfile for much on pfSense since it doesn't have many static things to serve and typically that gets kicked in for stuff like large pictures.
-
@jimp No, this happened also with a bare bone default config no widgets. Clean install and ipsec tunnel vpn
-
Curious. I even tried downloading a status output and some config backups with RRD (~4MB) but it keeps chugging along.
I tried with no crypto acceleration and also with QAT enabled.
There may be something specific to that exactl igb card that is different than mine.
-
@jimp my nic is <Intel(R) I211 (Copper)> port 0xd000-0xd01f mem 0xf7200000-0xf721ffff,0xf7220000-0xf7223fff at device 0.0 on pci2
-
Yeah, that's quite a bit different from this one.
igb0@pci0:0:20:0: class=0x020000 rev=0x03 hdr=0x00 vendor=0x8086 device=0x1f41 subvendor=0x8086 subdevice=0x1f41 vendor = 'Intel Corporation' device = 'Ethernet Connection I354' class = network subclass = ethernet bar [10] = type Memory, range 64, base 0xdfe60000, size 131072, enabled bar [18] = type I/O Port, range 32, base 0xf0c0, size 32, enabled bar [20] = type Memory, range 64, base 0xdff2c000, size 16384, enabled
I thought I had something around with an i211, but nope. I have some i210 devices but they aren't running pfSense.
-
@jimp On my setup hardware acceleration (only ads-ni available) on or off doesn't make a difference.
Nginx sendfile in my case seems to be the culprit as if I set it to off, it solves the problem.
-
Right, I just tested that in case it was relevant since it seems to be sendfile in some combination with IPsec and your hardware since it works locally.
-
-
It's possible, though hard to say for sure. It seems similar at least.
-
-
-
I am having this exact same issueโฆ just upgraded both my home and remote firewalls to the RC and now when I try to access either web gui over IPsec it immediately crashes the remote side.
-
@dyk-dike
Would you mind posting the output of dmesg on the crashing firewall? I would like to compare your hardware to mine to see if there are any common threads that may help sorting out or reproducing the issue.
For the time being I patched the problem by disabling sendfile on nginx in the remote firewall -
Also it would help to have the full textdump archive from any firewall that encounters this, will make getting the details and comparing easier.
If you are on 23.01 and can easily reproduce it, you may also want to install and boot from the debug kernel and try to trigger the crash, which will include a lot more detail in the backtrace.
And then you can use the System Patches package package to disable
sendfile
:diff --git a/src/etc/inc/system.inc b/src/etc/inc/system.inc index d36efc2fca..b7cda99366 100644 --- a/src/etc/inc/system.inc +++ b/src/etc/inc/system.inc @@ -1380,7 +1380,7 @@ http { add_header X-Frame-Options SAMEORIGIN; server_tokens off; - sendfile on; + sendfile off; access_log syslog:server=unix:/var/run/log,facility=local5 combined;
After applying that patch, use the console menu option to restart the GUI (
11
). -
@jjstecchino
is this how I disable sendfile
kern.ipc.mb_use_ext_pgs=0 -
This post is deleted! -
@dyk-dike
No to disable sendfile you do what Jimp just said above, either by using the system patches package with that diff or by manually editing /etc/inc/system.inc.That sysctl disable the use of unmapped buffers (mbuffs). While setting that sysctl to 0 would solve the problem, by disabling mbuffs altogether it MAY slow down your firewall as well. Not sure about this last statement as I haven't done any speed testing with kern.ipc.mb_use_ext_pgs=0.
I would personally do what jimp suggested.
Please post the crash dump (textdump.zip) so that we can compare my hardware to yours.
Thanks
-
-
@dyk-dike
I realized I have never PM'd anybody in this forum. I may be retarded but I can't find a PM function. Here you go... my email is jjstecchino at yahoo.com.
I looked at my crash dump and I don't see really any sensitive info. I may have missed it but I don't even see the IP address of my interfaces. There is a list of running processes though... -
@dyk-dike
Looking at those dumps, our machines do not have much in common.
We are both using an I5 different generation but similar features.
you have 4Gb and I have 8Gb of ram
Your nics are 2 Intel(R) PRO/1000 PT 82571EB/82571GB (em driver) and a RealTek 8168/8111 (re driver) where mine are Intel I211 (igb driver)..
The kernel trap is however the same, a memcpy to an invalid memory location which was triggered by nginx sendfile through a tcp->ipsec call.Why is this happening on FREEBSD 14 and not in FREEBSD 13 I am not 100% sure but here is what I could find out:
The sys call sendfile(2) that allows direct send of a file to a socket was modified with a PR by Netflix to use unmapped mbuf instead of regular mbuf (Revision 349529 since FreeBSD 12.
What are unmapped mbuf?
Quote from the above revision:
Unmapped mbufs allow sendfile to carry multiple pages of data in a
single mbuf, without mapping those pages. It is a requirement for
Netflix's in-kernel TLS, and provides a 5-10% CPU savings on heavy web
serving workloads when used by sendfile, due to effectively
compressing socket buffers by an order of magnitude, and hence
reducing cache misses.
...
NIC drivers advertise support for unmapped mbufs on transmit via a new
IFCAP_NOMAP capability. This capability can be toggled via the new
'nomap' and '-nomap' ifconfig(8) commands. For NIC drivers that only
transmit packet contents via DMA and use bus_dma, adding the
capability to if_capabilities and if_capenable should be all that is
required.If a NIC does not support unmapped mbufs, they are converted to a
chain of mapped mbufs (using sf_bufs to provide the mapping) in
ip_output or ip6_output. If an unmapped mbuf requires software
checksums, it is also converted to a chain of mapped mbufs before
computing the checksum.Unmapped mbuf are also used by KTLS which is a facility that allows the kernel to perform Transport Layer Security (TLS) framing on TCP sockets. I am not sure if ktls is enabled in pfsense by default or if this could be part of the issue.
What has changed form FreeBSD 13 to 14 is that the 14 nic drivers support the use of unmapped mbuf as shown by the NOMAP flag in ifconfig which was not there in FreeBSD 13 (pfsense 22.05).
Why some hardware crash and some other don't, I am not sure but it may have to do with a particular combination of hardware and nic drivers that is why I was interested at seeing your config.
This is what is behind the rationale of disabling sendfile in nginx, disable use of unmapped mbuf altogether with kern.ipc.mb_use_ext_pgs=0 but this is going to affect ktls as well or possibly disable unmapped mbuf use in the nic driver by adding the -mextpg (this is what -nomap was renamed to) flag to ifconfig. This last option which I haven't tried yet, would be like going back to the FreeBSD 13 situation without the NOMAP on the nic drivers.