Web GUI crashes after upgrade from 22.05 to 23.01
-
Do you have something behind it you could try accessing it via? Something you could remote desktop to maybe?
-
@stephenw10 Not at the moment. maybe tomorrow I can remote to my son Macbook and try local access.
Don't know if it does matter but the problem firewall is running both ipv4 and ipv6
-
The IPSec tunnel is IPv4 only though?
-
@stephenw10 yes
-
@stephenw10 Ok, local login to the firewall works without crashes. So it is the combination of logging in to the GUI through the ipsec vpn that is causing the problem.
Weird enough I can access webcams, ssh without issues.
The configuration is the same I was using on 22.05 without issues.
-
Ok, that's good info. Our devs are looking into this I'll try to replicate it...
-
@stephenw10 Were you guys able to replicate the issue?
I may be able to go to the other house this weekend. Should I try a fresh reinstall? -
I haven't replicated it yet.
Do you have SWAP enabled on that device? What size if so? Getting a full core dump from that would be useful.Steve
-
@stephenw10
/root: swapinfo
Device 1K-blocks Used Avail Capacity
/dev/ada0p3 1048576 0 1048576 0%How do I get you a full core dump?
-
If you install the debug kernel with:
[23.01-RC][root@6100.stevew.lan]/root: pkg install pfSense-kernel-debug-pfSense Updating pfSense-core repository catalogue... pfSense-core repository is up to date. Updating pfSense repository catalogue... pfSense repository is up to date. All repositories are up to date. The following 1 package(s) will be affected (of 0 checked): New packages to be INSTALLED: pfSense-kernel-debug-pfSense: 23.01.b.20230106.0600 [pfSense-core] Number of packages to be installed: 1 The process will require 709 MiB more space. 145 MiB to be downloaded. Proceed with this action? [y/N]: y [1/1] Fetching pfSense-kernel-debug-pfSense-23.01.b.20230106.0600.pkg: 100% 145 MiB 5.2MB/s 00:29 Checking integrity... done (0 conflicting) [1/1] Installing pfSense-kernel-debug-pfSense-23.01.b.20230106.0600... [1/1] Extracting pfSense-kernel-debug-pfSense-23.01.b.20230106.0600: 100%
Then when you reboot you can select that by hitting option 6 at the boot loader menu. However if you only have remote access that could be a problem.
Steve
-
Still failing to replicate this.
What IPSec config are you using?
What firewall rules do you have?
-
@stephenw10 I will be at the other house this weekend and I am going to try to get core dumps with the debug kernel.
As far as ipsec config, it is a tunnel between the two firewall.
Phase1: Key exchange is IKEv2, protocol is ipv4 only, auth is Mutual PSK, identifiers are the ip addresses. auth is AES 256 SHA256 DH 14, life time 31680 sec, rekey time at default 90% of lifetime.
Phase2: mode is Tunnel ipv4, Local network is my lan subnet, remote network is set to the remote network ip/24, no nat. Key exchange/SA mode is ESP, encryption is AES256-GCM 128bits PFS key group 14, life time 5400, rekey time 4860 sec.Once I get to the other house (the remote pfsense), the first thing I want to try is to access the firewall GUI of my primary pfsense through the ipsec vpn and see if it crashes or not. The two firewalls run on different hardware with different packages installed.
I will then try to get you core dumps. I will try to see if going back to a default config with just the ipsec configuration makes any difference.
Anything else I should try?
-
Do you have any special gateways/route table entries setup which refer to the IPsec network(s)? Or any other config in other areas of the firewall itself that relates to the tunnel outside of the IPsec config you described?
Things like this sort of setup:
https://docs.netgate.com/pfsense/en/latest/vpn/ipsec/access-firewall-over-ipsec.html -
@jimp yes I had to add gateways and static routes on both end points
-
Did it just fail to connect without those static routes? I assume those changes were added in an earlier version, before 22.05?
-
@stephenw10 as we discussed it is not a failed connection. The firewall reboots with a kernel fault. Yes those gateways and static routes were added before 22.05. Is that workaround not needed anymore?
-
@jjstecchino said in Web GUI crashes after upgrade from 22.05 to 23.01:
Is that workaround not needed anymore?
Maybe not. At least for incoming connections like this. In my testing here I found I did not need it to access the webgui on a remote firewall. I'm just trying to determine if that's specific to my setup. Though I doubt it is since my setup is very basic.
I haven't been able to replicate the crash with or without the static routes though. -
Ok this is puzzling to me. I am now at the remote location, if ii access the GUI of my primary home firewall through the IPsec tunnel it works just fine. The two firewalls have different hardware and different update paths. I am going to try a clean full install
-
Here is what I did once I had physical access to my remote pfsense:
Reinstalled 2.6CE from scratch, kept default config but set LAN ipv4 to my network ip xxx.xxx.50.0/24. Lets call this my now local LAN
Updated to plus 22.01, kept default config
Updated to plus 22.05, default config
Updated to 23.01, default config.
Created ipsec vpn tunnel to my primary firewall. Lets call this the now remote LAN with ip xxx.xxx.100.0/24
Created a domain override on unbound to forward dns request for the domain on remote network.
I could ping remote hosts from my local clients but not from pfsense so ....
I needed to add a gateway to the remote network with a static route to xxx.xxx.100.0
IPSEC VPN working properly, can access remote network from any local clients and from pfsense.
Access to the local pfsense GUI gives no problems.
Connected to a windows pc on the xxx.xxx.100.0 network and tried to access the pfsense GUI on xxx.xxx.50.0 network and as before pfsense has a kernel fault.
restored my configuration file which adds a few DHCP static mapping
Installed debug kernel
rebooted to debug kernel
Retried to connect to the pfsense guy from the remote pc
kernel fault, I see the kernel dump in the local consolein /var/crash there is a text dump.tar.0 but no vmcore dump
Anything else I can do to help troubleshooting?
-
Looking at the debug trace
Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 02 fault virtual address = 0x0 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff81334a0a stack pointer = 0x28:0xfffffe00d2378560 0xffffffff80fbe3c4 at tcp_defauframe pointer = 0x28:0xfffffe00d2378560 lt_output+0x2094 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 #10 0xffffffff80fd406e at tcp_usr_ready+0x11e processor eflags = interrupt enabled, resume, IOPL = 0 current process = 36274 (nginx) #11 0xffffffff80d7e39e at sendfile_iodone+0x23e rdi: fffff80106c34113 rsi: 0 rdx: 42c #12 0xfffffffrcx: 42c r8: 1 r9: fffff80106c61e00 f80d7db68 at vn_sendfile+0x1868 rax: fffff80106c34113 rbx: fffff8005f313100 rbp: fffffe00d2378560 #13 0xffffffff80d7e877 at %[sys_sendfile]+0xf7 r10: 1 r11: 0 r12: fffff80106c61e00 #14 0xffffffff813393be at amd64_syscall+0x12e r13: 0 r14: fffff8005f2a5e00 r15: 1 #15 0xffffffff8130c72b at fast_sytrap number = 12 panic: page fault cpuid = 2 time = 1673708059 KDB: enter: panic
it seems sendfile may be causing the fault.
Just to try I disabled sendfile in nginx configuration by editing /etc/inc/system.inc -> function system_generate_nginx_config and set sendfile to off.
This solves the issue.
Now I can access the GUI remotely without kernel faults
Interesting the other firewall (Xeon D-1518 32Gb Ram) doesn't have a problem with sendfile on
Another workaround is to set the sysctl kern.ipc.mb_use_ext_pgs = 0 and leave sendfile on in nginx config.
This solves the issue as well.
It seems to be related to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254419 which was marked as fixed