Crash after upgrade to 2.7
-
Hello dear community,
I am having problems since the update to version 2.7 and I can't find the solution, I'm going crazy.
Since I switched to the latest version, my CPU runs at 100% (IDLE) and as soon as I start to have a little traffic or several people initiate connections, the system saturates, VPN connections freeze, filtering slows down, the Webconfigurator no longer responds...
The temporary solution to this moment is to restart php-fpm from the console.
This solution allows you to put back in production the filtering and VPN connections, however it does not last long, after a while (a few hours), the system freeze and it is impossible to do anything, the console is not even accessible anymore, even if I try to open a TTY.
I tried to uninstall all my packages and restart the system, it does not change anything the CPU is 100%,
I disabled all VPN servers, same result, I modified the advanced options to maximize compatibility as recommended here https://docs.netgate.com/pfsense/en/latest/troubleshooting/high-cpu-load.html but no change.I check logs but nothing appear suspect..
My config:
Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz
Current: 796 MHz, Max: 2100 MHz
48 CPUs : 2 package(s) x 12 core(s) x 2 hardware threads
AES-NI CPU Crypto: Yes (inactive)
IPsec-MB Crypto: Yes (inactive)
QAT Crypto: No8 broadcom interfaces (6 used for differents LAN)
4 intel interfaces (3 used for differents WAN)Installed packages:
- ACME
- PfBlockerNG
- Snort
- OpenVpn-client export
Services in Use:
- bsnmpd
- captiveportal
- dhcpd
- dpinger
- ladvd
- ntpd
- openvpn (3 servers)
- pfb_dnsbl
- pfb_filter
- snort
- sshd
- syslogd
- unbound
top -aSH result:
last pid: 4283; load averages: 0.48, 0.47, 0.35 up 0+10:17:03 08:07:07 1412 threads: 49 running, 1253 sleeping, 110 waiting CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 99.9% idle Mem: 806M Active, 439M Inact, 2413M Wired, 56K Buf, 89G Free ARC: 323M Total, 119M MFU, 194M MRU, 294K Anon, 1479K Header, 7831K Other 204M Compressed, 584M Uncompressed, 2.86:1 Ratio Swap: 1024M Total, 1024M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root 187 ki31 0B 768K CPU43 43 615:03 100.00% [idle{idle: cpu43}] 11 root 187 ki31 0B 768K CPU9 9 614:57 100.00% [idle{idle: cpu9}] 11 root 187 ki31 0B 768K CPU45 45 614:57 100.00% [idle{idle: cpu45}] 11 root 187 ki31 0B 768K CPU6 6 614:54 100.00% [idle{idle: cpu6}] 11 root 187 ki31 0B 768K CPU2 2 614:52 100.00% [idle{idle: cpu2}] 11 root 187 ki31 0B 768K CPU20 20 614:49 100.00% [idle{idle: cpu20}] 11 root 187 ki31 0B 768K CPU11 11 614:48 100.00% [idle{idle: cpu11}] 11 root 187 ki31 0B 768K CPU1 1 614:45 100.00% [idle{idle: cpu1}] 11 root 187 ki31 0B 768K CPU5 5 614:44 100.00% [idle{idle: cpu5}] 11 root 187 ki31 0B 768K CPU4 4 614:43 100.00% [idle{idle: cpu4}] 11 root 187 ki31 0B 768K CPU15 15 614:42 100.00% [idle{idle: cpu15}] 11 root 187 ki31 0B 768K RUN 13 614:39 100.00% [idle{idle: cpu13}] 11 root 187 ki31 0B 768K CPU22 22 614:36 100.00% [idle{idle: cpu22}] 11 root 187 ki31 0B 768K CPU14 14 614:36 100.00% [idle{idle: cpu14}] 11 root 187 ki31 0B 768K CPU12 12 614:33 100.00% [idle{idle: cpu12}] 11 root 187 ki31 0B 768K CPU23 23 614:32 100.00% [idle{idle: cpu23}] 11 root 187 ki31 0B 768K CPU8 8 614:31 100.00% [idle{idle: cpu8}] 11 root 187 ki31 0B 768K CPU21 21 614:30 100.00% [idle{idle: cpu21}] 11 root 187 ki31 0B 768K CPU28 28 614:15 100.00% [idle{idle: cpu28}] 11 root 187 ki31 0B 768K CPU39 39 614:11 100.00% [idle{idle: cpu39}] 11 root 187 ki31 0B 768K CPU25 25 614:08 100.00% [idle{idle: cpu25}] 11 root 187 ki31 0B 768K CPU29 29 614:06 100.00% [idle{idle: cpu29}] 11 root 187 ki31 0B 768K CPU26 26 614:06 100.00% [idle{idle: cpu26}] 11 root 187 ki31 0B 768K CPU34 34 614:00 100.00% [idle{idle: cpu34}] 11 root 187 ki31 0B 768K CPU30 30 613:59 100.00% [idle{idle: cpu30}] 11 root 187 ki31 0B 768K CPU35 35 613:58 100.00% [idle{idle: cpu35}] 11 root 187 ki31 0B 768K CPU37 37 613:53 100.00% [idle{idle: cpu37}] 11 root 187 ki31 0B 768K CPU40 40 613:52 100.00% [idle{idle: cpu40}] 11 root 187 ki31 0B 768K CPU32 32 613:50 100.00% [idle{idle: cpu32}] 11 root 187 ki31 0B 768K CPU36 36 613:50 100.00% [idle{idle: cpu36}] 11 root 187 ki31 0B 768K CPU38 38 613:42 100.00% [idle{idle: cpu38}] 11 root 187 ki31 0B 768K CPU33 33 613:39 100.00% [idle{idle: cpu33}] 11 root 187 ki31 0B 768K CPU42 42 613:07 100.00% [idle{idle: cpu42}] 11 root 187 ki31 0B 768K CPU7 7 614:48 99.97% [idle{idle: cpu7}] 11 root 187 ki31 0B 768K CPU31 31 614:02 99.96% [idle{idle: cpu31}] 11 root 187 ki31 0B 768K CPU17 17 614:35 99.93% [idle{idle: cpu17}] 11 root 187 ki31 0B 768K CPU44 44 613:06 99.81% [idle{idle: cpu44}] 11 root 187 ki31 0B 768K CPU0 0 614:23 99.71% [idle{idle: cpu0}] 11 root 187 ki31 0B 768K CPU16 16 614:42 99.05% [idle{idle: cpu16}] 11 root 187 ki31 0B 768K CPU3 3 614:55 99.05% [idle{idle: cpu3}] 11 root 187 ki31 0B 768K CPU41 41 613:52 99.04% [idle{idle: cpu41}] 11 root 187 ki31 0B 768K CPU46 46 612:37 99.03% [idle{idle: cpu46}] 11 root 187 ki31 0B 768K CPU27 27 614:02 99.03% [idle{idle: cpu27}] 11 root 187 ki31 0B 768K CPU47 47 614:39 99.02% [idle{idle: cpu47}] 11 root 187 ki31 0B 768K CPU19 19 614:41 98.93% [idle{idle: cpu19}] 11 root 187 ki31 0B 768K CPU18 18 614:40 98.91% [idle{idle: cpu18}] 11 root 187 ki31 0B 768K CPU24 24 613:57 98.65% [idle{idle: cpu24}] 11 root 187 ki31 0B 768K CPU10 10 614:35 98.65% [idle{idle: cpu10}] 4283 root 26 0 20M 7092K CPU13 13 0:00 0.45% top -aSH
Thank you in advance for your help.
-
J jimp moved this topic from Problems Installing or Upgrading pfSense Software on
-
What does it show when there is traffic passing? It's basically doing nothing at all there.
Steve
-
what model intel NIC(s)?
-
@stephenw10 thank you for your reponse,
As the traffic passes the load of the CPU (INTERRUPT) increases until saturation, as soon as it reaches about 50%, the first freeze begins, we must restart PHP-FPM to be able to access the WebConfigurator. The load continues to increase up to 100%, once at 100%, nothing responds, not even the console, the only way is the forced stop.
Yet the traffic is not very important at the moment, some people are still on vacation, I'm talking about twenty people in production at the office, 5 VPN connections and some web services, which represents a third of the normal load ...
-
@ElTigreVerde thank you for your interest,
The spec of my Intel NIC:
<Intel(R) I340 82580 (Copper)>
EEPROM V3.29-0 eTrack 0x8000027a
Using 1024 TX descriptors and 1024 RX descriptors
Using an MSI interrupt
netmap queues/slots: TX 1/1024, RX 1/1024 -
Hmm, so what's generating that load? What does
top -HaSP
show? Orps -auxwwd
?How much traffic is passing? That system should be capable of passing a lot before it has much impact.
-
That intel NIC should be solid, can you test with snort and pfBlockerNG fully disabled to see if this persists? make sure running a force reload>all in pfB returns no output.