pfSense constantly crashing
-
So as the title suggests, pfSense is constantly crashing. Things like thermal temps and CPU / RAM all look pretty low, I have pfSense and ntopng running on a intel NUC i5 machine.
I have attached a screenshot of specs. I do get a lot of log entries, wondering if this could cause issues, but upon a reboot, all returns to normal.
Could you please advise me on how I could set a cron task to say automatically reboot pfSense everyday at like 4am?
also if you could please advise me on a cron talk that i could run to reload all modules in ntopng, however this may not be necessary if the cron reboot takes place, however would be good to know a command at which can reload ntopng module from a cron task.
any advice appreciated.
Thank you! Also attached is a log file.
-
Finish the setup of ntopng.
Right after the system boot, it complains with :Oct 12 00:26:30 root 60608 Bootup complete Oct 12 00:26:30 ntopng 62455 [HTTPserver.cpp:1254] ERROR: [HTTP] set_ports_option: cannot bind to 3000s: Address already in use Oct 12 00:26:30 ntopng 62455 [mongoose.c:4642] ERROR: set_ports_option: cannot bind to 3000s: No error: 0 Oct 12 00:26:30 ntopng 62455 [HTTPserver.cpp:1544] ERROR: Unable to start HTTP server (IPv4) on ports 3000s Oct 12 00:26:30 ntopng 62455 [HTTPserver.cpp:1550] ERROR: Either port in use or another ntopng instance is running (using the same port)
=> it wants to use ports on interfaces (addresses) that are already in use (by another web server ;) )
Solution : change them.To test system stability : disable or remove ntopng first.
-
@gertjan I think this is a issue with ntopng, because it does successfully bind to 3000, I think it tries to bind again after it's already bound. Also there is nothing else running on port 3000.
I seem to be getting issues where pfSense web server becomes unreachable, but internet is still active and all other service seem to work flawlessly.
Just no HTTP or HTTPS access
-
@deanfourie said in pfSense constantly crashing:
because it does successfully bind to 3000, I think it tries to bind again after it's already bound. Also there is nothing else running on port 3000.
Something like this : a first instance is launched, crashes, but keep port '3000' bound. A second instance is launched, and refuses to start because of the first instance.
Can you see the first instance here Diagnostics > System Activity ? -
@gertjan I get this multiple time, like almost 50 times. I am also having huge issues with ntopng plugins crashing I should add.
94319 root 20 0 304M 250M uwait 2 0:00 0.10% /usr/local/bin/ntopng -d /var/db/ntopng -G /var/run/ntopng.pid -s -e -w 0 -W 3000 -i em0 -i ovpnc1 -i ue0 --dns-mode 0 --local-networks 192.168.0.0/16,172.16.0.0/12,10.0.0.0/8{ntopng} 0 root -16 - 0B 704K swapin 0 0:16 0.00% [kernel{swapper}] 94319 root 20 0 304M 250M uwait 0 0:03 0.00% /usr/local/bin/ntopng -d /var/db/ntopng -G /var/run/ntopng.pid -s -e -w 0 -W 3000 -i em0 -i ovpnc1 -i ue0 --dns-mode 0 --local-networks 192.168.0.0/16,172.16.0.0/12,10.0.0.0/8{ntopng} 94319 root 20 0 304M 250M uwait 1 0:02 0.00% /usr/local/bin/ntopng -d /var/db/ntopng -G /var/run/ntopng.pid -s -e -w 0 -W 3000 -i em0 -i ovpnc1 -i ue0 --dns-mode 0 --local-networks 192.168.0.0/16,172.16.0.0/12,10.0.0.0/8{ntopng} 94319 root 23 0 304M 250M uwait 1 0:01 0.00% /usr/local/bin/ntopng -d /var/db/ntopng -G /var/run/ntopng.pid -s -e -w 0 -W 3000 -i em0 -i ovpnc1 -i ue0 --dns-mode 0 --local-networks 192.168.0.0/16,172.16.0.0/12,10.0.0.0/8{ntopng} 94319 root 20 0 304M 250M nanslp 3 0:01 0.00% /usr/local/bin/ntopng -d /var/db/ntopng -G /var/run/ntopng.pid -s -e -w 0 -W 3000 -i em0 -i ovpnc1 -i ue0 --dns-mode 0 --local-networks 192.168.0.0/16,172.16.0.0/12,10.0.0.0/8{ntopng}
-
I cant figure out why everything keeps crashing. I'm starting to lean toward a kernel panic. Only thing that makes sense at this point.
-
I assume you never see a crash report after it reboots?
Can you log the console output?
Nothing shown at all looks more like a hardware issue. ntopng trying to start twice is common and doesn't usually cause a problem. What do you mean by 'ntopng plugins'?
If I had to guess what the issue is I'd say it's because you're running a Realtek USB NIC.
Steve
-
-
@stephenw10 Correct, there appears to be no crash report.
@Gertjan and yes correct, a Realtek USB NIC. Could this really cause something so catastrophic like a kernel panic?
-
This morning again, it has crashed. I have internet still but cannot reach any of the web services.
Not sure at this stage if its a kernel panic or some kind on internal ddos on the web services.
Can confirm there is a active listening socket on port 80, just cannot connect.
Starting Nmap 7.92 ( https://nmap.org ) at 2021-10-13 09:10 New Zealand Daylight Time NSE: Loaded 155 scripts for scanning. NSE: Script Pre-scanning. Initiating NSE at 09:10 Completed NSE at 09:10, 0.00s elapsed Initiating NSE at 09:10 Completed NSE at 09:10, 0.00s elapsed Initiating NSE at 09:10 Completed NSE at 09:10, 0.00s elapsed Initiating ARP Ping Scan at 09:10 Scanning 172.16.101.1 [1 port] Completed ARP Ping Scan at 09:10, 0.06s elapsed (1 total hosts) Initiating Parallel DNS resolution of 1 host. at 09:10 Completed Parallel DNS resolution of 1 host. at 09:10, 0.05s elapsed Initiating SYN Stealth Scan at 09:10 Scanning 172.16.101.1 [1000 ports] Discovered open port 80/tcp on 172.16.101.1 Discovered open port 443/tcp on 172.16.101.1 Discovered open port 53/tcp on 172.16.101.1 Discovered open port 3000/tcp on 172.16.101.1 Completed SYN Stealth Scan at 09:10, 4.95s elapsed (1000 total ports) Initiating Service scan at 09:10
-
If you have internet then pfSense has not crashed.
If there was a kernel panic it would almost always create a crash report. The only time it doesn't is when the panic is caused by a driver failure of some sort so it can't write out the report.
Usually if there's no crash report it's because the device hard rebooted which is probably hardware failure.
The only other thing is if you have RAM disks enabled that are lost on reboot it will lose and crash data.Have you actually seen a kernel panic?
If you can't connect to the webgui try resetting php and the webconfigurator at the console, options 16 then 11 in the menu.
It looks like you don't have SSH enabled so that's the first thing I would so. Then you can access it that way.
Steve
-
@stephenw10 Ok, I also have DHCP discovers sent out with no reply offers.
-
@stephenw10 What logs should I pull from pfSense?
-
The main system log would contain any errors if anything is logged.
-
@deanfourie said in pfSense constantly crashing:
Could this really cause something so catastrophic like a kernel panic?
I adivse you to make this a priority task :
Have a look at what's been said about 'realtek' for 'serious' applications like routers.
I've no solid proof, but their is this common knowledge that you should stay away from this brand, just to be on the safe side.
Realtek over USB ? That's like playing russish roulette with 5 bullets in the 6 chambres, instead of one bullet.
Ethernet over USB : that's just a big nono in your situation. If it works, ok, good for you. But that kind of hardware should be removed if you suspect issues.So : first go native, classic bare bone : a device with two (or more) real NIC's. test drive that. If still issues, then you know the device (drive or motherboard or power) has an issue.
Don't do tests with realtek or USB NICs nearby.