HP T730 help please
-
Hello all, I got a new T730 and installed a X710-T2L in it. Has a 64GB nvme and 8gb ram.
My problem is it is regularly at 20% CPU usage which is fine but it also is spiking to 100% cpu usage a lot. It seems to be Python 3.9 and idle(idle:cpu xyz) using a lot of CPU.
Is this normal?
UPDATE
The box is still doing like I described above but also is now freezing after a few hours and I have to hard reset it. I read something about the realtek nic might being the culprit. I installed the newer driver but don't know how to enable it. -
What packages do you have enabled?
How much traffic is it filtering?
It's running an AMD RX-427BB?
Steve
-
All I have installed is ntopng and adguard home. But adguard is disable right now
It's really werid, it'll be running fine and then just stop working and I have to unplug the power and reboot for it to come back up. It seems to last 3-4 hours before crashing.
Yes it has the amd cpu
WAN traffic is 500mbps/50mbps
Max 20 clients connected
It seems to crash when there isn't any traffic or much load
-
Check the system logs for watchdog errors from the Realtek driver.
-
@stephenw10
I've looked for logs but can't seem to find any that show any errors.Is there a way to just disable the realtek nic to see if that is the culprit?
-
If you're not using it then it can't be the problem. Typically a Realtek NIC will stop responding but the rest of the firewall still functions.
Do you see any sort of crash report? Does the console still respond?Steve
-
@stephenw10
No crash report or logs that I can find.When it crashes the web gui stops working and I can't ssh in anymore.
Temperatures never get over 40c and ram usage is never above 10%.
-
Have you tried connecting to the console when that happens?
Or logging the console output would be better if you can. There are some errors that only appear on the console.
-
-
Could be either (or both!). It depends how it was installed. If it has a serial port and you enable the serial console you can log it's output on a terminal client which is useful.
It's probably using the video console though so just connect a monitor/keyboard and see if it's still responding there.Steve
-
@stephenw10
Nothing was responding serial cable or video. I have been trying all day to get errors and I finally was able to from the console before the screen just went black. Look like there could be more but I can't make them out before everything goes black. Not much but maybe can help:ixl1: Malicious Driver Detection event 1 on RX queue 771, pf number 0 (PF-1)
-
Hmm, do you have a bridge configured?
That looks like this known issue: https://redmine.pfsense.org/issues/13003
Though that doesn't normally crash the OS entirely AFAIK.You should be able to use the console when it running normally, that is functional I assume?
If it just stops responding when this happens try pressing
ctl+t
that can respond when nothing else will. If it does it should show you what it's waiting for.Steve
-
@stephenw10
No bridge that I am aware of unless it automatically did during the install.I am using one of the ports for the wan and the other for the lan. The realtek is not being used.
The console works right up to the point where everything goes down and then everything just has a black screen or loses connection.
The device shows to still be on, though with the activity lights on the nic blinking like there is still data going through.
I will try ctl+t next time this happens.
Everything seems to be working great while its up other than the 100% CPU spikes, I'd say the spike are happening about every 2 minutes or so.
-
Anything in the system logs at around the same 2min interval?
-
Where is the best place to access the system logs with WinSCP? The ones on the web GUI aren't giving me much
-
-
ixl0: <Intel(R) Ethernet Controller X710 for 10GBASE-T - 2.3.1-k> mem 0xe1000000-0xe1ffffff,0xe2008000-0xe200ffff at device 0.0 on pci1 ixl0: fw 7.2.60285 api 1.9 nvm 7.21 etid 80007a1a oem 1.266.0 ixl0: PF-ID[0]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, MDIO shared ixl0: Using 1024 TX descriptors and 1024 RX descriptors ixl0: Using 4 RX queues 4 TX queues ixl0: Using MSI-X interrupts with 5 vectors ixl0: Ethernet address: 68:05:ca:c1:8b:e0 ixl0: Allocating 4 queues for PF LAN VSI; 4 queues active ixl0: PCI Express Bus: Speed 8.0GT/s Width x8 ixl0: SR-IOV ready ixl0: netmap queues/slots: TX 4/1024, RX 4/1024 ixl0: Link is up, 1 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: CL74 FC-FEC/BASE-R, Autoneg: True, Flow Control: No ixl0: link state changed to UP ixl0: link state changed to DOWN ixl0: Link is up, 1 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: CL74 FC-FEC/BASE-R, Autoneg: True, Flow Control: No ixl0: link state changed to UP ixl0: promiscuous mode enabled ixl0: Malicious Driver Detection event 1 on RX queue 769, pf number 0 (PF-0) ixl0: Malicious Driver Detection event 1 on RX queue 769, pf number 0 (PF-0)
-
Here is another:
ixl1: <Intel(R) Ethernet Controller X710 for 10GBASE-T - 2.3.1-k> mem 0xe0000000-0xe0ffffff,0xe2000000-0xe2007fff at device 0.1 on pci1 ixl1: fw 7.2.60285 api 1.9 nvm 7.21 etid 80007a1a oem 1.266.0 ixl1: PF-ID[1]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, MDIO shared ixl1: Using 1024 TX descriptors and 1024 RX descriptors ixl1: Using 4 RX queues 4 TX queues ixl1: Using MSI-X interrupts with 5 vectors ixl1: Ethernet address: 68:05:ca:c1:8b:e1 ixl1: Allocating 4 queues for PF LAN VSI; 4 queues active ixl1: PCI Express Bus: Speed 8.0GT/s Width x8 ixl1: SR-IOV ready ixl1: netmap queues/slots: TX 4/1024, RX 4/1024 ixl1: Link is up, 1 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: CL74 FC-FEC/BASE-R, Autoneg: True, Flow Control: None ixl1: link state changed to UP ixl1: link state changed to DOWN ixl1: Link is up, 1 Gbps Full Duplex, Requested FEC: None, Negotiated FEC: CL74 FC-FEC/BASE-R, Autoneg: True, Flow Control: No ixl1: link state changed to UP ixl1: promiscuous mode enabled ixl1: Malicious Driver Detection event 1 on RX queue 771, pf number 0 (PF-1) ixl1: Malicious Driver Detection event 1 on RX queue 769, pf number 0 (PF-1) ixl1: Malicious Driver Detection event 1 on RX queue 769, pf number 0 (PF-1)
-
ntopng is causing the spikes I believe, there are about 10 of these that hit at the same time ~8-10% usage on all of them:
/usr/local/bin/ntopng -U ntopng -G /var/run/ntopng/ntopng.pid -1 /usr/local/share/ntopng/httpdocs -2 /usr/local/share/ntopng/scripts -3 /usr/local/share/ntopng/scripts/callbacks -e{ntopng}
-
ntopng is definitely the culprit on the spikes, with it disabled never goes above 40% usage even with 5 opvn users.
Is this normal behavior for ntopng?