HP T730 help please
-
I was able to ping internally but could not ping google or cloudflare.
Nothing logged though, from what I can tell, to me it was like the wan port stopped working.
I will dig another in winscp and see if I can find any more logs somewhere.
If it was having RAM errors or SSD errors would they get logged somewhere?
-
Bad RAM usually results in random crashes. A bad SSD can result in odd failures where services fail over some hours and logging obviously isn't possible. But you would see errors at the console.
When you tried to ping google.com what error was shown?
-
It just timed out when I pinged 8.8.8.8.
It's there a mem test built in?
Logging i believe is working, well it is for ntopng, netdata and adguard.
The only errors I am getting on the console are still:
ixl1: Malicious Driver Detection event 1 on RX queue 771, pf number 0 (PF-1) ixl1: Malicious Driver Detection event 1 on RX queue 769, pf number 0 (PF-1) ixl1: Malicious Driver Detection event 1 on RX queue 769, pf number 0 (PF-1)
Same for ixl0
-
Ah, yes. Hitting that issue will prevent traffic but should not crash the OS which sounds like what you are now seeing?
Previously it looked like the entire firewall was crashing which would not have been that. -
For the above error I read somewhere to try the newer drivers and or increase buffer size. But I am unsure how to do either.
-
You are running 2.6?
You could go to 22.05 or try a 2.7 snapshot. The drivers are slightly newer there though there are no changes I'm aware of that would affect this.
I'm not aware of any workaround for that currently. Which buffer was referenced?
Steve
-
Yes on the 2.6.
I think I might go to 22.05, are there any things I need to be aware of switching from CE to Plus?
I guess the buffer doesn't apply to me since I don't have a bridge:
"Most of the issue reports have been from those running a bridge interface with ixl0 and ixl1. However, there have been multiple reports without using bridges as well.
Increasing the buffer size on the bridge reduced the frequency of the events (went from once a day to taking 5 days before it reoccurred)."I have ordered an Intel i350 to replace the x710, when the card comes in I'll swap out the RAM as well.
-
Nope the upgrade to 22.05 should be relatively painless.
Yes, with no bridges defined the bridge buffer does nothing. I'm not sure it actually affects this even when there are bridges in play.
Steve
-
Haven't upgraded yet. But maybe have some more useful info.
Today it has been "crashing" like before. But this time the console still works. When it "crashes" now I lose access to the web gui and all internet traffic goes down.
Only errors are the ones from before, nothing else on the console.
-
All your interfaces are ixl now though? If so that's expected. You might add the Realtek NIC as a management interface as that will remain up if/when you hit the ixl bug.
-
So going through my graphs, everything "crashes" when the ixl0 and ixl1 starts having huge packet loss.
Does this confirm it is an issue with the x710-t2l? Could this nic be getting hot and causing this? Or do you think it is more related to the bug?
This would suck because that thing was hella expensive.
-
How hot? Can you add a fan there as a test?
I'm not aware of heat being an issue with that bug but if it is that would be a very interesting discovery.
Steve
-
I haven't put my heat gun to it yet so not sure if it is getting hot. I was just wondering if that was a known problem. I think I'll redo the thermal compound as well
I will take some readings here in a bit and install a noctua 40mm on it just in case.
Are the x550-t2 decent cards? Any known issues?
-
@cgi2099 said in HP T730 help please:
Are the x550-t2 decent cards?
Yes. The x500 series are what I would recommend currently. In terms of price, performance and stability they are in the sweet spot IMO. That may change.
Steve
-
-
Awesome, maybe I can find one of those reasonable somewhere.
I did see intel released new firmware for the x710 cards on the 8th but idk if that will fix any issues.
-
Yup check the OEM equivalent list:
https://forums.servethehome.com/index.php?threads/list-of-nics-and-their-equivalent-oem-parts.20974/Not seeing a firmware update newer than 8_40, does it have a change log?
Steve
-
9.0 is what I think I see, but could be the wrong card:
I believe this is the changelog:
I am not sure what version my card has on it at the moment.
-
Hmm, interesting. Not sure why I didn't find that. I can't see any specific mention of MDD though.
-
I am going to update in the morning and I'll report back if anything improves.
I did as you said and the last time it crashed today I plugged into the realtek port and was still able to access the gui. So it is the Intel card crashing.
Temps were fine, 40-50*c is what my temp gun was saying. I am still going to put a fan on it also.
-
My card was on 7.3 but with that said updating to 9.0 did not solve the issue, still getting the same error.
With that said is there a package for intel drivers 27.6? I believe that is the last thing I can do in hopes of getting this card to play nice.