Further update:
Perhaps it is already off topic but since it is the same machines I am referring to so I put them here.
Finally I have replicated the crash under a test condition and found that it is the NIC that have been causing the problem, all the problematic servers are using the same NIC with the driver: dev.dc.0.%desc: Macronix 98715AEC-C 10/100BaseTX
The crash occur after "TX underrun - using store and forward" message appeared during a network stress test(downloading large files). Hence I start looking at the NIC, replaced with another brand and everything works like a bliss, not even TX underrun message popping up. Both RealTek and DLink NIC works happily.
Poor thing I have been spending weeks torturing it with CPU and HDD stress test scripts and it end up the real culprit is the NIC :/