We have this HP Proliant DL365 G5 that runs Solaris 10 and has two zones: an imap and ldap server. Lately it's been crashing and rebooting in the middle of the night or will all of a sudden lose complete networking. I'll have to connect through the lom and reboot it from there when that happens. The first thing we really did was pop out the hard drive and put it in another DL365 G5. So far the "old" server is fine and only the one that followed the hard drive has the problem.
Looking at some of the crashes using SUNWscat (props to Ben at cuddeltech for his article on it ) some said something along the lines of:
panic string: mutex_destroy: bad mutex, lp=ffffffff82404580 owner=fffffe80004dfc80 thread=ffffffffbf4d884
panic string: CPQary3: Activation of New DMA Window Failed
Since CPQary3 is the pkgname for the controller on this box, I figured an update on this would help. Looking at the newest version of this driver, 1.9.1, the change release says:
Corrected a kernel mutex alignment issue that could cause a kernel panic.
So this looked like the the possible case for solving this problem! Too bad though it still had DMA errors and rebooted with a:
panic string: BAD TRAP: type=e (#pf Page fault) rp=fffffe8001e95310 addr=0 occurred in module "rootnex" due to a NULL pointer dereference
These are the same as a couple of reboots before. So now it could be a bad hard drive?
Doing some more stress testing on this one.