== update ==
We ended up just upgrading them to 5.5 when it came out and so far it hasn’t happened again.
== end update ==
So I’m having a weird situation with our Centos 5.4 AMD boxes that are open for general use. Occassional, the lbcd process will go into a D state and just chew up cpu doing absolutely nothing. With lbcd stopped basically, users can’t use the load balanced name to get into this machine. Kill -9 does nothing and strace shows absolutely nothing. lbcd can’t restart at all. Heres an output from one box:
ps axl | grep D
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
1 0 2397 1 16 0 1672 8 sync_p D ? 2:07 /pkg/lbcd/current/sbin/lbcd
So far it doesn’t affect the Intel machines we have, but that could be because regular users can’t log into them. If I find the reason, I’ll be sure to post it up here.