SUMMARY Re: Machine hangs (delayed)

From: vispi@lgc.com
Date: Fri Dec 18 1992 - 15:50:46 CST


Hi
Sorry for the extensive delay. The problem was fixed just last week.
My original mail follows and the summary follows after that.

-------

==>
==>
==>
==>Hi
==>I have 2 questions, a) sun related and b) related to a computer
==>conference (probably a flamable issue.)
==>
==>I have a Sparc 370 server 128M swap 32M memory. 4.1.1
==>
==>The machine suddenly freezes up when a user tries to bring up X,
==>but not necesarily just that. A simple du -s /export/home can do it
==>to.
==>
==> I ran vmstat 5 all the time. Even after the machine
==>froze up for all other processes, vmstat kept on running and the
==>page attach "at" is consistently 32, then 9, then 2, then 0, 0, 0, 32, 9, 2.
==>.
==>you get the idea.
==>
==>Once i kill vmstat, I can't restart it, and now the machine is truly hung.
==>Any suggestions???
==>
==>and b) Does anyone have any information on CompDac (I may be spelling it
==>wrong) it some computer exhibition in fall. Apparently in LV, Nevada.
==>
==>Thanks. A summary will follow in a day or two.
==>
==> -Vispi Dumasia

-------

 chris@invmms.worldbank.org (Chris Bulle) suggested it might be a bad mount.
 I deleted all the nfs mount entries and disabled the automounter. It
 didn't help.

 Hal Stern sent me some very useful information. A lot of you might have it,
 but if you don't, IMHO it is something every sun-manager should have
 available. Its a paper on how to debug SonOS kernels. Please send me e-mail
 if you want a copy, and I'll forward it to you. (Its long so I wont include it
 in my mail). Hal's reply:

the next time the machine hangs, force a core dump by
halting it with L1-A, and then at the ">" prompt type
"g0". save the core dump and look at what processes
are "stuck" on.
 
you may be running into a problem running out of kernel
memory. it's best to check the data first before installing
patches, but if this is the case, patch 100330 fixes it.
 
to diagnose this one, look for processes stuck in
morecore() or getpages() in the kernel.

In my case the above procedure did not point to anything obvious, this led
me to believe it might be a hardware issue. I swapped the mother board from
another machine, and things worked just fine.

adam%bwnmr4@harvard.harvard.edu (Adam Shostack) said:

        Comdex? The largest PC show in the country. Lately, Apple and
        workstation vendors have shown up as well.
 
Perry_Hutchison.Portland@xerox.com said it was Comdex not CompDac.

Thanks to all respondents.

        -Vispi Dumasia



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:54 CDT