SUMMARY

From: Andrew F. Mitchell (afm@ufnmr1.health.ufl.edu)
Date: Wed Feb 15 1995 - 14:06:45 CST


Hello,
        This is my long overdue summary to my request for information on
why one of my machines crashed. The original post:

> Can anyone tell me what caused my machine to crash?
> Here is the output of dmesg - identical to console output at
> the time of the crash:
>
> sq0: <Sony SMO-D501 cyl 18643 alt 2 hd 1 sec 31>
> sq0: <Sony SMO-D501 cyl 18643 alt 2 hd 1 sec 31>
> sq0: <Sony SMO-D501 cyl 18643 alt 2 hd 1 sec 31>
> sq0: <Sony SMO-D501 cyl 18643 alt 2 hd 1 sec 31>
> sq0: <Sony SMO-D501 cyl 18643 alt 2 hd 1 sec 31>
> sq0: <Sony SMO-D501 cyl 18643 alt 2 hd 1 sec 31>
> sq0: <Sony SMO-D501 cyl 18643 alt 2 hd 1 sec 31>
> sq0: <Sony SMO-D501 cyl 18643 alt 2 hd 1 sec 31>
> panic: iinactive
> syncing file systems... panic: iinactive
> 00000 low-memory static kernel pages
> 01118 additional static and sysmap kernel pages
> 00000 dynamic kernel data pages
> 00230 additional user structure pages
> 00000 segmap kernel pages
> 00000 segvn kernel pages
> 00132 current user process pages
> 00128 user stack pages
> 01608 total pages (1608 chunks)
>
> sq0 is an optical disk device. We have had some read/write problems
> in the past with optical drives. Specifically, certain disks. THis
> may have been one of them. Does anyone think this crash was caused
> by anything other than a optical disk i/o problem? If so, any ideas
> about a cure?

The responses:
        Kevin.Sheehan@uniq.com.au suggested using the kadb kernel
debugger to trace the problem. Haven't really had time to learn
about kadb, although I suppose I should. Perhaps in my "free time"? ;-)

        martin@gea.hsr.it (Martin Achilli) recalled a similar problem
and suggested patches:

        100173-12 SunoS 4.1.3: NFS jumbo patch

        Patch-ID# 100342-03
        Synopsis: SunOS 4.1 4.1.1 4.1.2 4.1.3:
                NIS client needs long recovery time if server reboots
        100726-17 SunOS 4.1.3:
                sun4m jumbo patch for kernel performance and memory bugs
        100567-04 SunOS 4.1,4.1.1, 4.1.2, 4.1.3:
                mfree and icmp redirect security patch

        and I think the relevant one:
        100623-03 SunOS 4.1.2;4.1.3: UFS jumbo patch

These ufs and nfs patches were also suggested by

        Kimberley.Brown@UK.Sun.COM.
        Danny.Mispelters@Belgium.Sun.COM
        

diekema@linus.si.com (Jon Diekema) sent me a bourne shell script for
discovering info on crashes. I haven't had a crash since, soI haven't
used it yet, but I am sure that it will provide some interesting info.
Thanks Jon.

japperso@homer.teledyne.com.teledyne.com (John Apperson) thought the
problem might be caused by a bad inode on the optical platter. He
suggested "carefully" writing the data on the disk and then mounting
it read only. I'm not sure how to write data "carefully" ? ;-) Thanks
John, for the advice (I am only ribbing you about the "carefully" thing,
but you *did* say that. ;-)

dougj@xray.ufl.edu (Doug Jones) identified this as a classic "REO drive
with corrupted filesystem" and said to try and fsck the disk. He also
advised me to check out the "convert from static fs to dynamic fs" flag
for fsck. Will do!

johnh@gerbil.umds.ac.uk (John Hearns - System Manager) said he had a
similar problem recently, contacted Sun support and was told by
Kimberley Brown (hello again) to isntall patches...well, you know from
above the two patches I am talking about....

John Hearns also sent me a second message saying,
        "don't you just hate those SMO drives...."
with which I concurred whole-heartedly.

That's my summary, and sorry for the delay!

/////////////////////////////////////////////////////////////////
Andrew F. Mitchell afm@ufnmr1.health.ufl.edu
Sys Admin UF MRI Lab phone 904.376.1611x5069
                http://ufnmr1.health.ufl.edu/~afm



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:10:16 CDT