SUMMARY: Re: Sun3 hanging in init

From: paula@atc.boeing.com
Date: Thu Mar 14 1991 - 20:34:45 CST


A couple days ago I asked for help with a diskless Sun3 that was hanging
in init while booting. The machine's boot server lost a drive and a bunch
of stuff, including this client's swap file, was moved elsewhere. All of
the relevant databases were updated, but still the client refused to boot.
A network analyzer showed that the last thing it did was an NFS read of
/sbin/init.

Many wizards responded, but George Terrone (georget@sunesc.East.Sun.COM)
takes the prize for the one response that actually fixed it. He wrote:

>This happened to me last week and believe it or not
>after five hours of software de-bugging.....
>..... a power off reset fixed it. It was a CPU problem.
>Hope yours is too.

I turned the hapless client off and back on and it booted right up!
I wonder if a K2 at the monitor prompt would have worked as well?

The following list of suggestions from the wizards may make a nice
checklist for things to look at when clients don't boot:

- The client and server addresses in /etc/hosts on both the client and
the server (and the YP hosts map, if you're using it) must be correct.

- The ifconfig line in /etc/rc.boot should have the appropriate netmask and
broadcast address (all ones).

- The size of the swap file should be reasonable. (The 4.1 Install manual
says to use 16K. Should be 16M!)

- The client's root and swap should be exported with -access=client,-root=client.

- You must run exportfs on the server if you move the client's root
or swap.

- Compare checksums of the binaries in /sbin with a working client.

- Make sure bootparams is correct. Watch the client boot with >b -v to
make sure it gets the right stuff.

- Try doing a rm_client followed by an add_client.

- Add_client puts unqualified names in /etc/bootparams and /etc/exports,
which can cause problems with DNS.

- Check that things in /etc like rc* and fstab aren't mangled.

- Has there been a change in the network recently?

- Check that the client's /dev has all the right entries, especially
/dev/console.

- Check /etc/ttytab.

- Check that the client's kernel has root and swap on nfs.

- Check the client's fstab for reasonableness.

- Power-cycle the sucker! (or maybe K2)

- Try restoring the client's root from tape.

These suggestions are in the order I recieved them, and some are clearly
more drastic than others. Use good judgement when applying this list to
your particular problem!

Thanks go to the following for their timely and helpful responses:

"Anthony A. Datri" <datri@lovecraft.convex.com>
mp@allegra.att.com (Mark Plotnick)
Jeff Nieusma <nieusma@cs.Colorado.EDU>
stern@East.Sun.COM (Hal Stern - Consultant)
mmikulska@UCSD.EDU (Margaret Mikulska)
4rst@turing.cs.unm.edu (Forrest Black)
zjat02@trc.amoco.com (Jon A. Tankersley)
trinkle@cs.purdue.edu (Daniel Trinkle)
georget@sunesc.East.Sun.COM (GEORGE TERRONE)
"D.Ballance" <gnma76@udcf.glasgow.ac.uk>
marantz@cs.rutgers.edu
Charles <mcgrew@porthos.rutgers.edu>
Michael P Lingk <mlingk@cs.uiuc.edu>

Thanks again, guys 'n' gals!

Paul Allen
pallen@atc.boeing.com
Boeing Advanced Technology Center for Computer Sciences



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:12 CDT