Summary of advice: recovering from catastrophic disk failure

From: Joe Van Andel (vanandel@stout.atd.ucar.edu)
Date: Mon Aug 06 1990 - 16:40:59 CDT


The most popular ways to recover from catastrophic disk failure without a
local high capacity tape are:

A) Setup a server to boot the dead machine diskless, either before you need
it, or after the failure occurs (refering to the notes you wrote yourself
BEFOREHAND on how to do it). This gives you a full Unix environment, with all
tools and utilities. You can even access the remote exabyte drive across
gateways, because you are running a full version of unix, including the route
command.

B) Use MUNIX (the miniroot) to restore / and /usr over the network as shown
below: (note than MUNIX does not contain the route command, so the exabyte
drive must be on the local network without any gateways in between. Now that
I think about it, you might be able to copy the route command from another
machine, so that you could issue the necessary route command.

1. Boot MUNIX from the local tape drive.

2. Run 'newfs' and 'fsck' on the drive partition to be restored (these are
    included in MUNIX, in /etc, I believe).

3. Mount the partition to be restored onto /mnt (You may have to mkdir /mnt).

4. Append to /etc/hosts the IP address of the local machine and of the tape
    host, along with their full names and nicknames. You can use either
    'cat >> /etc/hosts' or 'ed' as MUNIX doesn't include 'vi'.

5. Set the host name of the local machine and configure the network interface
    by issuing the following:

        example% hostname <hostname>
        example% ifconfig <interface> <hostname> -trailers up
        example% ifconfig <interface> <hostname> netmask +

    where <hostname> is the nickname of the local host which you entered into
    /etc/hosts, and <interface> is the Ethernet interface of the local host
    (ie0 for a Sun3).

6. cd to /mnt (or where ever you mounted the partition to be restored) and
    restore the partition from the remote tape.

7. Remove the file restoresymtable which restore creates.

8. If you are restoring /, you need to create the boot block with:

        example% cd /usr/mdec
        example% installboot /mnt/boot bootxd /dev/rxd0a

    (This example is for the 'xd' controller. If your controller is different,
    you'll need to adjust the installboot arguments appropriately. For example,
    if you have an 'xy' controller, use

        example% installboot /mnt/boot bootxy /dev/rxy0a

    instead.)

9. Unmount the filesystem and run the fsck on it.

Another option given was to have another bootable partition
on the same disk, or on another attached disk, for emergency use.

Thanks to all who responded. You've been very helpful!

(I wish that Sun would add this sort of information to their system
administration manuals. After all, not everyone runs only file servers with
attached tape drives, or simple disk-less nodes!)

        Joe VanAndel Internet:vanandel@ncar.ucar.edu
        NCAR - RSG
        P.O Box 3000 Fax: 303-497-2044
        Boulder, CO 80307-3000 Voice: 303-497-2071



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:05:58 CDT