SUMMARY: shutdown problem

From: Robert McGraw (mcgraw@sunspot.sunspot.noao.edu)
Date: Fri Nov 15 1991 - 14:57:38 CST


I want to thank all who responded to my email about remote shutdown
problems. From the email I received, I was not the only person having
problems.

I see two areas that might be causing myself and other people problems:

1. Not redirecting shutdown output.

2. Shutdown likes to rwall to it friends. If friends are down then
rwall hangs for a while waiting for a time out.

Since I want to shutdown fast because of low UPS power I used the halt rather
than the shutdown to solve my problem. Below is my script.

____________________________MY SCRIPT______________________
#! /bin/csh

##
#UPS SHUTDOWN SCRIPT
#
set clients = "aurora plage eclipse flare helios neutrino ra xrays"
cd /

mail -s "UPSSHUTDOWN" root << E_O_F
UPS directed shutdown
Starting shutdown process
E_O_F

##
# shutdown
#
foreach host ( $clients )
        set ifpinged = `ping $host 3 | grep -c " alive"`
        if ( $ifpinged != 0 ) then

                rsh -n neutrino touch /etc/nologin &

/usr/etc/rwall $host << E_O_F

*** LOW UPS POWER ***
*** SYSTEM SHUTDOWN IN 1 MINUTES ***
*** LOG OFF NOW!!!!!!!!!! ***
E_O_F

                rsh -n $host '(/usr/bin/sleep 60 ; /usr/etc/halt) >& /dev/null < /dev/null ' &
        endif
end

mail -s "UPSSHUTDOWN" root << E_O_F
UPS directed shutdown
Finished. Shutting down Sunspot.
E_O_F

shutdown -h +5 LOW UPS POWER &

______________________________________________________________

Thanks to:

From: ajs6143@eerpf001.boeing.com ( Andy Stefancik 234-3049 )
From: breimer@harley.dazixco.ingr.com (beverly reimer 1509)
From: mdl@cypress.com (J. Matt Landrum)
From: patp@juliet.ll.mit.edu ( Patrick Pawlak)
From: uunet!jtsv16.jts.com!gerry%arizona.UUCP@noao.edu (G. Roderick Singleton)
From: Ken Nawyn <ken@nynexst.com>
From: sjh@helicon.math.purdue.edu
From: kwthomas@nsslsun.nssl.uoknor.edu (Kevin W. Thomas)
From: keves@meaddata.com (Brian Keves - Consultant)
From: ray@isor.vuw.ac.nz
From: Kanthan Pillay <svpillay@Princeton.EDU>
From: lee@sqlee.sq.com (Liam R. E. Quin)
From: poffen@sj.ate.slb.com (Russ Poffenberger)
From: Mike Raffety <miker@sbcoc.com>
From: uunet!indetech.com!bobh%arizona.UUCP@noao.edu (Bob Haxo)
From: vincens%FRULM63.bitnet@VTVM2.CC.VT.EDU (Pierre VINCENS)
From: uunet!campus22!rpage%arizona.UUCP@noao.edu (Real Page)
From: leh@manatee.cis.ufl.edu
From: mdl@cypress.com (J. Matt Landrum)
From: "John R Ruckstuhl Jr" <ruck@zeta.ee.ufl.edu>

Below is a summary of replies.

------------------------REPLIES-----------------------------------------------
On Nov 13, 10:48, Robert McGraw wrote:
} Subject: remote shutdown.
}
} I have several workstation that are on a big UPS system. I have one
} host that monitors the UPS and determines if running under battery
} power and how long is left on the battery. When the battery power
} is below a cutoff time I have the monitoring host send a rsh shutdown
} to all hosts.
}
} I am having problems. It seem that some of the remote machines do not get the message or do not shutdown.
}
} Below is the script I use to shutdown.
} ----
} #! /bin/csh
} #
} #
} # Modified shutdown for use by UPS directed shutdown.
} # Version for Sun UNIX systems
}
} cd /
}
} /usr/etc/rwall -n allsuns <<!
} *** LOW UPS POWER
} *** SHUTDOWN PROCESS BEGINS IN 1 MINUTES
} *** LOG OFF NOW!!!!!!!!!!
} !
}
} mail -s "UPSSHUTDOWN" root << E_O_F
} UPS directed shutdown
} E_O_F
}
}
} foreach CLIENTS ( aurora plage eclipse flare helios neutrino ra xrays )
} switch ($CLIENTS)
} default:
} echo HOST $CLIENTS
} rsh -n $CLIENTS shutdown -h +2 LOW UPS POWER &
} breaksw
} endsw
} end
} shutdown -h +5 LOW UPS POWER &
} ---------------------
}
} Does anyone see a problem.
}
} Also if you have a script that does the job, could you email to me.
}
} Thanks
} Robert
} ------------------------------------------------------------------------
} Robert P. McGraw, Jr. (SysAdmin) National Solar Observatory/SP
} rmcgraw@sunspot.sunspot.noao.edu Box 62
} uunet!noao.edu!rmcgraw Sunspot, NM 88349 USA
} SPAN: 5355::RMCGRAW or NOAO::RMCGRAW (505) 434-7038
} UUCP: {arizona,ncar}!noao!sunspot!rmcgraw
}-- End of excerpt from Robert McGraw

++++++From: uunet!jtsv16.jts.com!gerry%arizona.UUCP@noao.edu (G. Roderick Singleton)

Yes, shutdown expects an attached tty. Use the PD program bacground
or truct nohup to do it's thing.

ger

-- 
G. Roderick Singleton {gerry@jts.com}, System and Network Manager, JTS Computers
Man is the only animal that blushes -- or needs to.
		-- Mark Twain

++++++From kwthomas@nsslsun.nssl.uoknor.edu Wed Nov 13 13:34:22 1991

I see two problems. One, "shutdown" should be "/usr/etc/shutdown". You might be having a command path problem for remote commands. Second, you need to verify that you have remote root priviledges on all of the systems. Try something harmless like "rsh -n client touch /.cshrc" and see if it gives you a "permission denied" message.

++++++From ray@isor.vuw.ac.nz Wed Nov 13 14:05:06 1991

I have recently been doing shutdowns (reboots) using foreach, and have found that the only way to get it to work is to use the following construct:

foreach CLIENT ( <list of clients> ) rsh $CLIENT "shutdown now Reason for shutdown < /dev/null & " < /dev/null & end

Otherwise the rsh tends to hang, either until the machine actually does shut down (which is not necessarily immediately if it is active), or forever.

Hope this helps

Ray Brownrigg ray@isor.vuw.ac.nz ax: (609) 258-1735 uucp: princeton!svpillay

++++++From @yonge.csri.toronto.edu,@sq.sq.com:lee@sqlee Wed Nov 13 14:58:51 1991

One problem might be that if you use YP (NIS), your mail to root might be directed to a machine that's sutting down and might hang. The rwall might also cause problems.

I suggest * delete the rwall and the mail -- syslogd will tell everyone with wall anyway

* in rc.local, check to see if you rebotted because of low power, and *then* send the mail, when things are more likely to work, and do it in the background. You could do date > /LOWPOWER in your powerdown script, and check for that file in rc.local.

Don't knowif this is the problem, but it might be more robust.

Lee lee@sq.com

++++++From poffen@sj.ATE.SLB.COM Wed Nov 13 15:12:55 1991

Well, one undesireable thing this does is leave a bunch of "rsh" processes on the one host, since rsh won't return until the remote process has completed (at shutdown time), and then, I have found that the rsh process gets "stuck" when the remote machine went down.

I don't know if this is the problem you are experiencing. If you do get replies about a good master/remote shutdown method, I would like to hear about it.

Russ Poffenberger DOMAIN: poffen@sj.ate.slb.com Schlumberger Technologies UUCP: {uunet,decwrl,amdahl}!sjsca4!poffen 1601 Technology Drive CIS: 72401,276 San Jose, Ca. 95110 Voice: (408)437-5254 FAX: (408)437-5246

++++++From miker@sbcoc.com Wed Nov 13 17:08:35 1991

Change this:

> rsh -n $CLIENTS shutdown -h +2 LOW UPS POWER &

To this:

> rsh -n $CLIENTS 'shutdown -h +2 LOW UPS POWER >& /dev/null < /dev/null'

Your script isn't getting past the first client, most likely.

Please be sure to summarize back to the list.

++++++From uunet!indetech!indy.indetech.com!bobh%arizona.UUCP@noao.edu Wed Nov 13 1

it is my experience that:

# rsh -n $CLIENTS shutdown -h +2 LOW UPS POWER &

does not work. (we recently had to shut down our 120+ systems for two scheduled power outages. 4AM. what a pain.)

try using:

# rsh $system 'shutdown -h now power outage < /dev/null >& /dev/console'

(or +2 if you want to be nice)

Also, the order of shutdown is VERY CRITICAL. with shutdown, each system tries to rwall its friends (see /etc/rmnt for list of friends). this list can get pretty long and as systems begin to go down, the timeouts waiting for a response gets to be a real problem.

NIS servers MUST be last to go. don't assume that a NIS server is binding itself. beware of NFS filesystems which are mounted on the system which you are bringing down last. umount ALL of the NFS mounted filesystems from that system before bringing down any hosts.

also, since rsh takes its own sweet time about timing out if $system is down, recommend a ping (with a non-degault timeout) to verify that the system is there. note however, that a system in single-user mode will respond to a ping.

++++++From @VTVM2.CC.VT.EDU:vincens@FRULM63 Thu Nov 14 00:26:31 1991

To solve the same problem, I use the scripts:

---------- begin of /usr/local/rbin/qshutdown -------------------- #!/bin/sh # This script is used on the remote machine /etc/shutdown $* </dev/null >/dev/null ---------- end of /usr/local/rcmd/qshutdown --------------------

---------- begin of /usr/local/rbin/shutall ---------------------- #! /bin/sh # # Diskless # for machine in anubis apis aton bastet hathor isis khnoum maat osiris sobek do echo '<<<' $$machine '>>>' 1>&2 rsh $$machine /usr/local/rbin/qshutdown -h now "Your message" done sleep 30 # # Sun with disk # for machine in ares forseti hera kvasir maia oko pemba twaz do echo '<<<' $$machine '>>>' 1>&2 rsh $$machine /usr/local/rbin/qshutdown -h now "Your message" done sleep 30 # # Secondary server # for machine in hermes horus do echo '<<<' $$machine '>>>' 1>&2 rsh $$machine /usr/local/rbin/qshutdown -h now "Your message" done sleep 30 echo "You can shutdown NIS server" ---------- end of /usr/local/rbin/shutall ----------------------

Sincerely,

Pierre VINCENS Ecole Normale Superieure E-mail: vincens@wotan.ens.fr Groupe de bioinformatique vincens@frulm63.bitnet 46, rue d'Ulm 75230 PARIS CEDEX 05 FRANCE

++++++From uunet!matrox!campus22!rpage%arizona.UUCP@noao.edu Thu Nov 14 01:35:37 1991

> Date: Wed, 13 Nov 91 08:48:56 MST > From: mcgraw@sunspot.sunspot.noao.edu (Robert McGraw) > Message-Id: <9111131548.AA09816@sunspot.sunspot.noao.edu> > To: sun-managers@eecs.nwu.edu > Subject: remote shutdown. > > I am having problems. It seem that some of the remote machines do not get the message or do not shutdown. > > Below is the script I use to shutdown. > ---- > #! /bin/csh > # > # > # Modified shutdown for use by UPS directed shutdown. > # Version for Sun UNIX systems > > cd / > > /usr/etc/rwall -n allsuns <<! > *** LOW UPS POWER > *** SHUTDOWN PROCESS BEGINS IN 1 MINUTES > *** LOG OFF NOW!!!!!!!!!! > !

You should probably put this one in background so you wont waste time broadcasting the messsages

> > mail -s "UPSSHUTDOWN" root << E_O_F > UPS directed shutdown > E_O_F > > > foreach CLIENTS ( aurora plage eclipse flare helios neutrino ra xrays ) > switch ($CLIENTS) > default: > echo HOST $CLIENTS > rsh -n $CLIENTS shutdown -h +2 LOW UPS POWER &

On my SUN IPC, it took more than 30 minutes for the shutdown to take effect. This is because shutdown does a rwall to all the machines that mount file system from your server. With 30 PC-NFS users that don't need that warning, it take a lot of time.

Instead I would do something like: (you could put a 2 minutes delay if you want" rsh -n $CLIENTS "shutdown -k now LOW UPS POWER &" rsh -n $CLIENTS "halt"

> breaksw > endsw > end > shutdown -h +5 LOW UPS POWER &

Same remarks applu here too

-- +----------------------------+----------------+----------------------+ | Real Page | 1055 St-Regis | (514) 685-7230 #2359 | | Hardware Design Engineer | Dorval, Canada | (514) 685-7030 Fax | | Matrox Electronics Systems | H9P 2T4 | Real.Page@matrox.com | +----------------------------+----------------+----------------------+

++++++From leh@manatee.cis.ufl.edu Thu Nov 14 08:49:27 1991

A common problem that people have when rsh'ing is .cshrc (or tcshrc etc...) interaction, such as a confirmation of the TERM setting. You may want to check your dot files.

Les Hill

Extraordinary crimes against the people and the state have to be avenged by agents extraordinary. Two such people are John Steed -- top professional, and his partner, Emma Peel -- talented amateur; otherwise known as "The Avengers." INTERNET: leh@ufl.edu UUCP: ...!gatech!uflorida!leh BITNET: vishnu@UFPINE

++++++From cypress!cypress.com!mdl@decwrl.dec.com Wed Nov 13 12:01:40 1991

This is what I use. It won't work if the order of my machines is not just right. I didn't research the problem, I just got it working by trial and error. It may have had something to do with NFS. How long did you give the machines to come down?

#!/bin/csh -f /usr/etc/rwall -n csdc <<! ^G^GSYSTEM BEING BROUGHT DOWN NOW ! ! !^G^G ! echo " " >> /ups/upsdown.log echo "UPS directed shutdown:" >> /ups/upsdown.log echo " " >> /ups/upsdown.log date >> /ups/upsdown.log # shutdown sequence rsh -n nv "shutdown -f +2" & rsh -n al "shutdown -f +5" & echo " " >> /ups/upsdown.log echo "All processes will now be killed." >> /ups/upsdown.log echo " " >> /ups/upsdown.log shutdown -f +2



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:16 CDT