<exiting> problem - answer summary

From: Brian Inwood (binwood@adelphi.ua.oz.au)
Date: Thu Feb 01 1990 - 13:22:02 CST


>From binwood Thu Feb 1 13:14:41 1990
Return-Path: <binwood>
Received: by adelphi.ua.oz.au (4.0/3.3MX)
        id AA03225; Thu, 1 Feb 90 13:14:41 CST
From: Brian Inwood <binwood>
Message-Id: <9002010244.AA03225@adelphi.ua.oz.au>
Subject: no subject (file transmission)
To: binwood@adelphi (thats me)
Date: Thu, 1 Feb 90 13:14:40 CST
X-Mailer: ELM [version 2.2 PL10]
Status: O

******************************************************
THE PROBLEM:
******************************************************
There is a csh process when a users logs in via a serial line.
 Sometimes, if that process is stopped either (say) by killing the shell or
  rlogin terminates because of a net failure (&etc), that csh has the status
        <exiting>
 as shown by
        ps -aux
 
 That user is shown to be still logged in.
 Sometimes (rarely) the problem fixes itself, otherwise the the serial line
 is unusable until a reboot is performed on the host machine.

******************************************************
THE ANSWERS:
******************************************************
thankyou for the quick replies:

first off the rank was my own computing center with an approx 1 minute response
time (surely a record?), in a situation paradoxically called a 'communication
breakdown'. the answer was

       trace -p on the offending process, and then type control-c to
exit trace. The process then goes away.
(this fixed my problem)

thankyou to
sun adelaide/my computing center
Tim Raymond raymond@uvm.edu
Chuck Foley
John R. Deuel <kink@rice.edu>

******************************************************
further illuminations were
******************************************************
      Try the following shell script

# Argument should be process number of exiting process.
trace -p $1
# After issuing this command, use ^C.
# To restart modem afterwards, turn it off in /etc/ttytab, do a kill -HUP
# and then do the same after turning it on.

The comment at the end may or may not be relevant since sometimes the
trace and Ctrl-C will solve the problem with init starting another
getty at the port. However, in case no getty is started, you can use
the following shell script afterwards. It takes as argument ($1) any
substring in the name of the port, e.g., ttya, ttyd0, etc.

#!/bin/sh
TMP=/tmp/rg.$$
cp /etc/ttytab /etc/ttytab.sv
sed -e /$1/s/on/off/ /etc/ttytab > $TMP
cp $TMP /etc/ttytab
rm $TMP
kill -HUP 1
cp /etc/ttytab.sv /etc/ttytab
kill -HUP 1
ps uax | grep $1

Make sure you check that this does something reasonable on your
system.
Needless to say, one has to be root to run these shell programs.

Leonard Evens len@math.nwu.edu
******************************************************
You should try finding the parent process and killing that. If
it is an xterm process, try sending it a SIGCHLD, if it is rshd, just
send it a SIGTERM.

Daniel Trinkle trinkle@cs.purdue.edu
******************************************************
You need to reconfigure the kernel to have it not to think the serial line
is hard wired.

Take a look at man page "zs", zs0 has the config for ttya and ttyb.
Resetting the bit value in flags for the corresponding serial line
should solve the problem.

Albert S. Kuo UUCP: yale!kuo-albert
******************************************************
Connect a terminal and feed in a control-Q. Or use TIOCSTI to
pretend to do that.

<matt@oddjob.uchicago.edu>
******************************************************
I have a problem like this when using tip.
Here's the relevant portion of my script that I use when I'm trying to
kill off errant tip processes.
In the bourne shell:

kill -9 $pid_to_kill
sleep 5
gcore $pid_to_kill &
sleep 10 ; kill -9 $!

[the $! is the pid of the gcore child]

andy beals bandy@capmkt.com
******************************************************
Well, if the state of the process is 'D' or 'Z' then rebooting is the
only method which will work every time. These processes are sometimes
referred to as 'zombies' (refers to the Z state). There was a recent
thread on this subject on the net in comp.unix.wizards.
The gist is that there is a way with trace and one other command
to glom onto the process and in very rare instances get it to terminate.

Mark Morrissey ARPA: bit!markm@cse.ogi.edu -or- bit!markm@sun.com
******************************************************
Try editing the tty out of /etc/ttys changing the first 1 on the
relevant line to a 0, saving the fill, kill -1 1, put the 1 back in and kill
-1 1.
If that fails, try it again but in between the first kill and re-edit,
unplug the serial line.
<andie%cstr.edinburgh.ac.uk@murtoa.cs.mu.oz>
******************************************************
(a) you should get the serial line patches for sunos 4.0.3 (called
        yapt 5.5)
(b) the process is exiting, which means that it is a zombie (its
        has called exit but nobody has picked up its exit status)

you can usually get these to go away by running trace on them,
say on process 566:
        trace -p 566

then control-c out of trace, and the process should go away. the
problem (i think) is due to the terminal line driver trying to
access a stream structure after the stream has been disposed of,
so it hangs waiting on a resource that never becomes free. when
you trace the process, you raise its priority so that the kill
signal can get through, and the exiting process can continue
cleaning up after itself.

--hal stern
  sun microsystems

******************************************************
and this guy would like more help...
******************************************************
2) trace -p pid
Note that this does not work for < SunOS 4.0. FOr one thing there is no
trace command. Unfortunately for us we need to run our one server under 3.5.
I called the SUN hotline and they looked at the code for trace. It turns out
that the UNIX command trace calls ptrace(2), a system call. I tried to
approximate the same sequence in C but I could not get it to kill the process.

If you hear of a way that works under 3.5 *PLEASE* let me know. Good luck!

                                                Bill Krauss

---------------------------------------------------------------------------
| William F. Krauss III | Moravian College, Bethlehem PA 18018 |
| Computer Science Dept | CSNET / INTERNET -> kraussW@moravian.edu |
| System Administrator | UUCP -> ...!rutgers!liberty!batman!kraussW |
| Phone - 215-861-1441 | BITNET -> kraussW%moravian.edu@relay.cs.net |
---------------------------------------------------------------------------
******************************************************

******************************************************
and a postscript....
******************************************************
C.R.Ritson@newcastle.ac.uk pointed out that i probably had trouble with
mail because i was using ELM PL10. Chris, i will update soon!

 brian inwood
 computing officer
 department of physics and maths physics
 university of adelaide
 australia
 binwood@adelphi.ua.oz.au



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:05:56 CDT