SUMMARY: Psuedo Terminal Weirdness

From: James Ashton (jaa101@gorton.anu.edu.au)
Date: Thu Sep 03 1992 - 07:09:09 CDT


Sorry about the delay in posting this summary but it was a fairly low
priority and it's taken a while to have time to try out some of the
options and play about.

The original problem:

>We have an SS2 running 4.1.1. It uses xdm from X11R5 to run the
>console (OW3.0 xnews - yuck) and one Xterminal. Today the users
>complained that they couldn't start a new xterminal. When started the
>csh in the term complained about reading EOF about 20 times (the `Use
>"logout" to logout.' message) before giving up and dying. Obviously it
>was reading EOF from the pty (/dev/?typ4) Starting 2 xterms at once
>allowed the second xterm to succeed on /dev/?typ5. I've even removed
>and re-mknoded /dev/?typ4 with no effect. It looks like the kernel's
>pty driver for this device has screwed up. I can't see any processes
>that have it open and the file permissions are `crw-rw-rw- 1 root'.
>I'd rather not reboot until I find out what's going on even though I'm
>90% certain that that will fix things up. Any clues?

There were various suggestions but mostly people felt this was a long
standing bug that popped up occasionally. There was a mention of a
patch that might work but I haven't tried it out mainly because we'll
soon be moving to 4.1.3. I have since confirmed that a reboot will fix
the problem. I did managed to bring one pty back to life on a running
system. I was trying to find out why rlogin failed to open the pty
that xterm succeeded with and also reading and writing the tty and
pty. Suddenly I noticed that a user had successfully opened the pty
for use! I've not yet managed to find a repeatable procedure. One
person suggested the use of fuser to find a process holding on to the
pty in some strange mode but each time I've tried fuser finds no
processes. The old `kill -1 1' was suggested but was ineffective.

And now the responses:

>Date: Wed, 12 Aug 92 02:40:53 PDT
>From: iapsd!seri!glenn@uunet.uu.net (Glenn Herteg)
>
>I have seen this problem for years, at least back to SunOS 4.0.1,
>running SunView. I have at various times thought it might be solved
>by SunOS patches, though I never got around to installing them. I
>have not seen the problem lately, though we're not exercising our
>systems the same way we used to. If you find out what the problem
>is, I'm intensely curious myself ...
>
>One way around the problem is "set ignoreeof" in your .cshrc file.
>You start a window, notice the problem, but it doesn't cause any
>significant problems per se to the rest of the system. Now you
>just close up the window and tuck it away in the corner of your
>screen, where it ties up the pseudo-terminal and you can create
>other windows t will.
>
>The only thing I've ever see clear this problem is a reboot.
>
>Now here's an oddball thought. A few days ago I reset my system
>clock about a year back. Suddenly the csh in that window started
>scrolling prompts interminably, effectively locking up the workstation
>to local accesses. I cleared the problem with a remote login and
>reset the time forward by a few days (still nearly a year back
>from wall clock time). Since the symptom of apparently reading
>an incessant stream of useless data is so similar, it occurs to
>me that maybe the pty driver is somehow getting confused about
>the time... though why this should make any difference is unclear.
>
>Glenn Herteg
>glenn%iapsd@uunet.uu.net

------------------------------------------------------------------

>From: kalli!glenn@fourx.Aus.Sun.COM (Glenn Satchell)
>Date: Wed, 12 Aug 1992 12:30:59 EST
>
>Yes, rebooting will fix it for now... You should also try installing
>Patch 100188-02: One of the bugs fixed is "Process not letting go of a
>pty (bugID 1040722)". Note that this supercedes patch 100414-01.
>
>regards,
>
>Glenn Satchell
>Unix Professional Services (Sydney Australia)
>kalli!glenn@fourx.aus.sun.com

------------------------------------------------------------------

>From: Brent Alan Wiese <brent@crick.ssctr.bcm.tmc.edu>
>Date: Tue, 11 Aug 92 10:16:40 CDT
>
>... It is interesting to note that rlogin and telnet will not
>pickup these EOF-pty's, only xterm seems to.

------------------------------------------------------------------

>Date: Tue, 11 Aug 92 09:24:13 CDT
>From: Mike Raffety <miker@sbcoc.com>
>
>Try fuser on the master and slave pty devices; I'll bet SOMETHING's
>still got it open in a funny mode.

------------------------------------------------------------------

>From: Steve_Kilbane@gec-epl.co.uk
>Date: Tue, 11 Aug 92 08:35:36 BST
>
>In article <9208102345.AA23976@gorton.anu.edu.au> you write:
>> When started the
>>csh in the term complained about reading EOF about 20 times (the `Use
>>"logout" to logout.' message) before giving up and dying. Obviously it
>>was reading EOF from the pty (/dev/?typ4)
>
>Not necessarily. I've seen this behaviour on normal terminal lines, where a
>program has set NDELAY, then died. The csh then gets no bytes from the
>terminal, and treats it as EOF. In our case, it was cleared by logging out
>(in fact, the login csh bombs out, as you discovered), and init resets
>the terminal line. In the case of ptys, things tend to be a bit more screwed,
>because init isn't resetting them.
>
>I don't know if this is your problem, though, because NDELAY is a
>characteristic of the open file table entry, rather than of a device or an
>inode, so this should be cleared when the file is closed.
>
>> I can't see any processes
>>that have it open and the file permissions are `crw-rw-rw- 1 root'.
>>I'd rather not reboot until I find out what's going on even though I'm
>>90% certain that that will fix things up. Any clues?
>
>Should do. It'll certainly close the file:-).
>
>Hope this is of some help...
>
>Steve

------------------------------------------------------------------

>Date: Tue, 11 Aug 92 16:55:10 +1000
>From: Chris Keane <chris@rufus.state.COM.AU>
>
>It's probable that some program has it open and the EXCL open performed
>by login isn't working. This happens sometimes.
>You can find out which program it is by using /etc/fuser /dev/ttyp4
>
>A quick dirty solution is to chmod 000 /dev/ttyp4 until the next time
>you reboot.
>
>Chris.

------------------------------------------------------------------

>Date: Tue, 11 Aug 92 13:30:34 EST
>From: ivan@fac.anu.edu.au (Ivan Dean)
>
>...
>You might look for processes that have '?' listed as their pty. In at least one
>case, a process like that was affecting someone else's pty. Otherwise, have you
>tried HUPing the init process, with kill -HUP 1 ??
>
>Ivan
______________________________________________________________________________
James Ashton System Administrator
                                             Department of Systems Engineering
Voice +61 6 249 0681 Research School of Physical Sciences and Engineering
FAX +61 2 249 2698 Australian National University
Email James.Ashton@syseng.anu.edu.au GPO Box 4 Canberra ACT 2601 Australia



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:49 CDT