solaris 2.0 bug/oddity list

From: Charles Hedrick (hedrick@farside.rutgers.edu)
Date: Sat Nov 28 1992 - 21:43:59 CST


The following is a list of the bugs or other oddities I've found in
the last couple of weeks of intensive work on Solaris 2.0. Since some
of them caused hours of debugging time, I'm posting them in hopes of
saving others from the same pitfalls. Although I assume many of them
are fixed in 2.1, I'm issuing bug reports for all of them that are
clearly bugs. (As you'll see, I haven't gotten around to issuing all
the reports yet.)

hostnames with . don't work: /sbin/bcheckrc needs "", and the
        menu installation system needs to install the initial
        part as a nickname in /etc/hosts. To hotline on 13-Nov-92

TCP ignores MSS on incoming telnet connection over SLIP. Also
        noted that the first telnet attempt resulted in the
        connection opening and immediately closing. To hotline on 13-Nov-92

xterm and other terminal emulators don't make entries in utmp
        To hotline on 13-Nov-92

telnetd does not implement the LFLOW option, and does not send sync
        for ^C. To hotline on 13-Nov-92.

suddenly nn couldn't find /fac/u4. /net/athos/u4 showed nothing.
an explicit mount worked. Thereafter, the automount reference came
back to life also. To hotline 18-nov-92. Happened again on
Nov 20. This time I didn't do anything to try and fix it. It
came back later.

I'm getting consistent failures the first time I try to telnet to
farside from my home machine over slip. To hotline 18-nov-92

snoop seems to fail the first time I try it in a given session.
        Using device le0 (promiscuous mode)
        offset 0: totlen=537479672
        snoop: bad packet header in buffer
To hotline 18-nov-92

Job control doesn't work with X-based emacs or xterm. They seem
to be doing setsid, and thus putting themselves in a different
process group than the tty. It's unclear whether this is a problem
with the kernel, the program, or something in Xlib.

garbled screen with 4.1.1 executables that use curses, particularly
under the xterm supplied with openwin.

I have to set "stty -tabs" or I get strange results. Maybe only with
xterm.

The first time I telnetted in from home today, my tty type wasn't
known. Other times it's been fine.

a documentation bug: the initial state of /etc/netconfig is
switch.so,tcpip.so. The initial state of /etc/nsswitch.conf also
includes "file". One would think that tcpip.so is not needed, since
nsswitch also looks at /etc/hosts. But without tcpip.so, booting
fails. Documentation doesn't say how netconfig works, nor in general
does it give enough information to explain this. Indeed the NETNAME
manual in the 2.0 AnswerBook seems to be out of date. It assumes that
a dns library will be put in /etc/netconfig, and talks about using a
special dns version of the stream libraries which doesn't exist.
To hotline 18-nov-92

sun has requested us to file a bug report that the system won't
boot with only switch.so in /etc/netconfig

when mm calls emacs from send>, and emacs returns, it claims that
the edit failed. Not sure if this is reproducible. Hold for
further investigation.

when postnews calls emacs, on one occasion emacs immediately exited.
This is not reproducible.

When perl is ported as a SVR4 program (our current port is BSD-based),
it fails lib/bnum test 186. A division should result in 6, but gives
NaN. (I forgot to write down the actual details.) For the moment,
use the BSD port.

GCC requires fixed include files. It automatically puts the
fixed version of /usr/include ahead of /usr/include when include
files are defaulted. However when you have to specify things
explicitly, you don't get fixed ones. And the install script
doesn't produce a fixed version of /usr/ucbinclude. I have fixed
/usr/ucbinclude, and made a symlink to the fixed /usr/include
so that people can refer to it. (The normal location is so
deep under /opt/cygnus that nobody could ever find it.) So
for the moment, instead of -I/usr/include, say
  -I/opt/cygnus/lib/include
and instead of -I/usr/ucbinclude, say
  -I/opt/cygnus/lib/ucbinclude

I'm having trouble building programs that use sharable libraries
outside of /usr/lib. ld.so can't find the sharable libraries. I've
tried all the documented features of ld for specifying the location of
libraries, with no change. Oddly enough, emacs can find /usr/ucblib
but not /usr/openwin/lib. But perl can't find /usr/ucblib. For the
moment I'm static linking the libraries that can't be found. This is
obviously NOT a good solution.

With the "out of the box" setup, man -k doesn't work, and man can't
find stuff from openwindows.

Wierd termcap stuff for vnews and to some extent nn. See if
maybe /etc/termcap is bad.

login doesn't have -p or any other way to pass the environment

sh and ksh have SIGPWR. csh and tcsh have SIGLOST.

sccsdiff doesn't recognize -c

Rlogin (even using Sun's versions) does not work reliably within the
same system. The out of band messages describing changes in ixon
don't always get through. A single change will work OK, but when you
start emacs, the change doesn't always happen. My theory is that
there's a problem with out of band data via the loopback device.

In krlogin, I was having a problem with hanging. There may be
more than one problem, but I suspect that
        recv(rem, &mark, 1, MSG_OOB)
is blocking when it shouldn't. It appears that this never blocks
on 4.1.1. At least if it did, I don't see how the code could
work. It does under Sol2.0. I have to do
        ioctl(rem, FIONBIO, &on);
and then off around it.

In krlogind, I have the opposite problem. Krlogind starts a subfork,
and needs to wait until the subfork is started. Thus the subfork
outputs a character on the controlling terminal (a slave pty),
and the main program does the following on the master pty.
        if (read(p, &c, 1) != 1)
This is intended to block until the character is available.
However it returns with the error ECHILD when the child isn't
started. Note that the pty's are streams pty's, i.e. /dev/ptmx
and /dev/pts/N.

There is an incompatibility in recvfrom that killed Kerberos 5.
recvfrom reads a message and puts the source address into a buffer
supplied by the user. If the user supplies an unreasonable buffer
size, it may hang. In fact Kerberos was not initializing the size at
all. This appears to be a bug, as even under 4.1.1 the man page
instructs the caller to set it. But it's a subtle difference between
4.1.1 and the socket emulator under Sol2.0 that it might be prudent to
fix.



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:53 CDT