SUMMARY: URGENT help needed - ? RPC problems

From: Frank Allan (fallan@awadi.com.AU)
Date: Mon Sep 07 1992 - 16:37:09 CDT


Once again thanks to all who responded in our moment of need.
A number of people suggested using trace to try to pinpoint
the problems, and this helped to direct our attention towards
rpc.lockd.

Maria Barnum put us on the right track when she wrote:

        I had a problem with WingZ, and the version of patch

> 100075-08 for 4.1.2

        that I had was bad. I got a newer one *with April 17*
        as the date on the files rather than April 15.

        I got "version mismatch" messages in the messages file of the
        server -- It was looking for some "version 2" and they had
        mistakenly loaded "1" as the version number into some register.
        It either wouldn't appear as a window, or it would say "no licenses
        available".

Our local Sun Support guru (Christian Peter) found the problem
after dialling in and having a look around.

We had some time ago installed patch 100075-08 and he found that
our version of rpc.lockd had a different checksum to the version
he had.

We installed the new version of the patch (Note it was the SAME rev
level) and all our problems disappeared. It appears that there may be
two versions of 100075-08 floating around, so it may pay to check
with your local SUN office if you have this patch.

Doing a 'sum /usr/etc/rpc.lockd' on the good version yields:
 '05166 240'
whereas the bad version gives: '60356 240'

As far as I can tell it only affects the executable for 4.1.1 for
sun4 and sun4c architectures. The executables for 4.1.2 give the same
results for both versions of the patch.

thanks for your help, especially the following:

barnum@pluto.crd.ge.com (Maria A. Barnum)
Christian.Peter@uluru.Aus.Sun.COM (Christian Peter - SUN Sydney - Software Support)
judy@qucis.queensu.ca (Judy Russell)
pjy@merlin.anu.edu.au (Peter Young)
Chris Keane <chris@rufus.state.COM.AU>
ptek@thor.pwcm.com ( Paul Tekverk)
kwthomas@nsslsun.nssl.uoknor.edu (Kevin W. Thomas)
Geert Jan de Groot <geertj@ica.philips.nl>
ems@ccrl.nj.nec.com (Ed Strong)
Aydin Edguer <edguer@alpha.CES.CWRU.Edu>
Kerry Duke <kerry.duke@analog.com>
Bill Hart <Bill.Hart@ml.csiro.au>
Mike Raffety <miker@sbcoc.com>
kevins@kuma3.Japan.Sun.COM (Kevin Sheehan {Consulting Poster Child})
Rod Rebello -- CAD Development <titan!rrebello@enuucp.eas.asu.edu>
Sven Ole Skrivervik <svenole@sdata.no>
stiles@no2sun.cray.com (John Stiles)

I have included my original posting below:

We need your help real bad!!

We have 60 workstations which suddenly can't run the two main
applications they use, namely Wingz and WordPerfect 5.0. Below is the
text of the support call I have logged with SUN's answer centre in
Australia. They had a look tonight but couldn't come up with anything,
and as it is now 9:30 at night we need some answers pretty quickly or
we will have a lot of angry users in the morning. (we expect to still
be here in the morning unless we get lucky)

----- Begin Included Message -----

Equipment Details
~~~~~~~~~~~~~~~~~

System Type : two x SparcStation 2 32Mb memory each
System Serial Number :
Attached Equip. : Disks 3 x 207 internal, 1 x 1.3Gb external,
                                3 x 1.0Gb external, 3 x 669Mb external
                        : Tapes 1 x Exabyte 2.3Gb external, 1 x 150Mb external
                        : CD-ROM external

Problem Details
~~~~~~~~~~~~~~~~

Synopsis : Applications don't run
SunOS Release : 4.1.1, 4.1.2
SunOS Patches : 100075-08,100482-02,100224-03
Unbundled S/W : Wingz, WordPerfect
Unbundled S/W Release : 1.1, 5.0
Unbundled Patches : none
Bug Reference : none known

Severity : 1

Problem Description : These two machines are servers for a total
of 58 diskless clients. mulga runs 4.1.1 and is the application server
and NIS master. geebung runs 4.1.2 and is a NIS slave. All machines at
our DSTO site are unable to run either of the above products. When the
products are started, either from the command line or from OW3 menus,
the appropriate processes are started on the user machine, but the
product does not run. Wingz does not bring up any windows, and
WordPerfect brings up the starting window but hangs with a Please wait
message.

The only change made was that on the weekend we ran ypinit -m on the
server mulga, which has always been our YP master, and added the server
geebung, which has not always been a YP slave, but was nominated as
such when the ypinit was run.

This configuration worked yesterday, but today we have had all sorts of
problems. We have tried rebooting individual machines and eventually
bringing down all machines, including the servers and rebooting
everything, but the problems persist.

We have since made the secondary server a NIS client rather than a
slave, but this has not changed the problems.

Please help.

**********************************************************************

Explanation of the Severity values:

    1 - bug prevents execution of critical function. no workaround
    2 - execution of critical function difficult; no workaround
    3 - critical function difficult, workaround available
    4 - execution of critical function inconvenient, workaround
    5 - execution of non-critical function inconvenient workaround

**********************************************************************
----- End Included Message -----

A couple of things we have found out since the call went in.

1. as root on the NIS master I can run both apps fine.
2. as a normal user on the NIS master, Wingz works, but WP doesn't
3. on the other server, and on clients of both servers, neither app
works, even as root. No error messages are produced, but the process
/usr/local/Wingz/Wingz just sits there and the load figures sit around
1 with no other activity or users on the machine: e.g. from a machine
where a user tried 4 times to use Wingz and then went home:

cedar# w
  9:00pm up 2:24, 1 user, load average: 3.95, 3.98, 3.99
User tty login@ idle JCPU PCPU what
root ttyp0 9:00pm w
cedar# ps -auxw
USER PID %CPU %MEM SZ RSS TT STAT START TIME COMMAND
root 0 0.0 0.0 0 0 ? D 18:36 0:02 swapper
root 1 0.0 0.0 52 0 ? IW 18:36 0:00 /sbin/init -
root 2 0.0 0.0 0 0 ? D 18:36 0:00 pagedaemon
root 78 0.0 0.0 100 0 ? IW 18:37 0:00 /usr/lib/sendmail -bd -q1h
root 44 0.0 0.9 68 140 ? I 18:36 0:00 portmap
bin 47 0.0 0.0 36 0 ? IW 18:36 0:00 ypbind
root 49 0.0 0.0 40 0 ? IW 18:36 0:00 keyserv
root 64 0.0 0.0 16 0 ? S 18:37 0:00 (biod)
root 65 0.0 0.0 16 0 ? S 18:37 0:00 (biod)
root 66 0.0 0.0 16 0 ? S 18:37 0:00 (biod)
root 67 0.0 0.0 16 0 ? S 18:37 0:00 (biod)
root 70 0.0 0.4 60 60 ? S 18:37 0:00 syslogd
root 84 0.0 2.1 112 320 ? S 18:37 0:12 rpc.lockd
root 90 0.0 0.2 16 28 ? S 18:37 0:00 screenblank
root 95 0.0 0.1 12 8 ? S 18:37 0:09 update
root 83 0.0 0.0 68 0 ? IW 18:37 0:00 rpc.statd
root 98 0.0 0.0 56 0 ? IW 18:37 0:00 cron
root 103 0.0 0.6 48 96 ? S 18:37 0:01 inetd
root 106 0.0 0.0 52 0 ? IW 18:37 0:00 /usr/lib/lpd
root 207 0.0 2.5 68 380 p0 S 21:00 0:00 -csh (csh)
root 195 0.0 0.0 40 0 co IW 18:45 0:00 - std.9600 console (getty)
root 206 0.0 1.7 24 248 ? S 21:00 0:00 in.rlogind
root 212 0.0 2.6 168 384 p0 R 21:00 0:00 ps -auxw
daemon 186 0.0 0.0 96 0 ? IW 18:39 0:00 rpc.cmsd
rbabb 187 0.0 6.7 708 1008 ? S 18:40 0:00 /usr/local/Wingz/Wingz /home/mulga/rbabb/MVS_HOT_ITEMS.WKZ
root 205 0.0 1.8 48 264 ? S 21:00 0:00 rpc.rstatd
rbabb 159 0.0 6.6 708 984 ? S 18:38 1:03 /usr/local/Wingz/Wingz /home/mulga/rbabb/us_drg_status
rbabb 188 0.0 6.7 708 1008 ? S 18:44 0:00 /usr/local/Wingz/Wingz /home/mulga/rbabb/MVS_HOT_ITEMS.WKZ.SAFE
rbabb 189 0.0 7.3 708 1088 ? S 18:44 0:00 /usr/local/Wingz/Wingz
cedar#

Frank Allan
Network Manager
AWA Defence Industries
Module 3 Endeavour House e-mail: fallan@awadi.com.au
Fourth Avenue Phone: 08 343 6357
Technology Park SA 5095 Fax: 08 260 8938



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:50 CDT