SUMMARY: Problem: SparcStation 5 "Async memory fault"

From: Gal Shalif (gal@sd.co.il)
Date: Tue Aug 22 1995 - 12:27:30 CDT


Hello sun managers,
    (this is a repost. I am afraid that the original one never made it
     to the mailing list).

THE PROBLEM:

    SparcStation 5 machines issue a "Async memory fault"
    and reboot.

THE SOLUTION:

    for SunOS 4.1.3_U1, the problem is fixed in jumbo patch #101508-XX
    I got and installed the patch and it solved the problem.

MORE DETAILS:

    The problem manifest for SparcStation 5 with some memory configuration.
    It happen on machines running either SunOS 4.1.3 or Solaris 2.x.

    I got a few answers that say:

>> This is a well known SS 5 problem. There is a system timing error that
>> causes problem. It is fixed in jumbo patch #101508-XX which modifies
>> kernel software.
>>

    and some that say:

>> The memory SIMMs are to blame, Replace it.

    and some that combined both:

>> probably it is a hardware problem, but sun has a software fix for it
>> in the jumbo patch for SUNOS 4.1.3_U1.

    Many reported that the patch fixed their problems and other reported
    that replacing the memory fixed it.

    I got and installed the SUNOS 4.1.3_U1 patch and it fixed the problem.

    (I got the patch from:
        sunsite.unc.edu:/pub/sun-info/sun-patches/101508.readme
        sunsite.unc.edu:/pub/sun-info/sun-patches/101508-xx.tar.Z)

CREDITS:

    Ray Brownrigg ray.brownrigg@isor.vuw.ac.nz
    Andrew Watkins andrew@dcs.bbk.ac.uk
    Ashwin P. Rao ashwin@cadence.com
    Yehuda Bamnolker bamby@scorpio.com
    Bismark Espinoza bismark@alta.jpl.nasa.gov
    Lopez Eli celita@taux01.nsc.com
    Daniel Hurtubise daniel@sar3.canr.hydro.qc.ca
    Eric William Burger ericb@telecnnct.com
    Fletcher Mattox fletcher@cs.utexas.edu
    Gerry Dunnion gdunnion@nova.ucd.ie
    Jeff Marble jmarble@cambric.com
    Nicholas J Brealey nick.brealey@aea.orgn.uk
    Nelson Fernandez nlf@aluxpo.att.com
    Paul Chang paulc@sybase.com
    Frank Branham root@cpatl.com
    Cynthia Shang shang@geng.aer.loral.com
    Ron Hall thorn@cc.mcgill.ca

    Special thanks for Daniel Hurtubise (daniel@sar3.CANR.Hydro.Qc.CA)
    for his program for reproducing the problem.

-- Thanks a lot,

   Gal Shalif, R&D group

 /-----------------------------------------------------------------\
| Gal Shalif | Internet: gal@sd.co.il |
| Software Engineer | Voice: +972 9-507102, ext. 209 |
| Summit Design (EDA) Ltd | Fax: +972 9-509118 |
 \-----------------------------------------------------------------/
  \ In god we trust, everybody else must pay in cash /
   ---------------------------------------------------------------

ORIGINAL MAIL:

    From: gal (Gal Shalif)
    Subject: Problem: SparcStation 5 "Async memory fault"
    To: sun-managers@eecs.nwu.edu
    Cc: ant (Anat Cohen), regev (Ilan Regev), el (Elchanan Herzog),
            urif (Uri Farkash), gal (Gal Shalif)
    Date: Wed, 15 Feb 1995 16:00:27 +0200 (EET)
    Organization: Summit EDA Technologies Ltd.
    Reply-To: gal
    
    Hello sun managers,
    
    Thanks for any clue, help or suggestion.
    
    
    THE PROBLEM:
    
        SparcStation 5 machines issue a "Async memory fault"
        and reboot.
    
    
    DETAILED DESCRIPTION:
    
        The SparcStation 5 machines we have has a problem
        that cause an occasional crash of the system.
    
        On random, the SparcStation 5 machines issue a "Async memory fault"
        message and and reboot.
    
        This problem is specific to our SparcStation 5 models.
        It never happen on other SparcStation models we have : 1,2,10
    
        This problem is known to exist from the first day
        that the machines were brought to our site, about 8 months ago.
    
    
    The O.S. version is:
    
        $ uname -a
        SunOS tokyo 4.1.3_U1 1 sun4m
    
    
    
    STATISTICS:
    
        The statistics relate to the error message as found
        in the /var/adm/messages files for the last 29 days.
    
    
            --------------------------------------------------
                           | |
              Machine Type | Number of Stations | Error Counts
                           | |
            --------------------------------------------------
            sparc 5 | 3 | 0
            sparc 5 | 6 | 1
            sparc 5 | 3 | 2
            sparc 5 | 1 | 3
            sparc 5 | 1 | 4
            sparc 5 | 1 | 5
            sparc 5 | 1 | 13
            ---------------|----------------------------------
        Total: | 16 | 37
            --------------------------------------------------
    
    
    
    DIAGNOSTICS:
    
        The following is a description of the problem as was
        printed by the crashing system to its /var/adm/messages
        log file.
    
    
    
        Feb 14 13:38:12 tokyo vmunix: Async memory fault mfsr=0x818028a0 mfar=0xf67376
        Feb 14 13:38:12 tokyo vmunix: panic on cpu 0: async memory fault
        Feb 14 13:38:12 tokyo vmunix: zs3: silo overflow
        Feb 14 13:38:12 tokyo vmunix: syncing file systems... done
        Feb 14 13:38:12 tokyo vmunix: 01022 low-memory static kernel pages
        Feb 14 13:38:12 tokyo vmunix: 00409 additional static and sysmap kernel pages
        Feb 14 13:38:12 tokyo vmunix: 00000 dynamic kernel data pages
        Feb 14 13:38:12 tokyo vmunix: 00182 additional user structure pages
        Feb 14 13:38:12 tokyo vmunix: 00000 segmap kernel pages
        Feb 14 13:38:12 tokyo vmunix: 00000 segvn kernel pages
        Feb 14 13:38:12 tokyo vmunix: 00140 current user process pages
        Feb 14 13:38:12 tokyo vmunix: 00124 user stack pages
        Feb 14 13:38:12 tokyo vmunix: 01877 total pages (1877 chunks)
        Feb 14 13:38:12 tokyo vmunix:
        Feb 14 13:38:12 tokyo vmunix: dumping to vp fb004c04, offset 182080
        Feb 14 13:38:12 tokyo vmunix: 1877 total pages, dump succeeded
        Feb 14 13:38:12 tokyo vmunix: rebooting...
        Feb 14 13:38:12 tokyo vmunix: VAC ENABLED
        Feb 14 13:38:12 tokyo vmunix: SunOS Release 4.1.3_U1 (GENERIC_AUDIO) #1: Sun Aug 7 17:27:44 IDT 1994
        Feb 14 13:38:12 tokyo vmunix: Copyright (c) 1983-1993, Sun Microsystems, Inc.
        Feb 14 13:38:12 tokyo vmunix: cpu = SUNW,SPARCstation-5
        Feb 14 13:38:12 tokyo vmunix: mod0 = FMI,MB86904 (mid = 0)
        Feb 14 13:38:12 tokyo vmunix: mem = 32452K (0x1fb1000)
        Feb 14 13:38:12 tokyo vmunix: avail mem = 29028352
        Feb 14 13:38:12 tokyo vmunix: entering uniprocessor mode
        Feb 14 13:38:12 tokyo vmunix: Ethernet address = 8:0:20:21:45:10
        Feb 14 13:38:12 tokyo vmunix: espdma0 at SBus slot 5 0x8400000
        Feb 14 13:38:12 tokyo vmunix: esp0 at SBus slot 5 0x8800000 pri 4 (onboard)
        Feb 14 13:38:12 tokyo vmunix: sd0 at esp0 target 3 lun 0
        Feb 14 13:38:12 tokyo vmunix: sd0: <SUN0535 cyl 1866 alt 2 hd 7 sec 80>
        Feb 14 13:38:12 tokyo vmunix: sd1 at esp0 target 1 lun 0
        Feb 14 13:38:12 tokyo vmunix: sd1: <SUN0535 cyl 1866 alt 2 hd 7 sec 80>
        Feb 14 13:38:12 tokyo vmunix: SUNW,bpp0 at SBus slot 5 0xc800000 pri 3 (sbus level 2)
        Feb 14 13:38:12 tokyo vmunix: ledma0 at SBus slot 5 0x8400010
        Feb 14 13:38:12 tokyo vmunix: le0 at SBus slot 5 0x8c00000 pri 6 (onboard)
        Feb 14 13:38:12 tokyo vmunix: cgsix0 at SBus slot 3 0x0 pri 9 (sbus level 5)
        Feb 14 13:38:12 tokyo vmunix: cgsix0: screen 1152x900, single buffered, 1M mappable, rev 11
        Feb 14 13:38:12 tokyo vmunix: SUNW,CS42310 at SBus slot 4 0xc000000 pri 9 (sbus level 5)
        Feb 14 13:38:12 tokyo vmunix: zs0 at SBus slot 5 0x1100000 pri 12 (onboard)
        Feb 14 13:38:12 tokyo vmunix: zs1 at SBus slot 5 0x1000000 pri 12 (onboard)
        Feb 14 13:38:12 tokyo vmunix: SUNW,fdtwo0 at SBus slot 5 0x1400000 pri 11 (onboard)
        Feb 14 13:38:12 tokyo vmunix: root on sd0a fstype 4.2
        Feb 14 13:38:12 tokyo vmunix: swap on sd0b fstype spec size 98560K
        Feb 14 13:38:12 tokyo vmunix: dump on sd0b fstype spec size 98548K
        Feb 14 13:38:12 tokyo vmunix: le0: Twisted Pair Ethernet
        Feb 14 13:38:12 tokyo vmunix: rebooting...
        Feb 14 13:38:12 tokyo vmunix: VAC ENABLED
        Feb 14 13:38:12 tokyo vmunix: SunOS Release 4.1.3_U1 (GENERIC_AUDIO) #1: Sun Aug 7 17:27:44 IDT 1994
        Feb 14 13:38:12 tokyo vmunix: Copyright (c) 1983-1993, Sun Microsystems, Inc.
        Feb 14 13:38:12 tokyo vmunix: cpu = SUNW,SPARCstation-5
        Feb 14 13:38:12 tokyo vmunix: mod0 = FMI,MB86904 (mid = 0)
        Feb 14 13:38:12 tokyo vmunix: mem = 32452K (0x1fb1000)
        Feb 14 13:38:12 tokyo vmunix: avail mem = 29028352
        Feb 14 13:38:12 tokyo vmunix: entering uniprocessor mode
        Feb 14 13:38:12 tokyo vmunix: Ethernet address = 8:0:20:21:45:10
        Feb 14 13:38:12 tokyo vmunix: espdma0 at SBus slot 5 0x8400000
        Feb 14 13:38:12 tokyo vmunix: esp0 at SBus slot 5 0x8800000 pri 4 (onboard)
        Feb 14 13:38:12 tokyo vmunix: sd0 at esp0 target 3 lun 0
        Feb 14 13:38:12 tokyo vmunix: sd0: <SUN0535 cyl 1866 alt 2 hd 7 sec 80>
        Feb 14 13:38:12 tokyo vmunix: sd1 at esp0 target 1 lun 0
        Feb 14 13:38:12 tokyo vmunix: sd1: <SUN0535 cyl 1866 alt 2 hd 7 sec 80>
        Feb 14 13:38:12 tokyo vmunix: SUNW,bpp0 at SBus slot 5 0xc800000 pri 3 (sbus level 2)
        Feb 14 13:38:12 tokyo vmunix: ledma0 at SBus slot 5 0x8400010
        Feb 14 13:38:12 tokyo vmunix: le0 at SBus slot 5 0x8c00000 pri 6 (onboard)
        Feb 14 13:38:12 tokyo vmunix: cgsix0 at SBus slot 3 0x0 pri 9 (sbus level 5)
        Feb 14 13:38:12 tokyo vmunix: cgsix0: screen 1152x900, single buffered, 1M mappable, rev 11
        Feb 14 13:38:12 tokyo vmunix: SUNW,CS42310 at SBus slot 4 0xc000000 pri 9 (sbus level 5)
        Feb 14 13:38:12 tokyo vmunix: zs0 at SBus slot 5 0x1100000 pri 12 (onboard)
        Feb 14 13:38:12 tokyo vmunix: zs1 at SBus slot 5 0x1000000 pri 12 (onboard)
        Feb 14 13:38:12 tokyo vmunix: SUNW,fdtwo0 at SBus slot 5 0x1400000 pri 11 (onboard)
        Feb 14 13:38:12 tokyo vmunix: root on sd0a fstype 4.2
        Feb 14 13:38:12 tokyo vmunix: swap on sd0b fstype spec size 98560K
        Feb 14 13:38:12 tokyo vmunix: dump on sd0b fstype spec size 98548K
        Feb 14 13:38:12 tokyo vmunix: le0: Twisted Pair Ethernet
    
    
    
    -- Thanks,
    
       Gal Shalif, R&D group
    
     /-----------------------------------------------------------------\
    | Gal Shalif | Internet: gal@sd.co.il |
    | Software Engineer | Voice: +972 9-507102, ext. 209 |
    | Summit Design (EDA) Ltd | Fax: +972 9-509118 |
     \-----------------------------------------------------------------/
      \ In god we trust, everybody else must pay in cash /
       ---------------------------------------------------------------
    
    



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:10:32 CDT