My original question was:
Okay, given a panic on an "unknown Memory error" with a final message:
MEMORY ERROR! Status D8, DVMA-BIT1, Context 0,
Vaddr: FFDFCA4, Paddr: 00001CA4, Type 0 at 0x0FEF4020
what algorythm do I follow to tell me which SIMM contains the flaky bit?
The two most informative answers were:
From: Eirik Fuller <ut-emx!elf.TN.Cornell.EDU!eirik>
It sounds like the fourth SIMM is the culprit. If you move it to the
first slot, you should see D1 in place of D8 in the error message.
This assumes your ROM revision is 1.6 or newer. If you are interested
in the details, I can elaborate; this information is in the manual
that comes with Clearpoint SIMMs.
This is section 2.8.4 of "Memory Modules for Sun Workstations",
Clearpoint's manual, Revision 2.01, November 1989. All of Chapter 2
is for the Sun 3/60, and 2.8 is "Troubleshooting - Locating Defective
2.8.4 Defective Memory - ROM Revision 1.6 and Later
Defective SIMMs in systems with ROM revision level 1.6 and later are
found using the physical address of the error. The error message
displayed after a system failure should contain a digit with the
physical address (PADDR) of the defective SIMM module. To determine
the bank with the defective SIMM, refer to the number on the error
screen. Ignore the 5 least significant digits of the number and use
the 3 remaining to locate the address. Refer to Table 2-5 below:
Table 2-5 Physical Address of Memory
PADDR Bank Location in memory
000xxxxx-003xxxxx 0 first 4 MB
004xxxxx-007xxxxx 1 between 4 MB and 8 MB
008xxxxx-00Bxxxxx 2 between 8 MB and 12 MB
00Cxxxxx-00Fxxxxx 3 between 12 MB and 16 MB
010xxxxx-013xxxxx 4 between 16 MB and 20 MB
014xxxxx-017xxxxx 5 between 20 MB and 24 MB
Once the bank is located, find the defective SIMM within that bank.
To determine the defective SIMM, you must use the parity error
register, D, where X is the hexadecimal value used to identify the
defective SIMM(s). The hexadecimal digit must be converted to its
binary equivalent. The high bits(s) points to the location of the
defective SIMM(s). Refer to the example on the following page.
Binary digit example:
High bits point to the defective SIMMM. The location is relative to
the first SIMM in a bank:
Table 2-6 Binary Location of SIMMS
x Binary SIMM Location
1 0001 first
2 0010 second
4 0010 third
8 1000 last
The error message contains:
Parity Error Reister D4<intr,intena,check,err08>
Physical Address = 00D13B2C
When the least significant digits of the physical address are droped
the "00D" remains. Refer to Table 2-5 and see that "00D" falls in
Bank 3 which is between 12MB and 16MB.
Find the value for X in Table 2-6 and converte the hexadecimal value
to its binary equivalent: 4=0100
Conclusion: The third SIMM in the bank #3 is defective.
From: John Valdes <ut-emx!geosun.uchicago.edu!valdes>
Shutdown your 3/60 (if not already), attach a terminal to serial port A,
change the diag switch on the back of the CPU from "NORM" to "DIAG", and
reboot the machine. The machine will now report the progress of the
self tests to the terminal (oh, set the terminal to 9600 baud, 8 data bits,
1 stop bit, no parity). When it gets to the memory test and detects a
failure, I believe it will tell you the location of the bad SIMM (something
like Unnnnn). Note the position, power-down the machine, take out the
CPU board, reseat the SIMM in question and repeat the above. If the SIMM
still shows as bad, replace it with a fresh one.
Hope this helps.
Thanks also to:
Jack L. Bell email@example.com
Jay S. Rouman (firstname.lastname@example.org or email@example.com)
Ric Anderson <uunet!cs.arizona.edu!ric>
-- Henry Melton firstname.lastname@example.org || emx.utexas.edu!hutto!henry
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:44 CDT