SUMMARY: SCSI bus reset

From: <M.Russell_at_iaea.org>
Date: Mon Mar 14 2005 - 01:24:06 EST
Many thanks to Eugene, Chris Ruhnke, and Joe Fletcher.

The problem was probably caused by a bad connection, possibly at the
terminator.  I took the system down today and, since I had another tape device
that I wanted to connect, I tried connecting it and I also removed and
reattached the terminator of the device that had been causing problems.  It
seems there are no problems with the SCSI card as everything was visible in
probe-scsi-all, boot -r went smoothly and I am able to use the device
connected to the SCSI card I had been concerned about.

Here is the advice I received:

>From Joe:

Try just replacing the terminator first.


>From Eugene:

Yes, a bad connection, or even a badly seated card can cause this. (Resolved
such an issue at a client this week).

It "could" be the card fault, but needs verification. It it later does
proove to be the issue, I would not use it in a sensitive area (mission
critical).

What I suggest is use another SCSI device and cable (and terminator) on this
card to verify, using a known good cable,device (and terminator if
applicable. Remember, the Sun internal terminators are normally of a higher
quality than the external ones) Also ensure it is all the same SCSI protcol
(Single ended, High Voltage diffrential or LVD, as the case may be)
	old cable by putting it back (check pins carefully beforehand)
	old terminator if applicable by putting it back.(check pins carefully
beforehand)
      If all works, was most likely a bad connection.

   End of process for above

If the problems reappears, I would start by reseating the card and re-check.
Try another slot for the card or another card in same slot.

The idea is to get a know good working state, then replacing the suspect
items one by one and rechecking. This way, as soon as a problem re-appears,
we know we have a faulty item and can zoom in. Also, check the other
components. Multi failures,though uncommon, has unfortunately caused a lot
of false diagnosis. It is better to complete the process.

The process is not too long on a small system which boots fast and can be
done in an hour, as in your case. Big systems are what normally what bites
:-)


>From Chris:

YES, the problem could be a loose (or bad) cable; but it could also be the
card.

The only way I know of to tell for sure is to swap cables.
You have 2 cards -- call 'em A1 and A2 (adapter 1 and 2) -- original hunh?
Each card has two cables -- call 'em A1C1 and A1C2 for adapter A1 and
A2C1/A2C2 for adapter A2.

>From your problem statement, your internal disk(s) are connected to A1C1 and
the failures/errors occurred on A1C2.
Your tape drive is now connected to A2C2 (I'm guessing at the cable, adjust as
appropriate).

Without changing anything else -- i.e. don't touch any other connections --
you need to:
- shutdown your system
- open the chassis
- swap the cables A1C2 and A2C2 at the adapter cards
- reboot

If the errors return, your problem is with the adapter card and you should
consider replacing it.
If there are no errors, connect the tape drive back to A2C2 (the former A1C2)
and reboot.
If there are still no errors, your problem may have been a loose connection at
the adapter A1 -- but keep an eye on things for a while.
If the problems do return, you probably have a bad cable.
If different problems assert themselves -- welcome to the world of hardware
triage!

When you are done, put things back the way they were when you started to avoid
confusion later.


-----Original Message-----
From: RUSSELL, Marian
Sent: Wednesday, 09 March, 2005 08:25
Subject: SCSI bus reset

I have a SUNBlade 100 with two Ultra-SCSI cards.  The backups to the tape
drive (Exabyte 8505) on my system have been failing for the last couple of
days.  I had thought the tape drive was perhaps defective because it is old,
and since I have an extra tape drive, I tried this morning to replace the
current drive with the spare one.  When I rebooted after exchanging the
drives, I got error messages like :  Target 3 reducing sync transfer rate, got
SCSI bus reset, and Target 3 reverting to async mode.  I did a search and
found in SUN that perhaps the problem is with the SCSI card and not the tape
drive, so I tried connecting the drive to the other SCSI card and everything
was fine - no error messages and the drive is happily now tarring the latest
set of backup files to tape.

Since one of our disk packs is connected to the other half of this card, I am
concerned that these error messages could indicate looming failure of the
whole card.  If one half of the card is defective, will the other half soon
fail?  Could something like a loose cable have caused the error messages?  Is
there any possibility that this SCSI card can be saved, or do I need to
replace it ASAP?

I would be very grateful for any advice.

Thanks in advance.

Marian Russell

This email message is intended only for the use of the named recipient.
Information contained in this email message and its attachments may be
privileged, confidential and protected from disclosure. If you are not the
intended recipient, please do not read, copy, use or disclose this
communication to others. Also please notify the sender by replying to this
message and then delete it from your system.
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Mon Mar 14 01:24:52 2005

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:44 EST