SUMMARY: Experience with DiskSuite Software

From: Dirk Behrens (behrens@eliot.imes.uni-hannover.de)
Date: Tue Oct 13 1992 - 17:19:26 CDT


Some weeks ago I ask this questions:

We4ve just bought the SUN DiskSuite Software V. 1.0. We ordered this because we
like to have partitions over more than one Harddrive.
Before installing the software I would like to ask if there is anybody who has
it already installed and have some experience to avoid a bad installation. So my
questions are:

1. Who has good or bad experiences with SUNs Online: DiskSuite 1.0 and why ?

2. Are there some performance benchmarks about concatenating or stripping of
   components (not only over many HDs but also over more than one SCSI Contr.)?

3. Is there a "best" configuration for best performance ?

4. Is mirroring and hot spare pools really needed ? Is so, which filesystems
   need mirroring (Sample in the Adm. Guide: swap ???) ?

5. Are there other tools to increase the performance of HD reads and writes ?
   ( IPI is for an university really expensive GB/$, prestoserve is well known,
   but increases only the writes (how much?), we have allready a second Fast
   SCSI Controller (PT) ).

-------------------------------------------------------------------------------

I`ve got 8 answers from

RichardT <richardt@apple.com>
dan@bellcore.com (Daniel Strick)
jallen@nersc.gov (John Allen)
Mike Raffety <miker@sbcoc.com>
katzung@sbcoc.com (Brian Katzung)
ups!upstage!glenn@fourx.Aus.Sun.COM (Glenn Satchell)
reynolds@icgmfg.mke.ab.com (Michael D. Reynolds)
Christian Lawrence <cal@soac.bellcore.com>

Thanks for all the replies !!

They said: DiskSuite works fine, but is not friendly to system upgrades and
recovery. Presto is one way to speed up NFS writes and the NC400 network card
is another very good way to get higher NFS (only?) read performance. They said
buy fast disks (we have already FAST SCSI Drives), they said buy fast controller
(we have a PTI FAST SCSI-Controller, is there a faster one?), they said buy lot
of memory(all our SUN 2 have 64MB Memory). We have also the /tmp filesystem in
memory.
They said initial configuration of Online DiskSuite requires some effort. You
need to patch every new /vmunix new! You get no big performance improvement from
Online DiskSuite (?). It has support for file systems bigger than 2 GB, but
nobody need this (why?). Someone has his own "indirect device driver" for
concatenating and striping disks (Dan Strick, can I get this?). John Allen
writes Presto is good, but the 1 Meg of memory is sometimes to less (in which
cases?).
Brian Katzung gives some hints what not do to when configuring. When con-
catenate or stripe some disks most people mirror these disks for safety.
Prestoserve is not really compatible with Disk Suite, DiskSuite
mirroring is NOT supported to work with Prestoserv. Disk Suite doesn't
really help with write performance.

So summary: Disksuit is good for mirroring and getting filesystems over more
than 2 GB. For performance improvements take the NC400 network card and
combine this with prestoserve. BUT !!! DiskSuite mirroring is NOT supported
to work with Prestoserv !!! That`s a bottle-eck.
Try fast controller, drives and a lot of memory. When this performance is not
good enough than ...?

Dirk

Now the replies:

-------------------------------------------------------------------------------
RichardT wrote:

We didn't buy DiskSuite because it made our disaster recovery and system
upgrade scenarios very, very scary (we thught about trying to recover
systems which were dependent on DiskSuite when some or all of the hardware
had been destroyed, and the scenario was *not* pretty).

Having noted that -
  Presto will improve NFS write performance but not normal disk performance.
  It will generally do something between halve the response time of the
  server and double the number of clients you can support with the same
  response time. Your mileage will vary (the factors are how much write
  NFS traffic do you have and how much traffic do you have).

  The other thing thats really helpful is the NC400 network card, which
  improves NFS read performance (again, does nothing for local disk
  traffic). The benchmarks for a Prestoserved machine with an NC400
  are very, very impressive (and reasonably accurate; I was working at
  Legato [the folks who make Presto] when the benchmarks were done; we usually
  saw better performance than the benchmarks we published, but we didn't
  think anyone would believe us when we said 'makes your server go six
  times as fast')

As far as local disk traffic, your options are limited. Buy fast disk.
Buy fast controllers. Buy lots of memory and try to minimize disk
access.

RichardT
-------------------------------------------------------------------------------

dan@bellcore.com (Daniel Strick) wrotes:

> 1. Who has good or bad experiences with SUNs Online: DiskSuite 1.0 and why ?

   DiskSuit broke the file locking daemon in by installing a new version
   which was incompatible with the standard SunOS version. The fix was
   to install yet a newer version on all of ouy systems (not just the
   one using DiskSuite).

   Otherwise, DiskSuite has not caused me severe problems, but whenever
   I change the kernel (e.g. to install a new sun security patch), I now
   have to spend half an hour rereading the DiskSuite installation manual
   because you have to run a special DiskSuite utility program that patches
   the kernel.

> 2. Are there some performance benchmarks about concatenating or stripping of
   components (not only over many HDs but also over more than one SCSI Contr.)?

   dunno

> 3. Is there a "best" configuration for best performance ?

   no

> 4. Is mirroring and hot spare pools really needed ? Is so, which filesystems
   need mirroring (Sample in the Adm. Guide: swap ???) ?

   If the possibilty that you might have to restore a file system from
   backup tape does not cause you to lose sleep (due to fear of lost data
   or the required file system downtime), then you don't need to mirror
   that file system.

> 5. Are there other tools to increase the performance of HD reads and writes ?
   ( IPI is for an university really expensive GB/$, prestoserve is well known,
   but increases only the writes (how much?), we have allready a second Fast
   SCSI Controller (PT) ).

   dunno

Comment:

   Initial configuration of Online DiskSuite requires some effort.
   The need to patch every new /vmunix is a continuing maintenance
   "problem". Incompatibility with new versions of SunOS can be
   a real pain. (E.G. you can't upgrade to a new release of SunOS
   until you know that your version of Online DiskSuite is compatible.
   Sometimes unbundled software releases can make it impossible to
   install patches or make it necessary to wait many months for
   new releases of the unbundled software before upgrading the
   basic SunOS.)

   I find that Online DiskSuite is worthwhile for mirroring (if you
   believe your bytes are critically important). Otherwise it is a
   waste of time. If you are expecting a performance improvement
   from Online DiskSuite, I believe you are going to be *very*
   disappointed. You will probably get something for concatentating
   SCSI drives on different controllers and perhaps a very tiny
   improvement from concatentating SCSI drives on the same controller.
   Don't expect much when concatentating SMD or IPI-2 drives on the
   same controller. (Does Online DiskSuite support striping?
   If it does and the stripe size is small (e.g. a track or less),
   expect to lose.) There are situations in which you can lose big,
   particularly if you use mirroring.
   
   The support for file systems bigger than 2 GB is nice, but
   currently I can live without it. This should be a standard
   feature of SunOS, not part of an unbundled product.

   I wrote my own "indirect device driver" for concatenating and
   striping disks. I can't boot off a "pseudo" drive. I can't
   do "parity" drives (e.g. mirroring or RAID). But my driver
   doesn't cause maintenance "problems" and it can provide modest
   performance increases. Mainly I use it to make clusters of old
   small drives look like single drives so that I can have a smaller
   number of larger file systems.

Dan Strick, aka dan@bellcore.com or bellcore!dan, (201)829-4624

-------------------------------------------------------------------------------

jallen@nersc.gov (John Allen) writes

I Answer:

1) I have used it to concatinate disks and to mirror disks and it seems to
   work fine.

2) I have done no benchmarks

3) (see #2)

4) Mirroring is only needed if one wants to have a machine remain up even
   when a disk dies. I have nly a couple of applications for this and
   80+ (going on 100 real soon machines)

5) IPI is not worth it (Even Sun is leaving IPI in favor of faster SCSI)
   Prestoserve works not be increasing the total throughput to disk but
   by allowing the application to think the write completed to disk while
   it has really only gone to battery backedup RAM. It goes to disk later
   at normal disk speeds. This is fantastic depending on your application.
   If the prestoserve cache of 1 megabyte is large enough for your machine
   the results are WOINDERFULL. I have 3 SS2's with prestoserve that are
   very noticeably "FASTER" then the ones without presto. However my 4/490
   with 4 IPI controllers and lots of IPI disk might as well not have it as
   the 1 megabyte is much too small to be an effective cache except during
   off hours!.

-------------------------------------------------------------------------------

Mike Raffety <miker@sbcoc.com> writes

Have you looked into the Interphase (?) NC400 network coprocessor?
It's a replacement Ethernet board, more or less, which handles all
your NFS directly.

-------------------------------------------------------------------------------

katzung@sbcoc.com (Brian Katzung) writes

>1. Who has good or bad experiences with SUNs Online: DiskSuite 1.0
>and why ?

We've had pretty good experiences with Disk Suite, but there are a couple of
gotchas:

1) NEVER EVER configure more than one submirror per metamirror in your
/etc/md.conf. Configure one in and metattach the rest, OTHERWISE YOU WILL
TRASH YOUR FILESYSTEM. This file determines the initial configuration. The
working configuration is stored in the state databases on disk. The problem
with specifying more than one submirror in /etc/md.conf is that disk suite
assumes that the contents of all submirrors are already identical. By using
metattach instead, the new submirrors will be completely reinitialized from
the first one.

2) Do not try to put state databases and filesystems on the same partition.
I have not been able to get this to work. (However, multiple state databases
per partition seems to work OK.)

>3. Is there a "best" configuration for best performance ?

This is discussed in the Disk Suite manual.

>4. Is mirroring and hot spare pools really needed ? Is so, which filesystems
> need mirroring (Sample in the Adm. Guide: swap ???) ?

The answer to that question is application-specific. If you can't afford to
lose access to your data for the time it takes to replace and reload a failed
drive, then mirror it. If you can, then don't. Obviously, if you have
critical data that needs to be mirrored, you will want to mirror the basic
partitions (root, swap, /usr, etc) too so that the machine can keep running to
provide access to the other partitions.

  -- Brian Katzung katzung@{i88.isc,sbcoc}.com

-------------------------------------------------------------------------------

ups!upstage!glenn@fourx.Aus.Sun.COM (Glenn Satchell) writes

> 1. Who has good or bad experiences with SUNs Online: DiskSuite 1.0 and why ?

I have set this up on a number of systems SS2, 4/690, 4/470 with scsi
and ipi disks. We have had no problems with this software, it works
well and does everything it claims. All these setups have been
combinations of mirroring, with some striping and concatenation.

> 2. Are there some performance benchmarks about concatenating or stripping of
> components (not only over many HDs but also over more than one SCSI Contr.)?

There are performance improvements if you can stripe over more than one
disk. Adding more disk controllers will also help.

> 3. Is there a "best" configuration for best performance ?

This depends upon your applications, your job mix (how much % read and
% write). Whether you're doing nice sequential reads, or random reads
(such as an nfs server).

> 4. Is mirroring and hot spare pools really needed ? Is so, which filesystems
> need mirroring (Sample in the Adm. Guide: swap ???) ?

It is if you want high availability. Remember that if you stripe or
concatenate a partition across more than one disk that the failure of
any disk in that partition will result in the loss of the data on that
whole partition. Generally most people mirror the disks that are
concatenated or striped to prevent this. You need to evaluate the risk
for your situation.

> 5. Are there other tools to increase the performance of HD reads and writes ?
> ( IPI is for an university really expensive GB/$, prestoserve is well known,
> but increases only the writes (how much?), we have allready a second Fast
> SCSI Controller (PT) ).

With Disk Suite if you set up mirroring you can interleave the reads,
ie since you have two identical copies of the data it can read from one
copy and then read the data from the other. Note that prestoserve is
not really compatible with Disk Suite. Disk Suite doesn't really help
with write performance.

-------------------------------------------------------------------------------
reynolds@icgmfg.mke.ab.com (Michael D. Reynolds) wrote

I have installed DiskSuite 1.0 here with no problems so far. We are using
Prestoserv in conjuction with DiskSuite disk concatenation. However, DiskSuite
mirroring is NOT supported to work with Prestoserv. I'm think that striping
concatenated disks will give you the best read/write performance but keep in
mind what would be needed to restore a concatenated/striped filesystem.
My concatenated filesystem is made of three 1.6 GB disks. So now there are
3 parts that can fail instead of 1. So I keep very recent full and incremental
backups of my concatenated filesystem in case of failure.

-------------------------------------------------------------------------------

Christian Lawrence <cal@soac.bellcore.com> writes

I've set up several 2x2 arrays of striped SCSI mirrors as follows :

            |-------| |-------|
            | ----- | ----- | ----- | -----
            | | | | | | | | | | | |
ctlr 1 --> | | |\ | | | | |\ | |
            | | | \ __| | | | | \ _| |
            | ----- \/ ----- | ----- \/ -----
            | ----- /\ ----- | ----- /\_-----
            | | | / \_| | | | | / | |
ctlr 2 --> | | |/| | | | | |/| | |
            | | | | | | | | | | | |
            | ----- | ----- | ----- | -----
            |-------| |-------|

              fs1 fs2

fs1 uses interleaving of 16k and fs 2 uses 24k. I read one of these values
in some article somewhere and it was suppose to be good for "normal" UFS/NFS
operations. The mirroring uses parallel writes to be absolutely current. The
stripes use geometric reads to monopolize performance associated with arm/head
location. All the drives are 1.3 GB (formatted) Seagate Elites (like Sun sells)

This gives each fs a 2.6 GB capacity and it seems to hum along pretty fast
with high availability since it can suffer a disk/ctlr loss and keep on going.
You could just as easily make this one big stripe OR concatenate the 2 stripes
depending on what you're looking for. The reason anybody should mirror is to
be able to deal with hardware failure on critical systems. Note that mirroring
becomes extremely advantageous with stripes or concatenations since restores
could conceivably take a VERY, VERY, VERY long time.

I'm in the process of mirroring the system disk (root, swap, usr) on a
different machine which has IPI drives. Having the hot spare is not magical.
It's primary use is to back fill for a failed mirror drive thus giving
temporary double protection against failure. Note that the mirroring above
is split across controllers to optimize bus/disk activity.

Hope this helps.

Again many thanks for all the answers

Dirk
________________________________________________________________________________

Dipl.-Ing. Dirk Behrens University of Hannover
                                        Institute for Microelectronic Systems
                                        Callinstrasse 34
Phone : +49 511 762-4986 W-3000 Hannover 1
Telefax: +49 511 762-4994 Germany

Email: behrens@eliot.imes.uni-hannover.dbp.de
________________________________________________________________________________



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:50 CDT