SUMMARY: 130gig file, poor backup performance & high IOWaits

From: Tim Chipman <chipman_at_ecopiabio.com>
Date: Wed Jul 09 2003 - 10:49:55 EDT
Hi All,

Many thanks for the responses I've gotten (in no particular order) from: 
Paul Roetman, Hichael Morton, Jay Lessert and also Vadim Carter (of 
AC&NC, HWVendorCo for the Jetstor disk array). Please see the end of 
this email for text of replies.

Bottom line // range of suggestions include:

-> there should *not* be an OS limitation / cache issue causing the 
observed problem ; folks report manipulation of large (90+gigs) files 
without observing this type of problem.

-> In future, as a workaround, request that DBA do dumps to "many small 
files" rather than "one big! file". This is apparently possible in 
Oracle8 (although not as easy as it used to be in Ora7, I'm told?) and 
is a decent workaround.

-> Possibly, depending on data-type of tables being dumped, subsequent 
(or inline... via named pipes) compression using gzip MAY result in 
oradmp files that are smaller / more managable. [alas in my case the 
large table being dumped has very dense binary data that compresses poorly].

-> Confirm performance of system for small file backup NOW? (yes - it 
was OK) ; that filesystem wasn't corrupt (it is "logging" and fsck'ed 
itself all OK / quickly after freeze-crash-reboot of yesterday AM) ; 
that large file isn't corrupt (believed to be OK since fsck was OK)


However, it gets "better". I did more extensive digging on google / 
sunsolve using "cadp160" as the search term, since this was cited in a 
message logged shortly before the system froze-hang yesterdayAM (when 
loading began to pickup on MondayAM as users came in to work).  What 
I've learned is **VERY GRIM**, assuming I can believe it all. ie,

-> CADP160 driver [ultra160 scsi kernel module driver] on SolarisX86 has 
a long history of being buggy & unreliable especially at times of 
significant load to the disk. This can result in such a fun range of 
things as data corruption, terrible performance, freeze/reboots, etc 
etc.  There are entries in sunsolve which date back to '2000 and as 
recent as May/31/03 which are in keeping with these problems, including 
such things as:

BudID:		Description:
4481205 	cadp160 : performance of cadp160 is very poor
4379142 	cadp160: Solaris panics while running stress tests

-> there is a "rather interesting" posting I located via google which 
appears to have been made by someone who claims to be the original 
developer of a low level SCSI driver module commonly used in Solaris // 
which is the basis of many other such drivers subsequently developed 
(Bruce Adler, driver is GLM). If this posting is true, it suggests that 
Sun has known about this problem with CADP160 for quite a long time ; 
that it came about for absurd reasons, and that it is quite disgusting 
that it remains unresolved. And .. IFF this story is true, then it 
certainly suggests that the cadp160 driver needs to be rewritten from 
scratch, and that until this happens, it should **NEVER** be anywhere 
near a production server. For anyone interested in the details, the 
(long) posting / sordid tale is available at the URL,

http://archives.neohapsis.com/archives/openbsd/2002-02/0598.html

So. As a temporary workaround, I believe I'll add an entry to 
/etc/system reading "exclude: drv/cadp160" - which should force the use 
of the older (apparently more reliable) cadp driver - albeit at 
non-ultra160 performance, but hopefully infinitely more stable / less 
buggy. After making this change I'll be doing some trivial tests (ie, 
attempt to copy the 130gig file between slices ; re-initiate the 
netbackup job) -- and observe the performance and iowait loading. My 
expected / hoped-for result will be better performance / less iowait 
loading.

I hope this summary is of some use to others. IF in the unlikely even 
anyone from Sun reads this, I would encourage you to try to inquire 
about when the cadp160 driver redevelopment will begin :-)

Thanks,



Tim Chipman



====paste====original text of replies======
...

have you thought about compressing the file on the fly - with database
exports, generally get over 90% compression.

Create a bunch of pipes (eg file[0-20].dmp ), and a bunch of gzip processes
   gzip < file0.dmp > file0.dmp.gz &

then export the database to to the pipes...
   exp ... file=(file0.dmp, file1.dmp, .. ,file20.dmp) \
        filesize=2147483648 ....

that way you end up with a bunch of 200 meg compressed files to backup, 
and even if you do uncompress them, they are smaller than two gig.

I have a script that generates all this on the fly, and cleans up after
itself if you are interested.

note: import can be done straight from the compressed files using the 
same pipe system!

Cheers

----------------

Have you confirmed ~12MB/s *now* with a 10GB file in the same file
system as your 100+GB file?
...
Do you have any interesting non-default entries in /etc/system?

I've manipulated 90GB single files on SPARC Solaris 8 (on vxvm
RAID0+1 volumes) with no performance issues.

Are you positive the RAID5 volume is intact (no coincidental failed
subdisk)?

...
You *could* try bypassing the normal I/O buffering by backing it up
with ufsdump, which will happily do level 0's of subdirectories, if you
ask.  Not very portable, of course.

------------------

...

the dba should  be able to split the file into smaller files for backup.
...
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Wed Jul 9 10:53:14 2003

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:16 EST