From lrandy.webb at gmail.com Mon Mar 8 23:29:52 2010 From: lrandy.webb at gmail.com (Randy Webb) Date: Mon, 8 Mar 2010 22:29:52 -0600 Subject: SUMMARY: Unable to Create New Boot Environment Message-ID: The problem ultimately turned out to be a 'corruption' issue it seems. Turning on the the debugging option in /etc/default/lu ( which I did not know how to do but I will REMEMBER it) led right to the problem ( creation of inode failed) I did a newfs on the underlying slice and rebuilt the metadevice and the problem cleared up. Thanks to Anthony, Darren and Espen for their answers and suggestions all were valuable. I am having some problems creating a new environment on 2 machines. See the > details below. I have googled the error and found a 'solution' of do a > devfsadm -Cv and then reboot. That did not work for me. I am still getting > the error below. I need to patch these servers asap. > > > Attempt #1 > > bash-3.00# df -k > / > > Filesystem kbytes used avail capacity Mounted > on > /dev/md/dsk/d20 10292260 5824774 4364564 58% > / > bash-3.00# > lustatus > > Boot Environment Is Active Active Can > Copy > Name Complete Now On Reboot Delete > Status > -------------------------- -------- ------ --------- ------ > ---------- > patched_20091115 yes yes yes no > - > bash-3.00# lufslist > patched_20091115 > boot environment name: > patched_20091115 > This boot environment is currently > active. > This boot environment will be active on next system > boot. > > Filesystem fstype device size Mounted on Mount > Options > ----------------------- -------- ------------ ------------------- > -------------- > /dev/md/dsk/d20 ufs 10701570048 / > - > /dev/md/dsk/d3 swap 8596684800 - > - > bash-3.00# metastat > d0 > d0: > Mirror > Submirror 0: > d1 > State: > Okay > Submirror 1: > d2 > State: > Okay > Pass: > 1 > Read option: roundrobin > (default) > Write option: parallel > (default) > Size: 20982912 blocks (10 > GB) > > d1: Submirror of d0 > State: Okay > Size: 20982912 blocks (10 GB) > Stripe 0: > Device Start Block Dbase State Reloc Hot Spare > c0t0d0s0 0 No Okay Yes > > > d2: Submirror of d0 > State: Okay > Size: 20982912 blocks (10 GB) > Stripe 0: > Device Start Block Dbase State Reloc Hot Spare > c0t1d0s0 0 No Okay Yes > > > Device Relocation Information: > Device Reloc Device ID > c0t0d0 Yes id1,sd at n500000e1113dfb40 > c0t1d0 Yes id1,sd at n500000e1113dfb80 > bash-3.00# lucreate -m /:/dev/md/dsk/d0:ufs -n patched_20100210 > Discovering physical storage devices > Discovering logical storage devices > Cross referencing storage devices with boot environment configurations > Determining types of file systems supported > Validating file system requests > Preparing logical storage devices > Preparing physical storage devices > Configuring physical storage devices > Configuring logical storage devices > Analyzing system configuration. > Comparing source boot environment file systems with the > file system(s) you specified for the new boot environment. Determining > which file systems should be in the new boot environment. > Updating boot environment description database on all BEs. > Searching /dev for possible boot environment filesystem devices > > Updating system configuration files. > The device is not a root device for any boot > environment; cannot get BE ID. > Creating configuration for boot environment > . > Source boot environment is > . > Creating boot environment > . > Creating file systems on boot environment > . > Creating file system for in zone on > . > Mounting file systems for boot environment > . > Calculating required sizes of file systems for boot > environment . > ****************************ERROR > HERE**************************************** > ERROR: Cannot make file systems for boot environment > . > bash-3.00# ludelete > patched_20100210 > /bin/nawk: can't open file > /etc/lu/ICF.2 > source line number > 1 > Boot environment deleted. > ****************************ERROR > HERE***************************************** > > > Attempt # 2 > bash-3.00# metastat > d10 > d10: > Mirror > > Submirror 0: > d11 > > State: > Okay > > Submirror 1: > d12 > > State: > Okay > > Pass: > 1 > > Read option: roundrobin > (default) > Write option: parallel > (default) > Size: 20901504 blocks (10.0 > GB) > > d11: Submirror of d10 > State: Okay > Size: 20901504 blocks (10.0 GB) > Stripe 0: > Device Start Block Dbase State Reloc Hot Spare > c0t0d0s4 0 No Okay Yes > > > d12: Submirror of d10 > State: Okay > Size: 20901504 blocks (10.0 GB) > Stripe 0: > Device Start Block Dbase State Reloc Hot Spare > c0t1d0s4 0 No Okay Yes > > > Device Relocation Information: > Device Reloc Device ID > c0t0d0 Yes id1,sd at n500000e1113dfb40 > c0t1d0 Yes id1,sd at n500000e1113dfb80 > bash-3.00# lucreate -m /:/dev/md/dsk/d10:ufs -n patched > Discovering physical storage devices > Discovering logical storage devices > Cross referencing storage devices with boot environment configurations > Determining types of file systems supported > Validating file system requests > Preparing logical storage devices > Preparing physical storage devices > Configuring physical storage devices > Configuring logical storage devices > Analyzing system configuration. > Comparing source boot environment file systems with the > file system(s) you specified for the new boot environment. Determining > which file systems should be in the new boot environment. > Updating boot environment description database on all BEs. > Searching /dev for possible boot environment filesystem devices > > Updating system configuration files. > The device is not a root device for any boot > environment; cannot get BE ID. > Creating configuration for boot environment . > Source boot environment is . > Creating boot environment . > Creating file systems on boot environment . > Creating file system for in zone on . > Mounting file systems for boot environment . > Calculating required sizes of file systems for boot > environment . > ******Same ERROR******* > > ERROR: Cannot make file systems for boot environment . > bash-3.00# ludelete patched > /bin/nawk: can't open file /etc/lu/ICF.2 > source line number 1 > Boot environment deleted. > bash-3.00# cat /etc/*release* > Solaris 10 5/09 s10s_u7wos_08 SPARC > Copyright 2009 Sun Microsystems, Inc. All Rights Reserved. > Use is subject to license terms. > Assembled 30 March 2009 _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From warren.liang at cox.net Tue Mar 2 20:54:00 2010 From: warren.liang at cox.net (Warren Liang) Date: Wed, 03 Mar 2010 01:54:00 -0000 Subject: SUMMARY:Last shutdown is later than time on thime-of-day chip Message-ID: <20100302205235.VZR44.908285.imail@fed1rmwml38> Hello: It is a newly build T5140 server. It's system time was way off. Thanks. Original post: What does "genunix: [ID 820358 kern.warning] WARNING: Last shutdown is > later than time on time-of-day chip; check date." mean Thanks. Warren _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From cbarnar1 at earthlink.net Tue Mar 2 22:13:22 2010 From: cbarnar1 at earthlink.net (Christopher Barnard) Date: Wed, 03 Mar 2010 03:13:22 -0000 Subject: SUMMARY: A negative rule in find In-Reply-To: References: Message-ID: As several folks realized and pointed out, yes I had two questions with find in the same script. I separated them into two posts so that they could have separate summaries. I asked > I want to delete all files over a certain age in a certain directory except > for those owned by root and one other user. Unfortunately, find does not > appear to honor regex. So !root (or !(root)) does not appear to work. What > I want to do is > > /usr/bin/find /export/temp -user !(root || cacheadm) -exec /bin/rm {} \; > unfortunately find is interpreting the ! as part of the usertname. Anyone > have other suggestions on how else I would do this? The answer Find *does* honor regex, I just wasn't looking at it right. You cannot negate the parameter to an expression, you negate the expression itself. So it isn't -user !root it is ! -user root the code snippet is ### ### Should files owned by certain users be excluded from the list? ### Uncomment exactly one of the ADD definitions below. ### No users to exclude? Uncomment this line. ### ADD="" ### ### Exclude any files owned by root? Uncomment this line. ### ADD=" ! -user root" ### ### Exclude any files owned by root or epicadm? Uncomment this line. ADD="! -user root -a ! -user epicadm" ### [...] /usr/bin/find ${SOURCEDIR} ${ADD} -type f -mtime +${MAXAGE} -print other suggestions I received: * use -not -user instead. That generated a syntax error. * escape the ! (ie, \!). That works on the command line, but within a script you are escaping an escape, so it no longer behaves nicely. * do an ls grepping out the user before feeding it to find. * many many suggestions to get the negation to the -user expression, not the parameter to the expression (the username), which was the bulk of my problem. * use gnu find to get regex. No, I did not have to go with gnu find, I just needed to get the regex syntax correct. * make sure you ask find to the regex, not the shell. The shell just does plain simple globbing. * use the -prune directive (-user root -prune) No, that is not what -prune is for. * explicitly list all of the users for which it is ok to delete their files. hmmm. maybe not. * use perl. I'd rather not kill a mosquito with a bazooka. This list is wonderful. Thanks to the 28 individuals (and counting) who gave me suggestions. Christopher L. Barnard ------------------- comment your code as if the maintainer is a homicidal maniac who knows where you live. _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From cbarnar1 at earthlink.net Tue Mar 2 22:33:49 2010 From: cbarnar1 at earthlink.net (Christopher Barnard) Date: Wed, 03 Mar 2010 03:33:49 -0000 Subject: SUMMARY: deleting old files but not directories with new files in them [weird find results.] In-Reply-To: <30E64B10-0C5B-4B2E-AD52-4C269D141B39@earthlink.net> References: <30E64B10-0C5B-4B2E-AD52-4C269D141B39@earthlink.net> Message-ID: <7CAE26DB-01F0-4F3D-825D-304206B338AA@earthlink.net> As several folks realized and pointed out, yes I had two questions with find in the same script. I separated them into two posts so that they could have separate summaries. I asked > I am not sure if this is a bug in the sun-shipped find (Solaris 10) or not. > > If I delete the files over a certain age, it deletes subdirectories as well if > they are over the age limit. The problem is that the files in that > subdirectory can be under the age limit but still get deleted. > > ie., I find all files under /export/staff over 365 days old and delete them > with find. The /export/staff/projects/stew/ directory contains files written > yesterday should not be deleted. However the directory /export/staff/projects > is over 365 days old. So find is dutifly doing a recursive delete on the > projects/ subdirectory even though some of the directories under it have > recent files. This appears only for a problem with 2nd generation or more of > depth, btw ( a/ and a/b/ are ok but a/b/c/, a/b/c/d, etc are not). So I want > to delete a directory only if every file and directory under it for an > indefinite depth is over the age limit. I thought that was the way find > worked, but evidently not... > > Thoughts? Is what I am describing doable? Solution no, its not a bug. I'm trying to do too much at once. Do it in two passes. First, using -type f, remove elderly files. Then do a second pass to remove directories that are empty. The code snippet /usr/bin/find ${SOURCEDIR} ${ADD} -type f -mtime +${MAXAGE} -exec rm {} \; ### ### now go through ${SOURCEDIR} and look for directories over the specified ### age and are empty. They can be deleted. An rmdir will fail if there is ### anything in the directory. STDERR is directed to /dev/null to suppress ### the expected errors when a directory is not empty. /usr/bin/find ${SOURCEDIR} ${ADD} -type d -mtime +${MAXAGE} -exec rmdir {} 2>/dev/null \; almost all of the suggestions centered around using -type and making two passes. Thanks to the 20-odd and counting folks who pointed out the wonders of the -type flag and two passes... Christopher L. Barnard ------------------- comment your code as if the maintainer is a homicidal maniac who knows where you live. _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From John.Oxley at team.telstra.com Mon Mar 8 21:29:11 2010 From: John.Oxley at team.telstra.com (Oxley, John C) Date: Tue, 9 Mar 2010 12:29:11 +1000 Subject: SUMMARY: Problems configuring MPxIO Message-ID: <693CD04A19309243A42029FB8079E3F5169CF03F18@WSMSG3104V.srv.dir.telstra.com> Thanks to everyone who replied to my question. Neil Martin and Rob De Langhe solved the problem for me. Neil suggested this wiki: http://wikis.sun.com/display/StorageDev/Symmetric+Multipath+Support+for+Solar is and Rob said: likely the typical error in the "/kernel/drv/scsi_vhci.conf" file... To enable on Solaris-10 : edit the file "/kernel/drv/fp.conf" and sure the following entry is set : mpxio-disable=no edit the file "/kernel/drv/scsi_vhci.conf" and make sure the following entries are set : load-balance="round-robin"; auto-failback="enable"; device-type-scsi-options-list = "EMC SYMMETRIX", "symmetric-option"; symmetric-option = 0x1000000; Make sure you put exactly the correct number of space between "EMC" and "SYMMETRIX", or it does not recognize the disks as to be considered by MPxIO So it appears that just running the stmsboot command is not enough. It's necessary to manually add the third party device id in the /kernel/drv/scsi_vhci.conf file. The original question: I'm trying to configure multipathing on some SAN storage but it appears that it's not being recognised as a dual path system. Devices 5, 6 & 7 below are dual pathed with devices 9, 10 & 11, each about 67G. System is Solaris 10 10/08 (update 6). I was expecting to see the 6 devices appear as 3. I run: # stmsboot -D fp -e then reboot # stmsboot -L stmsboot: MPxIO is not enabled Can anyone suggest why stmsboot is not configuring the multipathing? AVAILABLE DISK SELECTIONS: 0. c1t0d0 /pci at 0/pci at 0/pci at 2/scsi at 0/sd at 0,0 1. c1t1d0 /pci at 0/pci at 0/pci at 2/scsi at 0/sd at 1,0 2. c1t2d0 /pci at 0/pci at 0/pci at 2/scsi at 0/sd at 2,0 3. c1t3d0 /pci at 0/pci at 0/pci at 2/scsi at 0/sd at 3,0 4. c2t5006048452A58408d0 /pci at 0/pci at 0/pci at 8/pci at 0/pci at 8/SUNW,emlxs at 0/fp at 0,0/ssd at w500 6048452a58408,0 5. c2t5006048452A58408d27 /pci at 0/pci at 0/pci at 8/pci at 0/pci at 8/SUNW,emlxs at 0/fp at 0,0/ssd at w500 6048452a58408,1b 6. c2t5006048452A58408d28 /pci at 0/pci at 0/pci at 8/pci at 0/pci at 8/SUNW,emlxs at 0/fp at 0,0/ssd at w500 6048452a58408,1c 7. c2t5006048452A58408d29 /pci at 0/pci at 0/pci at 8/pci at 0/pci at 8/SUNW,emlxs at 0/fp at 0,0/ssd at w500 6048452a58408,1d 8. c4t5006048452A58407d0 /pci at 0/pci at 0/pci at 9/SUNW,emlxs at 0/fp at 0,0/ssd at w5006048452a5840 7,0 9. c4t5006048452A58407d27 /pci at 0/pci at 0/pci at 9/SUNW,emlxs at 0/fp at 0,0/ssd at w5006048452a5840 7,1b 10. c4t5006048452A58407d28 /pci at 0/pci at 0/pci at 9/SUNW,emlxs at 0/fp at 0,0/ssd at w5006048452a5840 7,1c 11. c4t5006048452A58407d29 /pci at 0/pci at 0/pci at 9/SUNW,emlxs at 0/fp at 0,0/ssd at w5006048452a5840 7,1d regards, John. _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From DRoss-Smith at reviewjournal.com Wed Mar 10 18:03:27 2010 From: DRoss-Smith at reviewjournal.com (DRoss-Smith at reviewjournal.com) Date: Wed, 10 Mar 2010 15:03:27 -0800 Subject: summary: solaris 8 disk replacement In-Reply-To: <87a8b8a51003101256x1a1c73adob25c8b23140bc68a@mail.gmail.com> References: <87a8b8a51003101256x1a1c73adob25c8b23140bc68a@mail.gmail.com> Message-ID: Thanks to the couple of folks who replied. The jbod array is semi-intelligent and likes to assign it's own scsi ids to certain slots. This isn't working correctly. I rebooted the jbod. I now have my new ultra-320 drive with an id of 15 where it should have an id of three. I can label the disk but can seem to get disksuite to write to it. I'm going to chalk this up to an incompatible replacement drive and move on to plan b... rooting around in my spares and finding a good used drive as a replacement. MIchael Horton wrote on 03/10/2010 12:56:28 PM: > MIchael Horton > 03/10/2010 12:56 PM > > To > > DRoss-Smith at reviewjournal.com > > cc > > Subject > > Re: solaris 8 disk replacement > > you wrote, "killed the scsi bus" > now the OS doesn't recognize a new disk. > > in troubleshooting, go from the easiest to the hardest. > > try a known good hard drive. (a new drive is not a known good drive.) > try an actual reconfiguration reboot. > check to see if the new hard drive is seen at the boot prom level > (ok> prompt)? > if it isn't you have a hardware problem and maybe a real failure. > if it is and the reconfiguration doesn't fix it, you have an OS > subsystem failure. > > On Wed, Mar 10, 2010 at 1:30 PM, wrote: > Hi all- > It's been a while since I've had to do this... > I've had a 420r chugging along running solaris 8 and sybase 11 on a > mirrored jbod for many years. It had uptime of 3.5 years until a couple > of weeks ago when a disk died, killed the scsi bus and panicked the > server. The disk was 9GB large and served a good life (from ~1999!). I > ordered a new 73GB disk, slapped it into the jbod, ran devfsadm, went to > label it and format doesn't see the disk. > > -snip--- > 4. c1t3d0 > /pci at 1f,4000/scsi at 4/sd at 3,0 > ---snip--- > > but selecting the disk, it asks me to choose the drive type and then lists > the correct drive type > AVAILABLE DRIVE TYPES: > 0. Auto configure > --snip-- > 23. FUJITSU-MBA3073NP-0102 > --snip-- > Specify disk type (enter its number): 23 > > selecting c1t3d0 > [disk unformatted] > > so I try to format > format> format > Ready to format. Formatting cannot be interrupted. > Continue? y > Beginning format. The current time is Wed Mar 10 10:27:39 2010 > > Inquiry failed > failed > format> > > and at this point I'm stumped. > iostat -En looks ok.... > Vendor: FUJITSU Product: MBA3073NP Revision: 0102 Serial No: > BBR0PA102CM7 > Size: 73.54GB <73543163904 bytes> > Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 > Illegal Request: 0 Predictive Failure Analysis: 0 > c3t1d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 > > Anyone have a fix? > TIA > Dean Ross-Smith > _______________________________________________ > sunmanagers mailing list > sunmanagers at sunmanagers.org > http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From everett.batey at navy.mil Fri Mar 12 13:38:50 2010 From: everett.batey at navy.mil (Batey, Everett II NAVSEA) Date: Fri, 12 Mar 2010 10:38:50 -0800 Subject: SUMMARY, Answers found .. Help for "+lost system +control +key sunfire netra +v440 -PGP" Message-ID: SUMMARY, .. Help for "+lost system +control +key sunfire netra +v440 -PGP" Looks like a lost key or keys may not be a big problem -----Original Message----- From: francisco Do you mean these keys: http://shop.ebay.com/?_from=R40&_trksid=p3907.m38.l1313&_nkw=v440+key&_sacat=See-All-Categories I've a couple v440's, both use the same key. You might also search google products for "v440 key", sort price lowest to highest, i think it's also p/n 240-4341 but doublecheck with a vendor: http://sunsolve.sun.com/handbook_pub/validateUser.do?target=Systems/SunFireV440/components&source Just noticed Sun has a 'buy' link there too, $30. ------Original Message----- I can probably dig up a key for you. Scott -----Original Message----- If I remember there are two keys one larger one where the handle is triangle shaped and one smaller one. The larger key controls system power up for the front panel I believe the smaller key controls disk access. I have the larger key, and maybe the smaller one, it might be for an e240, that I kept as a memento that I could send you if you can't find a replacement. Jordan -----Original Message----- don't know about the V440, but with my E250's that is certainly the case. I have a store room with several surplussed E250's that I harvest for spare parts/bodies when emergencies come up. Many of them arrived without keys or drives. Any one key will fit any E250. Chris -----Original Message----- I Google'd "sun v440 key" and found reference to part# 240-4341 and further searches suggest that it may also be part# 240-4429 Allan -----Original Message----- I've got a handful of V440's we're getting rid of. I think the netra version is about the same hardware wise as the standard model. As I recall the same key is used for everything. I probably have a few keys sitting around somewhere, assuming I can find them I could drop you one or two in the mail. I'm out of the office today but could look around on Monday and see about digging them up if you want. All of mine use the same key, ... Paul ---------- Thanks folks .. Again rescued by the net .. -- R / Everett Batey Cell & Days: (805) 616-2471 [demime 1.01b removed an attachment of type application/x-pkcs7-signature which had a name of smime.p7s] _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From lrhazi at gmail.com Tue Mar 16 07:59:05 2010 From: lrhazi at gmail.com (Mohamed Lrhazi) Date: Tue, 16 Mar 2010 08:59:05 -0400 Subject: Summary: Can a user-process cause Solaris to boot-up, with having shutdown? Message-ID: Thanks a lot to all, Most responses agreed this can be power failure, or much more likely Oracle RAC evicting the node and forcing it to shutdown "improperly" to avoid having it "save anything"... We confirmed from Oracle RAC log files on other nodes that the host "elli" was indeed evicted, and also found a bug report mentioning that: 6389053 Oracle RAC defines "uadmin 1 1" as reboot command, trips over boot archive check after reset So that would explain the observed system logs. Thank you all very much. Mohamed. On Mon, Mar 15, 2010 at 11:57 PM, Mohamed Lrhazi wrote: > My question in other words is: What can explain a system crash where > the logs go from "usual log lines" to a sudden system boot messages > like this: > > ================ > Mar 15 21:11:55 elli genunix: [ID 540533 kern.notice] ^MSunOS Release > 5.10 Version Generic_138888-03 64-bit > > Mar 15 21:11:55 elli genunix: [ID 172908 kern.notice] Copyright > 1983-2008 Sun Microsystems, Inc. All rights reserved. > > Mar 15 21:11:55 elli Use is subject to license terms. > > ================ > > That is, no panic message, no "am going down..." or anything at all! > > Could it be power failure? Could there be other explanations? > > This system is part of an Oracle cluster, and I don't know the > software at all, but from OS perspective, am tending to think that > Oracle is just a user process and could not be doing anything behind > the kernel's back, but maybe I am wrong. > > I also don't know with the linefeed character doing there in the first > boot line! > > Thanks a lot. > Mohamed. _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From Syed.Hosain at aeris.net Thu Mar 18 11:28:25 2010 From: Syed.Hosain at aeris.net (Syed Zaeem Hosain (Syed.Hosain@aeris.net)) Date: Thu, 18 Mar 2010 09:28:25 -0700 Subject: SUMMARY: Question about replacing CPU fan (or CPU heatsink) on a V240. References: Message-ID: Hi, all. Thanks to everybody who responded. 1. I had two people point me to the correct documentation at the Sun web site: http://dlc.sun.com/pdf/817-4048-13/817-4048-13.pdf. Thanks! My other info was not sufficient for sure. 2. Per the instructions, you have to be careful about which thermal material is (a) currently on the existing CPU/heatsink and (b) what is provided with older vs. newer 440 systems. The documentation is clear about how to proceed ... depending on whether you have the thermal paste or the glue pad. However, the procedure involves having the processor freshly turned off (i.e, still hot) in one situation - see the doc - so some experimentation is needed if you do not know the answer. 3. My new heatsink came with a thermal glue pad, so I planned to use it as a replacement after cleaning the old material off. 4. From the ten or eleven responses, only one person said that the "replace the fan" idea would work, since he had recently done this about 4 months ago. I took that as a good omen later - see next few lines. :) 5. I went looking for isopropyl alcohol around here to clean the surface of the cpu, but could not find any handy, so I took the risk of removing the old fan and installing a new one - *without* removing the existing heatsink at all. 6. The fan is held in place with two screws (not four as I had incorrectly mentioned earlier), and is fairly easy to remove. Just have to be careful to move the fan wires out from under the two metal guides on the heat-sink without moving it around. And you may have to remove some of the memory cards that are in the way to reach the fan screws. 7. You have to be careful with the small Phillips screwdriver, and you have to be careful not to drop the screws - our system is rack-mounted, so finding a dropped screw could have been painful. Fortunately, my dropped screw landed on top of the small green heatsink next to the fan, so it was an easy retrieval. Whew! In any case, the system powered up fine, both CPU fans are now working and the system is back in operation. Thanks again, everybody! Z -----Original Message----- From: Syed Zaeem Hosain (Syed.Hosain at aeris.net) Sent: Wednesday, March 17, 2010 11:37 AM To: sunmanagers at sunmanagers.org Subject: Question about replacing CPU fan (or CPU heatsink) on a V240. Hi, all. CONTEXT: We received a CPU fan error message on one of our V240 systems. I powered down and pulled the cover and sure enough, one of the fans (there are two tiny ones on the CPU heatsink) is sticking and not spinning too well when I move it by hand or an air can. The other one does fine. I ordered a replacement, and instead of just a fan, I received an entire heatsink with two fans, etc. More expensive, but ... okay! FWIW, it appears that removing and changing the fan may be more pain than needed - there are four tiny screws that could get stripped, torqued down too tightly to spin, or come loose, etc. So, I decided to proceed with changing the heatsink instead. (Although I am still open to just changing the fan if people feel that is the better approach). QUESTIONS: My experience with changing heatsinks (on Intel windows systems for example) required cleaning the heatsink and cpu surface carefully, using thermal paste (Arctic Silver 7 for example) between the heatsink and the processor, etc., etc., etc. Physically changing the entire heatsink appears to be very simple on the V240 from what I can see, *BUT* the instructions I have found do not state anything about using thermal paste, etc. What are the Sun requirements here? Will I mess things up by using thermal paste? Do they require thermal material? Would it be better if I did so? Or would be better to simply remove the old heatsink and install the new one (without any paste, etc.)? (Or should I simply attempt to unscrew the fan from the old heatsink and replace it with a new one from this new heatsink)? Thanks in advance, and I will summarize. Z _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From Syed.Hosain at aeris.net Wed Mar 17 17:59:26 2010 From: Syed.Hosain at aeris.net (Syed Zaeem Hosain (Syed.Hosain@aeris.net)) Date: Wed, 17 Mar 2010 15:59:26 -0700 Subject: SUMMARY Re: Question about replacing CPU fan (or CPU heatsink) on a V240. References: Message-ID: Hi, all. Thanks to everybody who responded. 1. I had two people point me to the correct documentation at the Sun web site: http://dlc.sun.com/pdf/817-4048-13/817-4048-13.pdf. Thanks! My other info was not sufficient for sure. 2. Per the instructions, you have to be careful about which thermal material is (a) currently on the existing CPU/heatsink and (b) what is provided with older vs. newer 440 systems. The documentation is clear about how to proceed ... depending on whether you have the thermal paste or the glue pad. However, the procedure involves having the processor freshly turned off (i.e, still hot) in one situation - see the doc - so some experimentation is needed if you do not know the answer. 3. My new heatsink came with a thermal glue pad, so I planned to use it as a replacement after cleaning the old material off. 4. From the ten or eleven responses, only one person said that the "replace the fan" idea would work, since he had recently done this about 4 months ago. I took that as a good omen later - see next few lines. :) 5. I went looking for isopropyl alcohol around here to clean the surface of the cpu, but could not find any handy, so I took the risk of removing the old fan and installing a new one - *without* removing the existing heatsink at all. 6. The fan is held in place with two screws (not four as I had incorrectly mentioned earlier), and is fairly easy to remove. Just have to be careful to move the fan wires out from under the two metal guides on the heat-sink without moving it around. And you may have to remove some of the memory cards that are in the way to reach the fan screws. 7. You have to be careful with the small Phillips screwdriver, and you have to be careful not to drop the screws - our system is rack-mounted, so finding a dropped screw could have been painful. Fortunately, my dropped screw landed on top of the small green heatsink next to the fan, so it was an easy retrieval. Whew! In any case, the system powered up fine, both fans are now working and the system is back in operation. Thanks again, everybody! Z -----Original Message----- From: Syed Zaeem Hosain (Syed.Hosain at aeris.net) Sent: Wednesday, March 17, 2010 11:37 AM To: sunmanagers at sunmanagers.org Subject: Question about replacing CPU fan (or CPU heatsink) on a V240. Hi, all. CONTEXT: We received a CPU fan error message on one of our V240 systems. I powered down and pulled the cover and sure enough, one of the fans (there are two tiny ones on the CPU heatsink) is sticking and not spinning too well when I move it by hand or an air can. The other one does fine. I ordered a replacement, and instead of just a fan, I received an entire heatsink with two fans, etc. More expensive, but ... okay! FWIW, it appears that removing and changing the fan may be more pain than needed - there are four tiny screws that could get stripped, torqued down too tightly to spin, or come loose, etc. So, I decided to proceed with changing the heatsink instead. (Although I am still open to just changing the fan if people feel that is the better approach). QUESTIONS: My experience with changing heatsinks (on Intel windows systems for example) required cleaning the heatsink and cpu surface carefully, using thermal paste (Arctic Silver 7 for example) between the heatsink and the processor, etc., etc., etc. Physically changing the entire heatsink appears to be very simple on the V240 from what I can see, *BUT* the instructions I have found do not state anything about using thermal paste, etc. What are the Sun requirements here? Will I mess things up by using thermal paste? Do they require thermal material? Would it be better if I did so? Or would be better to simply remove the old heatsink and install the new one (without any paste, etc.)? (Or should I simply attempt to unscrew the fan from the old heatsink and replace it with a new one from this new heatsink)? Thanks in advance, and I will summarize. Z _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From redtigra at gmail.com Mon Mar 22 09:23:22 2010 From: redtigra at gmail.com (Eugenia Shuiskaya) Date: Mon, 22 Mar 2010 15:23:22 +0100 Subject: SUMMARY: heavy paging Message-ID: <5c0b76411003220723p3971d0adv7e9bd85c9a1a56b8@mail.gmail.com> Hi All, I would like to thank all who answered and explain that sched is just waiting for the disk and should not be blame on. Especial thanks to Gary Paveza who payed my attention on high fpi which means an active reading from the file system. iosnoop from DtraceToolkit showed me a process which was reading a lot small files. Files were moved to CFS and the problem has been solved. Thanks again and best regards, es > > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 16 Mar 2010 18:08:11 +0100 > From: Eugenia Shuiskaya > Subject: heavy paging > To: sunmanagers at sunmanagers.org > Message-ID: > <5c0b76411003161008w7637b6f8r3dbbcb681d26e5f2 at mail.gmail.com> > Content-Type: text/plain; charset="us-ascii" > > Hi All, > > We run into heavy I/O on our zfs disk and could not find the reason. > Solaris > 10 10/09. > > Here is zfs info: > > pool: rpool > state: ONLINE > status: The pool is formatted using an older on-disk format. The pool can > still be used, but some features are unavailable. > action: Upgrade the pool using 'zpool upgrade'. Once this is done, the > pool will no longer be accessible on older software versions. > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t0d0s0 ONLINE 0 0 0 > c1t1d0s0 ONLINE 0 0 0 > > > dtrace script iotop shows: > > # iotop d sd0 Co ** > > UID PID PPID CMD DEVICE MAJ MIN D DISKTIME > > 0 0 0 sd0 32 0 60104 > > 0 0 0 sched sd0 32 0 W 4554988 > > > 2010 Mar 16 17:51:16, load: 3.25, disk_r: 0 KB, disk_w: 233648 KB > UID PID PPID CMD DEVICE MAJ MIN D DISKTIME > 0 0 0 sched sd0 32 0 W 4864154 > > > I run whoispaging.d ang got the result: > > > # ./whoispaging.d > Who's waiting for pagein (milliseconds): > Who's on cpu (milliseconds): > ScriptAgent 20 > Supervisor 24 > dtrace 28 > psSTAT64_10 33 > data 51 > dmh 65 > fsflush 178 > rdh 1999 > rlh 6515 > sched 217344 > > > I assumed that the virtual memory is low, but prstat shows I'm wrong: > #prstat -Z... > ZONEID NPROC SWAP RSS MEMORY TIME CPU ZONE > 0 188 40G 38G 59% 4:11:50 3.4% global > > # vmstat -p 3 > memory page executable anonymous > filesystem > swap free re mf fr de sr epi epo epf api apo apf fpi fpo > fpf > 68783448 43892160 235 226 22 0 3 0 0 0 0 0 0 1531 23 > 22 > 41939512 15828032 298 871 0 0 0 0 0 0 0 0 0 868 0 > 0 > 41902168 15797808 197 123 0 0 0 0 0 0 0 0 0 1312 0 > 0 > 41911104 15782904 320 3 0 0 0 0 0 0 0 0 0 2859 0 > 0 > 41910920 15759696 232 177 0 0 0 0 0 0 0 0 0 1007 0 > 0 > 41823336 15734752 212 6 0 0 0 0 0 0 0 0 0 2132 0 > 0 > 41779384 15712160 177 4 0 0 0 0 0 0 0 0 0 1697 0 > 0 > > A lot of free memory - and high fpi. It's paging a lot. > > And I cannot understand, how I could find the reason of such heavy paging. > Any advise, thought, opinion would be very appreciated. > > Thanks a lot and best regards, > Evgenia _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From zhu_junca at yahoo.ca Mon Mar 22 21:35:41 2010 From: zhu_junca at yahoo.ca (Carl E. Ma) Date: Mon, 22 Mar 2010 19:35:41 -0700 (PDT) Subject: Summary: nfs questions In-Reply-To: <476660.91239.qm@web56705.mail.re3.yahoo.com> Message-ID: <552210.65346.qm@web56705.mail.re3.yahoo.com> Thanks for the response from Bismark Espinoza, Darren Dunham, Darren Brechman-Toussaint. There is no quicker way to dump quota information directly in order to find out files owned by particular user. ncheck and quota are the approach, although it is time consuming if you have a big filesystem. For nfs, showmount can map the shared directory to remote nfs client. With the aid of find, I can find out how many filles were not accessed in the last one year. Therefore, I can do the cleanup work. Darren Dunham also suggest using nfslog, which can outline nfs daemon activities. Thanks for all your help, Zhu --- On Tue, 3/2/10, Carl E. Ma wrote: From: Carl E. Ma Subject: nfs questions To: sunmanagers at sunmanagers.org Received: Tuesday, March 2, 2010, 1:08 PM Hello All, I have one solaris 9 NFS server with multi Terabyte nfs sharing. The filesystem is Veritas vxfs 4.1 with quota enabled. I need to find out where are the file/directory for particular user. Using find or ncheck is not efficient on such big filesystem. It took a hour to finish one run.:-) My understanding of quota filesystem is it maintains an internal structure to track inode usage information for every user. Is it possible I can access that information directly? I assume that could be very quick. Second question is whether it is possible to find out which nfs sharing is being used by which client? I have scratched my head for a hour. Could anyone share their thoughts? I will summarize. Thanks in advance, zhu __________________________________________________________________ Be smarter than spam. See how smart SpamGuard is at giving junk email the boot with the All-new Yahoo! Mail. Click on Options in Mail and switch to New Mail today or register for free at http://mail.yahoo.ca _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers __________________________________________________________________ Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now http://ca.toolbar.yahoo.com. _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From dmagda at ee.ryerson.ca Tue Mar 23 06:24:24 2010 From: dmagda at ee.ryerson.ca (David Magda) Date: Tue, 23 Mar 2010 07:24:24 -0400 Subject: SUMMARY: tool for managing, reporting, and auditing patches In-Reply-To: <530E5686-BAAB-4383-B8CD-5925057FB261@ee.ryerson.ca> References: <530E5686-BAAB-4383-B8CD-5925057FB261@ee.ryerson.ca> Message-ID: <12C9159E-379B-49E3-8AA9-6AA79937676F@ee.ryerson.ca> A few replies. Stuart Saxon mentioned Service'ability Reliability Acces'ability Support'ability (SRAS) and "Traffic Light Patching" (TLP). TLP however was EOSL in December 2007 from what I can tell: http://blogs.sun.com/patch/entry/patch_automation_tools Damir Delija brought up SANS' "User Vetted Tools" (see Critical Control 10), and some do mention Solaris support on their web page: http://www.sans.org/critical-security-controls/user-tools.php Rob De Langhe mentioned that the Explorer tools collect patch information, but that's not much different than PCA. Karl Vogel said that I should probably just go with PCA plus some home- grown scripting. So it seems that there's no consensus on this. On Mar 16, 2010, at 07:21, David Magda wrote: > I recently posted a question on the SAGE list [1] asking about tools > that could help in managing, reporting, and auditing installed > patches on Solaris (and Linux) machines. In the past we haven't > worried too much about it at $WORK, mostly focusing on keeping up-to- > date on network-accessible stuff (SSH, Apache, FTP, BIND, etc.), but > it's been suggested the Unix sys admin team be a more stringent like > our Windows brethren. > > While PCA seems to be canonical way to install patches, there > doesn't seem to be a canonical way of auditing which patches are > installed. The main tools mentioned in the SAGE thread were: > > . pca, with home-grown scripting built around it for reporting > . Sun/Oracle Ops Center [2] > . BigFix [3] > . Bfg2 > . Lumension (formerly PatchLink, with one recommendation /against/ > it) [4] > . Nagios, with the "check_solaris_pca" plug-in > . GFI Languard (at least for Linux) [5] > . radmind [6] > . Tenable's Nessus 3: not open-source like Nessus 2 and its fork > OpenVAS > . RHN (for Linux), and it's open source cousin Spacewalk > > So are the people on SunManagers doing any kind of reporting and > auditing of patches? If so, can you recommend any FOSS or commercial > products for Solaris (and/or Linux)? Any stuff that should be avoided? > > Thanks for any info. > > > [1] http://mailman.sage.org/pipermail/sage-members/2010/thread.html#00345 > [2] http://www.oracle.com/us/products/enterprise-manager/opscenter/ > [3] http://www.bigfix.com/content/patch-management > [4] http://www.lumension.com/vulnerability-management/patch-management-software.aspx > [5] http://www.gfi.com/lannetscan/patch-management.htm > [6] http://rsug.itd.umich.edu/software/radmind/ > _______________________________________________ > sunmanagers mailing list > sunmanagers at sunmanagers.org > http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From victor.engle at gmail.com Tue Mar 23 08:16:37 2010 From: victor.engle at gmail.com (Victor Engle) Date: Tue, 23 Mar 2010 09:16:37 -0400 Subject: SUMMARY: ZFS in Production Message-ID: Sorry for the delay with this summary. I was surprised that production use of zfs pretty common. Unfortunately my client was not willing to switch from his tried and true ufs. The common thread in the responses was that zfs is particularly well suited as storage for file services. Less well suited for databases but still very good provided tuning best practices were followed. One responder pointed out that you can't take a lun back from zfs as you can with a veritas disk group for example. All the responses I received follow... ############################################################################# I wrote a perl script to take hourly snapshots and to remove old ones. The idea for it came from the Apple's TimeMachine concept. The thinking behind is that the users should be able to go back to a snapshot several months old and copy back a file they deleted. However, unlike the Apple TimeMachine, this script does NOT make snapshots in a second drive. So, if the drive dies, original data and the snapshots all go with it. I'll work on the next iteration of the script to move the snapshots to a second drive/array. ########################################################################### I've been using it for the past year, plus some. I've been using the compression feature to conserve disk space. I usually create one large pool, and create all the filesystems under it. (All of the free space is share by the pool, and not locked away in separate filesystems.) I've been using the snapshot feature to do point in time backups. It greatly simplified the scripts I had written to do UFS snapshots with 'fssnap'. ####################################################################### We've been using zfs for years. You don't describe your application, depending on the particulars, you may want to look into tuning: http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide We make some modifications for Oracle with SAN disk, but for most uses don't change anything. As far as non-evil tuning, depending on your app you may benefit from compression and other settings. Even parts of Oracle benefits from compression - we found that RMAN backups happen faster and save space when sent to zfs with compression enabled. Test out settings on your app. Also be sure to read and understand the Best Practices guide: http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide We use snapshots for many different situations. On some servers we snap partitions on a daily basis and keep these for 2-7 days, providing us with instant recover capabilities. To do this be sure to understand how space is consumed with snapshots. We also snap before upgrades to provide rollback and comparison in case something doesn't work as expected. In some cases we snap for transmission to remote site, using zfs send/receive capabilities. Still not automated or regular, though. We're also investigating using snapshots for backup, e.g. snap partitions and backup the snaps instead of the actual partitions. When using snaps be sure to understand how they consume space, how your application reacts to the snap (e.g. does database need to be down for consistency) and be sure to clean up after yourself. We've had some admins create a snap, forget about it for a year, and then it causes problems consuming too much unused space. Good luck, -f http://www.blackant.net/ ################################################################################### The attached text file is the perl script. It has a "requre" line which I commented out. That's a localism to get some variables pre-loaded for our machines. If you find some variables (ie: $HOSTNAME) missing stuff you can figure out pretty early on. This particular script is being used for a RAID array that has large datasets. The lab tells me that the original data can be re-created so they don't want to back it up. This is a way to guard against accidental file deletions. This is NOT a backup strategy. HW = Sun Blade 2000 + Anacapa RAID arrays. Production and Dev. We use ZFS for SAN luns to aggregate them to a large filesystem and to lay OpenAFS on top. (See www.openafs.org to see an enterprise-class multi-platform filesystem) We also use ZFS to make filesystems out of Sun J4400 JBOD arrays. We have ~5-7 TB of production data that uses ZFS at the moment. All production. Does not use this script to make backup snaps as OpenAFS has its own backup snapshots. We run Solaris on Sun SPARC hardware, not x86, if that makes a difference. ##################################################################################### Awesome. Bloody easy and lots of value/features above/beyond ufs. I've not moved to zfs boot environment. But, everything else on my newer Solaris 10 systems is ZFS. I love it. I'm using it on a StorageTek 2510 iSCSI device and on a J4200 multi-homed SAS device. Also on the extra internal drives (beyond the hardware mirrored boot drive). You'll find recipes online for rotating snapshots. They take virtually no resources unless/until they are needed for data replication. I implemented a basically endless version of that that allows me to recover data files from any day in the semester for our bio-imaging class/laboratory. #!/bin/ksh # this should be run off cron before midnight every night. # it will generate a date stamped snapshot of those zfs filesystems we deem important. # `date +%Y%m%d%H%M` generates a date stamp for the snapshot name of the form 200909151431, # which corresponds to Sept. 15, 2009 at 2:31pm. These can be viewed with `zfs list`. # for a quick overview of recovery, roll back, etc., see # https://www.sun.com/offers/details/zfs_snapshots.xml /usr/sbin/zfs snapshot biopool/bioimaging@`date +%Y%m%d%H%M`; /usr/sbin/zfs snapshot biopool/capstone@`date +%Y%m%d%H%M`; /usr/sbin/zfs snapshot biopool/quantbiol@`date +%Y%m%d%H%M`; /usr/sbin/zfs snapshot biopool/students@`date +%Y%m%d%H%M`; /usr/sbin/zfs snapshot jpool/outreach@`date +%Y%m%d%H%M`; of course, I have to clear those out manually, and/or create a script to clear them out. However, I still haven't gotten a clear policy statement from the faculty on what time frame they need; and, since it still isn't eating much space compared to what I have, it all still sits there. The one complaint I have is that there is no replacement for ufsdump/ufsrestore. The zfs send and receive only does full file systems. You cannot recover just one file or directory using those tools. So, they aren't really functional for backups. Amanda can be configured to use them, but it turns out that gnutar is more functional in terms of the use cases that typically turn up. That's one reason I haven't gone to zfs boot environments. I depend on fssnap and ufsdump/ufsrestore configured with a wrapper in Amanda to backup my primary partitions. It could be I just don't know enough yet or don't have enough experience to have confidence, and, perhaps, zfs might actually be better for boot environments. Anyway, my setup works fine for me. Something I haven't seen a clear analysis of yet is higher level failure modes for zfs. Ok, so I have raid 6 with a hot spare configured using zfs raidz2. But, what if I was away on vacation, didn't pay attention, and I got 3 drives failed. Now, suppose I had two of those raidz2 (or even more for that matter) configured into one zpool. Do I lose the entire zpool? Or will it self heal and at least give me what data happened to be on the surviving raid components? Based on the lack of an answer for this, I don't use one giant zpool with multiple raidz components for large scale storage. That might be easier in terms of allocating space and the usage of space, but I don't want to risk it. So, I use multiple zpools rather than putting all my eggs in one basket. This is much better than the old situation of ufs where you have a gazillion drives with even more partitions, and you have to manage who gets what space and where directories are mounted and all that stuff. Some of my Solaris 9 systems with over 30 individual drives have become a bear in that respect. With zfs, I can ask a faculty member to buy a drive and I will allocate him a certain amount of space, but I don't have to just mount that drive and give him a directory on it. The disadvantage is that I can't just add a drive to a zpool and have any kind of raid protection. I have to wait until I have enough drives to make sense and set up a raidz. Then I can set reservations if they are appropriate. I would like a more incremental way of adding space to a zpool or even to a raidz component. Say, I have 3 drives in a raidz and want to expand it to 5 drives. I have to zfs send the data to something else, destroy the raidz and remake it, and then bring all the data back. I found one of the people who works on zfs posted a specification of how to do that and invited people to implement it (it's open source after all). But, as far as I know, it hasn't been done. Anyway, I've been using zfs for a couple of years and love it. If you happen to have a StorageTek 2510 iSCSI device or something comparable, I posted a very detailed summary on the sun managers list on how to set that up with zfs. However, if you haven't got one, I would recommend against it. The J series is just way easier to set up. The J4200 or J4400 with SAS. One thing I haven't done is to use the SSD drives to boost overall speed. That hasn't been an issue for me. But, if raw speed really is an issue, zfs can make use of an SSD for it's write intent log and end up running much faster. As far as I know this isn't possible with any other existing file system. It's part of what allows the Sun 7000 series storage systems to beat out NetApp in price/performance. -- --------------- Chris Hoogendyk ################################################################################################# We use ZFS extensively in all environments including production. We have been migrating systems from Veritas and SVM OS filesystems to ZFS with great success. Our newly jumpstarted systems are all using zfs root. The best advantage is the ability to create a new boot environment using snapshots, then patch that BE live and simply reboot into it. Turns a 3-hour downtime for patching into 10min. We use ZFS to create application filesystems using internal drives in a mirrored pool. I've personally used it to manage SAN LUNs, though this is not ready for prime-time since we absolutely need multipathing and MPXIO does not make it straightforward to test that both paths are actually active...it'll just tell you whether mpxio is turned on or not. We need explicit tests for when we do SAN maintenance. I've tested ZFS fault tolerance on internal drives (either system or for app filesystems) and it works as advertised, and it's real easy to replace failed disks. Dave Foster ################################################################################################## Hi Victor, I've been using zfs in a production oracle environment for about a year without any issues, although we have not used the snapshot technology. Pete ################################################################################################### Hi Vic I have implemented (an early version) of ZFS with 11 x T3 arrays with a V480 for around 2 years now in a production server giving both NFS and Samba file serving to a workgroup of 50 people and have approx 8Tb of data managed. So far it has held together really well without incident, very happy with it. Weekly backups are approx 3Tb to ZFS disk. We snapshot every night and keep 14 days worth of snapshots available. Also use snapshot for tape backups. A real winner with the Windows users to beable to get last weeks version..... Peter ###################################################################################################### ZFS is the best thing that has come out of the OS industry in the past years IMO. I've used it for years now, I use ZFS to create Sun virtual servers within each specified ZFS. I've created scripts that send incremental changes to off-site servers, perfect for DR scenarios. The list goes on. Can save a company hundreds of thousands of dollars by virtualizing and replicating data. I would not hesitate to use it in a production environment. ################################################################################################## Hi Victor, Victor Engle wrote: ZFS has been available for some time now and I'm currently working on a small project where ZFS would be perfect for the customer. Before deciding though I wanted to ask this group about any experiences you may have with deploying ZFS in production environments. no problems with 50+ Servers with ZFS for root FS and no problems with 200+ servers with ZFS for data FS I'm especially interested in ways you may have leveraged ZFS snapshots. sorry, we dont use snapshots HTH Tobias ##################################################################################################### My biggest problem is the way that I/O to a pool with a bad disk slows to a crawl until it's detached, despite remaining mirrors. SVM usually doesn't do that. I've seen ZFS resilvers saturate the disk subsystem too; a way to avoid starvation has been requested from Sun for over a year ################################################################################ Just remember one thing Vic you can not take away devices you add to a zpool currently which would result in you recreating your zpool if one of the devices would have to be removed. I have used zfs in prod environment and it all depends on what you trying to do with it. We are using one of sun's open storage products that has zfs built in, which was purchased to replace 3 emc clarion's (note: not based on my recommendation) and to say the transition has been flawless is a understatement. We have nothing but problems problems problems. I am not trying to trash sun or not recommend zfs for what you are doing but I am just sharing my experience. Thanks, ############################################################################################# Hi Victor, ZFS is certainly production ready. I have been using it for quite a few years now in various configurations and for me it has been very reliable. Primarily I am using it in bothe ZFS mirrors and RAIDZ's. Biggest RAIDZ is a 35TB configuration on a J4500 used as a staging areas in Disk->Disk->Tape for Veritas Netbackup. Handles arund 3-4 TB a day of new data and dat being flushed. This is on a 10 Gige attached T5220 server. Another great use I have is fo Squid web caches where the ARC cache proves to be a great addition for speeding up web queries. My caches have between 1500 and 2000 simultaneous connections to them. Works a treat there for 24/7 type workload. Also using it for compliance archiving for email as well on another J4500. If you really want to get a lot of info on who is using it, try having a look through the zfs-discuss at opensolaris.org archives. This question gets asked regularly with plenty of good user cases being reported. I have certainly done so there in much more detail than above. Hope this helps. /Scott. PS. Further to m previous email. I have foind snapshots to be fantastic in multi stage upgrades of varyong software. I have used it to great affect with a Blaackboard upgrade that involved multiple Oracle updates and schema changes. If there was a problem at any stage we just found the cause, rolled back, fixed it and the did the step over until we had a succesfull outcome. at that point we snapshotted and moved onto next step. Very, very helpful with a 20 GB database! ################################################################################################## On a Solaris x86 (Actually a Nevada deployment) I've used/setup ZFS snapshots for a client to do daily snapshots of his Accounting database that is presented as a SAMBA share to the clients. Haven't hear any complaints etc. for the past 3 years... ######################################################################################################### First off, ZFS is production-ready, at this point, it's not a concern with it being out for so long now. I am using ZFS snapshots for a group of users who need multiple copies of the same data from production, but yet need to modify it for each test environment, which is an ideal use of ZFS snapshots + clones. I'm able to provide 8+ sets of test data, and no additional blocks are used until they change data within each test environment. Furthermore, because its test data, I've set ZFS compression on the filesystem to save disk space; with some types of data ZFS compression is actually faster because if the data is highly compressed in memory, then there is less I/O. The whole process is very scriptable, and the only command used is different variations of 'zfs'. RCA ############################################################################################################### _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From cbarnar1 at earthlink.net Tue Mar 23 21:43:09 2010 From: cbarnar1 at earthlink.net (Christopher Barnard) Date: Tue, 23 Mar 2010 21:43:09 -0500 Subject: Summary: sftp login In-Reply-To: References: Message-ID: <6F370639-50BB-4EBC-B2C3-3DA519233F2F@earthlink.net> I asked: > I've just about finished setting up an anonymous sftp site. The only issue I > still have is that the login to sftp is like scp or ssh -- it assumes the > username of the issuer of the command unless you specify otherwise. I'd like > to get the prompt more like telnet or ftp -- no matter what user you may be > logged is locally, you are prompted for both a username and password when > logging in. > > I have not seen any reference to being able to do this with sftp, but I could > have missed it. Has anyone seen a way to get sftp (or ssh for that matter) to > not assume the username to be used? The answer: It cannot be done. The user is defined by the client side before connecting to any server. Several people suggested UseLogin in the sshd_config file, but that did nothing. I've put the proper usage in the /etc/issue file so that inbound connections will know that they need to specify Thanks to: Sebastian Muniz Safdar Mirza (hi, Safdar!) Michael Horton Nick Hindley Brian Dunbar David Magda Christopher L. Barnard ------------------- comment your code as if the maintainer is a homicidal maniac who knows where you live. _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From rvandolson at esri.com Thu Mar 25 11:27:55 2010 From: rvandolson at esri.com (Ray Van Dolson) Date: Thu, 25 Mar 2010 09:27:55 -0700 Subject: SUMMARY: Source IP routing w/ ipfilter In-Reply-To: <20100323061154.GA21704@esri.com> References: <20100323061154.GA21704@esri.com> Message-ID: <20100325162755.GA23773@esri.com> Hi all, thanks for replies from: Michael Horton John Hallman Crist Clark It sounds like this is currently not really possible with ipfilter. Based on feedback I got on the ipf mailing list, it sounds like folks _expected_ that it work and were interested in using dtrace to figure out why it doesn't. However, the multihome setup I'm using is most typically done when the interfaces are on different subnets. The way we had things set up just wasn't really something planned for. Recommendations are to rework the infrastructure and use IPMP/LACP if using both physical links is truly required. The ipf syntax I was using is correct in theory, it just doesn't do what I'd expect. Ray On Mon, Mar 22, 2010 at 11:11:55PM -0700, Ray Van Dolson wrote: > I have a Solaris 10 machine with two interfaces, both with IP's on the > same subnet: > > igb0: 10.49.2.110/16 > igb2: 10.49.2.111/16 > > Routing Table: IPv4 > Destination Gateway Flags Ref Use Interface > -------------------- -------------------- ----- ----- ---------- --------- > default 10.49.254.254 UG 1 6120267 > 10.49.0.0 10.49.2.110 U 1 113322 igb0 > 10.49.0.0 10.49.2.111 U 1 2 igb2 > 127.0.0.1 127.0.0.1 UH 3 175197 lo0 > > Problem is that when traffic destined for 10.49.2.111 hits igb2, the > replies are sent out igb0. I want anything originating from > 10.49.2.111 to go out igb2. > > I thought source based routing with ipf might do the trick: > > pass out quick on igb0 to igb2 from 10.49.2.111 to any > > However, while this rule definitely is getting matched on, the packets > don't appear to actually go out the interface (or any interface for > that matter). > > This works: > > pass out quick on igb0 to igb2:10.49.254.254 from 10.49.2.111 to any > > 10.49.254.254 is the default gateway for the 10.49 network. > > However, this isn't ideal either. Now all the packets show up at their > destination with a src mac address of the default gateway instead of my > Solaris box (even though the destination was another 10.49/16 host). > > I've also tried "to igb2:10.49.2.111" to no avail. > > Any tips? > > Ray _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From Mark.E.Hargrave at maf.nasa.gov Thu Mar 25 13:23:08 2010 From: Mark.E.Hargrave at maf.nasa.gov (Hargrave, Mark E) Date: Thu, 25 Mar 2010 13:23:08 -0500 Subject: SUMMARY: PDF to Text Converter In-Reply-To: References: Message-ID: Thanks for all of the replies and suggestions. I did see the xpdf package on sunfreeware.com and was unaware that "pdftotext" was a part of it. I thank Francisco Roque for pointing this out to me. This is what I needed. Mark Mark E. Hargrave ITS Operating Systems - Team Lead Lockheed Martin Space Systems Co. New Orleans, LA Phone: 504-257-1242 -----Original Message----- From: sunmanagers-bounces at sunmanagers.org [mailto:sunmanagers-bounces at sunmanagers.org] On Behalf Of Hargrave, Mark E Sent: Thursday, March 25, 2010 12:37 PM To: sunmanagers at sunmanagers.org Subject: PDF to Text Converter I'm in need of a PDF to Text Converter that runs on Solaris 8 or 9. Does anyone know of a free source? Mark Mark E. Hargrave ITS Operating Systems - Team Lead Lockheed Martin Space Systems Co. New Orleans, LA Phone: 504-257-1242 _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From c.a.herriges at gmail.com Fri Mar 26 00:41:33 2010 From: c.a.herriges at gmail.com (Cody Herriges) Date: Thu, 25 Mar 2010 22:41:33 -0700 Subject: SUMMARY: Moving a multipathed SAN Array. In-Reply-To: References: Message-ID: <9C3A521A-1C0A-4A2A-89B3-230334DD085D@gmail.com> On Mar 25, 2010, at 5:55 PM, Cody Herriges wrote: > I inherited a FLX240 disk array connected to a V490 running Solaris 10 Update 2 that is multipathed through two brocade switches. Each LUN is split up into two filesystems using SVM. This server has gone un-patched for some time now and it currently does not have a mirrored root filesystem so it is just going to be easier/cleaner to reload the machine rather then patch. In order to reload I need to move the array to a different host but I am having trouble determining the best way to go about it. > > The virtual device name that is generated for each LUN is going to be different when I move the array. Is there any predictable way to determine what LUN will be given what device name? I need the new device names in order to re-assemble all the SVM partitions using the md.tab file. These LUNs and SVM partitions were not set up using disk sets so a simple import is not going to be possible. > > Any suggestion on the best way to move these LUNs would be much appreciated. > > Thanks, > --Cody > I didn't realize that MXPIO names were based on the Volumed ID of the LUN that is reported to the OS by the FLX240. This was made obvious once I opened SANtricity and looked at the LUN properties. This is a no brainer now that the names generated are predictable. Thank you to Scott Lawson for tipping me off. _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From glowe at sbcglobal.net Tue Mar 30 11:47:01 2010 From: glowe at sbcglobal.net (Grant Lowe) Date: Tue, 30 Mar 2010 09:47:01 -0700 (PDT) Subject: SUMMARY: V490 memory errors In-Reply-To: <214336.54507.qm@web81807.mail.mud.yahoo.com> References: <214336.54507.qm@web81807.mail.mud.yahoo.com> Message-ID: <591420.83633.qm@web81807.mail.mud.yahoo.com> I just got the answer I was looking for. Thanks to Andy Yother for such a quick and accurate response: CPU to Memory Association SOCKET CPU CPU MEMORY GROUP J2900 J2901 J3001 J3000 0 1 4 5 0 1 4 5 0 1 4 5 0 1 4 5 16 17 20 21 16 17 20 21 16 17 20 21 16 17 20 21 A0 A0 A0 A0 1st Group 1st Group 1st Group 1st Group J7900 J7901 J8001 J8000 2 3 6 7 2 3 6 7 2 3 6 7 2 3 6 7 18 19 22 23 18 19 22 23 18 19 22 23 18 19 22 23 B0 B0 B0 B0 2nd Group 2nd Group 2nd Group 2nd Group J3100 J3101 J3201 J3200 0 1 4 5 0 1 4 5 0 1 4 5 0 1 4 5 16 17 20 21 16 17 20 21 16 17 20 21 16 17 20 21 A1 A1 A1 A1 3rd Group 3rd Group 3rd Group 3rd Group J8100 J8101 J8201 J8200 2 3 6 7 2 3 6 7 2 3 6 7 2 3 6 7 18 19 22 23 18 19 22 23 18 19 22 23 18 19 22 23 B1 B1 B1 B1 4th Group 4th Group 4th Group 4th Group ----- Original Message ---- From: Grant Lowe To: sunmanagers at sunmanagers.org Sent: Tue, March 30, 2010 9:14:02 AM Subject: V490 memory errors Hi Managers, I've got a V490 running Solaris 10 that has some memory errors: # fmadm faulty -a STATE RESOURCE / UUID -------- ---------------------------------------------------------------------- degraded mem:///unum=Slot,A:J2900,J2901,J3001,J3000 ff99de46-3f03-e037-b52c-fdd201f240da -------- ---------------------------------------------------------------------- faulted mem:///unum=Slot,A:J2900,J2901,J3001,J3000/physaddr=a7cb534000 c1e0f601-a90c-c291-e4cb-f3708503f14f -------- ---------------------------------------------------------------------- faulted mem:///unum=Slot,A:J2901/physaddr=a7cb536000 a9d1c65b-91c1-c38b-8f61-f1322a7118cf -------- ---------------------------------------------------------------------- faulted mem:///unum=Slot,A:J2901/physaddr=a7eb534000 fbdeb6ea-5fff-ec25-db44-b8137667f51c -------- ---------------------------------------------------------------------- faulted mem:///unum=Slot,A:J2901/physaddr=a7eb536000 2823c96d-ebe2-c018-88ef-bd112e5632f6 -------- ---------------------------------------------------------------------- degraded mod:///mod-name=pcisch/mod-id=26 b1a747aa-7079-e4f8-efe9-b69343a38b18 -------- ---------------------------------------------------------------------- # Now I can see that it's supposed to be in slots J2900, J2901, J3001, and J3000. But according to the doc, they come in four groups, A0, A1, B0, and B1. My question is, how do convert the J2900, J2901, J3001, and J3000 to one of the four groups? The docs don't say. Thanks. _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From romeotheriault at gmail.com Wed Mar 31 07:54:23 2010 From: romeotheriault at gmail.com (Romeo Theriault) Date: Wed, 31 Mar 2010 21:54:23 +0900 Subject: Summary: Large Oracle DB's in Zones? Message-ID: Many thanks to the following folks for taking the time to respond to my question: Adrian Koester Tim Bradshaw Joe Fletcher David Magda William Brown Antony Pavlenko Original question: On Thu, Mar 18, 2010 at 10:34 PM, Romeo Theriault wrote: > Hi Folks, I'm looking for some general advice on moving large Oracle (ERP) > databases into Solaris zones. The DB's currently run on Solaris 9 (E20K) but > we are planning on upgrading hardware and moving to Solaris 10. Some > questions I have are: > > * Does it make sense to move large resource intensive DB's in containers? > Obviously we'd need to have the appropriate horsepower on the box to deal > with the load but I'm wondering what kind of hit in performance we are going > to see by putting the workload in a container. > > * All of our Oracle DB's currently reside on our Netapp SAN which we access > from the HBA's/FC using vxvm and vxfs. I've seen multiple ways to make the > disks/mount points available to the containers (e.g. lofs, assigning the > whole device to the zone), but which makes the most sense from a performance > perspective? > > * Any big gotchas I should know about? > > > One of the benefits we hope to be able to capitalize on by using zones > would be to have the ability to move the zone around amongst boxes, in case > of emergency or to reduce maintenance outages, etc... So any thoughts on how > well this works would be great too. > Summary: * General consensus is that running large Oracle DB's in zones should work just fine. * There is a bit of disagreement about what kind of performance impact running a Oracle DB in a container will have. Some folks said there should be none while others suggest there will be some. I did find this blog post though that shows there will likely be some performance impact: http://blogs.sun.com/JeffV/entry/virtual_overhead * Running Oracle RAC nodes in containers is a no go since the clusterware software requires lower level access than the container allows. * One suggestion to dump Sparc in favor of X64 sun boxes. * While no-one disagreed that it is possible to move zones from machine to machine if needed it was suggested that I look into Sun Cluster or Veritas Cluster to do a similiar thing but in a much more automated way. * You can use solaris zones resource controls to limit the licensing costs of Oracle by limiting the number of cpu's a zone has access to. * Be aware of how you are going to backup the databases in the zones. As they are all sharing the hosts network resources you can quickly run into throughput bottlenecks during backup windows. * One comment that the best way to divide the cpu's out is via CPU pools and that it doesn't really matter if you use dynamic pools or not. The recommendation was not to use the FSS scheduler. Also recommended not to split out the memory among the zones due to some issues with resource caps. (I don't have all the details on this yet.) * About mount points. > In fact from performance point of view any type looks the same. > But there are some other problems. For example file system's which is added > to zone cnfig and mouned when zone start will skip fsck procedure even if > they are corrupted ( it is a bug ). And so on. > So I think that the best way is to mount vxfs file system in /etc/vfstab to > /zones//fs/ and then mount it like lofs in zone > config. > Romeo _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From bencom500 at gmail.com Wed Mar 31 17:23:30 2010 From: bencom500 at gmail.com (Ben V) Date: Wed, 31 Mar 2010 15:23:30 -0700 Subject: SUMMARY: Re-partition / root file system question Message-ID: Thanks for all responses. Basically is no can't do with the UFS fs unless I have the second disk then use ufsdump to the second disk - reformat the 1st disk - then restore back to the 1st disk Hello, > > > > My / root file system currently occupies all the space on the drive as > c1t0d0s0 (116.96GB). Is this possible to reduce the / root file system slice > 0 to 66.96GB and create a new slice 3 with 50GB for my live upgrade > space usage? > > > > Here is the format of the disk: > > Part Tag Flag Cylinders Size Blocks > > 0 root wm 9378 - 65532 116.96GB (56155/0/0) > 245285040 > > 1 swap wu 0 - 9377 19.53GB (9378/0/0) > 40963104 > > 2 backup wm 0 - 65532 136.49GB (65533/0/0) > 286248144 > > 3 unassigned wu 0 0 (0/0/0) > 0 > > 4 unassigned wu 0 0 (0/0/0) > 0 > > 5 unassigned wu 0 0 (0/0/0) > 0 > > 6 unassigned wu 0 0 (0/0/0) > 0 > > 7 unassigned wu 0 0 (0/0/0) > 0 > > > > Please help me with the command syntax how to accomplish this. Thanks. _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers