From Ugo.Balestrieri at alcatel-lucent.it Thu Jul 3 08:10:54 2008 From: Ugo.Balestrieri at alcatel-lucent.it (BALESTRIERI UGO) Date: Thu, 3 Jul 2008 14:10:54 +0200 Subject: SUMMARY: Any idea about kernel/sparcv9/unix request In-Reply-To: <486C9DEB.1010703@manukau.ac.nz> References: <46806A6FB6918C49987A49A754C10AAE2BED6B@PEPSUREX02.sur.dpep.pep.pemex.com> <72B2DB3EB0EAE243B7613C6F00B7C16F76DEA0@FRVELSMBS21.ad2.ad.alcatel.com> <486C9DEB.1010703@manukau.ac.nz> Message-ID: <72B2DB3EB0EAE243B7613C6F00B7C16F76E00D@FRVELSMBS21.ad2.ad.alcatel.com> Hi managers Thanks to Mr : Scott, Nelyubin and Richard Ugo, It might be the version of Soalris 8 you are using. You need at least Solaris 8 7/03 minimum for this system. Refer Sunsolve for this info. http://sunsolve.sun.com/handbook_pub/validateUser.do?target=Systems/SunF ireV440/SunFireV440 BALESTRIERI UGO wrote: > Hi managers > > > > Anyone knows what happen when I try to reinstall Solaris 8 on Sun Fire > V440 with Solaris 10 and OBP 4.18.10 I run the following steps : > Insert Cd Solaris 8 (I see with 'mount' command all Cd directories) > Init 0 > OK> boot cdrom - install > Then system request to write the kernel file after a default > indication 'kernel/sparcv9/unix' > I write /cdrom/sol_8_202_sparc/s0/Solaris_8/Tools/Boot/kernel/genunix > But the message is : cannot open file > > > Thanks in advance > > Bye > Ugo > > _______________________________________________ > sunmanagers mailing list > sunmanagers at sunmanagers.org > http://www.sunmanagers.org/mailman/listinfo/sunmanagers > _______________________________________________ > sunmanagers mailing list > sunmanagers at sunmanagers.org > http://www.sunmanagers.org/mailman/listinfo/sunmanagers > -- ________________________________________________________________________ _ Scott Lawson Systems Architect Information Communication Technology Services Manukau Institute of Technology Private Bag 94006 South Auckland Mail Centre Manukau 2240 Auckland New Zealand Phone : +64 09 968 7611 Fax : +64 09 968 7641 Mobile : +64 27 568 7611 mailto:scott at manukau.ac.nz http://www.manukau.ac.nz ________________________________________________________________________ __ perl -e 'print $i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10);' ________________________________________________________________________ __ _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From john.horne at plymouth.ac.uk Mon Jul 7 08:44:30 2008 From: john.horne at plymouth.ac.uk (John Horne) Date: Mon, 07 Jul 2008 13:44:30 +0100 Subject: [SUMMARY] Qlogic fibre-channel failover problem In-Reply-To: <1214300936.2919.42.camel@jhorne.homelinux.net> References: <1214300936.2919.42.camel@jhorne.homelinux.net> Message-ID: <1215434670.29070.47.camel@jhorne.csd.plymouth.ac.uk> Apologies for the late summary reply. I received a variety of hints and suggestions from the following, for which many thanks: Jim Musso Markus Mayer Dean Ross-Smith JayJay Florendo inemes Chris Liles Thomas Leyer Andrey Borzenkov There was no one specific 'answer' to the problem. Some people requested a bit more information, to which I did not reply. The reason being that the problem 'resolved' itself when three things occurred! These were: 1) The '/kernel/drv/fp.conf' file had 2 entries in it for fibre-channel - as if there was a dual-port card present. In our case we only had the one port, so I commented out one of the entries. (Suggested by Markus Mayer.) 2) The 'mpathadm show lu ...' command showed the 'Current Load Balance' as round-robin. This was changed to 'none'. (Suggested by Dean Ross-Smith.) 3) It seems that Sun recently released a patch fixing some problems with Qlogic cards. I tend to run 'pca' to patch my systems, and wasn't really paying too much attention to it I'm afraid! I think the patch was 113042. Rebooting and reconfiguring the system, the FC card then seemed to work correctly when one of the channels was disabled. Given that a few people (including myself!) asked why we hadn't bought 2 cards or at least a dual-port card if this was going to be a production server, we got approval to buy a second card. As far as I can tell running Solaris 10 with 2 FC cards should work pretty much out of the box with respect to failover. Because of this I did not analyse the initial problem any further to see if there was any one solution. (I'm still awaiting delivery of the second FC card, so this problem may yet come back and bite me again!) Regards, John. On Tue, 2008-06-24 at 10:48 +0100, John Horne wrote: > Hello, > > We have a T2000 running Solaris 10 5/08 with a single QLA2460 > fibre-channel card in it - so one card, one port. I have no control over > the fibre side of things, so am not completely sure what the > configuration is, but I gather it (the SAN) is provided by FalconStor. > The card/OS have been configured to see the switch the card is connected > to, and this seems to work fine. I am told that the switch provides 2 > routes from the actual SAN, hence Solaris initially sees 2 disks (when > using 'format'). I have configured multipathing (mpxio), and Solaris now > sees one disk. I have formatted/newfs'd the disk, and mounted it with no > problems. The disk provides user data, so it is not booted off. > > However, when I asked our Ops people to disable one of the fibre > 'channels' (on the fabric switch), to simulate a hardware fault, Solaris > detected the problem but disabled all access to the disk. Trying to > access the mounted disk gave an 'I/O error'; format showed the disk as > 'disk information unavailable', and 'mpathadm' likewise gave I/O errors > and stated that it could not get disk information. The only solution > seemed to be to unload the qlc module (using modunload), and reload it. > Then the system saw the disk again. Thinking this might just be a timer > issue, I left the system for a good 30 mins, but the disk never became > accessible again. The messages file showed errors such as: > > ================================================================== > Jun 20 16:58:53 lib-srvr7 scsi: [ID 107833 kern.warning] > WARNING: /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2): > Jun 20 16:58:53 lib-srvr7 Error for Command: read(10) > Error Level: Retryable > Jun 20 16:58:53 lib-srvr7 scsi: [ID 107833 kern.notice] > Requested Block: 64 Error Block: 64 > Jun 20 16:58:53 lib-srvr7 scsi: [ID 107833 kern.notice] Vendor: > FALCON Serial Number: OF1S3WS894OA > Jun 20 16:58:53 lib-srvr7 scsi: [ID 107833 kern.notice] Sense > Key: Unit Attention > Jun 20 16:58:53 lib-srvr7 scsi: [ID 107833 kern.notice] ASC: > 0x29 (power on occurred), ASCQ: 0x1, FRU: 0x0 > Jun 20 16:58:53 lib-srvr7 scsi: [ID 107833 kern.warning] > WARNING: /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2): > Jun 20 16:58:53 lib-srvr7 Error for Command: read(10) > Error Level: Retryable > Jun 20 16:58:53 lib-srvr7 scsi: [ID 107833 kern.notice] > Requested Block: 64 Error Block: 64 > Jun 20 16:58:53 lib-srvr7 scsi: [ID 107833 kern.notice] Vendor: > FALCON Serial Number: OF1S3WS894OA > Jun 20 16:58:53 lib-srvr7 scsi: [ID 107833 kern.notice] Sense > Key: Unit Attention > Jun 20 16:58:53 lib-srvr7 scsi: [ID 107833 kern.notice] ASC: > 0x3f (reported LUNs data has changed), ASCQ: 0xe, FRU: 0x0 > Jun 20 16:59:03 lib-srvr7 scsi: [ID 243001 kern.warning] > WARNING: /pci at 7c0/pci at 0/pci at 1/pci at 0,2/SUNW,qlc at 1/fp at 0,0 (fcp1): > Jun 20 16:59:03 lib-srvr7 INQUIRY to D_ID=0xe30700 lun=0x0 failed: > sense key=IllegalRequest, ASC=24, ASCQ=0. Giving up > Jun 20 16:59:03 lib-srvr7 scsi: [ID 243001 > kern.info] /pci at 7c0/pci at 0/pci at 1/pci at 0,2/SUNW,qlc at 1/fp at 0,0 (fcp1): > Jun 20 16:59:03 lib-srvr7 offlining lun=0 (trace=0), target=e30700 > (trace=b10101) > Jun 20 16:59:03 lib-srvr7 genunix: [ID 834635 > kern.info] /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2) > multipath status: degraded, > path /pci at 7c0/pci at 0/pci at 1/pci at 0,2/SUNW,qlc at 1/fp at 0,0 (fp1) to target > address: w50060b00006441e2,0 is offline Load balancing: round-robin > Jun 20 17:01:13 lib-srvr7 scsi: [ID 107833 kern.warning] > WARNING: /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2): > Jun 20 17:01:13 lib-srvr7 Error for Command: read(10) > Error Level: Retryable > Jun 20 17:01:13 lib-srvr7 scsi: [ID 107833 kern.notice] > Requested Block: 1528 Error Block: 1528 > Jun 20 17:01:13 lib-srvr7 scsi: [ID 107833 kern.notice] Vendor: > FALCON Serial Number: OF1S3WS894OA > Jun 20 17:01:13 lib-srvr7 scsi: [ID 107833 kern.notice] Sense > Key: Unit Attention > Jun 20 17:01:13 lib-srvr7 scsi: [ID 107833 kern.notice] ASC: > 0x3f (reported LUNs data has changed), ASCQ: 0xe, FRU: 0x0 > Jun 20 17:01:23 lib-srvr7 scsi: [ID 243001 kern.warning] > WARNING: /pci at 7c0/pci at 0/pci at 1/pci at 0,2/SUNW,qlc at 1/fp at 0,0 (fcp1): > Jun 20 17:01:23 lib-srvr7 INQUIRY to D_ID=0xe30900 lun=0x0 failed: > sense key=IllegalRequest, ASC=24, ASCQ=0. Giving up > Jun 20 17:01:23 lib-srvr7 scsi: [ID 243001 > kern.info] /pci at 7c0/pci at 0/pci at 1/pci at 0,2/SUNW,qlc at 1/fp at 0,0 (fcp1): > Jun 20 17:01:23 lib-srvr7 offlining lun=0 (trace=0), target=e30900 > (trace=b10101) > Jun 20 17:01:23 lib-srvr7 scsi: [ID 107833 kern.warning] > WARNING: /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2): > Jun 20 17:01:23 lib-srvr7 transport rejected fatal error > Jun 20 17:01:48 lib-srvr7 ufs: [ID 702911 kern.warning] WARNING: Error > writing master during ufs log roll > Jun 20 17:01:48 lib-srvr7 ufs: [ID 127457 kern.warning] WARNING: ufs log > for /m1 changed state to Error > Jun 20 17:01:48 lib-srvr7 ufs: [ID 616219 kern.warning] WARNING: Please > umount(1M) /m1 andrun fsck(1M) > Jun 20 17:02:23 lib-srvr7 scsi: [ID 107833 kern.warning] > WARNING: /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2): > Jun 20 17:02:23 lib-srvr7 offline or reservation conflict > Jun 20 17:03:16 lib-srvr7 scsi: [ID 107833 kern.warning] > WARNING: /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2): > Jun 20 17:03:16 lib-srvr7 offline or reservation conflict > Jun 20 17:03:33 lib-srvr7 scsi: [ID 107833 kern.warning] > WARNING: /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2): > Jun 20 17:03:33 lib-srvr7 offline or reservation conflict > Jun 20 17:03:37 lib-srvr7 scsi: [ID 107833 kern.warning] > WARNING: /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2): > Jun 20 17:03:37 lib-srvr7 offline or reservation conflict > Jun 20 17:03:48 lib-srvr7 scsi: [ID 107833 kern.warning] > WARNING: /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2): > Jun 20 17:03:48 lib-srvr7 offline or reservation conflict > Jun 20 17:03:50 lib-srvr7 scsi: [ID 107833 kern.warning] > WARNING: /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2): > Jun 20 17:03:50 lib-srvr7 offline or reservation conflict > ================================================================== > > > when the disk becomes available again (after modunload/modload), we see: > > ================================================================== > Jun 20 17:23:56 lib-srvr7 genunix: [ID 408114 > kern.info] /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2) > offline > Jun 20 17:23:56 lib-srvr7 genunix: [ID 834635 > kern.info] /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2) > multipath status: failed, > path /pci at 7c0/pci at 0/pci at 1/pci at 0,2/SUNW,qlc at 1/fp at 0,0 (fp1) to target > address: w50060b00006441e2,0 is offline Load balancing: round-robin > Jun 20 17:23:56 lib-srvr7 genunix: [ID 408114 > kern.info] /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2) > offline > Jun 20 17:24:06 lib-srvr7 scsi: [ID 243001 kern.warning] > WARNING: /pci at 7c0/pci at 0/pci at 1/pci at 0,2/SUNW,qlc at 1/fp at 0,0 (fcp1): > Jun 20 17:24:06 lib-srvr7 ns_registry: failed name server > registration > Jun 20 17:24:06 lib-srvr7 scsi: [ID 799468 kern.info] ssd2 at > scsi_vhci0: name g6000d775000032d11ada4f3e5d6a37ea, bus address > g6000d775000032d11ada4f3e5d6a37ea > Jun 20 17:24:06 lib-srvr7 genunix: [ID 936769 kern.info] ssd2 > is /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea > Jun 20 17:24:06 lib-srvr7 genunix: [ID 936769 kern.info] fp1 > is /pci at 7c0/pci at 0/pci at 1/pci at 0,2/SUNW,qlc at 1/fp at 0,0 > Jun 20 17:24:06 lib-srvr7 genunix: [ID 408114 > kern.info] /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2) > online > Jun 20 17:24:06 lib-srvr7 genunix: [ID 834635 > kern.info] /scsi_vhci/ssd at g6000d775000032d11ada4f3e5d6a37ea (ssd2) > multipath status: degraded, > path /pci at 7c0/pci at 0/pci at 1/pci at 0,2/SUNW,qlc at 1/fp at 0,0 (fp1) to target > address: w50060b000064487a,0 is online Load balancing: round-robin > ================================================================== > > > Looking on the Internet, it seems that the 'cfgadm -c configure' command > may re-enable the disk as well. The problem seems to be that the QLA > card 'logs out' (?) from the switch, and cannot re-establish the disk > connection until it logs in again. The point is that we want the > failover to be automatic, and not to have to run commands should a fault > occur on the SAN side. > > Has anyone else had this problem, and if so was there a solution? > Obviously what we want is to not to have to run commands should a > problem occur on the SAN, we want automatic failover (albeit that having > 2 cards or 2 ports might have been better for resilience!). > > > > Thanks, > > John. > -- --------------------------------------------------------------- John Horne, University of Plymouth, UK Tel: +44 (0)1752 587287 E-mail: John.Horne at plymouth.ac.uk Fax: +44 (0)1752 587001 _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From jal at mcs.le.ac.uk Tue Jul 8 10:51:00 2008 From: jal at mcs.le.ac.uk (J. Landamore) Date: Tue, 8 Jul 2008 15:51:00 +0100 Subject: [SUMMARY] ethernet interface in local zone Message-ID: <20080708145059.GK673@mcs.le.ac.uk> Thanks to Dean Ross-Smith for the answer, which is: I believe that on our boxes that have zones, we touched the interface files (touch /etc/bge0 or touch /etc/e1000g0) in the global zone which is enough for solaris to activate the interface and the nic configuration is then setup in the zone. hth Dean Ross-Smith -------------- sunmanagers-bounces at sunmanagers.org wrote on 07/08/2008 02:05:19 AM: > I have a X4150 solaris10u5 A local zone is configured with an exclusive > network interface, however there is a small problem. The local zone > cannot see the interface to plumb it until the global zone has plumbed and > unplumbed the interface. This prevents me starting the local zone at > boot. I believe this is a "feature" (I seem to remember having read about > this somewhere), has anyone a neat way round this or do I just tweak > /lib/svc/method/physical in the global zone to plumb and unplumb the > interface? -- John Landamore School of Mathematics & Computer Science University of Leicester University Road, LEICESTER, LE1 7RH J.Landamore at mcs.le.ac.uk Phone: +44 (0)116 2523410 Fax: +44 (0)116 2523604 _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From sunhux at gmail.com Thu Jul 10 05:33:10 2008 From: sunhux at gmail.com (sunhux G) Date: Thu, 10 Jul 2008 17:33:10 +0800 Subject: Summary: script to ssh into remote box & issue remote box's commands Message-ID: <60f08e700807100233g66f0b70we7f67c817e12e5f7@mail.gmail.com> My favourite is reply "A" from Vitaly as it specifically addresses Netapp Filer but I could not scp nor ftp the public key file into it (winscp session gets kicked after password is entered while there's no ftp client in filer. I'll just list the concise scripts/reply herein : Another list member enquired me about it too. Reply A: (my comments in bracketts) ====== you can use ssh authorized keys from you host to use no password ssh connection to your SAN netapp: 1. on the monitoring host: - create pair ssh keys, private & public by ssh-keygen from SSH pkg like ssh-keygen -t dsa -b 1024 (with no paraphrase) - save both keys in root home folder, for Solaris it's /.ssh 2. on the netapp - mount /vol/vol0/etc ('mount' command not there; /vol/vol0 is already mounted) - cd to etc on netapp, further cd sshd/root/.ssh if not exists create it ('cd' not there) - in netapp/etc/sshd/root/.ssh copy public key generated on the monitoring host here with authorized_keys name, make sure it has 600 root permissions as well as .ssh directory (can't find a way to ftp/scp public key into the filer) - make sure that ssh option on Netapp "ssh.pubkey_auth.enable" is on You can now run ssh remotely from your host to netapp to get info like: ssh 10.51.1.2 -l root 'fcp show adapter -v; lun config_check; fcp status' Reply B: (Perl script; need Perl ) ====== #!/usr/bin/perl use strict; # make shared keys first # this is a security risk. this script could be easily modified to do serious # serious damage # (eg, `$sshcmd "rm -rf /"`; will blow away everything on your netapp.) # be careful ;) # update user and filer to your username/filer hostname my $sshcmd = "/usr/local/bin/ssh user\@filer"; # backticks tell perl to drop to a shell and execute the command. my $rv = `$sshcmd "fcp show adapter -v"`; if ($rv eq "") { # if i don't get anything back, something's wrong. die "something's wrong\n"; } # print the output. print $rv; my $rv = `$sshcmd "lun config_check"`; unless ($rv =~ "No Problems Found") { print "!!! lun config_check FAILED !!!\n"; print "Error was $rv\n"; } print $rv; my $rv = `$sshcmd "fcp status"`; if ($rv =~ /FCP service is running/) { print $rv; } else { print "!!!! FCP status FAILED !!!!\n" print "Error was $rv\n"; } # etc. exit; Another Perl script : use Net::SSH::Perl; $host = "remote hostname or ip"; $user = "username"; $pass = "password"; $cmd = "/fullpath/remote_script.pl"; my $ssh = Net::SSH::Perl->new($host); $ssh->login($user, $pass); my($stdout, $stderr, $exit) = $ssh->cmd($cmd); Reply C: (Expect script) ====== Expect script will look something like : #!/usr/bin/env expect -f set timeout -1 set stty_init -echo spawn ssh 10.51.1.2 -l root match_max 100000 expect "Are you sure you want to continue connecting" send -- "yes\r" expect "password:" send -- "root-password-here\r" stty echo expect "sent unsupported channel request" send -- "\r" expect -exact "FILER1>" send -- "fcp show adapter -v\r" expect -exact "FILER1>" send -- "lun config_check\r" expect -exact "FILER1>" send -- "fcp status\r" expect -exact "SLAFILE1>" send -- "logout telnet" thanks U On 7/4/08, sunhux G wrote: > > Hi, > > I'm looking for solution to capture our SAN filer's information/statistics > to a file on a regular basis. The filer runs a customized Unix. > > It's possible to put ftp commands/parameters into a file (like password, > "cd ...", "get..."). Is it possible to do this with openssh that comes > with Solaris? > > Plan is to use following crontab script (call it capture.sh) so that the > filer's commands are captured into output.txt : > 00,15,30,45 /adm/script/capture.sh >> /var/tmp/output.txt 2>> > /var/tmp/err.txt > > # ssh 10.51.1.2 -l root (don't find any "-p password" for ssh) > The authenticity of host '10.51.1.2 (10.51.1.2)' can't be established. > ... > Are you sure you want to continue connecting (yes/no)? yes > root at 10.51.1.2's password: > > FILER1> Fri Jul 4 13:27:35 SGT [SLAFILE1: > openssh.invalid.channel.req:warning]: SSH client (SSH-2.0-OpenSSH_4.3) from > 10.51.1.45 sent unsupported channel request (10, env). > > FILER1> > FILER1> fcp show adapter -v > > ........... > > FILER1>lun config_check > No Problems Found > FILER1> fcp status > FCP service is running. > SLAFILE1> logout telnet > Connection to 10.51.1.2 closed. > > If expect/tcl script is expected, appreciate a more detailed codes > > as I'm not familiar with expect/tcl scripting. > > > > Thanks > > U _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From bgbeaird at sbcglobal.net Thu Jul 10 12:51:31 2008 From: bgbeaird at sbcglobal.net (Gene Beaird) Date: Thu, 10 Jul 2008 11:51:31 -0500 Subject: Summary: luxadm remove_device SCSI failed error on Sunfire 280R Message-ID: <001001c8e2ad$354b8360$0214a8c0@ibmt23> Thanks to all who replied, including: Scott Lawson, Tim Bradshaw, Stefan Varga, Sandesh Kubde, 'hike', Bryan Bahnmiller and Robert M. Martel We were successful in getting the disk swapped without having to reboot, or panicking the box. Some suggested I need a reboot to fix it, which the customer was having none of. Mr. Lawson suggested we use cfgadm. The drives WWN did show up in a 'cfgadm -al'. But nowhere in the server documentation did it say to use cfgadm on any of the FC-AL disks. One of my colleagues thought that yesterday when we were preparing for the change, and we considered it for a bit last night, but since this was such an important mission-critical and eggs-all-in-one-basket server, I opted to go strictly by the book, in case of catastrophe, where I could claim I was going by the book. A couple of you offered that since the disk is dead to luxadm, then you can just pull it. It would be interesting to try these things, though, just to see if it works. Unfortunately, my lab is customers' production boxes, so opportunity to experiment is limited. We determined that, as others suggested, the disk was too far gone for luxadm to communicate with it. When we executed 'luxadm remove_device ' (here is /dev/rdsk/c1t0d0s2), luxadm couldn't check the status of the drive, so the procedure failed after issuing the first line to 'Make sure the filesystems were backed up....', and then it would fail out, posting a SCSI error. We studied the steps of 'remove_device' and determined that luxadm roughly removed the device from the device tree, offlines it, and possibly even powers the device down. After executing 'luxadm -e offline ', we verified the disk didn't show in 'luxadm inq c?t?d?s2' or format. We then executed devfsadm -C to clear the devices from the /dev device list. After that, I had the DC Engineer go check to see if the light on the drive was out. It wasn't, but it was burning solidly, whereas the light on the other disk was showing activity. Since the system otherwise didn't know about the disk, I crossed my fingers and had the Engineer swap the disk. I monitored the system via console and noted that picld saw the drive pulled and re-inserted into the system. I then verified the disk showed up in format, and executed devfsadm -C to rebuild the /dev device list. From then on, it was the usual Disk Suite disk replacement process. Mr. Martel offered these steps for a failed disk on a A5200 array: "I had this problem with a Sun A5200 array - disk too far gone for luxadm to talk to it. The procedure Sun gave me had me bypassing the ports on the failed disk using the front panel controls - I don't know the 280R, but I'd guess you don't have such controls available." "What Happened after I followed Sun's special procedure to replace the failed disk was that was the new disk was not accessible. I then ran luxadm remove_device, popped the disk out when prompted, and ran luxadm insert_device and re-installed the replacement disk. From then on all was normal again." Unfortunately, I couldn't talk to Sun, as the status of the maintenance contract on this system is being investigated. Even then, most of the support we have is Gold, and this was way outside of Gold support time. This new disk may be T&M. Thanks to all who responded, it is nice to know at least people are out there listening and offering help when you are stressed out, sitting at the keyboard all alone in the middle of the night trying to keep the machine from falling over. Regards, Gene Beaird Pearland, Texas -----Original Message----- From: Gene Beaird [mailto:bgbeaird at sbcglobal.net] Sent: Wednesday, July 09, 2008 10:34 PM To: 'sunmanagers at sunmanagers.org' Subject: luxadm remove_device SCSI failed error on Sunfire 280R I have a failed disk0 on a SunFire 280R. It is part of a mirrored pair, mirrored with Disk Suite. I have broken the mirror, and metacleared the devices. According to the SunFire 280R Service manual and Owners manual, I am supposed to remove the bad disk from the system using luxadm remove_device command before I physically swap the drive out. When I execute luxadm remove_device /dev/rdsk/c1t0d0s2, I get: Error: SCSI failure. - /dev/rdsk/c1t0d0s2 Which is the same message I get for that disk when I execute luxadm inq /dev/rdsk/c?t?d?s2. I don't see a WWN in luxadm for that device. What's wrong and how do I get this fixed? Thank you all. Regards, Gene Beaird, CISSP, Unix Support Engineer, Pearland, Texas _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From sean at fpp.nuclearsafetysolutions.com Thu Jul 10 21:45:02 2008 From: sean at fpp.nuclearsafetysolutions.com (Sean Walmsley) Date: Thu, 10 Jul 2008 21:45:02 -0400 (EDT) Subject: SUMMARY: X4500 (thumper) boot drives Message-ID: <200807110145.m6B1j2t9013448@merlin.fpp.nuclearsafetysolutions.com> The short answer is that it seems you can only boot from either slot 0 or slot 1 on the X4500 (thumper). Mike Brodbelt noted that the X4500 documents discuss using the eeprom command to set an alternate boot path, but notes (as I had) that on Sun's x86 servers this information is stored in /boot/solaris/bootenv.rc *ON THE FILESYSTEM*. Since the filesystem isn't available until after boot, it's unlikely that setting this would have the desired effect. Thanks to: Scott Lawson Mike Brodbelt for their responses. Sean Walmsley ORIGINAL QUESTION: > >Does anyone know if it's possible to boot an X4500 Thumper >from disks other than the ones in slots 0 and 1? > >As far as we can tell from looking at the BIOS interface, >only the drives in slots 0 and 1 can be selected to boot from. >On other Sun x86 boxes (e.g. our X4450s), the BIOS seems >to probe for available drives and list all that it finds. > >We've looked through the manuals, and although they make >specific mention of booting from slots 0 and 1, they don't >actually come out and say that you can't boot from other >drives. Similarly, the chassis has labels warning that >slots 0 and 1 *MAY* be boot drives which to our minds >suggests that there may be alternatives. > >The reason for this question is that we periodically run a >"copy boot disk" script which clones the boot drive to a >second drive and performs housekeeping to make the clone drive >bootable in situ. Since slots 0 and 1 both reside on the >same controller, we'd prefer to use one of the 40 other >drives in the chassis that reside on a different controller >for our boot clone to in order to improve redundancy. _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From Laurence.Moughan at aerlingus.com Fri Jul 11 08:46:11 2008 From: Laurence.Moughan at aerlingus.com (Laurence Moughan) Date: Fri, 11 Jul 2008 13:46:11 +0100 Subject: SUMMARY - VXFS disk replace - help Message-ID: Ok so - vxdiskadm - remove old bad disk from discgroup, add new disk to diskgroup - thats it ! ................................... Hi All, I have 2 x 3310 arrays - both raid 5 - then plexed via vxfs. I lost one raide due to loss of several disks, these have been replaced and readi rebuilt. However my vx sees still NODEVICE please advise how to get the once failed lun added back into vxfs, I assume i use vxdiskadm and use the replace disk option, but do i specify the lun ? or sd names ? - eg oracled1 or c3t0d0s2 Thanks Laurence eg obeora1:/apps1 # vxdisk list DEVICE TYPE DISK GROUP STATUS c1t0d0s2 sliced rootdisk rootdg online c1t1d0s2 sliced rootdisk2 rootdg online c3t0d0s2 sliced - - online c5t0d0s2 sliced oracled2 oracledg online - - oracled1 oracledg failed was:c3t0d0s2 Disk group: oracledg TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0 dg oracledg oracledg - - - - - - dm oracled1 - - - - NODEVICE - - dm oracled2 c5t0d0s2 - 282391104 - - - - v apps1 fsgen ENABLED 67108864 - ACTIVE - - pl apps1-01 apps1 DISABLED 67112896 - NODEVICE - - sd oracled1-04 apps1-01 DISABLED 67112896 0 NODEVICE - - pl apps1-02 apps1 ENABLED 67112896 - ACTIVE - - sd oracled2-01 apps1-02 ENABLED 67112896 0 - - - v exports fsgen ENABLED 33554432 - ACTIVE - - pl exports-01 exports DISABLED 33560512 - NODEVICE - - sd oracled1-03 exports-01 DISABLED 33560512 0 NODEVICE - - pl exports-02 exports ENABLED 33560512 - ACTIVE - - sd oracled2-02 exports-02 ENABLED 33560512 0 - - - v oradata fsgen ENABLED 83886080 - ACTIVE - - pl oradata-01 oradata DISABLED 83889088 - NODEVICE - - sd oracled1-01 oradata-01 DISABLED 83889088 0 NODEVICE - - pl oradata-02 oradata ENABLED 83889088 - ACTIVE - - sd oracled2-03 oradata-02 ENABLED 83889088 0 - - - v oralogs fsgen ENABLED 33554432 - ACTIVE - - pl oralogs-01 oralogs DISABLED 33560512 - NODEVICE - - sd oracled1-02 oralogs-01 DISABLED 33560512 0 NODEVICE - - pl oralogs-02 oralogs ENABLED 33560512 - ACTIVE - - sd oracled2-04 oralogs-02 ENABLED 33560512 0 - - - ..For low fares and great deals on hotels, car hire and travel insurance visit http://www.aerlingus.com ***************************************************************************** ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited.If you have received this email in error please notify the sender immediately and delete the material. ***************************************************************************** ** Aer Lingus Limited Registered in Ireland Company Number 9215 Registered Office at Dublin Airport, Dublin,Ireland. ***************************************************************************** ** _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From HammonNN at telkom.co.za Fri Jul 11 10:27:50 2008 From: HammonNN at telkom.co.za (Dean Hammond (NN)) Date: Fri, 11 Jul 2008 16:27:50 +0200 Subject: SUMMARY: scstat -i is hanging Message-ID: Yo Eivind: ---------------------------------------------------------------------- ----------------------------------- Nobody replied to this message. I did not find a sollution, so I reinstalled the cluster with SC3.0 5/02, which worked all right. Regards, Eivind Nordbye Eivind Nordbye wrote: > Hi > > My configuration: > Recently installed two node cluster (V210 and V240) running SC3.1 > (from Java Enterprise System 2004Q2) and Solaris 8 2/04. bge > interfaces used as interconnects. D2 as shared storage. > > My problem: > scstat -i is hanging for several minutes before it lists up the ipmp > group on the local node only. A message in the log says it cannot talk > to pnmd on the other node. Problem is the same on both nodes. It does > not hang when other node is down. I had the problem before I > configured ipmp groups as well. The pnmd daemon is running on both > nodes and I can telnet to its' port number on the interconnet > interfaces from one node to another. Latest patches have been installed. > > Please help. Will summarize. > > Regards, > Eivind Nordbye ---------------------------------------------------------------------- ---------------------------------------------------------------------- ------- I found the solution: You need to set the following: ndd -set /dev/ip ip_strict_dst_multihoming 0 If it's set to 1, the problem will pop up again. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This e-mail is subject to the Telkom SA electronic communication legal notice, available at : http://www.telkom.co.za/TelkomEMailLegalNotice.PDF ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From ahoesch at smartsoft.de Mon Jul 14 03:59:04 2008 From: ahoesch at smartsoft.de (=?ISO-8859-1?Q?Andreas_H=F6schler?=) Date: Mon, 14 Jul 2008 09:59:04 +0200 Subject: Summary: Kill process problem In-Reply-To: <329389FB-4F93-11DD-AC3C-000393CA0072@smartsoft.de> Message-ID: Dear all, The question was: > I have a bad acroread process running in a zone. > > 11573 ahoesch 89M 14M cpu1 60 0 5:24:40 50% acroread/1 > > I tried "kill 11573" and "kill -9 11573" in the zone. Nothing! I tried > I tried "kill 11573" and "kill -9 11573" in the global zone. Nothing! > the process still runs. I then tried to reboot the zone hosting the > process. The zone went down and never came up again. In the global zone > prstat still shows the process. I am stuck! What is this? Looks like a > serious bug in Solaris!? I don't dare to reboot the whole system since > it probably won't go down cleanly anyway. What can I do? The bottom line of your responses was that "kill -9 ..." won't kill a process that hangs within a system call (e.g. read()). The process blocked a complete CPU for 5 hours. Seconds before I was going to hit reboot , the process was suddenly gone and the zone shut down. So I did not have to do a complete system reboot this time. I am attaching responses in no special order. Thanks a lot! Regards, Andreas ************************************************************************ ********** Most probably the process is stuck in BIOREAD state, you have triggered the bug with the zones. Try to: - Kill all processes within zone concerned - Halt the zone - Forcibly umount all mount points referenced by the zone (using umount -f) - Boot the zone again. This always helped to me (for example if NFS mount into zone timed out.) ************************************************************************ ********** This: http://opensolaris.org/jive/thread.jspa?messageID=147538 is what you are triggering. On the other hand, sometimes you can use ``preap'' if the process is stuck in zombie state. Just for my interest, what was the output of "zoneadm list -civ" after you issued "zoneadm halt" ? ************************************************************************ ********** Try truss'ing the process: truss -p 11573 this should provide some clue as to what the process is doing, possibly why it cannot be killed. If this doens't work, mdb might reveal more information. You could be facing a bug similar to this: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6455727 Good luck, -f http://www.blackant.net/ ************************************************************************ ********** man preap ************************************************************************ ********** Zone or no zone, processes can only be killed when they are not in the middle of a system call. If acroread is making a call (like read()) that doesn't return, it will not die. You might try to 'truss' it and see if it's making such a call. If a process won't die, I don't know any method of disconnecting it from a zone so that the zone can be restarted. ************************************************************************ ********** _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From sunhux at gmail.com Fri Jul 18 10:52:36 2008 From: sunhux at gmail.com (sunhux G) Date: Fri, 18 Jul 2008 22:52:36 +0800 Subject: Summary: Killing a process with rapidly changing pid (without rebooting) Message-ID: <60f08e700807180752q3fb2b232te5ae3899540488cd@mail.gmail.com> Thanks to the numerous replies. ==================================== Below are 5 which are good enough : Use ptree to find its (stable) parent and then kill that, or use pkill . ==================================== I dont think that this solve you problem but: kill -9 ` ps -elf | grep foo | grep -v "grep" | cut -d " " -f5` ==================================== If the name of the process is unique, you can use pkill: pkill foobar or pkill -9 foobar ==================================== #ps -elf | grep nohup | grep -v grep 0 O root 16538 1579 0 99 20 ? 152 13:24:49 ? 0:00 nohup echo AA In my output you see the parent process id is 1579. Now I am killing this process: #kill 1579 ==================================== pgrep PARENT_PROCESS_NAME | xargs kill -9 Thanks U On Fri, Jul 18, 2008 at 2:55 PM, sunhux G wrote: > Hi, > > > > I've started/run a "while ... loop" shell script by issuing > "nohup script_name &". > > Rebooting the server is not an option. > > Problem is in a split of a second after issuing > "ps -ef | grep script_name" > the pid would change again & I'm not able to kill using pid. > > I wrote this script so that I can rapidly check a SAN LUN > path status & email out when the status of the SAN LUN > changed momentarily. We suspect there's transient/quick > transitioning problem with certain SAN LUN & the SAN > vendor told us to check it by issuing a specific command > from the Solaris server. > > I did not insert a "sleep ..." line in the script in case during > the short pause, we missed capturing the state of the SAN > LUN. > > > Perhaps someone has script that could quickly get the > pid of the process, pipe it immediately to the kill command?? > > > Thanks > U _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From zhu_junca at yahoo.ca Sun Jul 20 11:40:07 2008 From: zhu_junca at yahoo.ca (Carl Ma) Date: Sun, 20 Jul 2008 11:40:07 -0400 Subject: summary: solaris network port response time Message-ID: Hi, Thanks for the responses from following people. Andrew Brennan John.Hallman Ddelija JayJay Florendo Sengor Andrew suggested port mapping and scapy.py, which doesn9t apply in our enviornment. John Hallman suggested IO::Socket module using Perl script, based on which I added time::HiRes fuction, it ends up below perl script. Ddelija also suggested switch port mapping, which I can9t implement in our environment. Jay jay advise the 3time 2 to count the round trip time. As this program forks lots of processes, port connection is one of the components. I can9t use this. Sengor suggested sending snoop output to another host running Ethereal/Wireshark. It is doable although extra management approval is needed.:-) I would suggest chaoreader, which is a well written perl script. It can parse snoop/tcpdump output and create a html file with all kinds of statistics. So far, I am using the below script for a quick snap of port response and waiting Sun to release new tcp provider.:-) Thanks, carl #!/usr/bin/perl use strict; use IO::Socket; use Time::HiRes qw( gettimeofday ); # my ($socket, $start, $after, $delta, $status); my @services = ( [ 'destination.net', 'IP', 'port' ], ); # for my $i ( 0 .. $#services ) { $status .= "Port ".$services[$i][2]." on ".$services[$i][0]." is "; $start = gettimeofday(); $socket = IO::Socket::INET->new( PeerAddr => $services[$i][1], PeerPort => $services[$i][2], Proto => 'tcp', Timeout => 2, Type => SOCK_STREAM); if ($socket) { $after = gettimeofday(); $delta = ($after - $start); $status .= "UP, port connection time is $delta "; close($socket); } else { $status .= "DOWN"; } $status .= "\n"; } Print $status exit; --original question-- On Mon, 30 Jun 2008, Jun Zhu wrote: > Hello folks, > > I am looking for a way to monitor the network response time from local > server(solaris 10 release 08/07) to a particular application port on remote > machine, which I don't have access to and it could be a host or network > device. > > As this is a production server, I can't install extra utilities. So nmap or > ethereal is out of scope. The dtrace script - tcpsnoop doesn't work on the > latest solaris release because of the incompatible fbt provider. > > I am planning to use a script to parse snoop output. By calculating the > timestamp difference in "ether header", it should be able to show response > time of remote port. > > Someone may have this done already and could you please share some ideas? > > Thanks& have a good day, > > carl _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From Ronelle.vanNiekerk at intecbilling.com Mon Jul 21 13:10:23 2008 From: Ronelle.vanNiekerk at intecbilling.com (Ronelle van Niekerk) Date: Mon, 21 Jul 2008 19:10:23 +0200 Subject: SUMMARY: Boot Solaris 10 X86 in 64-bit In-Reply-To: <20080721163814.GB21744@SDF.LONESTAR.ORG> References: <20A990C5AF9E6F46A338CF7FA4403346E46920@IBCPTEX01.intecbilling.com> <20080721163814.GB21744@SDF.LONESTAR.ORG> Message-ID: <20A990C5AF9E6F46A338CF7FA4403346E7A1FD@IBCPTEX01.intecbilling.com> Turns out it was the vmware servers - we can't get it to allow us to run in 64-bit mode. Thanks to all who tried to help. -Ronelle -----Original Message----- From: A Darren Dunham [mailto:ddunham at taos.com] Sent: 21 July 2008 06:38 PM To: Ronelle van Niekerk Subject: Re: Boot Solaris 10 X86 in 64-bit On Wed, Jul 16, 2008 at 11:08:06AM +0200, Ronelle van Niekerk wrote: > Guys, > > We're trying to install Solaris 10 x86 on a vmware esx server. > > We've configured the virtual machine to 64-bit and to accept solaris as > a bootable OS (I don't know much about this bit.) I wonder if there was a problem here. > Solaris won't boot in 64-bit mode and choosing the kernel/amd64/unix > won't force it - it come sup with a processor error. 64-bit is the default after installation if the hardware supports it. During installation, only a 32-bit version is used. Can you show the output of 'psrinfo -vp'? -- Darren _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From speedyourmind at yahoo.com Tue Jul 22 09:36:27 2008 From: speedyourmind at yahoo.com (Kiran Sharma) Date: Tue, 22 Jul 2008 06:36:27 -0700 (PDT) Subject: Summary: Solaris Volume Manager (SVM) issue with mirroring the disks. In-Reply-To: <493206.13761.qm@web65405.mail.ac4.yahoo.com> Message-ID: <958163.52655.qm@web65405.mail.ac4.yahoo.com> The Issue was with vfstab. I copied the entry of / (root) from the vfstab and forgot to change no to yes on mount at boot. I would like to thank following who pointed out the issue with the mount. Thank you so much. Have a wonderful day. Matthew Stier Neil Calton Paveza, Gary Deiter, Scott Andrew Williamson - Fujitsu joe fletcher Udo Grabowski Ric Anderson A Darren Dunham Matthew Stier Romeo Theriault JayJay Florendo Kiran Sharma wrote: Gurus, I just installed Solaris 10 05/08 on the server. I created state database on slice 3 and 4 on two disk c0t0d0 and c0t1d0. I created two slices 6 and 7 with about 1gb space and 500mb on each slice. I used the following command. metainit d51 1 1 c0t0d0s3 metainit d52 1 1 c0t1d0s3 metainit d5 -m d51 metattach d5 d52 and same process for d6 mirror. I manually tried to mount and I can mount it and I can see the content on c0t0d0s3 with the following mount command mount /dev/md/dsk/d6 /sam but when I unmount and add entry into vfstab and do mountall it does not mount the mount point /sam. I added entry on vfstab and rebooted the server without any problem but it didn't mount the filesystem. Again if I manually mount then I can mount and see the content on the disk. Here is detail from the server. I really appreciate if someone please guide if I am missing something or is there any bug on the system or need to upgrade or apply any patches. Thanks - KS # uname -a SunOS samrat 5.10 Generic_127127-11 sun4u sparc SUNW,UltraSPARC-IIi-cEngine # # df -k Filesystem kbytes used avail capacity Mounted on /dev/dsk/c0t0d0s0 20169850 13690434 6277718 69% / /devices 0 0 0 0% /devices ctfs 0 0 0 0% /system/contract proc 0 0 0 0% /proc mnttab 0 0 0 0% /etc/mnttab swap 2824016 1472 2822544 1% /etc/svc/volatile objfs 0 0 0 0% /system/object fd 0 0 0 0% /dev/fd swap 2822544 0 2822544 0% /tmp swap 2822584 40 2822544 1% /var/run # ------------------------------------------------------------------ # more /etc/vfstab #device device mount FS fsck mount mount #to mount to fsck point type pass at boot options # fd - /dev/fd fd - no - /proc - /proc proc - no - /dev/dsk/c0t0d0s1 - - swap - no - /dev/dsk/c0t0d0s0 /dev/rdsk/c0t0d0s0 / ufs 1 no - /devices - /devices devfs - no - ctfs - /system/contract ctfs - no - objfs - /system/object objfs - no - /dev/md/dsk/d6 /dev/md/rdsk/d6 /sam ufs 1 no logging /dev/md/dsk/d5 /dev/md/rdsk/d5 /cisco ufs 1 no logging swap - /tmp tmpfs - yes - I even move swap on top up /sam and also tried moving bottom but didn't work. ---------------------------------------------------- # mountall mount: /tmp is already mounted or swap is busy # # df -k Filesystem kbytes used avail capacity Mounted on /dev/dsk/c0t0d0s0 20169850 13690434 6277718 69% / /devices 0 0 0 0% /devices ctfs 0 0 0 0% /system/contract proc 0 0 0 0% /proc mnttab 0 0 0 0% /etc/mnttab swap 2824184 1472 2822712 1% /etc/svc/volatile objfs 0 0 0 0% /system/object fd 0 0 0 0% /dev/fd swap 2822712 0 2822712 0% /tmp swap 2822752 40 2822712 1% /var/run # # metastat |more d6: Mirror Submirror 0: d61 State: Okay Submirror 1: d62 State: Okay Pass: 1 Read option: roundrobin (default) Write option: parallel (default) Size: 1025595 blocks (500 MB) d61: Submirror of d6 State: Okay Size: 1025595 blocks (500 MB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c0t1d0s7 0 No Okay Yes d62: Submirror of d6 State: Okay Size: 1025595 blocks (500 MB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c0t0d0s7 0 No Okay Yes d5: Mirror Submirror 0: d51 State: Okay Submirror 1: d52 State: Okay Pass: 1 Read option: roundrobin (default) Write option: parallel (default) Size: 2097414 blocks (1.0 GB) d51: Submirror of d5 State: Okay Size: 2097414 blocks (1.0 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c0t0d0s6 0 No Okay Yes d52: Submirror of d5 State: Okay Size: 6110235 blocks (2.9 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c0t1d0s6 0 No Okay Yes Device Relocation Information: Device Reloc Device ID c0t0d0 Yes id1,sd at SSEAGATE_ST336704LSUN36G_3CD1XFL600007141DJ82 c0t1d0 Yes id1,sd at SSEAGATE_ST336704LSUN36G_3CD1AK9W00007125L64L ------------------------------------------------- # metadb flags first blk block count a m pc luo 16 8192 /dev/dsk/c0t0d0s3 a pc luo 16 8192 /dev/dsk/c0t0d0s4 a pc luo 16 8192 /dev/dsk/c0t1d0s3 a pc luo 16 8192 /dev/dsk/c0t1d0s4 # # mount /dev/md/dsk/d6 /sam # mount /dev/md/dsk/d5 /cisco # df -k Filesystem kbytes used avail capacity Mounted on /dev/dsk/c0t0d0s0 20169850 13690434 6277718 69% / /devices 0 0 0 0% /devices ctfs 0 0 0 0% /system/contract proc 0 0 0 0% /proc mnttab 0 0 0 0% /etc/mnttab swap 2823512 1472 2822040 1% /etc/svc/volatile objfs 0 0 0 0% /system/object fd 0 0 0 0% /dev/fd swap 2822040 0 2822040 0% /tmp swap 2822080 40 2822040 1% /var/run /dev/md/dsk/d6 482308 1252 432826 1% /sam /dev/md/dsk/d5 1016306 1252 954076 1% /cisco # _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From sun at paynet.co.ke Tue Jul 22 11:30:27 2008 From: sun at paynet.co.ke (sun) Date: Tue, 22 Jul 2008 18:30:27 +0300 Subject: SUMMARY: Error Installing Solaris 10 on T5220 In-Reply-To: <43F744E0C1D7334E8AC73B02B5697F750316A11F@exchange.paynet.co.ke> References: <43F744E0C1D7334E8AC73B02B5697F750316A11F@exchange.paynet.co.ke> Message-ID: <43F744E0C1D7334E8AC73B02B5697F75030F6FF6@exchange.paynet.co.ke> Hi, Thank you all for your comments. It appears that I may be using an older version of Solaris 10. I have been trying one I downloaded last year August. I am already re-downloading the Solaris software and will retry the installation with the latest version I can download which is 10 5/08. All my responses seem to point to this as the problem so rather than trouble you all before I have had a chance to try a newer version I think I can summarize this for now. Thank you all once again for your quick responses. Good work Thanks Andrew Luande ________________________________ From: sun Sent: Tuesday, July 22, 2008 5:50 PM To: sunmanagers at sunmanagers.org Subject: Error Installing Solaris 10 on T5220 Hi, I am getting the following errors as I begin my Solaris 10 Installation from a boot cdrom command. I cant seem to find much on the internet about these errors. The installation doesn't terminate as I get the select your language menu and I can begin the install but I wonder why these errors are coming up? WIll they cause me a problem later? What should I do? I am trying to reinstall Solaris10 on a T5220 server which is new. Your advice is appreciated. Errors Boot device: /pci at 0/pci at 0/pci at 1/pci at 0/pci at 1/pci at 0/usb at 0,2/storage at 2/disk at 0:f Fi le and args: SunOS Release 5.10 Version Generic_118833-33 64-bit Copyright 1983-2006 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. Configuring devices. /platform/sun4u/kernel/drv/sparcv9/rmc_comm: undefined symbol 'watchdog_activated' /platform/sun4u/kernel/drv/sparcv9/rmc_comm: undefined symbol 'tod_ops' /platform/sun4u/kernel/drv/sparcv9/rmc_comm: undefined symbol 'watchdog_enable' WARNING: mod_load: cannot load module 'rmc_comm' /platform/sun4u/kernel/drv/sparcv9/rmclomv: undefined symbol 'rmc_comm_unreg_intr' /platform/sun4u/kernel/drv/sparcv9/rmclomv: undefined symbol 'rmc_comm_unregister' /platform/sun4u/kernel/drv/sparcv9/rmclomv: undefined symbol 'rmc_comm_register' /platform/sun4u/kernel/drv/sparcv9/rmclomv: undefined symbol 'rmc_comm_reg_intr' /platform/sun4u/kernel/drv/sparcv9/rmclomv: undefined symbol 'rmc_comm_request_nowait' /platform/sun4u/kernel/drv/sparcv9/rmclomv: undefined symbol 'watchdog_activated' /platform/sun4u/kernel/drv/sparcv9/rmclomv: undefined symbol 'watchdog_available' /platform/sun4u/kernel/drv/sparcv9/rmclomv: undefined symbol 'tod_ops' /platform/sun4u/kernel/drv/sparcv9/rmclomv: undefined symbol 'rmc_comm_request_response' /platform/sun4u/kernel/drv/sparcv9/rmclomv: undefined symbol 'watchdog_enable' WARNING: mod_load: cannot load module 'rmclomv' WARNING: rmclomv: unable to resolve dependency, module 'drv/rmc_comm' not found /platform/sun4u/kernel/drv/sparcv9/rmc_comm: undefined symbol 'watchdog_activated' /platform/sun4u/kernel/drv/sparcv9/rmc_comm: undefined symbol 'tod_ops' /platform/sun4u/kernel/drv/sparcv9/rmc_comm: undefined symbol 'watchdog_enable' WARNING: mod_load: cannot load module 'rmc_comm' /platform/sun4u/kernel/drv/sparcv9/rmcadm: undefined symbol 'rmc_comm_unregister' /platform/sun4u/kernel/drv/sparcv9/rmcadm: undefined symbol 'rmc_comm_request_response_bp' /platform/sun4u/kernel/drv/sparcv9/rmcadm: undefined symbol 'rmc_comm_register' /platform/sun4u/kernel/drv/sparcv9/rmcadm: undefined symbol 'rmc_comm_request_response' /platform/sun4u/kernel/drv/sparcv9/rmcadm: undefined symbol 'rmc_comm_send_srecord_bp' WARNING: mod_load: cannot load module 'rmcadm' WARNING: rmcadm: unable to resolve dependency, module 'drv/rmc_comm' not found Using RPC Bootparams for network configuration information. Attempting to configure interface e1000g3... NOTICE: pciex8086,105e - e1000g[0] : Adapter 100Mbps full duplex copper link is up. Skipped interface e1000g3 Attempting to configure interface e1000g2... Skipped interface e1000g2 Attempting to configure interface e1000g1... Skipped interface e1000g1 Attempting to configure interface e1000g0... Skipped interface e1000g0 internal error: Bad file number svc:/system/filesystem/local:default: WARNING: /usr/sbin/zfs mount -a failed: ex it status 134 Jul 22 00:22:40 svc.startd[7]: svc:/system/filesystem/local:default: Method "/li b/svc/method/fs-local" failed with exit status 95. Jul 22 00:22:40 svc.startd[7]: system/filesystem/local:default failed fatally: t ransitioned to maintenance (see 'svcs -xv' for details) Setting up Java. Please wait... Extracting windowing system. Please wait... Beginning system identification... Searching for configuration file(s)... Search complete. Discovering additional network configuration... Thanks Andrew Luande _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From Robert.Peebles at Williams.com Thu Jul 24 14:53:55 2008 From: Robert.Peebles at Williams.com (Peebles, Robert) Date: Thu, 24 Jul 2008 13:53:55 -0500 Subject: SUMMARY: Hostflapping Issue In-Reply-To: <74CD15E19CA2CC4AB3AB7ABEC63670E90197C5B9@wmstutmb04.WILLIAMS.COM> References: <488710A8.2010905@fjserv.net> <74CD15E19CA2CC4AB3AB7ABEC63670E9018CF4F9@wmstutmb04.WILLIAMS.COM> <74CD15E19CA2CC4AB3AB7ABEC63670E90197C5B9@wmstutmb04.WILLIAMS.COM> Message-ID: <74CD15E19CA2CC4AB3AB7ABEC63670E90197C5BC@wmstutmb04.WILLIAMS.COM> Thanks to Francisco for letting me know about this issue: http://blogs.sun.com/swas/entry/solaris_10_8_07_broadcom However, this was not our problem. The ST2540 has an unresolved bug. Here's the description that the Sun support engineer sent us: ST2540 could send packets coming from a host back to the SWITCH, (forwards the same packets on the network again.) So the SWITCH stores the host's MAC address info on the Port connected to the ST2540. SWITCH PortA <------ Host PortB ------> ST2540 <====== !! (fowards the same packet back to the PortB) The SWITCH misunderstands the Host is being connected to the PortB because the "forwarded" packet has the Host's MAC address as the "source MAC". So if SWITCH receives packets to the Host, then forwards the packets to PortB, not PortA. There are 2 workarounds: 1) Connect the NICs on the switch to a hub and connect a NIC on the server to a hub; OR 2) Make direct network connections between the switch and the server using crossover cables. Regards, Robert -----Original Message----- From: sunmanagers-bounces at sunmanagers.org [mailto:sunmanagers-bounces at sunmanagers.org] On Behalf Of Peebles, Robert Sent: Wednesday, July 23, 2008 9:26 AM To: sunmanagers Subject: Hostflapping Issue Greetings, I have a Sun X4600 running Solaris 10 8/07 connected to an ST2540 array via FC HBAs. Both the server and the array are on the same subnet connected to the same Cisco 4507 switch. Numerous times every day the server's IP address stops responding to pings. When this occurs, the MAC address of the Sun's NIC appears on the switch port that is connected to the NIC on the ST2540. I can't comprehend how the MAC address on the Sun can show up on a port connected to the array, unless it's a switch issue. Has anyone else experienced this problem? I would greatly appreciate any light that could be shed on this issue. Thanks! Robert _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From stepchung at gmail.com Mon Jul 28 12:34:05 2008 From: stepchung at gmail.com (Stephanie C) Date: Mon, 28 Jul 2008 09:34:05 -0700 Subject: SUMMARY: ZFS vs UFS - Anyone runs big Oracle DB on ZFS? Message-ID: <6bdfa8ce0807280934o12e5adc4l974957a59ccae3f7@mail.gmail.com> Thank you for all excellent responses. I really appreciate for your help. *From: francisco roque* We run all of our production databases on zfs, but the largest is only about 70GB. There are a number of steps you need to take to get better performance for db. Read through the entirety of the ZFS Best Practices Guide: *http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide* * *and the related tuning guide: *http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide* * * The main changes we make are to limit the ZFS ARC and to disable cache flushing (we use SAN disk). We tried these changes and found that it gave us performance similar to our previous filesystems (ufs/vxfs). If you try zfs, be sure to test it out on a dev system and do some true perf testing to verify that any changes you make are really effective. Depending on the nature of your database, you may find a number of the tuneables worthwhile, but only testing will verify this. * * *From: Jeff Marble* We had one issue with ZFS and one with contiguous blocks of memory on our Oracle 10g server. We are running fine now after discovering the corrections. We do not typically have a heavy load though. ZFS - Oracle performs a sync write which requires that all buffers are written do disk. That means all cache must be flushed out to disk and a response must be received before proceeding. We got around this by putting the ctl files, redo, and arch files to a separate 5GB partition. Performance went way up. Large Contiguous Blocks - Sun attempted to implement a new feature where Oracle is given large contiguous blocks of memory to work with. The problem is that at times, the algorithm cannot find the space and the system comes to it's knees. It does not happen if Oracle is brought up immediately after starting the computer, before any other app can fragment the RAM. We set pg_contig_disable=1 in /etc/system and rebooted. Run this line to stop the hunt for contiguous pages. You should see results in a few minutes. echo "pg_contig_disable/W 1" | mdb -kw Or for a more permanent resolution, put this in /etc/system and reboot. ** Disable coalescing feature set pg_contig_disable=1 * * * * * * *From: Martin Pre_laber*** perhaps that article will help you: *http://www.solarisinternals.com/wiki/index.php/ZFS_for_Databases* ** *From: Maciej Bliziqski*** My company is running Oracle 10g on ZFS. Our database is comparable to 300GB, and we have no performance issues. I'm only administering the OS, so I don't know much more than that at the moment. If you have specific questions, I can ask around. *From: Rajiv Gunja*** Some of the production DBs at our Org has about 1.5 TB to 3 TB of Oracle 9/10 on SUN Solaris with FS = vxfs. One way we take care of performance issues is to split the DB in to smaller pieces across different FS. Its been many (11)years since I worked with Oracle, so not sure what snapshot is. We use NetApps for SAN so we have daily snapshots of the FS done on the SAN side and we also perform cold backup weekly. *From: Dr. Udo Grabowski* You could also create vdevs on a zfs raidz filesystem, export them via iscsi, and then create a UFS filesystem in them with directio access from Oracle. On our MySQl database this gave a factor of 5 better performance than an adapted (recordsize,etc.) ZFS filesystem, and we have the checksumming and other features of ZFS still underneath On Fri, Jul 25, 2008 at 3:59 PM, Stephanie C wrote: > We are moving from HP-UX to SUN Solaris 10. We would like to configure the > Oracle 10g DB with the ZFS file system to take advantages of this ZFS cool > features (snapshot...). But since we have some information about Oracle 10g > DB is having some performance issue with ZFS file system (this information > is not confirmed), we don't really want to take this chance. I hate to have > two file systems (ZFS for applications data and UFS for DB) in one zone. > When I do 'zfs snapshot', it will not snap the Oracle DB. Before making the > decision to go or not to go with ZFS for DB, I would like to ask the experts > on the list. Does anyone run Oracle DB (300GB data) on ZFS zone? Do you > experience with the performance issue? Thank you. _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From dkrause at optivus.com Mon Jul 28 13:15:49 2008 From: dkrause at optivus.com (Don Krause) Date: Mon, 28 Jul 2008 10:15:49 -0700 Subject: SUMMARY: echod on Solaris 10 In-Reply-To: <2ACEEA94-A118-442B-A0DF-FC22A1B17670@optivus.com> References: <2ACEEA94-A118-442B-A0DF-FC22A1B17670@optivus.com> Message-ID: Many thanks to: Tim Bradshaw, Aleks Feltin and Luc I. Suryo for getting me in the right direction. For some reason, in.echod seems to be installed on their systems, it is not, however installed on any of my Solaris 10 boxes. Installing SUNWcns is only half the battle however. After installing it, there is indeed now a /usr/lib/inet/in.echod, but running inetadm | grep echo still returns nothing. Turns out I had to: Install SUNWcns svccfg import /var/svc/manifest/network/echo.xml inetadm -e network/echo:stream inetadm -e network/echo:dgram Some of this was in the Solaris IP admin guide (816-4554.pdf) However, it made references to "service.dir", but never, (that I could locate anyway) alluded to the fact that the "service.dir" was indeed /var/ svc/ so I spent too much time trying to locate the echo.xml file. (That's what I get for assuming that it would be called echod.xml) Thanks On Jul 25, 2008, at 5:14 PM, Don Krause wrote: > I'm trying to migrate some servers to Solaris 10, and I really need > echod. > > According to the man pages, it's been pulled out of inetd and is now / > usr/sbin/in.echod, but not on my installs. So I tried adding it to / > etc/inetd.conf and running inetdconv, it complains, telling me that I > need to install SUNWcnsr and SUNWcnsu, which don't seems to exist. > > However, SUNcns does exist, and it claims to offer echo, time, etc. So > I successfully install that package, but there's still no in.echod > anywhere that I can find, svcs doesn't not list an echo service, > either does inetadm. > > Suggestions? > -- > Don Krause > Optivus Proton Therapy, Inc. > P.O. Box 608 > Loma Linda, California 92354 > dkrause at optivus.com > www.optivus.com > "This message represents the official view of the voices in my head." > _______________________________________________ > sunmanagers mailing list > sunmanagers at sunmanagers.org > http://www.sunmanagers.org/mailman/listinfo/sunmanagers > -- Don Krause Optivus Proton Therapy, Inc. P.O. Box 608 Loma Linda, California 92354 dkrause at optivus.com www.optivus.com "This message represents the official view of the voices in my head." _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From rob.knighthood at gmail.com Mon Jul 28 18:15:17 2008 From: rob.knighthood at gmail.com (Robert Knight) Date: Mon, 28 Jul 2008 14:15:17 -0800 Subject: SUMMARY: non-interactive jumpstart Message-ID: <2b1484630807281515k292b481qce77610859bffe79@mail.gmail.com> Thank you all for the rapid and useful replies. I received a number of potential solutions, and spent an hour or so testing them all out. The solutions I chose were: 1) hardcode NFSv4 domain to mydomain.com, instead of selecting "dynamic". We don't use NFSv4 anyway, so I disabled it in a post script by setting /etc/default/nfs NFS_*_VERSMAX=3. 2) I set timeserver=localhost, which tells jumpstart that the local clock is assumed to be correct. No one could provide information on other accepted values of the timeserver variable (IP addresses for ntp servers? something else?). If anyone happens to know, it would satisfy a curiousity :-) Robert Knight On Mon, Jul 28, 2008 at 10:02 AM, Robert Knight wrote: > I'm having issues getting jumpstart to be fully automated. There are two > issues: > > 1) Prompting for NFSv4 domain. I have the following option in my sysidcfg > file, but it doesn't seem to be used: nfs4_domain=dynamic > 2) Prompting for date/time. In the same sysidcfg file, I set timeserver= > ntp.mydomain.com, but am still prompted during jumpstart. > > I am attempting to jumpstart Solaris 10u5 sparc using the DVD image. In > case it matters, I am using, for now, the traditional RARP method, not the > DHCP one. > > I understand that these might questions might not seem time-sensitive in > the traditional sense, but my boss is breathing down my neck to get this all > set up correctly... > > Does anyone have any experience to lend on these matters? > > Robert Knight _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From lally.singh at gmail.com Tue Jul 29 12:25:44 2008 From: lally.singh at gmail.com (Lally Singh) Date: Tue, 29 Jul 2008 12:25:44 -0400 Subject: SUMMARY: Zones, Mac Addresses Message-ID: <3b3449e00807290925u62685f2ev7c9793269faee54d@mail.gmail.com> Thanks to: Emmanuel Mejias Dennis Clarke D. Ratliff Hendrik Visage Francisco Roque and Matthew Stier Question: Can you run multiple zones off of a single NIC, all sharing the same MAC address? Answer: YES. Set up virtual NICs with different IPs, and give them all the same MAC address. Solaris will do the right thing. Documentation Links: (This is all quoted) http://www.sun.com/software/solaris/howtoguides/containersLowRes.jsp In particular: http://docs.sun.com/app/docs/doc/817-1592/z.conf.start-29?a=view contains a section on how to set up a zone. The zonecfg(1M) man page has more info as well: http://docs.sun.com/app/docs/doc/816-5166/zonecfg-1m?a=view as does the zones(5) man page: http://docs.sun.com/app/docs/doc/816-5175/zones-5?a=view Configuration help: I have some experience with this. get a list of your zones with zoneadm list -vc get the config for a zone with zonecfg -z zonename info then setup exclusive mode ip if needed. better yet .. run snoop to verify your MAC address data and you shoudl see the same MAC for all your traffic from that one NIC Thanks again everyone! -- H. Lally Singh Ph.D. Candidate, Computer Science Virginia Tech _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers From jesse-carroll at usa.net Tue Jul 29 15:47:38 2008 From: jesse-carroll at usa.net (JESSE CARROLL) Date: Tue, 29 Jul 2008 15:47:38 -0400 Subject: SUMMARY: java 1.2.2 sparc download Message-ID: <595mgCTUM2230S13.1217360858@cmsweb13.cms.usa.net> Got one of the local Sun guys to ftp me a copy and the download site FINALLY came back. I love REALLY out of date applications. ------ Original Message ------ Received: Tue, 29 Jul 2008 11:11:06 AM EDT From: "JESSE CARROLL" To: Subject: java 1.2.2 sparc download We are attempting to move an OLD appliation from Solaris 8 to Solaris 10 and the app wants java 1.2.2 (at least 1.2.2_10). The archive section at http://java.sun.com/products/archive/j2se/1.2.2_017/ is having a bit of a problem ("General Error A technical error occured while processing your request. Please contact the system administrator. Thank you for your patience."). Does anyone have a copy of the download? I'd be happy with anything from _10 to _17. JC _______________________________________________ sunmanagers mailing list sunmanagers at sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers