SUMMARY: interpreting "vmstat -i" output

From: Craig D Rice (cdr@stolaf.edu)
Date: Mon Nov 29 1993 - 21:54:53 CST


First, thanks to the following people who responded to my request for
suggestions on interpreting the output from "vmstat -i":

        dsnmkey@guinness.ericsson.se (Martin Kelly - Unix Specialist)
        Casper Dik <casper@fwi.uva.nl>
        Ian Angles <ia@st-andrews.ac.uk>
        aw@bk.sel.de (Armin Weber)
        eckhard@ts.go.dlr.de (Eckhard Rueggeberg)

More than anything else, those who responded indicated that my
interrupt counts were not especially out of the ordinary for a news
and a mail server.

Martin Kelley <martin@guinness.ericsson.se> noted that (when measuring
with vmstat 5) if the number of interrupts is consistently higher than
256, then it's time to explore where they're coming from. In our
case, our interrupt levels were in the 200-300 range.

Full text of the responses are included below...
Thanks for your suggestions,
Craig

--
Craig D. Rice		UNIX Systems Specialist/Network Analyst
cdr@stolaf.edu		Academic Computing Center, St. Olaf College
+1 507 646-3631		1510 St. Olaf Avenue
+1 507 646-3549 FAX	Northfield, MN  55057-1097   USA

-----

From: dsnmkey@guinness.ericsson.se (Martin Kelly - Unix Specialist) Message-Id: <9311241043.AA26406@guinness.ericsson.se> To: cdr@stolaf.edu Subject: Re: interpreting "vmstat -i" output

Hello Craig,

The vmstat -i command displays information on the number of interrupts which are being received and by what.

However, it is much better to look at the interrupt rates using the vmstat 5 command. This will indicate if your system is being interrupted excessively. Only then, should you determine the source using vmstat -i. If the number of interrupts are high consistently higher than 256, then you can check the source. You can check for a faulty transceiver if your ethernet shows a high number of interrupts. This might indeed be the case with your system. Of course, you should run the vmstat command over a long period, eg: vmstat 300 96 (every 5 mins for 8 hours).

If you are having problems in performance due to a heavy loaded system (this will be shown up by other events), then you should consider adding another ethernet interface on the mail machine and another SCSI controller on the News machine.

Best Regards,

/Martin.

============================================================================ Martin Kelly Tel: +31 1612 29358 Unix Analyst/Specialist Fax: +31 1612 29071 Ericsson Data Services Nederland, BV. (DSN) PO Box 209 5120 AE Rijen MEMO: ERI.DSN.DSNMKEY Netherlands E-mail: martin@guinness.ericsson.se X/Open: m.kelly@xopen.co.uk ============================================================================

----- NEXT -----

From: Casper Dik <casper@fwi.uva.nl>

> >Greetings, Sun Managers, > >I am wondering if y'all know of any guidelines for interpreting >"vmstat -i" output. For example, are there any rules of thumb for >interpreting when a machine is being bogged down by le or scsi >interrupts? A scouring of sun-managers-summary.src reveals nothing >helpful... > >For example, our mailhost handles all mail traffic (/var/spool/mail is >mounted on all UNIX clients, and all SMTP traffic is routed through >this machine's sendmail). scsi interrupts (esp) *seems* low, but >network interrupts (le) *seem* high.

I wouldn't worry too much about the numbers. We get similar numbers/

(100/s SCSI on news hosts, 140+ le on busy servers)

None of the machines feels sluggish or slow.

Casper

----- NEXT -----

From: Ian Angles <ia@st-andrews.ac.uk>

>For example, our mailhost handles all mail traffic (/var/spool/mail is >mounted on all UNIX clients, and all SMTP traffic is routed through >this machine's sendmail). scsi interrupts (esp) *seems* low, but >network interrupts (le) *seem* high. > > >MAILHOST.STOLAF.EDU > >interrupt total rate >----------------------------------- autovectored interrupts >esp 55858529 22

The figure we get seems to fluctuate between about 20 to 60 with extremes in the 120's - it depends if you're serving NFS clients I suspect.

>fd 0 0 >audio 0 0 >le 241090870 96

At first glance it looks like something is hosed - either your net is saturated or the le device is shot - either of which would lead to more obvious symptoms.

But, on the other hand, if you've got a decent size RAM on this machine then the disk cache will be fairly useful, leading to less esp than le interrupts - and if it's a SS1+ class (or above) then it can probably pump out enough packets to get near this figure.

>zs 10862 0 >clock 248704699 100 >----------------------------------- vectored interrupts >----------------------------------- >Total 545664960 219

>Further, our news machine gives the following stats, that might >indicate that it's trying to handle a lot of (too many?) scsi (esp, >ptscII) interrupts:

If you're thrashing the disks - you gotta expect some esp interrupts. unfortunately, our news machine is a sun4/370 so I can't give you esp stats on that.

I suspect it will also depend on what news transport you're using - C News goes to disk at every opportunity, whereas INN does as few writes and reads as it can.

Hope this is relevant,

ian "Guesswork?" --- Ian Angles, St. Andrews University Computing Laboratory. ia@st-andrews.ac.uk "What's the point of having mastery over the cosmic balance and knowing the secrets of fate if you can't blow something up?" - Terry Pratchet, Reaper Man

----- NEXT -----

From: aw@bk.sel.de (Armin Weber)

Hello, no help to your current problem, but a hint for a very good book for this: System Performance Tuning by Mike Loukides printed by O'Reilly & Associates.

Greetings

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ V | Armin Weber | Internet: +---------------+ | PS/ETD4 | aw@bk.sel.de | A L C A T E L | | Motorstr. 55 | DECNet: +---------------+ | 70499 Stuttgart | 61.137::aw SEL | Germany | Phone: | | (49) 711 869 6037 | Development Computer | Fax: | Center | (49) 711 869 6049 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

----- NEXT -----

From: eckhard@ts.go.dlr.de (Eckhard Rueggeberg)

Our main file server (and gateway betweeen two subnets) has le interrupts of 95+23 = 118 and esp of 23+8 = 31 on it's two boards (and a grand total of 242). So your numbers seem reasonable and normal...

Eckhard R|ggeberg eckhard@ts.go.dlr.de

----- THE ORIGINAL MESSAGE FOLLOWS -----

> From sun-managers-relay@ra.mcs.anl.gov Wed Nov 24 11:29:26 1993 > Sender: sun-managers-relay@ra.mcs.anl.gov > To: sun-managers@eecs.nwu.edu > Cc: root@stolaf.edu > Subject: interpreting "vmstat -i" output > Date: Mon, 22 Nov 1993 14:00:10 -0600 > From: Craig D Rice <cdr@stolaf.edu> > Reply-To: Craig D Rice <cdr@stolaf.edu> > Followup-To: junk > Content-Length: 1835 > > > Greetings, Sun Managers, > > I am wondering if y'all know of any guidelines for interpreting > "vmstat -i" output. For example, are there any rules of thumb for > interpreting when a machine is being bogged down by le or scsi > interrupts? A scouring of sun-managers-summary.src reveals nothing > helpful... > > For example, our mailhost handles all mail traffic (/var/spool/mail is > mounted on all UNIX clients, and all SMTP traffic is routed through > this machine's sendmail). scsi interrupts (esp) *seems* low, but > network interrupts (le) *seem* high. > > > MAILHOST.STOLAF.EDU > > interrupt total rate > ----------------------------------- autovectored interrupts > esp 55858529 22 > fd 0 0 > audio 0 0 > le 241090870 96 > zs 10862 0 > clock 248704699 100 > ----------------------------------- vectored interrupts > ----------------------------------- > Total 545664960 219 > > > Further, our news machine gives the following stats, that might > indicate that it's trying to handle a lot of (too many?) scsi (esp, > ptscII) interrupts: > > NEWS.STOLAF.EDU > > interrupt total rate > ----------------------------------- autovectored interrupts > esp 159148834 64 > ptscII 126623395 50 > fd 0 0 > audio 0 0 > le 54686899 21 > zs 3178 0 > clock 248677216 100 > ----------------------------------- vectored interrupts > ----------------------------------- > Total 589139522 236 > > > Thanks for your input. Will summarize.... > Craig > -- > Craig D. Rice UNIX Systems Specialist/Network Analyst > cdr@stolaf.edu Academic Computing Center, St. Olaf College > +1 507 646-3631 1510 St. Olaf Avenue > +1 507 646-3549 FAX Northfield, MN 55057-1097 USA >



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:08:30 CDT