SUMMARY: interpreting "vmstat -i" output

From: Craig D Rice (cdr@stolaf.edu)
Date: Mon Nov 29 1993 - 21:54:53 CST

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

First, thanks to the following people who responded to my request for
suggestions on interpreting the output from "vmstat -i":

        dsnmkey@guinness.ericsson.se (Martin Kelly - Unix Specialist)
        Casper Dik <casper@fwi.uva.nl>
        Ian Angles <ia@st-andrews.ac.uk>
        aw@bk.sel.de (Armin Weber)
        eckhard@ts.go.dlr.de (Eckhard Rueggeberg)

More than anything else, those who responded indicated that my
interrupt counts were not especially out of the ordinary for a news
and a mail server.

Martin Kelley <martin@guinness.ericsson.se> noted that (when measuring
with vmstat 5) if the number of interrupts is consistently higher than
256, then it's time to explore where they're coming from. In our
case, our interrupt levels were in the 200-300 range.

Full text of the responses are included below...
Thanks for your suggestions,
Craig

--
Craig D. Rice		UNIX Systems Specialist/Network Analyst
cdr@stolaf.edu		Academic Computing Center, St. Olaf College
+1 507 646-3631		1510 St. Olaf Avenue
+1 507 646-3549 FAX	Northfield, MN  55057-1097   USA
-----
From: dsnmkey@guinness.ericsson.se (Martin Kelly - Unix Specialist)
Message-Id: <9311241043.AA26406@guinness.ericsson.se>
To: cdr@stolaf.edu
Subject: Re: interpreting "vmstat -i" output
Hello Craig,
The vmstat -i command displays information on the number
of interrupts which are being received and by what.
However, it is much better to look at the interrupt rates
using the vmstat 5 command. This will indicate if your system
is being interrupted excessively. Only then, should you
determine the source using vmstat -i. If the number of
interrupts are high consistently higher than 256, then
you can check the source. You can check for a faulty
transceiver if your ethernet shows a high number of
interrupts. This might indeed be the case with your
system. Of course, you should run the vmstat command
over a long period, eg: vmstat 300 96 (every 5 mins
for 8 hours).
If you are having problems in performance due to
a heavy loaded system (this will be shown up by
other events), then you should consider adding
another ethernet interface on the mail machine
and another SCSI controller on the News machine.
Best Regards,
/Martin.
============================================================================
 Martin Kelly                                        Tel: +31 1612 29358
 Unix Analyst/Specialist                             Fax: +31 1612 29071
 Ericsson Data Services Nederland, BV. (DSN)
 PO Box 209
 5120 AE Rijen                      MEMO:   ERI.DSN.DSNMKEY
 Netherlands                        E-mail: martin@guinness.ericsson.se
                                    X/Open: m.kelly@xopen.co.uk
============================================================================
----- NEXT -----
From: Casper Dik <casper@fwi.uva.nl>
>
>Greetings, Sun Managers,
>
>I am wondering if y'all know of any guidelines for interpreting
>"vmstat -i" output.  For example, are there any rules of thumb for
>interpreting when a machine is being bogged down by le or scsi
>interrupts?  A scouring of sun-managers-summary.src reveals nothing
>helpful...
>
>For example, our mailhost handles all mail traffic (/var/spool/mail is
>mounted on all UNIX clients, and all SMTP traffic is routed through
>this machine's sendmail).  scsi interrupts (esp) *seems* low, but
>network interrupts (le) *seem* high.
I wouldn't worry too much about the numbers.  We get similar numbers/
(100/s SCSI on news hosts, 140+ le on busy servers)
None of the machines feels sluggish or slow.
Casper
----- NEXT -----
From: Ian Angles <ia@st-andrews.ac.uk>
>For example, our mailhost handles all mail traffic (/var/spool/mail is
>mounted on all UNIX clients, and all SMTP traffic is routed through
>this machine's sendmail).  scsi interrupts (esp) *seems* low, but
>network interrupts (le) *seem* high.
>
>
>MAILHOST.STOLAF.EDU
>
>interrupt      total      rate
>----------------------------------- autovectored interrupts
>esp          55858529       22
The figure we get seems to fluctuate between about 20 to 60 with
extremes in the 120's - it depends if you're serving NFS clients I
suspect.
>fd                  0        0
>audio               0        0
>le           241090870       96
At first glance it looks like something is hosed - either your net is
saturated or the le device is shot - either of which would lead to more
obvious symptoms.
But, on the other hand, if you've got a decent size RAM on this machine
then the disk cache will be fairly useful, leading to less esp than le
interrupts -  and if it's a SS1+ class (or above) then it can probably
pump out enough packets to get near this figure.
>zs              10862        0
>clock        248704699      100
>----------------------------------- vectored interrupts
>-----------------------------------
>Total        545664960      219
>Further, our news machine gives the following stats, that might
>indicate that it's trying to handle a lot of (too many?) scsi (esp,
>ptscII) interrupts:
If you're thrashing the disks - you gotta expect some esp interrupts.
unfortunately, our news machine is a sun4/370 so I can't give you esp
stats on that.
I suspect it will also depend on what news transport you're using - C
News goes to disk at every opportunity, whereas INN does as few writes
and reads as it can.
Hope this is relevant,
ian	"Guesswork?"
---
Ian Angles, St. Andrews University Computing Laboratory. ia@st-andrews.ac.uk
"What's the point of having mastery over the cosmic balance and knowing the
secrets of fate if you can't blow something up?" - Terry Pratchet, Reaper Man
----- NEXT -----
From: aw@bk.sel.de (Armin Weber)
Hello,
no help to your current problem, but a hint for a very good book
for this:
System Performance Tuning by Mike Loukides printed by O'Reilly & Associates.
Greetings
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         V            |   Armin Weber              | Internet:
 +---------------+    |   PS/ETD4                  | aw@bk.sel.de
 | A L C A T E L |    |   Motorstr. 55             | DECNet:
 +---------------+    |   70499 Stuttgart          | 61.137::aw
        SEL           |   Germany                  | Phone:
                      |                            | (49) 711 869 6037
                      |   Development Computer     | Fax: 
                      |   Center                   | (49) 711 869 6049
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
----- NEXT -----
From: eckhard@ts.go.dlr.de (Eckhard Rueggeberg)
Our main file server (and gateway betweeen two subnets) has le interrupts
of 95+23 = 118 and esp of 23+8 = 31 on it's two boards (and a grand total 
of 242). So your numbers seem reasonable and normal...
Eckhard R|ggeberg
eckhard@ts.go.dlr.de
 
----- THE ORIGINAL MESSAGE FOLLOWS -----
> From sun-managers-relay@ra.mcs.anl.gov Wed Nov 24 11:29:26 1993
> Sender: sun-managers-relay@ra.mcs.anl.gov
> To: sun-managers@eecs.nwu.edu
> Cc: root@stolaf.edu
> Subject: interpreting "vmstat -i" output
> Date: Mon, 22 Nov 1993 14:00:10 -0600
> From: Craig D Rice <cdr@stolaf.edu>
> Reply-To: Craig D Rice <cdr@stolaf.edu>
> Followup-To: junk
> Content-Length: 1835
> 
> 
> Greetings, Sun Managers,
> 
> I am wondering if y'all know of any guidelines for interpreting
> "vmstat -i" output.  For example, are there any rules of thumb for
> interpreting when a machine is being bogged down by le or scsi
> interrupts?  A scouring of sun-managers-summary.src reveals nothing
> helpful...
> 
> For example, our mailhost handles all mail traffic (/var/spool/mail is
> mounted on all UNIX clients, and all SMTP traffic is routed through
> this machine's sendmail).  scsi interrupts (esp) *seems* low, but
> network interrupts (le) *seem* high.
> 
> 
> MAILHOST.STOLAF.EDU
> 
> interrupt      total      rate
> ----------------------------------- autovectored interrupts
> esp          55858529       22
> fd                  0        0
> audio               0        0
> le           241090870       96
> zs              10862        0
> clock        248704699      100
> ----------------------------------- vectored interrupts
> -----------------------------------
> Total        545664960      219
> 
> 
> Further, our news machine gives the following stats, that might
> indicate that it's trying to handle a lot of (too many?) scsi (esp,
> ptscII) interrupts:
> 
> NEWS.STOLAF.EDU
> 
> interrupt      total      rate
> ----------------------------------- autovectored interrupts
> esp          159148834       64
> ptscII       126623395       50
> fd                  0        0
> audio               0        0
> le           54686899       21
> zs               3178        0
> clock        248677216      100
> ----------------------------------- vectored interrupts
> -----------------------------------
> Total        589139522      236
> 
> 
> Thanks for your input. Will summarize....
> Craig
> --
> Craig D. Rice		UNIX Systems Specialist/Network Analyst
> cdr@stolaf.edu		Academic Computing Center, St. Olaf College
> +1 507 646-3631		1510 St. Olaf Avenue
> +1 507 646-3549 FAX	Northfield, MN  55057-1097   USA
>

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:08:30 CDT