SUMMARY: Distributed Processing

From: Dr. Dave Checketts (checkedg@eee.bham.ac.uk)
Date: Fri May 06 1994 - 03:06:04 CDT


I sent this Summary last week. It did not appear so here it is again.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Thank you everyone who helped out on this one. It would appear that there is
a considerable amount of software which exists to allow me to do what I want.

I have not been able to look at all of the suggested possibilities yet but,
based on the distribution of replies which arrived, we will probably take
a close look at 'condor' and 'pvm' from the PD world.

One company, Myrias (Alberta, Canada) also supplied me with documentation
within a few days about their product which looks quite promising.

In total, I received 30 answers. I have listed them below as a summary
and then appended the full responses at the end for those who may be
interested. The number following the product indicates the mentions given
in the repsonses.

Public Domain
-------------

Condor (6) Extra library to be compiled in. Migrates jobs to a number
                of available machines. Will not use machines whose load
                exceeds some preset limit.

Net Linda (3) Provides 6 calls to add net parallelism to C programs.

PVM (7) Allows a collection of heterogeneous computers to be used
                as a coherent and flexible concurrent computational resource.
                Programs written in C, C++ or Fortran access PVM through
                library routines.

Parallaxis (1)
lb (1)
ISIS (1)
DQS (2)
Hence (1)
Express (3)

Commercial
----------

ConnectQ (1)
LSF (1)
Load Balancer (4)
Netmake (3)
NetShare (1)
NQS (1)
PAMS(Myrias) (1)

There is apparently also an article in the April 1994 edition of Open
Computing on this topic (guess who has not received one tihs month ?)

Responses came from the following....thank you all

tommy@big.att.com
pjw@ccci.com
@nsfnet-relay.ac.uk:hasley@andy.bgsu.edu
ems@ccrl.nj.nec.com
@nsfnet-relay.ac.uk:jdd@db.toronto.edu
cecilp@vancouver.cantel.rogers.com
peter.allan@aea.orgn.uk
doug@perry.berkeley.edu
poffen@San-Jose.ate.slb.com
gta!paul@uunet.UU.NET
bshaw@bobasun.spdc.ti.com
sommer@vsun02.ag01.Kodak.com
bern@penthesilea.Uni-Trier.DE
shandelm@jpmorgan.com
fgreco@lehman.com
tkevans@eplrx7.es.duPont.com
feldt@phyast.nhn.uoknor.edu
jkays@msc.edu
gautam@salvador.speech.lsu.edu
@nsfnet-relay.ac.uk:tate_j@deboom.portal.com
stern@sunrise.East.Sun.COM
Ian_MacPhedran@engr.usask.ca
vasey@issi.com
jimm@csdc02.orl.mmc.com
@kakwa.ucs.ualberta.ca:myrias.ab.ca!wtk@myrias
nitkin@ptdcs2.intel.com
fetrow@biostat.washington.edu
rminnich@super.org
zika@trinity.tamu.edu

Thanks Again

Dave

***************************************************************************
Dr. Dave Checketts | JANET: d.g.checketts@uk.ac.bham
Computer Officer | INTERNET: checkedg@eee.bham.ac.uk
School of Elec. & Elec. Eng., |
University of Birmingham | Telephone: 021 414 4322
Birmingham, B15 2TT, | Fax: 021 414 4291
England
***************************************************************************

************************************************************************

COMPLETE LISTS OF RESPONSES FOLLOW.

*************************************

>From tommy@big.att.com Fri Apr 22 13:51:32 1994

Look on the net for a package called condor. You compile your program
with the condor library. When you run your program, it migrates across
your list of available machines. When the load average of the machine
reaches a certain level -- I think 0.13 -- it assumes that the owner
wants to use it and the process backs off and moves away. Owners of
workstations, therefore, have no reason to object to other people
running programs on their workstations this way. You can do this in
the day or night for obvious reasons.

>From pjw@ccci.com Fri Apr 22 14:12:18 1994

Shell scripts are the first simple way you can parallel process
on idle Suns. Use cmd line args to drive multiple instances
of a program, NFS to collect results in some file system.

Net Linda (offshoot of Yale research) is a set of 6 calls
that add net parallelism to C programs, provided you don't
need to share a huge amount of data. Sample ray tracing
etc code is available. Fairly cheap to universities.

----------------------------------------------------------------------
  Dr. Peter J. Welcher EMAIL: pjw@ccci.com
  Chesapeake Computer Consultants, Inc. PHONE: (410) 266-5686
  2816 Southaven Drive or: (410) 573-1751
  Annapolis, MD 21401 FAX: (410) 573-1751
----------------------------------------------------------------------

>From @nsfnet-relay.ac.uk:hasley@andy.bgsu.edu Fri Apr 22 14:16:04 1994

There are a number of programs out there that can handle such
work. I looked at Condor from the U. of Wisconsin a while back,
but it got pushed away by official work before I could finish
the installation. It looked like a nice package, but I can't
give an adequate review.

(And Archie isn't responding so I can't tell you where it is...
poke, poke ... Try 'ftp.cs.wisc.edu', directory "condor".)

John Hasley

>From ems@ccrl.nj.nec.com Fri Apr 22 14:30:05 1994

One commercial package we looked into recently here is ConnectQ from
Sterling Software. It support multiple architectures. We heard of it through
Silicon Graphics. Contact Mark Maxwell (PST) at 800-455-9273 or 415-390-3523

Ed Strong

>From @nsfnet-relay.ac.uk:jdd@db.toronto.edu Fri Apr 22 20:05:42 1994

Call Platform Computing at 416-978-0458, or call your local DEC or Convex
salesperson. Ask for LSF.

Regards,

John

--
John DiMarco                                              jdd@cdf.toronto.edu
Computing Disciplines Facility Systems Manager            jdd@cdf.utoronto.ca
University of Toronto                                     EA201B,(416)978-1928

***************************************************************************"

>From cecilp@vancouver.cantel.rogers.com Sat Apr 23 16:44:02 1994

There is a product called

Load Balancer

by Unison Tymlabs.

in Texas U.S.A.

Tel: (512) 478-0611 Fax: (512) 479-0735

I am not their salesman and have no relationship with them.

Regards

Cecil

>From peter.allan@aea.orgn.uk Sat Apr 23 19:16:12 1994

There is such a thing but I don't know what it is. (The guy who half-told me about it was being unhelpful.)

Please post the summary to me, when solved.

Thanks.

--

________ ___________________________________ /\ | ______| /\ | | / \ | | / \ | Peter Allan | / /\ \ | |______ / /\ \ | Email : peter.allan@aea.orgn.uk | / /__\ \ | ______| / /__\ \ | Phone : (44) 925 252684 | / ______ \ | | / ______ \ | Fax : (44) 925 252390 | / / \ \ | |______/ / \ \ |___________________________________|/_/ \_\|_________/ \_\ T E C H N O L O G Y

>From doug@perry.berkeley.edu Sat Apr 23 20:27:00 1994

Check out PVM (Parallel Virtual Machine), available from: netlib2.cs.utk.edu

- Doug Neuhauser, doug@perry.berkeley.edu, 510-642-0931

>From poffen@San-Jose.ate.slb.com Sat Apr 23 23:08:03 1994

Depends on what the job entails. If it is "makes" of software, where there are multiple modules to compile in a single directory, a product called "netmake" can spawn the compiles on remote machines to do parallel builds.

The company name is called "Aggregate Computing". I don't have any other details offhand.

Russ Poffenberger DOMAIN: poffen@San-Jose.ate.slb.com Schlumberger Technologies ATE UUCP: {uunet,decwrl,amdahl}!sjsca4!poffen 1601 Technology Drive CIS: 72401,276 San Jose, Ca. 95110 Voice: (408)437-5254 FAX: (408)437-5246

>From gta!paul@uunet.UU.NET Sun Apr 24 06:24:09 1994

Dave, In regards to your question about distributed processing on a SPARC there are two solutions I know of:

1. PVM - A public domain system developed at Oak Ridge National Labs 2. Express from Parasoft Corp.

We will be using Express on a system we are currently configuring. Please contact Arthur Hicken at Parasoft Corp. for more information. They have an ftp server at parasoft.com that has many docuements about the Express product. If you contact Arthur, please tell him that I sent you.

Arthur Hicken Parasoft Corporation 818-792-9941 ahicken@parasoft.com

Paul ----- Paul Emerson | Global Technology Associates, Inc. President | 7198 Harbor Heights Circle Email: paul@gta.com | Orlando, FL 32835 CIS: 72355,171 | Tel 407-296-3636 FAX 407-295-1954

<too much to leave in this summary about Express so I took the liberty of deleting it>

>From bshaw@bobasun.spdc.ti.com Sun Apr 24 16:32:22 1994

Hi Dave I've evaluated CONDOR which is *VERY* good with the only disadvantage is that you need the .o files of your application to relink. Excellent checkpointing capability.

I'm presently looking at taskbroker.

Also Load Balancer sounds interesting but have not played with it.

PLEASE SUMMARIZE !!!!!

Thanks Bob

LAWYERS do it on a trial basis. Bob Shaw Texas Instruments Inc. 13536 North Central Expressway, Mail Station 461 Dallas, Tx 75243 bshaw@spdc.ti.com ------------------------------------

>From sommer@vsun02.ag01.Kodak.com Mon Apr 25 07:10:48 1994

Dave,

I would think you even would need a special language to do that. Using this language, your program must be structured so that it can be divided into different threads where each thread runs on one machine. I've heard from a program calculating fractals on different workstations.

language: Parallaxis package: parallaxis version: 2.0 parts: ?, simulator, x-based profiler author: ? how to get: ftp pub/parallaxis from ftp.informatik.uni-stuttgart.de description: Parallaxis is a procedural programming language based on Modula-2, but extended for data parallel (SIMD) programming. The main approach for machine independent parallel programming is to include a description of the virtual parallel machine with each parallel algorithm. ports: MP-1, CM-2, Sun-3, Sun-4, DECstation, HP 700, RS/6000 contact: ? Thomas Braunl <braunl@informatik.uni-stuttgart.de> ? updated: 1992/10/23

This was taken from:

Return-Path: <avmech@clpd.kodak.com> Xref: clpd.kodak.com comp.compilers:2161 comp.lang.misc:3496 comp.archives.admin:386 news.answers:7713 Newsgroups: comp.compilers,comp.lang.misc,comp.archives.admin,news.answers,comp.answers Path: clpd.kodak.com!kodak!newsserver.pixel.kodak.com!psinntp!psinntp!uunet!world!iecc!compilers-sender >From: David Muir Sharnoff <muir@idiom.berkeley.ca.us> Subject: Catalog of compilers, interpreters, and other language tools [p2of3] Followup-To: comp.archives.admin Summary: Monthly posting of free language tools that include source code Keywords: tools, FTP, administrivia Sender: compilers-sender@iecc.cambridge.ma.us Supersedes: <free2-May-93@comp.compilers> Reply-To: muir@idiom.berkeley.ca.us Organization: University of California, Berkeley References: <free1-Jun-93@comp.compilers> Date: Tue, 1 Jun 1993 11:00:35 GMT Approved: compilers@iecc.cambridge.ma.us Expires: Thu, 1 Jul 1993 23:59:00 GMT

If you cannot get hold of the complete list, I can send it to you as well (compressed with gzip). Just let me know.

Regards Tilman

---------------------------------------------------------------------------- #### ###### Tilman C. Sommer ### ######## OI-P+E, Software and Integration Group (SIG) ## ########## Kodak AG, Breitwiesen, D-73347 Muehlhausen/Gruibingen, Germany # ### KODAK ## Mailcode: 5023, Building 213/OG KMX : 631-2986 ## ########## Phone: ++49 (7335) 12-7677, Fax : ++49 (7335) 12-7766 ### ######## KNET Phone/Fax: 631-7677/7766 PROFS: (974111) LOCKOVM1 #### ###### Internet: sommer@vsun02.ag01.kodak.com ----------------------------------------------------------------------------

>From bern@penthesilea.Uni-Trier.DE Mon Apr 25 12:49:55 1994

There's NQS out there, we use it to distribute big Batches of Jobs across our Cluster, and it would support distributed Printing, too. However, there's nothing for having the Jobs communicate.

Regards, J. Bern -- __/\_____________________________________________ ___________________________ / \ \ / /\ / J. \ EMail: bern@[TI.]Uni-Trier.DE / ham: DD0KZ X More Infos on me from / \ \Bern/ X.400: <---- temporarily disabled ----> / \ the X.500 Directory; \ / \ / P. O. Box 1203, 54202 Trier, Germany / \ Pub Keys via finger \/ __\/___________________________________________/ EOF \_________________________

>From shandelm@jpmorgan.com Mon Apr 25 13:08:14 1994

PVM for the Sun (written by Convex?) or Aggregate Computing Inc or Network Linda.

-- joel

>From fgreco@lehman.com Mon Apr 25 14:27:14 1994

Contact Aggregate Computing (US) for a commercial solution or Dikran Kassabian (deke@ee.rochester.edu) for "lb" a public domain solution. pvm or ISIS might also fit your needs.

Frank G.

>From tkevans@eplrx7.es.duPont.com Mon Apr 25 14:47:19 1994

Check out DQS, available via anonymous ftp from 'ftp.scri..fsu.edu' in /pub/DQS.

There's a good tutorial to DQS in the April Issue of _Unix World's Open Computing_.

>From feldt@phyast.nhn.uoknor.edu Mon Apr 25 15:33:09 1994

Dave,

Try DQS (Distributed Queueing System). ftp from:

ftp.psc.edu:pub/dqs/DQS-2.1.tar.Z

Good luck!

Andy Feldt System Support Programmer Department of Physics and Astronomy The University of Oklahoma

>From jkays@msc.edu Mon Apr 25 15:40:32 1994

Dave - You should check into PVM, which is public domain. I have attached the README from the most current release, PVM 3.3. You can get the source from ORNL. I don't know the exact hostname, but it shouldn't be too hard to find. Good luck!

jeff

--

Jeff Kays Minnesota Supercomputer Center E-Mail: jkays@msc.edu 1200 Washington Avenue South Phone: (612) 337-3422 Minneapolis, Minnesota 55415 Fax: (612) 337-3400

"May fortune favor the foolish"

>From gautam@salvador.speech.lsu.edu Mon Apr 25 15:59:39 1994

Hi, This site "cs.dal.ca" has a complete directory of info on distributed processing. In think it is /pub/distributedProcessing.... You could also take a look at arjuna....... I hope this helps.

gautam pardhy gautam@salvador.speech.lsu.edu

>From @nsfnet-relay.ac.uk:tate_j@deboom.portal.com Mon Apr 25 16:20:27 1994

Talk to aggregate computing about NetShare

>From stern@sunrise.East.Sun.COM Mon Apr 25 16:38:15 1994

check out:

Unison-Tymlabs (Load Balancer) [NOTE: sold by freedman-sharp] 675 Almanor Avenue Sunnyvale, CA 94086 408 245 3000

Aggregate Computing (NetShare/NetMake) 300 South Highway 169 Suite 400 Minneapolis, MN 55426 800 966 1666 info@aggregate.com

network Linda (C-Linda), PVM, Express, and Hence, of which the last three are available for ftp.

load balancer and netshare are commercial products, understand multiple os/machine types, and are pretty rugged. pvm/express/hence are publicly available, and make your network look like a big compute cluster (with ethernet-type latency between nodes).

--hal

>From Ian_MacPhedran@engr.usask.ca Mon Apr 25 16:39:48 1994

The "Condor" package may be what you want - it is available from ftp.cs.wisc.edu via anonymous ftp. Yes, the application would have to be linked against special libraries.

Check out the latest UNIX World (now called Open something-or-other) for an article on distributed queuing systems.

Ian. ---------------------------------------------------------------------------- Ian MacPhedran, Engineering Computer Centre, University of Saskatchewan. 2B13 Engineering Building, U. of S. Campus, Saskatoon, Sask., CANADA S7N 0W0 macphed@dvinci.USask.CA (306) 966-4832 Ian_MacPhedran@engr.USask.CA

>From vasey@issi.com Mon Apr 25 18:17:06 1994

Your request sounds almost vebatim like the script from series of demos we used to run at the MCC Experimental Systems Lab about 3 years ago, where I was employed on a large-scale O-O parallel processing project. Using a small network operating system that ran on top of SunOS (and other UNIXes) we distributed large problems over several dozen work- stations and cranked out supercomputer scale results quite handily.

I don't know what has happened to that particular project recently, but it did have some government support, as well as Motorola, Boeing, and several universities participating when I left. I suggest you contact the Director, Rob Smith (rob@mcc.com) for further information.

++ Ron Vasey International Software Systems Inc. Vox: 512+338-5724 @issi.com 9430 Research #250, Austin TX 78759 Fax: 512+338-5757

>From jimm@csdc02.orl.mmc.com Mon Apr 25 19:29:55 1994 |------------------------------------------------------------------------------|

Look at the April 1994 'Open Computing' magazine. On pages 97 - 100, they present a number of different 'batching' or 'queing' programs available both by the public domain and also commercially.

-------------------------------------------------------------------------------- James R. Miller -- jimm@csdc02.orl.mmc.com -- System Administrator Martin Marietta Corporation -- Information Systems -- Orlando, Florida Voice: (407) 826-1348 -- Fax: (407) 356-8944 --------------------------------------------------------------------------------

>From @kakwa.ucs.ualberta.ca:myrias.ab.ca!wtk@myrias Mon Apr 25 20:43:17 1994

I was forwarded the attached message.

Sun and Myrias have just (a week and a half ago) unvailed a new product (based on 10 years of R&D done by Myrias in the MPP world) that exactly matches what you are asking about. I will have some information mailed to you today - you should get it in a few days. -Wayne

-------------------------------------------------------- Wayne T. Karpoff, General Manager wtk@myrias.ab.ca Myrias Computer Technologies Inc. 8522 Davies Road (403) 463-1337 Fax (403) 465-0130 Edmonton Alberta --------------------------------------------------------

>From nitkin@ptdcs2.intel.com Tue Apr 26 00:29:24 1994

One commercial product I'm aware of is "Load Balancer". You might want to look into it. The company is:

Unison-Tymlabs 675 Almanor Avenue Sunnyvale, CA 94086 (408) 245-3000

European Headquarters Harpenden, Herts 44 582 462424

I've not used the product, so I can't make any type of recommendation. Good luck.

-- - Nate Itkin - Portland Technology Development, Intel Corporation Aloha, Oregon - E-mail: Nate-Itkin@ptdcs2.intel.com

>From fetrow@biostat.washington.edu Tue Apr 26 01:21:10 1994

Ask archie about "condor". Rather clever.

You have to recompile you code, and it has to be Fortran, and all it will do is distribute jobs among the best (or nearbest) SINGLE workstations and you need a common network file system but you get "dump files" for free.

(as in: Stop the job and restart it from where you stopped). This is EXTREMELY handy. You can bring things down for maitenance and restart it later.

>From rminnich@super.org Tue Apr 26 15:01:29 1994

Yes, condor might do the job for distributed processing at night. It's used at some sites on hundreds to thousands of nodes, but in fact grabs idle cycles 24 hours a day. At any given time at least 66% of the cycles on a network are there for the taking, for periods of 10 or 20 minutes at a time. Condor is good at this because it supports checkpoint/migration.

ron

rminnich@super.org | It's amazing how much code depends on (301)-805-7451 or 7312 | 0xdeadbeef not being a valid virtual address.

>From zika@trinity.tamu.edu Thu Apr 28 20:53:27 1994

An excellent PD package is PVM which we use at our site. Here's the README file that came with:

Hope this helps...

--Michael Zika Nuclear Engineering Texas A&M University (zika@trinity.tamu.edu)



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:09:00 CDT