Herdtools Basic Information

Basic Information:
The Herdtools package is aimed at providing a set of user-level tools with a simplified interface to control and run jobs across a cluster of similar machines. We haven't paid much attention to portability, so these scripts may only work on Linux-based clusters! However, they are pretty standard Perl scripts, so they should work on many different platforms.

Here is a screen shot showing the webwatch interface. Each machine has a row in the table. The two meters show CPU and memory usage, the remaining columns show user-defined statistics (provided by Rob Brown's procstatd). Of course, the title area, the message of the day, etc., are also customizable. You can even change the meter images if you'd like.

The project is hosted at SourceForge, which provides us with email lists, discussion forums, bug tracking, and download facilities. To get to the Herdtools SourceForge Page, CLICK HERE.

The Herdtools are released under the GNU General Public License (GPL). This license gives you the right to use, copy, and distributed original and modified versions of the code provided that all such modifications also adopt the GPL. We offer no warranty for the Herdtools programs, they should be considered "AS IS", without warranty of any kin, either expressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for any particular purpose.

Ok, now that the legal stuff is out of the way...

Basic Configuration:
If you grabbed this code via an RPM package, then all the pieces should automatically go into the correct locations. If you just have source code (Perl code), then you should put the programs into an easily accessible directory, perhaps /usr/bin or /usr/local/bin. Some of the tools are probably best described as "sys admin" tools since they may either have security or CPU-load implications (ie. having many users run them could cause the cluster to become overloaded); these tools should probably be put in something like /usr/sbin or /usr/local/sbin.

Some of the tools assume that each machine in the cluster has Rob Brown's procstatd daemon running on it. This daemon serves out basic information about the current status of the machine. We actually use a modified version of procstatd which returns additional information for a parallel ps-like utility. It is probably best to have procstatd run at start-up, although manually running the daemon will work fine too.

Some of the tools also use the webwatchd daemon to monitor the cluster. The webwatchd daemon contacts a set of remote procstatd's and routinely asks them for information. This information is then published as a web page (and as a raw-data page) that users can refer to. The idea here is that having dozens of users repeatedly ping every machine in the cluster for information may increase the network load dramatically. Instead, all users can view the webwatchd-created page instead, reducing overall network traffic. It is probably best to have webwatchd run at start-up, although running it manually will work fine too.

The Herdtools generally all look to the same configuration file for basic info, called herdtools_site.pm. It is actually a Perl module file and so it is generally best to put it into /usr/lib/perl5/__version__ (maybe there'll be a site specific subdirectory). In this file, you'll find a number of options for the configuration of the system itself as well as configuration for various helper programs.

Since it is a Perl module, you'll need to know just a hint of Perl syntax, most of which you can pick up from what is shown in the default configuration. Most configuration variables start with a dollar sign followed by the variable name; this is just basic Perl syntax for a scalar variables. Character strings can be enclosed with either double or single quotes. All lines end with a semi-colon. The allhostinfo variable is called a "hash table" and has it's own syntax, explained below.

List of Tools:

cl_motd -- message of the day for cluster users
cl_stat -- status of the cluster (similar to the web page output)
cl_hosts -- check hosts and defined host tags
cl_bcast -- broadcast a file to multiple remote machines (fan-out approach)
cl_cp -- copy a file to one or more machines (one to many approach)
cl_exec -- execute the same command on multiple machines
cl_run -- execute a command on a single remote machine, but do some rudimentary load-balancing
cl_ipcs -- check on the Unix IPC information on remote machines
cl_ipcrm -- remove Unix IPC segments from remote machines
cl_kill -- kill a program (by name) off of multiple machines
cl_sig -- send a signal to a program (by name) on multiple machines
cl_multirun -- (experimental) run a set of programs on a first-come-first-served basis to multiple machines (good for parameter space studies)
cl_dirsync -- (experimental) synchronize the files in a directory between two machines (uses checksums and date information to NOT copy equivalent files)
cl_proc -- connect to remote procstatd's for information
cl_quik -- connect to remote procstatd's for "quick" information
cl_ps -- remote ps utility (through procstatd)
cl_prodweb -- "prod" the webwatchd and make it update the web page faster
cl_sudo -- run a command as root (uses the Unix sudo command)

Additional executables:

procstatd -- the process status daemon from Rob Brown
webwatchd -- the web publishing daemon

Host Configuration:
The set of hosts in the cluster are defined in a Perl hash table called "%allhostinfo". The percent sign indicates to Perl that it is a hash, the rest of the syntax is straight-Perl (although it may be a bit cryptic to non-Perl programmers). The basic syntax for this hash table looks like:

%allhostinfo = (
  "cow1" => [ "cow","mem1024","f90","gcc" ],
  "cow2" => [ "cow","mem1024","f90","gcc" ],
  "cow3" => [ "cow","mem1024","myprog" ],
  "cow4" => [ "cow","mem1024" ],
  "calf1" => [ "calf","mem512","f90","gcc" ],
  "calf2" => [ "calf","mem512","f90","gcc" ],
  "calf3" => [ "calf","mem512","myprog" ],
  "calf4" => [ "calf","mem512" ]
);

The "%allhostinfo" is the name of the hash variable (the percent sign indicates to Perl that it is a hash). The first quoted string on each line is the name of a machine (cow1, cow2, etc.). The entries between the square brackets are the assigned "tags" for that host. One caveat to note: most host definition lines end with a comma except the last one, which is ended with the closing parenthesis and semi-colon.

Usually, the tags assigned to a machine indicate the type of system it is, what machines it is similar to, what machines it is different from, etc. This can be things like the amount of memory in the machine, licensed compilers, other special software, or other hardware that is unique to that machine. Or it could indicate that certain machines share an ethernet switch or UPS. These tags can later be used to select all hosts that have a Fortran 90 compiler, on all hosts that have large memory spaces. Note that at this point, tags are really character strings -- so defining "mem1024" is useful to a human reader, but the system cannot determine that "mem1024" is bigger than "mem512". You can remove all of these if you wish, or you can get more and more detailed if you have many software packages that are installed on selected machines. See below for more details on why this might be useful.

Configuration File:
There are several basic system configuration options that you may want to set, some for security reasons, others for more robustness in connecting to remote machines. Note that the top part of the file should not be tampered with -- the "package", "use", "@ISA", and "@EXPORT" lines must be there in order for Perl to import this into the herdtools programs.

$rshcmd="/usr/bin/rsh";: sets the rsh command to use when connecting to remote machines; you may want to make this ssh for more security
$rcpcmd="/usr/bin/rcp";: sets the rcp command to use when copying files to remote machines; you may want to make this scp for more security
$pingcmd="/bin/ping";: sets the ping command to use when attempting to see if other machines are alive or dead
$pingtimeout=5;: sets the timeout for all ping commands (in seconds); if no response is obtained within this period, the machine is assumed dead
$timeout=5;: sets the timeout for all procstatd commands (in seconds); if no response is obtained within this period, the procstatd daemon on the remote machine is assumed dead
$retries=5;: sets the number of times to retry any remote connection; not used in many (any?) functions... yet!
$port=7885;: sets the port number to connect to for the remote procstatd daemons
$numchildren=16;: sets the default number of child processes to spawn when running multiple remote connections; setting this higher may provide better performance, setting it too high may degrade performance as it will increase the load on the local machine
$webhost="bull";: sets the location of the webwatchd daemon, if it is to be used; could (should?) be a FQDN; is really only used by the cl_prodweb command to force an update of the web page info... other herdtools use...
$webpageurl="http://cluster.ee.duke.edu";: sets the URL for the remote cluster information provided by webwatchd
$wgetcmd="/usr/bin/wget";: sets the command to be used when getting the remote webwatchd information; used by cl_motd and cl_stat

Note that the last line of the program should be "1;" since this is required for Perl modules (this forces the module to return a "true" value to Perl, indicating that it loaded without errors).

Commandline Options to all Herdtools Commands:
To make matters easier, we have included a set of basic options that all Herdtools commands will parse. These include mechanisms to select groups of machines out of the entire cluster.

Basic options:

-A: use all machines in the cluster
-F hostfile: read in a text host-file; one host per line
-N name: use all machines named name*; useful for hitting all machines called beowulf1 through beowulf16 at once
-n num_list: used in conjunction with -N above, this selects specific machine numbers in the named cluster; num_list is a comma separated list which is appended to the name given by -N; you can also use a colon to specify a range ("1:4,7:10" will give a list of machines: 1, 2, 3, 4, 7, 8, 9, 10)
-m machine_list: use the listed machines; machine_list is a comma separated list of machines
-- user_arg_list: used at the end of a command line to separate out the to-be-executed command's arguments from the cl_* command's arguments; any options after the -- will not be parsed by the cl_* commands

Advanced options:

-i tag: include all machines which have the indicated tag
-e tag: exclude all machines which have the indicated tag; generally used with -A above
-c num: uses an alternate number of child processes to speed-up the work on large host lists. The default is to use 8 children, thereby processing 8 remote machines at once. Note that too many children can swamp the local machine.

Using Host Tags:
Host tags are simply an attempt to make life easier, to allow quick access to common sets of machines. If several machines have Fortran compilers and the rest do not, then it can provide a quick means to running compilation jobs on the compilation-capable machines. Similarly, you may have some machines with more memory than others. Rather than remembering what hosts those are, you can simply select all "mem1024" machines when you need to run a large memory job. If a UPS system fails, you may want to quickly shut down all machines that have tag "ups1". And of course, if software has only been licensed for N machines, you can quickly request that your job be run on one of those machines by selecting the appropriate tag.

When you use the "-i tag" option, your hostlist will contain those machines that are marked with the given tag (think "include machines with...").

When you use the "-e tag" option, your hostlist will be re-scanned and those machines that are marked with the given tag will be remove (think "exclude machines with..."). Note that since this removes hosts from an assumed previously defined hostlist, you may want to use "-e" with the "-A" option to start with ALL hosts, then whittle down the list.

When using multiple "-i" or "-e" options, we tried to program for the usual case. For multiple "-i" options, hosts are selected only if they match ALL "include" tags. For multiple "-e" options, hosts are removed if they match ANY "exclude" tag. So if you have a set of "big memory" machines that you need to replace a compiler library on, you can use "-i fatmem -i f90" -- this will include only those machine which have lots of memory, and have the Fortran compiler installed. Since Matlab tends to hog CPU resources, you may want to search for machines which have lots of memory, but do not have Matlab installed -- "-i fatmem -e matlab".

Examples:

cl_ps -N cow: run the cl_ps command and collect data from all machines that are names cow*
cl_ls -N calf -n 2:4 /scratch/jim: run the cl_ls command and show the directory contents of user Jim's scratch directory for machines calf2, calf3, and calf4
cl_kill -N cow -n 1,2,4 foobar: send a kill signal (signal #9) to all programs named "foobar" on machines cow1, cow2, and cow4
cl_exec -m cow1,calf1 rm /tmp/foobar/*: remove the indicated directory on two machines: cow1, and calf1
cl_exec -i f90 "uname -a": run the cl_exec command and collect data from all machines that have the f90 tag (eg. all machines with a Fortran 90 compiler); given the above configuration, this would be machines cow1, cow2, calf1, and calf2
cl_hosts -A -e f90: show all hosts that do not have the f90 tag; given the above configuration, this would be machines cow3, cow4, calf3 and calf4
cl_exec -F myhosts.txt myprog -- -N foo -m bar: runs myprog on the hosts found in the text file myhosts.txt, sending "-N foo" and "-m bar" to myprog as command line arguments (this is important since -N and -m would otherwise be interpretted by cl_exec); note that given the above definitions and host tags, only cow3 and calf3 have the "myprog" tag, so any other machines may crash

Parallel make:
We have included a command called "cl_run" which can be used to craft a parallel make. Instead of using the "gcc" command to compile, you tell make to use cl_run (and then tell cl_run to execute the gcc command).

     % make -j 8 CC="cl_run -- gcc"

The "-j 8" option tells make to use at most 8 child processes in parallel (you can set this higher, if you have more compilation hosts). The "CC=..." part tells make what command to run for the C compiler. Note that we are assuming here that gcc is installed everywhere (since it is free), thus no tags are needed to select a compilation host. Also note that the "--" argument is sent to cl_run to separate out cl_run-specific arguments from the user program (compiler) arguments.

Why the name? Well, you have a "server farm" don't you? So these are tools to help tame your "herd"... ha ha? (I know, I know, don't quit the day job)

RCSID $Id: herdtools.html,v 1.4 2002/03/12 14:27:21 jbp4444 Exp jbp4444 $