Beginner's Guide to MPI
(MPICH-v1.0.12)
on the
University of Delaware DEC Alpha Cluster

Dixie Hisley and Lori Pollock
Department of Computer and Information Sciences
University of Delaware
(Revised October 15, 1999)

Introduction

MPI, the Message Passing Interface, is a library, and a software standard developed by the MPI Forum to make use of the most attractive features of existing message passing systems for parallel programming. Important contributions have come from the IBM T. J. Watson Research Center, Intel's NX/2, Express, nCUBE's Vertex, p4, PARMACS, Zipcode, Chimp, PVM, Chameleon, and PICL.

An MPI process consists of a C or Fortran 77 program which communicates with other MPI processes by calling MPI routines. The MPI routines provide the programmer with a consistent interface across a wide variety of different platforms. The MPI specification is based on a message passing model of computation where processes communicate and coordinate by sending and receiving messages. These messages can be of any size, up to the physical memory limit of the machine. MPI provides a variety of message passing options, offering maximal flexibility in message passing.

MPI is a specification (like C or Fortran) and there are a number of implementations. This guide describes the basic use of the MPICH implementation of MPI. Other implementations include LAM and CHIMP versions of MPI. These implementations are freely available by anonymous ftp:

1.: MPICH - info.mcs.anl.gov in the directory pub/mpi/mpich, file mpich.tar.Z
2.: LAM - tbag.osc.edu in the directory pub/lam
3.: CHIMP - ftp.epcc.ed.ac.uk in the directory pub/chimp/release

The MPICH implementation is a library of several hundred C and Fortran routines that will let you write programs that run in parallel and communicate with each other. Few completely understand all that any MPI implementation offers, but that's okay, because this class will only be using some ten (maybe a few more) routines out of the bunches available.

This guide is designed to give the user a brief overview of some of the basic and important routines of MPICH with emphasis on getting up and running on the DEC Alphas at the University of Delaware. The DEC Alphas include the following machines located on the eecis network: porsche, cobra, corvette, ferrari, jaguar, lamborghini, lotus, maserati,and viper. They can be accessed only through remote login to porsche, the server.

This guide is not meant to be a replacement for the official manual or user's guide. You should follow the various links on the course web page for the manual, user's guide, MPI Standard, and MPI FAQ.

Getting Started With MPI on the Cluster

This section contains the steps necessary to configure your MPI environment, and to compile and run an MPI program.

Editing Files on porsche

There are three different editors available on the Cluster: emacs, vi, and pico. Pick your favorite editor for your course projects.

Logging In & Setting Up Your Environment

It is highly recommended that you use secure shell (ssh) to log in to the Alpha cluster. Using ssh will make life easier by handling certain administrative aspects of your account for you. In particular, if you use ssh, you can skip section 2.2.3 (Setting Up to Display Graphics to Your Local Screen) of this manual.

/newpage

Logging In

The main reason behind using ssh is to keep your login session private. Ssh creates an encrypted socket from the local machine to the remote machine so that no data (e.g. your user-id, password, etc.) is sent in clear text. Ssh also sets up various environment parameters automatically for you (like your display name), making account set up easier. For more detailed information about ssh, visit http://www.employees.org/ satch/ssh/faq/ssh-faq-2.html. You can also use ssh to log in from home.

To log in to porsche from any machine on UDelNet, type the following at the command prompt:

ssh porsche.cis.udel.edu

The first time you log in from your machine, you will be asked if you're sure that's what you want to do. Respond:

yes

Once logged in, ssh sets up your DISPLAY environment and uses X-Authentication to handle any X-Windows you might create during your session. If you are logged in from a machine not running some variant of X-Windows, you will want to type:

unsetenv DISPLAY

otherwise, you might have problems with windows that try to pop-up on your local machine but can't!

.cshrc/.tcshrc file modifications

The modifications below will allow you to execute mpirun and find other MPI related files, the dxbook browser for online information about the Alpha cluster, and man pages, without specifying their full path location. A sample .(t)cshrc file with these modifications can be copied from ~pollock/public/sample.cshrc on porsche. Do NOT copy my .cshrc file in ~pollock. Alternately, just add the following to your existing .(t)chsrc file:

setenv OPENWINHOME /usr/openwin
set MYHOME = "/usr/local/users/$USER"

#for using the dxbook browser for online information about the alphas
setenv DECW_BOOK "/usr/lib/dxbook /cdrom/bookreader/decw_book"

#MPI STUFF
set path = (/usr/local/mpi/mpe /usr/local/mpi/bin \
              /usr/local/mpi/lib/alpha/ch_p4 $path)
              
# set up MANPATH
#
setenv MANPATH ${MYHOME}/usr/man:/usr/man
# list of other places to look
set manpath = ( /usr/local/man /usr/local/X11R6/man /usr/local/X/man \
        /usr/local/gnu/man /usr/local/lang/man /usr/lang/man \
        /usr/lang/SC1.0/man ${OPENWINHOME}/man /usr/share/catman \
        /usr/local/uci/man /usr/local/pvm/man /usr/local/mpi/man)
# only include if it exists
foreach mandir ( ${manpath} )
  if ( -d ${mandir} ) then
      setenv MANPATH ${MANPATH}:${mandir}
  endif
end

Setting Up to Display Graphics to Your Local Screen

To execute MPI programs that display graphics to your screen (e.g., the Mandelbrot Renderer), you will need to perform the following two steps (Note: if you use ssh to log in, you don't need to do this. Note further: if you're logged in from a machine that is not running X-Windows, you will not be able to display graphics on your machine.):

1.

On your local machine at which you want to display the image from porsche, create a file, called addhosts, that contains the following lines. Make the file addhosts executable by typing

chmod 755 addhosts

at the UNIX prompt. Each time you login to your local machine, and want to display graphics from the cluster, execute the addhosts file by typing

addhosts

at the UNIX prompt on your local machine. The addhosts file adds each of the machines in the cluster to your local machine's list of machines that have permission to display to your local screen. You can copy this file from ~pollock/public/addhosts on porsche.

# sample addhosts file
xhost +porsche.cis.udel.edu 
xhost +cobra.cis.udel.edu 
xhost +corvette.cis.udel.edu 
xhost +ferrari.cis.udel.edu
xhost +jaguar.cis.udel.edu
xhost +lamborghini.cis.udel.edu 
xhost +lotus.cis.udel.edu
xhost +maserati.cis.udel.edu
xhost +viper.cis.udel.edu

2.

Each time you are logged onto porsche and want to run a parallel program that displays to your local screen, you need to type:

setenv DISPLAY localmachinename:0.0

where localmachinename is replaced by the full name of the machine or xterm at which you are sitting (e.g., quadriga.cis.udel.edu). This command only has to be executed once when you login. Afterward, any graphic images generated by the DEC Alpha client machines and sent to porsche will be passed on to your local machine for display.

To find out the name of the xterm which you are using, you can do either of the following:

Press the Shift and HELP keys simultaneously, or
Type echo $DISPLAY.

Compilation

MPI allows you to have your source code in any directory. For convenience, you should probably put them in subdirectories under ~yourusername/mpi. To compile your source code, you should first copy the makefile in Appendix A available as ~pollock/public/Makefile on porsche into the directory of your source code. This makefile will compile the sample programs, the hello world and Mandelbrot Renderer, and place the executables in the current directory. (It also shows how to compile another program that is NOT given to you. This is just to demonstrate different parts of a Makefile.) To compile your own programs, you will find it easiest to simply change the names in the Makefile.

As you can see from the Makefile, you can compile simple C programs that call MPI routines with:

mpicc -o program_name program_name.c -lm

(where -lm links in the math library)

For simple C programs that use MPE graphics, you can compile with:

mpicc -o program_name program_name.c -lmpe -lX11 -lm

Fortran compilation is performed similarly; exchange mpif77 for mpicc and program_name.f for program_name.c. Type man mpicc or man mpif77 for additional information.

Creating a machinefile for Running Your MPI Program

A machinefile is a file that contains a list of the possible machines on which you want your MPI program to run. This file is useful if one of the Alphas is heavily loaded or is having problems. The particular machine you want to avoid can be commented out of the list of possible machines for selection. For example, jaguar is not possible for selection below.

# sample machine file
porsche.cis.udel.edu 
cobra.cis.udel.edu
corvette.cis.udel.edu 
ferrari.cis.udel.edu 
# jaguar.cis.udel.edu
lamborghini.cis.udel.edu 
lotus.cis.udel.edu 
maserati.cis.udel.edu 
viper.cis.udel.edu

For convenience, your machinefile should be kept in the same directory as your executable MPI files and named something appropriate like machines. The name of your machinefile will be used as an argument to the mpirun option -machinefile, (see next section).

Running MPI

In order to run an MPI compiled program, you must type:

mpirun -np <number of processors> [mpirun_options] <program name and arguments>

where you specify the number of processors on which you want to run your parallel program, the mpirun options, and your program name and its expected arguments.

Some examples of mpirun's:

     mpirun -np 4 hello
     mpirun -np 6 -machinefile machines hello
     mpirun -np 5 integrate 0 1

Type man mpirun or mpirun -help for a complete list of mpirun options.

Checking and Killing Processes Using spy and shoot

Bugs? - Just don't write buggy programs! - Simple! Of course, it will clearly never happen that a program written in this class would ever have any sort of problems, but, if, for some reason, a program that you write were to crash unexpectedly, there's something to watch out for.

An MPI program that contains parallelism may start simultaneously on all (or at least, many) of the DEC Alpha machines. If one process crashes, and MPI dies, it is quite possible that some of the other processes might continue living - and, cut off from their MPI connection - may just sort of hang around and use up CPU time. This is a great way to lose friends!

In fact, sit back for a while and imagine the Alphas, filled to the brim with students, all of them running their programs together on all the machines. One student's program crashes, leaving nine other copies of his program treading water.

Then a second person's program crashes. And a third.

These people try to fix their bugs, recompile, and run their programs again. The twenty-seven floundering processes from their first attempts are still around.

Some other people's programs crash, adding more dead weight. After a second compile-and-run attempt, the Alphas are host to sixty-three floundering processes, each potentially using up a unit of CPU load.

Inexplicably, the Alphas start to feel sluggish.

Slow, even.

Tempers flare. People start getting out their knives.

Not a good scene!

Soooo, for just such an eventuality, we have provided the commands spy, spyall, and shoot.

When you type spy, spy will start a remote shell on each of the Alphas and issue a ps command that will display the current status of all processes on the Alphas associated with your username. spyall will do the same, but show the status of all processes owned by anyone on the Alphas. shoot will insure that all your processes (except login shells on porsche) will die.

To access these programs, add these lines somewhere in your .cshrc/.tcshrc:

alias spy	/usa/pollock/public/spy 
alias shoot	/usa/pollock/public/shoot
alias spyall	/usa/pollock/public/spyall

Now when you type spy, spyall or shoot, they will be found from your alias, and the link will point to my copy of the file which will be executed.

It is suggested that whenever you run an MPI program on a large portion of the Alphas, and it crashes unexpectedly in a way that leads you to believe that there may be other, floundering processes left over, you should run spy to check out your suspicions and shoot to find and kill any processes you have hanging around.

It is strongly suggested that you issue a shoot command immediately before logging off porsche to help keep the peace. You should use spyall just before you run a program for performance numbers to be sure that no one else is running a job that will affect your performance numbers. You want to make sure that you are the only one using the cluster when you are doing performance runs.

Printing your Programs and Other Files

You can print directly from porsche onto any of the CIS Department printers, by typing lpr -P<printer> <filename> directly from porsche. However, the CIS Department printers are located in rooms typically not accessible by undergraduates, and the main department printer in the CIS Department office (103 Smith) is NOT to be used for printing coursework. So, you should print your files by copying your files to strauss and then printing from there to a printer on campus in which you have access. The best way to copy your files is via scp, or you can copy your files by ftp.

Copying Files via scp

Scp utilizes ssh to transmit your files, thus you will not be sending your password in clear text for the world to see (as you would with ftp). To copy files from porsche to strauss, type the following while on porsche,

scp <path1><file> <username>@strauss.udel.edu:<path2><file>

Where <path1><file> is the local machine path and filename and <path2><file> is the path and filename for the destination machine. When prompted for your password, enter it and press return. Your file will then be copied for you.

Copying files via ftp

To ftp from porsche to strauss, type the following while on porsche:

ftp strauss

The ftp session goes as follows. Your responses are indicated in all-caps after the prompts; however, you should type them in small letters.

Connected to strauss.udel.edu.
220 strauss.udel.edu FTP server (Version wu-2.4.2-academ[BETA-15](1) Thu Dec 11 08:49:20 EST 1997) ready.
Name (strauss:pollock):  HIT CARRIAGE RETURN 
Password: TYPE YOUR PASSWORD HERE 
230 User pollock logged in.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> PUT PROGRAM.C 
200 PORT command successful.
150 Opening BINARY mode data connection for hello.c.
226 Transfer complete.
219 bytes sent in 0 seconds (0.21 Kbytes/s)
ftp> PUT RESULTS.FILE 
200 PORT command successful.
150 Opening BINARY mode data connection for hello.c.
226 Transfer complete.
219 bytes sent in 0 seconds (0.21 Kbytes/s)
ftp> QUIT 
221 Goodbye.
%

The Basics of Writing MPI Programs

You should now be able to successfully compile and execute MPI programs, check the status of your MPI processes, and halt MPI programs that have gone astray. This section gives an overview of the basics of parallel programming with MPI. For a more in-depth discussion of basic and advanced constructs, see the references or look at the man pages by typing man MPI or man MPI_Routinename. Note that you must capitalize MPI for the man pages to work. Also, the MPI routines must be capitalized exactly as they are used in your C programs.

Fortran versus C with MPI

MPI provides both Fortran and C routines. All names of MPI routines and constants in both C and Fortran begin with the prefix MPI_ to avoid name collisions. For the remainder of this guide, only the C versions of the MPI routines will be presented. However, the primary differences between the C and Fortran routines are:

Error codes are returned in a separate argument for Fortran as opposed to the return value for C functions.
Fortran-compatible MPI routine names are totally uppercase (e.g., MPI_INIT), whereas C-compatible MPI routine names are upper and lowercase (e.g., MPI_Init).
The arguments to most C-compatible MPI functions are more strongly typed than they are in Fortran, having specific C types such as MPI_Comm and MPI_Datatype where Fortran has integers.
The include files are different: in C, mpi.h, in Fortan, mpif.h.
The arguments to MPI_Init are different, so that a C program can take advantage of command-line arguments.

Initialization, Communicators, Handles, and Clean-Up

The first MPI routine called in any MPI program must be the initialization routine MPI_INIT. Every MPI program must call this routine once, before any other MPI routines. Making multiple calls to MPI_INIT is erroneous. The C version of the routine accepts the arguments to main, namely argc and argv as arguments.

MPI_INIT defines something called MPI_COMM_WORLD for each process that calls it.
MPI_COMM_WORLD is a communicator. All MPI communication calls require a communicator argument and MPI processes can only communicate if they share a communicator.

Every communicator contains a group which is a list of processes. Secondly, a group is in fact local to a particular process. The group contained within a communicator has been previously agreed across the processes at the time when the communicator was set up. The processes are ordered and numbered consecutively from zero, the number of each process being known as its rank. The rank identifies each process within the communicator. The group of MPI_COMM_WORLD is the set of all MPI processes.

MPI maintains internal data structures related to communications etc. and these are referenced by the user through handles. Handles are returned to the user from some MPI calls and can be used in other MPI calls.

An MPI program should call the MPI routine MPI_FINALIZE when all communications have completed. This routine cleans up all MPI data structures etc. It does NOT cancel outstanding communications, so it is the responsibility of the programmer to make sure all communications have completed. Once this routine is called, no other calls can be made to MPI routines, not even MPI_INIT, so a process cannot later re-enroll in MPI.

MPI Indispensable Functions

This section contains the basic functions needed to manipulate processes running under MPI. It is said that MPI is small and large. What is meant is that the MPI standard has many functions in it, approximately 125. However, many of the advanced routines represent functionality that can be ignored until one pursues added flexibility (data types), robustness (nonblocking send/receive), efficiency (``ready mode"), modularity (groups, communicators), or convenience (collective operations, topologies). MPI is said to be small because there are six indispensable functions from which many useful and efficient programs can be written.

The six functions are:

MPI_Init - Initialize MPI
MPI_Comm_size - Find out how many processes there are
MPI_Comm_rank - Find out which process I am
MPI_Send - Send a message
MPI_Recv - Receive a message
MPI_Finalize - Terminate MPI

You can add functions to your working knowledge incrementally without having to learn everything at once. For example, you can accomplish a lot by just adding the collective communication functions MPI_Bcast and MPI_Reduce to your repertoire. These functions will be detailed below in addition to the six indispensable functions.

MPI_Init

The call to MPI_Init is required in every MPI program and must be the first MPI call. It establishes the MPI execution environment.

	int MPI_Init(int *argc, char ***argv)

	Input:
   	   argc - Pointer to the number of arguments
   	   argv - Pointer to the argument vector

MPI_Comm_size

This routine determines the size (i.e., number of processes) of the group associated with the communicator given as an argument.

	int MPI_Comm_size(MPI_Comm comm, int *size)

	Input:
   	   comm - communicator (handle)
	Ouput:
   	   size - number of processes in the group of comm

MPI_Comm_rank

The routine determines the rank (i.e., which process number am I?) of the calling process in the communicator.

	int MPI_Comm_rank(MPI_Comm comm, int *rank)

	Input:
   	   comm - communicator (handle)
	Output:
   	   rank - rank of the calling process in the group of comm (integer)

MPI_Send

This routine performs a basic send; this routine may block until the message is received, depending on the specific implementation of MPI.

	int MPI_Send(void* buf, int count, MPI_Datatype datatype, int dest,
              int tag, MPI_Comm comm)

	Input:
  	   buf  - initial address of send buffer (choice)
	   count - number of elements in send buffer (nonnegative integer) 
	   datatype - datatype of each send buffer element (handle)
  	   dest - rank of destination (integer)
  	   tag  - message tag (integer)
  	   comm - communicator (handle)

MPI_Recv

This routine performs a basic receive.

	int MPI_Recv(void* buf, int count, MPI_Datatype datatype, int source,
              int tag, MPI_Comm comm, MPI_Status *status)

	Output:
  	   buf  - initial address of receive buffer 
	   status - status object, provides information about message received;
          status is a structure of type MPI_Status, the element
          status.MPI_SOURCE is the source of the message received, 
          and the element status.MPI_TAG is the tag value.
          
	Input:
	   count - maximum number of elements in receive buffer (integer)
	   datatype - datatype of each receive buffer element (handle)
	   source - rank of source (integer)
	   tag  - message tag (integer)
	   comm - communicator (handle)

MPI_Finalize

This routine terminates the MPI execution environment; all processes must call this routine before exiting.

	int MPI_Finalize(void)

MPI_Bcast

This routine broadcasts data from the process with rank "root" to all other processes of the group.

	int MPI_Bcast(void* buffer, int count, MPI_Datatype datatype, int root,
               MPI_Comm comm)

	Input/Output:
	   buffer - starting address of buffer (choice)
	   count - number of entries in buffer (integer)
	   datatype - data type of buffer (handle)
	   root - rank of broadcast root (integer)
  	   comm - communicator (handle)

MPI_Reduce

This routine combines values on all processes into a single value using the operation defined by the parameter op.

	int MPI_Reduce(void* sendbuf, void* recvbuf, int count, MPI_Datatype
                datatype, MPI_Op op, int root, MPI_Comm comm)

	Input:
	   sendbuf - address of send buffer (choice)
	   count - number of elements in send buffer (integer)
	   datatype - data type of elements of send buffer (handle)
	   op - reduce operation (handle) (user can create using MPI_Op_create
          or use predefined operations MPI_MAX, MPI_MIN, MPI_PROD, MPI_SUM,
          MPI_LAND, MPI_LOR, MPI_LXOR, MPI_BAND, MPI_BOR, MPI_BXOR,
          MPI_MAXLOC, MPI_MINLOC in place of MPI_Op op.
	   root - rank of root process (integer)
	   comm - communicator (handle)

	Output:
	   recvbuf - address of receive buffer (choice, significant only at root )

A Simple MPI Program - hello.c

Consider this demo program:

/*The Parallel Hello World Program*/
#include <stdio.h>
#include <mpi.h>

main(int argc, char **argv)
{
   int node;
   
   MPI_Init(&argc,&argv);
   MPI_Comm_rank(MPI_COMM_WORLD, &node);
     
   printf("Hello World from Node %d\n",node);
            
   MPI_Finalize();
}

In a nutshell, this program sets up a communication group of processes, where each process gets its rank, prints it, and exits. It is important for you to understand that in MPI, this program will start simultaneously on all machines. For example, if we had ten machines, then running this program would mean that ten separate instances of this program would start running together on ten different machines. This is a fundamental difference from ordinary C programs, where, when someone said ``run the program", it was assumed that there was only one instance of the program running.

The first line,

#include <stdio>

should be familiar to all C programmers. It includes the standard input/output routines like printf. The second line,

#include <mpi.h>

includes the MPI functions. The file mpi.h contains prototypes for all the MPI routines in this program; this file is located in /usr/local/mpi/include/mpi.h in case you actually want to look at it.

The program starts with the main... line which takes the usual two arguments argc and argv, and the program declares one integer variable, node. The first step of the program,

      MPI_Init(&argc,&argv);

calls MPI_Init to initialize the MPI environment, and generally set up everything. This should be the first command executed in all programs. This routine takes pointers to argc and argv, looks at them, pulls out the purely MPI-relevant things, and generally fixes them so you can use command line arguments as normal.

Next, the program runs MPI_Comm_rank, passing it an address to node.

      MPI_Comm_rank(MPI_COMM_WORLD, &node);

MPI_Comm_rank will set node to the rank of the machine on which the program is running. Remember that in reality, several instances of this program start up on several different machines when this program is run. These processes will each receive a unique number from MPI_Comm_rank.

Because the program is running on multiple machines, each will execute not only all of the commands thus far explained, but also the hello world message printf, which includes their own rank.

      printf("Hello World from Node %d\n",node);

If the program is run on ten computers, printf is called ten times on ten different machines simultaneously. The order in which each process executes the message is undetermined, based on when they each reach that point in their execution of the program, and how they travel on the network. Your guess is as good as mine. So, the ten messages will get dumped to your screen in some undetermined order, such as:

Hello World from Node 2
Hello World from Node 0
Hello World from Node 4
Hello World from Node 9
Hello World from Node 3
Hello World from Node 8
Hello World from Node 7
Hello World from Node 1
Hello World from Node 6
Hello World from Node 5

Note that all the printf's, though they come from different machines, will send their output intact to your shell window; this is generally true of output commands. Input commands, like scanf, will only work on the process with rank zero. After doing everything else, the program calls MPI_Finalize, which generally terminates everything and shuts down MPI. This should be the last command executed in all programs.

MPE Graphics

Overview

In addition to the MPI functions, there is also a set of graphics routines located in a library called MPE. These routines are useful for MPI parallel programs that involve displaying graphical images. This library includes a simple interface to X. Of course, any program that uses MPE will have to include mpe.h by the line

#include "mpe.h"

You will also have to compile any program that uses MPE by including the -lmpe as part of the compilation line, like this:

mpicc lab.c -o lab -lmpe -lX11 -lm

Unfortunately, MPE does not supply the proper functions to create even a semi-reasonable interface. Thus, MPE_Get_mouse_status and MPE_Drag_square are provided for your use. These functions are in the file ~pollock/public/mouse_status.c on porsche, with prototypes in the file
~pollock/public/mouse_status.h on porsche. You should copy these files into your own directory.

There is a very terse summary of the MPE routines in Appendix C of the book Using MPI included in the reference list of this document. For a small example using the MPE graphics routines, see section 3.8 of Using MPI. Below is a list of the most frequently used MPE routines; the list does not contain all of the MPE graphics routines. For more, look at the file: /usr/local/mpi/mpe/mpe_graphics.h.

MPE_Open_graphics  - create a graphics window
MPE_Close_graphics  - destroy a graphics window
MPE_Draw_point  - draw a point in a window
MPE_Draw_points - draw a series of points in a window.
   (moderately faster than a series of MPI_Draw_point calls)
MPE_Draw_line  - draw a line in a window
MPE_Fill_rectangle  - draw a rectangle in a window
MPE_Update  - flush the buffer for a window
MPE_Get_mouse_press - wait until the user presses a mouse button
   and return the press point.
MPE_Get_mouse_status (in mouse_status.c) - get information about the mouse state
MPE_Drag_square (in mouse_status.c)  - let the user select a square on the screen
MPE_Make_color_array - create a nice spectrum of colors

Details on some MPE Graphics Routines

MPE_Open_graphics (MPE_XGraph *window, MPI_Comm comm, char *display, int x,int y,
int width, int height, int is_collective);

Open a window at x, y of size width, height. If you pass -1 for x and y, the user will be required to position the window. If NULL is passed for display, then the display will be configured automatically. Always pass 0 for is_collective. MPE_Open_graphics must be called on all nodes in comm. Don't forget to pass the address of your window!

MPE_Close_graphics (MPE_XGraph *window);

Close the window associated with window. All processes must call this routine. Once any process has called MPE_Close_graphics, no process can call any other MPE routine. Don't forget to pass the address of your window!

MPE_Draw_point (MPE_XGraph window, int x, int y, MPE_Color color);

Draw a pixel at (x, y). Initially, MPE_Color can be one of: MPE_WHITE, MPE_BLACK, MPE_RED, MPE_YELLOW, MPE_GREEN, MPE_CYAN, and MPE_BLUE. You may change the available colors using MPE_Make_color_array and MPE_Add_RGB_color (see these routines' man pages). Note that the point may not actually be drawn until you call MPE_Update.

MPE_Draw_points (MPE_XGraph window, const MPE_Point *points, int npoints);

Draws a series of points at once. points should point to an array of MPE_Point structures. Here's the form of an MPE_Point:

  typedef struct {     
    int x, y;
    MPE_Color c;   
  } MPE_Point;

npoints should contain the number of points that are in the array pointed to by points. Note that the points may not actually be drawn until you call MPE_Update.

MPE_Draw_line (MPE_XGraph window, int x1, int y1, int x2, int y2, MPE_Color
color);

Draw a line from (x1, y1) to (x2, y2).

MPE_Fill_rectangle (MPE_XGraph window, int x, int y, int width, int height,
MPE_Color color);

Fill a rectangle with upper-left corner at (x, y) of size width, height, in pixels.

MPE_Update (MPE_XGraph window);

The MPE graphics library buffers the drawing commands that you execute, so that they can be sent to the X server all at once. MPE_Update sends the contents of the buffer. You should call MPE_Update whenever your process may be idle for a while (so that the window is not partially drawn).

MPE_Get_mouse_press (MPE_XGraph window, int *x, int *y, int *button);

Blocks until a mouse button is pressed in window. Then, the mouse position (relative to the upper-left corner of the window) is returned in x and y. The number of the button that was pressed is returned in button.

MPE_Get_mouse_status (MPE_XGraph window, int *x, int *y, int *button, int
*wasPressed);
/* in mouse_status.h. */

Does exactly the same thing as MPE_Get_mouse_press, buy returns immediately even if no button is pressed. wasPressed will be non-zero if any button was pressed at the time of the call.

MPE_Drag_square (MPE_XGraph window, int *startx, int *starty, int *endx,
int *endy);
/* In mouse_status.h. */

Wait for the user to drag out a square on the screen. It is OK if a button is already pressed when you call MPE_Drag_square; for instance, you might call MPE_Get_mouse_press to wait for a mouse press, and then call MPE_Drag_square only if a certain button was pressed. If the button is already pressed when you call MPE_Drag_square, *startx and *starty should contain the point at which the mouse was pressed. *endx and *endy will always be greater than *startx and *starty, even if the user drags the square from right to left.

MPE_Make_color_array (MPE_XGraph window, int ncolors, MPE_Color *colors);

This function creates a nice rainbow spectrum of ncolors colors. It places these colors into the array pointed to by colors; this array should have at least ncolors elements. If not enough colors are available, then some of the returned colors will be random. Mosaic and Netscape tend to hog the colormap, so you might want to quit them before running your program to get the correct colors. The maximum value for ncolors is 254. The new colors replace all the standard MPE colors except MPE_BLACK and MPE_WHITE. You should call MPE_Make_color_array from all the nodes that you plan to draw from.

Remember that most of these functions also have man pages.

Gathering Performance Data

Timing Programs

For timing parallel programs, MPI includes the routine MPI_Wtime() which returns elapsed wall clock time in seconds. The timer has no defined starting point, so in order to time something, two calls are needed and the difference should be taken between the returned times. As a simple example, we can time each of the processes in the hello world program as below:

#include <stdio.h>
#include <mpi.h>

/*NOTE: The MPI_Wtime calls can be placed anywhere between the MPI_Init
and MPI_Finalize calls.*/

main(int argc, char **argv)
{
   int node;
   double mytime;   /*declare a variable to hold the time returned*/

   MPI_Init(&argc,&argv);
   mytime = MPI_Wtime();  /*get the time just before work to be timed*/
   MPI_Comm_rank(MPI_COMM_WORLD, &node);

   printf("Hello World from Node %d\n",node);

   mytime = MPI_Wtime() - mytime; /*get the time just after work is done
                                    and take the difference */
   printf("Timing from node %d is %lf seconds.\n",node,mytime);
   MPI_Finalize();

 }

It might be nice to know what was the least/most execution time spent by any individual process as well as the average time spent by all of the processes. This will give you a vague idea of the distribution of work among the processes. (A better idea can be gained by calculating the standard deviation of the run times.) To do this, in additional to a few calls to get the value of the system clock, you need to add a call to synchronize the processes and a few more calls to collect the results. For example, to time a function called work() which is executed by all of the processes, one would do the following:

    int myrank,
        numprocs;
    double mytime,   /*variables used for gathering timing statistics*/
           maxtime,
           mintime,
           avgtime;
  
    MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
    MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
    MPI_Barrier(MPI_COMM_WORLD);  /*synchronize all processes*/
    mytime = MPI_Wtime();  /*get time just before work section */
    work();
    mytime = MPI_Wtime() - mytime;  /*get time just after work section*/
/*compute max, min, and average timing statistics*/
    MPI_Reduce(&mytime, &maxtime, 1, MPI_DOUBLE,MPI_MAX, 0, MPI_COMM_WORLD);
    MPI_Reduce(&mytime, &mintime, 1, MPI_DOUBLE, MPI_MIN, 0,MPI_COMM_WORLD);
    MPI_Reduce(&mytime, &avgtime, 1, MPI_DOUBLE, MPI_SUM, 0,MPI_COMM_WORLD);
    if (myrank == 0) {
      avgtime /= numprocs;
      printf("Min: %lf  Max: %lf  Avg:  %lf\n", mintime, maxtime,avgtime);
    }

Be sure to execute spyall to ensure that no one else is using the cluster at the time you want to run a performance execution. Other processes in the system will affect your performance timings. On a network of workstations, the reported times may be off by a second or two due to network latencies as well as interference from the operating system and the jobs of other users. On a tightly coupled machine like the Paragon, these timings should be accurate to within a second. Thus, it's best in general to time things that run for long enough that the system noise isn't significant, and time them several times. If you get an anomalous timing, don't hesitate to run the code a few more times to see if it can be reproduced.

If you are comparing the execution times of a sequential program with a parallel program written in MPI, be sure to use the mpicc compiler for both programs with the same switches, to ensure that the same optimizations are performed. Then, run the sequential version via mpirun -np 1.

Profiling and Viewing Profile Information

Although timing can provide insight into the performance of a program, it is sometimes desirable to see in detail the sequence of communication and computational events that occurred in a program and the amount of time spent in each phase. This information is usually gained by tracing various events during execution, ie., logging information as the parallel program runs. Files that contain time-stamped communication and computational events are called logfiles. The easiest way to understand this data at a glance is with a graphical tool. In the next two subsections, creation of a logfile using MPE logging routines and viewing of the logfile using the program upshot are described.

MPE Logging Routines

The logging routines in MPE are used to create logfiles of events that occur during the execution of a parallel program. These files can be studied after the program has ended successfully. The following routines allow the user to log events that are meaningful for specific applications rather than relying on automatic logging of MPI library calls. The basic routines are MPE_Init_log, MPE_Log_event, and MPE_Finish_log.

MPE_Init_log must be called by all processes to initialize MPE logging data structures. MPE_Finish_log collects the log data from all the processes, merges it, and aligns the timestamps with respect to the times at which MPE_Init_log and MPE_Finish_log were called. Then, the process with rank 0 in MPI_Comm_world writes the log into the file whose name is given as an argument to MPE_Finish_log.

A single event is logged with the MPE_Log_event routine. The routines MPE_Describe_event and MPE_Describe_state allow one to add event and state descriptions and to define states by specifying a starting and ending event for each state. Finally, MPE_Start_log and MPE_Stop_log can be used to dynamically turn logging on and off, respectively. By default, logging is on after MPE_Init_log is called. For the specific syntax of these routines, you can also consult the man pages, e.g., man MPE_Describe_state. The following sample program demonstrates some of these logging routines. The program can be found on porsche in

        ~pollock\public\cpilog.c

/*		Sample Program with Logging Commands*/
#include "mpi.h"
#include "mpe.h"
#include <math.h>
#include <stdio.h>

double f(a)
double a;
{
    return (4.0 / (1.0 + a*a));
}

int main(argc,argv)
int argc;
char *argv[];
{
  int done = 0, n, myid, numprocs, i, rc, repeat;
  double PI25DT = 3.141592653589793238462643;
  double mypi, pi, h, sum, x, a;
  double startwtime, endwtime;

  MPI_Init(&argc,&argv);
  MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
  MPI_Comm_rank(MPI_COMM_WORLD,&myid);

  MPE_Init_log();
  if (myid == 0) {
    MPE_Describe_state(1, 2, "Broadcast", "red:vlines3");
    MPE_Describe_state(3, 4, "Compute",   "blue:gray3");
    MPE_Describe_state(5, 6, "Reduce",    "green:light_gray");
    MPE_Describe_state(7, 8, "Sync",      "yellow:gray");
  }

  while (!done)
    {
      if (myid == 0) 
	{
	  printf("Enter the number of intervals: (0 quits) ");
	  scanf("%d",&n);
	  startwtime = MPI_Wtime();
	}
      MPI_Barrier(MPI_COMM_WORLD);
      MPE_Start_log();

      MPE_Log_event(1, 0, "start broadcast");
      MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
      MPE_Log_event(2, 0, "end broadcast");
    
      if (n == 0)
	done = 1;
      else
	{
	  for (repeat=5; repeat; repeat--) {
	    MPE_Log_event(7,0,"Start Sync");
	    MPI_Barrier(MPI_COMM_WORLD);
	    MPE_Log_event(8,0,"End Sync");
	    MPE_Log_event(3, 0, "start compute");
	    h   = 1.0 / (double) n;
	    sum = 0.0;
	    for (i = myid + 1; i <= n; i += numprocs)
	      {
		x = h * ((double)i - 0.5);
		sum += f(x);
	      }
	    mypi = h * sum;
	    MPE_Log_event(4, 0, "end compute");
	    fprintf( stderr, "[%d] mypi = %lf\n", myid, mypi );

	    MPE_Log_event(5, 0, "start reduce");
	    MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);
	    MPE_Log_event(6, 0, "end reduce");
	    if (myid == 0)
	      {
		printf("pi is approximately %.16f, Error is %.16f\n",
		       pi, fabs(pi - PI25DT));
		endwtime = MPE_Wtime();
		printf("wall clock time = %f\n", endwtime-startwtime);	       
	      }
	  }
        }
      MPE_Stop_log();
    }
  MPE_Finish_log("cpilog.log");
  MPI_Finalize();
}

Profile Visualization with Upshot

The logfile viewing program that is distributed with MPI is called upshot. It is a simple graphical display of parallel time lines and state durations. Upshot is a Tcl/Tk script, so it can be customized and extended. Once you have created a logfile by inserting the MPE logging routines into your program and compiling and executing the program, the logfile can be viewed by invoking upshot. Simply type

upshot filename.log

at the UNIX prompt on porsche. Try this on the sample logfile cpilog.log provided for you on porsche in

 
~pollock\public\cpilog.log

When the window titled Upshot appears, click on Setup. cpilog.log will be automatically read, and upshot will display parallel time lines for each process, with states indicated by colored bars. Timestamp values, adjusted to start at 0 are shown along the bottom of the time lines. Upshot provides zooming capability for magnified views of the time lines.

In the reference Using MPI, another sample of logging is presented as part of a Fortran code that performs matrix-matrix multiplication on pages 42-46.

Debugging MPI Programs

Parallel programs are much more difficult to debug than their serial counterparts. Not only do you have to worry about the things you worry about with serial programs, such as syntax and logic, but you also have to worry about parallel programming problems such as deadlock, nondeterminism, and synchronization.

Debugging Methods

The following method is suggested for debugging MPI programs. First, if possible, write the program as a serial program. This will allow you to debug most syntax, logic, and indexing errors.

Then, modify the program and run it with 2-4 processes on the same machine. This step will allow you to catch syntax and logic errors concerning intertask communication. A common error found at this point is the use of non-unique message tags. The final step in debugging your application is to run the same processes on different machines. This will check for synchronization errors caused by message delays on the network.

You should first try to find the bug by using a few printf statements. If this does not work, then you may want to try running the program under a debugger such as dbx or gdb which will start the first process under the debugger where possible. See man dbx or man gdb for information on these debuggers and man mpirun for information on adding them to your command line arguments. Note that the debuggers are not parallel versions, but are only being used by the first process. Also, their interaction with parallel programs was not tested, therefore use at your own risk.

Common Problems: Descriptions and Tips

This section contains descriptions and fixes for some common problems.

Lost Output

If some or all of your output does not appear, it is the result of one of two things (or a combination of both). Sometimes the output will have disappeared without a trace. The reason for this is that under UNIX, output is placed in a buffer to be printed, but not actually printed yet, to increase the efficiency of the system. Sometimes MPI marks a process as dead when it exits, and therefore the output in the buffer is never read. This is usually the case when some output appears, but not all of it. To correct this, add the statement:

fflush(stdout);

after each printf statement. This will flush the buffer after each printf so no output will be in the buffer when a process exits MPI.

If you attempt to write output to a single file from multiple processes, the file will be overwritten; there is no append capability. The file will contain only the output from the last process that writes to it.

Writing to standard output does appear to work okay, except that the information may not appear in the order your program would suggest. Again, adding the following:

fflush(stdout);
sleep(1);

after every printf statement will often force the output to arrive in the proper order, when the real problem is the network latency, and not nondeterminism in your program.

Error Messages

Many messages - When an MPI job crashes, you typically get more than one line of error messages. The FIRST line is the most important and contains the clue to your actual problem. The rest of the messages are usually the system's attempt to clean up the rest of the processes that have been left hanging!

Infinite messages - Occasionally, you will get runaway error messages that appear to be in an infinite loop... You will need to log onto porsche from another window and issue a shoot command.

Intermittent messages - During initial testing, it was discovered that the DEC Alphas sometimes issue intermittent error messages on programs that are correct. The errors may have something to do with a network or hardware problem. Gurus are looking at the problem but... unfortunately you may still have to deal with this! Our suggestion... build your program slowly, adding just a few lines at a time. If you do get an error that does not have obvious origins, run the code a couple of times to make sure it is your problem and not the system. Don't forget to do a shoot command between runs to clean up leftover jobs. Error messages that are suspicious for being system problems usually contain phrases like:

	net_send: could not write
	unidentified err handler
	bad file number
	interrupt SIGBUS: 10

Uninitialized variables - Another potential problem error could be uninitialized variables. MPI_Init in the main part of your program appears to set uninitialized variables to zero; however, uninitialized variables in subroutines appear to be set to the usual C compiler initialization; that is, garbage. Beware of subroutines bearing garbage! A clue to this problem is a SIGFPE error message.

A reminder of common signals and their explanation:

SIGABRT - Abnormal termination of the program (such as a call to abort).
SIGFPE  - An erroneous arithmetic operation, such as a divide-by-zero
                 or an operation resulting in overflow
SIGILL -  Detection of an illegal instruction
SIGINT - Receipt of an interactive attention signal
SIGSEGV - An invalid access to storage
SIGTERM - A termination request sent to the program

World Wide Web Resources

There are some valuable resources online regarding MPI. A good place to start is the Argonne National Labs site:

http://www.mcs.anl.gov/Projects/mpi/

Ian Foster's book on parallel programming which includes a chapter on MPI and links to other MPI resources can be found at:

http://www.mcs.anl.gov/dbpp/

References

1.: Gropp, Lusk, and Skjellum Using MPI- Portable Parallel Programming with the Message-Passing Interface The MIT Press: Cambridge, Mass, 1994.
2.: Snir, et al. MPI: The Complete Reference The MIT Press: Cambridge, Mass, 1994.
3.: Kerninghan and Ritchie The C Programming Language, 2nd Edition Prentice Hall: Englewood Cliffs, 1988.

Sample Makefile

# Generated automatically from Makefile.in by configure.
ALL: default
##### User configurable options #####

SHELL       = /bin/sh
ARCH        = alpha
COMM        = ch_p4
BOPT        = 
P4_DIR      = 
TOOLS_DIR   = 
MPIR_HOME   = /usr/local/mpi-v1.0.12
CC          = cc
CLINKER     = cc
CCC         = 
CCLINKER    = $(CCC)
F77         = f90
FLINKER     = f90
AR          = ar crl
RANLIB      = ranlib
PROFILING   = $(PMPILIB)
OPTFLAGS    = -O
MPE_LIBS    = -lmpe -lX11 -lm
MPE_DIR     = /usr/local/mpi-v1.0.12/mpe
LIB_PATH    = -L/usr/local/mpi-v1.0.12/lib/alpha/ch_p4 
FLIB_PATH   = -L/usr/local/mpi-v1.0.12/lib/alpha/ch_p4 
LIB_LIST    = -lmpi  
MPE_GRAPH   = -DMPE_GRAPHICS
#
INCLUDE_DIR =  -I$(MPIR_HOME)/include 
DEVICE      = ch_p4

### End User configurable options ###

CFLAGS	  =  -DFORTRANUNDERSCORE  -DMPE_USE_EXTENSIONS=1 -DHAS_XDR=1 
-DSTDC_HEADERS=1  -DHAVE_STDLIB_H=1 -DMALLOC_RET_VOID=1 -DHAVE_SYSTEM=1 
-DHAVE_NICE=1 -DPOINTER_64_BITS=1 -DINT_LT_POINTER=1 -DHAVE_LONG_DOUBLE=1 
-DHAVE_LONG_LONG_INT=1 $(OPTFLAGS) $(INCLUDE_DIR) -DMPI_$(ARCH) 
CFLAGSMPE = $(CFLAGS) -I$(MPE_DIR) $(MPE_GRAPH)
CCFLAGS	  = $(CFLAGS)
#FFLAGS	  = '-qdpc=e' 
FFLAGSMPE = $(FFLAGS) -I$(MPE_DIR) $(MPE_GRAPH)
FFLAGS	  =  $(OPTFLAGS)
MPILIB	  = $(MPIR_HOME)/lib/$(ARCH)/$(COMM)/libmpi.a 
MPIPPLIB  = $(MPIR_HOME)/lib/$(ARCH)/$(COMM)/libmpi++.a
LIBS	  = $(LIB_PATH) $(LIB_LIST)
FLIBS	  = $(FLIB_PATH) $(LIB_LIST)
LIBSPP	  = $(MPIPPLIB) $(LIBS)
EXECS	  = mandel hello

default: $(EXECS)

all: default

clean:
	/bin/rm -f *.o *~ 

#SAMPLE FOR SIMPLE PROGRAM - NO GRAPHICS:
hello:	hello.o $(MPILIB)
	   $(CLINKER) $(OPTFLAGS) -o hello hello.o $(LIBS) -lm

#SAMPLE FOR SIMPLE PROGRAM WITH MPE GRAPHICS:
easy:	easy.o $(MPILIB)
	   $(CLINKER) $(OPTFLAGS) -o easy easy.o $(LIBS) -lmpe -lX11 -lm
	
#SAMPLE FOR COMPLEX PROGRAM WITH MPE GRAPHICS:
mandel: mandel.o manager.o worker.o mouse_status.o \
        $(MPIR_HOME)/include/mpir.h $(MPILIB)
	$(CLINKER) $(OPTFLAGS) -o mandel mandel.o manager.o \
                                         worker.o mouse_status.o\
	$(LIB_PATH) ${MPE_LIBS} $(LIB_LIST)

mandel.o:	mandel.c
		$(CC) $(CFLAGSMPE) -c mandel.c

mouse_status.o:	mouse_status.c
		$(CC) $(CFLAGSMPE) -c mouse_status.c

manager.o:	manager.c
		$(CC) $(CFLAGSMPE) -c manager.c

worker.o:	worker.c
		$(CC) $(CFLAGSMPE) -c worker.c

MPI Program with Graphics - Mandelbrot Rendering

This program generates a rendering of the Mandelbrot set in parallel.

What's the Mandelbrot set?

The Mandelbrot set is what is known as an iterative fractal. The idea is that every point (x, y) on the plane is mapped to a complex number c = x + yi. Using this value of c, we define a function f:

f(z) = z*z + c

We then define a series in which the nth element is f composed with itself n - 1 times applied to 0.

e.g., 0, f(0), f(f(0)), f(f(f(0))), f(f(f(f(0)))), ...

If this series is bounded, then point c is in the Mandelbrot set. If the series is not bounded, then c is not in the Mandelbrot set. It turns out that if the distance from any element in the series to the origin is greater than 2.0, then the series is not bounded. Thus, we can find out if a point is in the Mandelbrot set by iterating until the generated value is more than 2.0 away from the origin. Of course, if the point is in the set, then the iteration will never stop. Thus, we define an iteration limit N. Any point which can be iterated N times without escaping the radius 2.0 is probably in the Mandelbrot set.

Traditionally, images of the Mandelbrot set are drawn with the set in black. Areas outside the set are colored according to how many iterations there were before the value escaped radius 2.0. Such images are of course very pretty, and way overused.

Parallelizing Mandelbrot Rendering

How should you divide up the task of rendering the Mandelbrot set for N processors? The obvious method is to divide the image up into N sections, and have each processor render one section. This is inefficient, though, since different areas of the image take different amounts of time to compute, so some nodes will be done before others and sit idle. Even if the sections did require equal amounts of computation, some nodes may be faster than others.

The solution is to use manager-worker techniques. The idea is that one node (presumably node 0) is the manager, and farms out tasks to the rest of the nodes, which are the workers. The manager keeps a list of the workers, and whether each worker is busy. Whenever the manager needs a task done, it sends a message to an idle worker. When the worker completes, it sends a message back to the manager saying it is finished. Thus the manager can keep all the workers busy most of the time.

User Interface

When the program is started, it should bring up a graphics window containing a view of the entire Mandelbrot set. The user should be able to use the mouse to zoom in on a region. To quit the program, click the right mouse button inside the graphics window.

globals.h

#ifndef __globals_h__
#define __globals_h__

/* Includes: */
#include <stdlib.h>
#include <stdio.h>
#include <mpi.h>
#include "/usr/local/mpi/mpe/mpe.h"

/* Constants: */
enum {
  defaultWidth = 400,
  defaultHeight = 400
};

/* Variables: */
int rank, numnodes;
MPE_XGraph w;
int width, height;
int numcolors;
MPE_Color *colors;

/* Types: */
typedef double coordinate;

typedef struct {
  coordinate x1, y1, x2, y2;
  int left, top, right, bottom;
  int maxiter;
} MandelRender;

extern MPI_Datatype MandelRenderType;

/* Message tags: */
enum {
  tag_renderthis,
  tag_donerendering,
  tag_flush,
  tag_shutdown
};

#endif

mandel.c

#include "globals.h"
#include "manager.h"
#include "worker.h"

MPI_Datatype MandelRenderType;

void Setup_Types (void)
{
  int blockcounts[] = { 4, 5 };
  MPI_Aint offsets[] = { 0, 4 * sizeof(double) };
  MPI_Datatype types[2];
  types[0] = MPI_DOUBLE;
  types[1] = MPI_INT;
  MPI_Type_struct (2, blockcounts, offsets, types, &MandelRenderType);
  MPI_Type_commit (&MandelRenderType);
}

int main (int argc, char **argv)
{
  MPI_Init (&argc, &argv);
  MPI_Comm_rank (MPI_COMM_WORLD, &rank);
  MPI_Comm_size (MPI_COMM_WORLD, &numnodes);
  Setup_Types ();

  /* Open the window: */
  width = defaultWidth;
  height = defaultHeight;
  MPE_Open_graphics (&w, MPI_COMM_WORLD, NULL, -1, -1, 
		     width, height, 0);
  numcolors = 32;
  colors = (MPE_Color *) malloc(numcolors * sizeof(MPE_Color));
  MPE_Make_color_array (w, numcolors, colors);
  
  /* Start the manager or worker, as appropriate: */
  if (!rank)
    Manager ();
  else
    Worker ();

  MPE_Close_graphics (&w);

  MPI_Finalize ();
  exit(0);
}

manager.h

#include "globals.h"
void Manager (void);

manager.c

#include "/usr/local/users/hisley/mpi/src/lab4/manager.h"
#include "/usr/local/users/hisley/mpi/src/lab4/mouse_status.h"

/* The busy array contains an element for every worker node.  
   The value of node[n] indicates how many tasks node n is doing.
   We try to keep each worker fed with 2 tasks at a time, to avoid message 
   latency.  Thus workersfree specifies how many more tasks could be fed to
   workers. */
char *workersbusy;
int workersfree;

static int ReceiveDonePacket (void)
{
  MPI_Status status;
  MPI_Recv (NULL, 0, MPI_INT, MPI_ANY_SOURCE, tag_donerendering, 
	    MPI_COMM_WORLD, &status);
  workersbusy[status.MPI_SOURCE]--;
  workersfree++;
  return status.MPI_SOURCE;
}

static void Manager_Draw (MandelRender *r)
{
  int sectHeight;
  coordinate sectDeltaY;
  MandelRender section;
  int who;

  /* Calculate height of the sections to render: */
  sectHeight = height / (numnodes - 1) / 10;	/* Let's go for 10 times as 
						   many sections as workers. */
  if (sectHeight <= 0)
    sectHeight = 1;
  sectDeltaY = (r->y2 - r->y1) / (r->bottom - r->top) * sectHeight;
  
  section = *r;
  while (1) {
    section.bottom = section.top + sectHeight;
    section.y2 = section.y1 + sectDeltaY;
    if (section.bottom > r->bottom) {
      section.bottom = r->bottom;
      section.y2 = r->y2;
    }
    
    if (workersfree)
      for (who = 1; who < numnodes && workersbusy[who] == 2; who++);
    else
      who = ReceiveDonePacket ();
      
    MPI_Send (&section, 1, MandelRenderType, who, tag_renderthis,
	      MPI_COMM_WORLD);
    workersbusy[who]++;
    workersfree--;

    if (section.bottom == r->bottom)
      break;
    
    section.top = section.bottom;
    section.y1 = section.y2;
  }

  for (who = 1; who < numnodes; who++)
    MPI_Send (NULL, 0, MPI_INT, who, tag_flush, MPI_COMM_WORLD);

  while (workersfree != (numnodes - 1) * 2)
    ReceiveDonePacket ();
}

void Manager (void)
{
  MandelRender whole = { -2.0, 1.5, 1.0, -1.5,
			 0, 0, -1, -1, 1000 },
	       r;
  char done = 0;

  puts ("Welcome to the lame Mandelbrot rendering demo.\n\n"
	"When I say I'm ready for mouse input you may:\n"
	"  - Drag out a square region to zoom in on with the left mouse "
	"button.\n"
	"  - Click the middle button to return to a view of the full "
	"Mandelbrot set.\n"
	"  - Click the right button to quit this program.\n");

  whole.right = width;
  whole.bottom = height;
  r = whole;
  workersbusy = (char *) calloc (sizeof(char), numnodes);
  workersfree = (numnodes - 1) * 2;

  while (!done) {
    printf ("\nDrawing image:\nReal: %12.10f - %12.10f; 
             Imaginary: %12.10f - %12.10f.\n", r.x1, r.x2, r.y1, r.y2);
    fflush (stdout);
    Manager_Draw (&r);
    while (1) {
      int button, x, y;
      printf("\nReady for mouse input.\n");
      fflush (stdout);
      MPE_Get_mouse_press (w, &x, &y, &button);
      if (button == 3) {
	done = 1;
	break;
      } else if (button == 2) {
	r = whole;
	break;
      } else if (button == 1) {
	int rx, ry;
	double ox1 = r.x1, oy1 = r.y1;
	MPE_Drag_square (w, &x, &y, &rx, &ry); 
	r.x1 = ox1 + (r.x2 - ox1) * x / width;
	r.x2 = ox1 + (r.x2 - ox1) * rx / width;
	r.y1 = oy1 + (r.y2 - oy1) * y / height;
	r.y2 = oy1 + (r.y2 - oy1) * ry / height;
	break;
      }
    }
  }

  { int who;
    for (who = 1; who < numnodes; who++)
      MPI_Send (NULL, 0, MPI_INT, who, tag_shutdown, MPI_COMM_WORLD);
  }
}

worker.h

#include "/usr/local/users/hisley/mpi/src/lab4/globals.h"
void Worker (void);

worker.c

#include "/usr/local/users/hisley/mpi/src/lab4/worker.h"

static void RenderSection (MandelRender *r)
{
  int width = r->right - r->left, height = r->bottom - r->top;
  int numpoints = width * height;
  int x, y;
  coordinate dr = (r->x2 - r->x1) / (r->right - r->left),
	     di = (r->y2 - r->y1) / (r->bottom - r->top),
	     cr, ci = r->y1;
  MPE_Point *points = (MPE_Point *) malloc (numpoints * sizeof(MPE_Point)),
	    *point = points;
  
  for (y = r->top; y < r->bottom; y++, (ci += di)) {
    cr = r->x1;
    for (x = r->left; x < r->right; x++, (cr += dr)) {
      int iter;
      register double zr = 0, zi = 0;
      point->x = x;
      point->y = y;
      for (iter = r->maxiter; iter && zr * zr + zi * zi < 4.0; iter--) {
	register double nr = zr * zr - zi * zi + cr;
	zi = zr * zi;
	zi += zi + ci;
	zr = nr;
      }
      point->c = iter ? colors[iter % numcolors] : MPE_BLACK;
      point++;
    }
  }

  MPE_Draw_points (w, points, numpoints);
  free (points);
}

void Worker (void)
{
  MandelRender r;
  MPI_Status status;
  while (1) {
    MPI_Probe (0, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
    if (status.MPI_TAG == tag_renderthis) {
      MPI_Recv (&r, 1, MandelRenderType, status.MPI_SOURCE, tag_renderthis, 
		MPI_COMM_WORLD, &status);
      RenderSection (&r);
      MPI_Send (NULL, 0, MPI_INT, status.MPI_SOURCE, tag_donerendering,
		MPI_COMM_WORLD);
    } else if (status.MPI_TAG == tag_flush) {
      MPI_Recv (NULL, 0, MPI_INT, status.MPI_SOURCE, tag_flush, 
               MPI_COMM_WORLD, &status);
      MPE_Update (w);
    } else if (status.MPI_TAG == tag_shutdown) {
      MPI_Recv (NULL, 0, MPI_INT, status.MPI_SOURCE, tag_shutdown, 
               MPI_COMM_WORLD, &status);
      return;
    } else {
      fprintf (stderr, "Hey!  Unknown tag %d on node %d.\n", 
	      status.MPI_TAG, rank);
      fflush (stderr);
    }
  }
}

mouse_status.h

#ifndef __mouse_status_h__
#define __mouse_status_h__

#include "/usr/local/mpi/mpe/mpe.h"

int MPE_Get_mouse_status( MPE_XGraph graph, int *x, int *y, 
			  int *button, int *wasPressed );

void MPE_Drag_square (MPE_XGraph w, int *startx, int *starty, 
		      int *endx, int *endy);

#endif

mouse_status.c

#include <math.h>
#include "/usr/local/mpi/mpe/basex11.h"

#define MPE_INTERNAL
#include "/usr/local/mpi/mpe/mpe.h"   /*I "mpe.h" I*/

#include "/usr/local/users/hisley/mpi/src/lab4/mouse_status.h"

#define DEBUG 0

/*@
  MPE_Get_mouse_status - Checks for mouse button location & stuff
  Checks if the mouse button has been pressed inside this MPE window.
  If pressed, returns the coordinate relative to the upper right of
  this MPE window and the button that was pressed.

  Input Parameter:
. graph - MPE graphics handle

  Output Parameters:
. x - horizontal coordinate of the point where the mouse button was pressed
. y - vertical coordinate of the point where the mouse button was pressed
. button - which button was pressed: MPE_BUTTON[1-5]
. wasPressed - 1 if the button was pressed, 0 if not

@*/
int MPE_Get_mouse_status( MPE_XGraph graph, int *x, int *y, 
			  int *button, int *wasPressed )
{
  Window root, child;
  int rx, ry;
  unsigned int mask;

  if (graph->Cookie != MPE_G_COOKIE) {
    fprintf( stderr, "Handle argument is incorrect or corrupted\n" );
    return MPE_ERR_BAD_ARGS;
  }

  XQueryPointer (graph->xwin->disp, graph->xwin->win, &root, &child, &rx, &ry,
		 x, y, &mask);
  if (mask & Button1Mask)
    *button = Button1;
  else if (mask & Button2Mask)
    *button = Button2;
  else if (mask & Button3Mask)
    *button = Button3;
  else
    *button = 0;
  *wasPressed = *button;

  return MPE_SUCCESS;
}


static void fill_ones_rect (MPE_XGraph handle, int x, int y, int w, int h)
{
  XBSetPixVal (handle->xwin, 0xFFFFFFFF);
  XFillRectangle (handle->xwin->disp, handle->xwin->win,
		  handle->xwin->gc.set, x, y, w, h);
}


void MPE_Drag_square (MPE_XGraph w, int *startx, int *starty, 
		      int *endx, int *endy)
{
  int button, pressed, x, y, ox = *startx, oy = *starty;

  MPE_Draw_logic (w, MPE_LOGIC_XOR);
  MPE_Get_mouse_status (w, &x, &y, &button, &pressed);
  if (!pressed)
    MPE_Get_mouse_press (w, &x, &y, &button);
  while (pressed) {
    MPE_Get_mouse_status (w, &x, &y, &button, &pressed);
    y = *starty + (x - *startx);
    if (x != ox) {
      if (ox > *startx)
	fill_ones_rect (w, *startx, *starty, ox - *startx, oy - *starty);
      else
	fill_ones_rect (w, ox, oy, *startx - ox, *starty - oy);
      ox = x; oy = y;
      if (pressed)
	if (ox > *startx)
	  fill_ones_rect (w, *startx, *starty, ox - *startx, oy - *starty);
	else
	  fill_ones_rect (w, ox, oy, *startx - ox, *starty - oy);
      MPE_Update (w);
    }
  }
  if (x > *startx) {
    *endx = x; *endy = y;
  } else {
    *endx = *startx; *endy = *starty;
    *startx = x; *starty = y;
  }
    
  MPE_Draw_logic (w, MPE_LOGIC_COPY);
}

About this document ...

Beginner's Guide to MPI
(MPICH-v1.0.12)
on the
University of Delaware DEC Alpha Cluster

This document was generated using the LaTeX2HTML translator Version 98.1p1 release (March 2nd, 1998)

The command line arguments were:
latex2html -split 0 manual.

The translation was initiated by Lori Pollock on 2000-02-14

Lori Pollock
2000-02-14

Beginner's Guide to MPI (MPICH-v1.0.12) on the University of Delaware DEC Alpha Cluster

Common Problems: Descriptions and Tips

Lost Output

Sample Makefile

MPI Program with Graphics - Mandelbrot Rendering

Beginner's Guide to MPI
(MPICH-v1.0.12)
on the
University of Delaware DEC Alpha Cluster