Prerequisites
This tutorial follows the talk MPI Collective Communication I. You should complete the Basics of MPI Programming. module and lab exercise before starting this tutorial.
Overview
This is an example of a 2-d block decomposition program. It works on the SPMD (single program, multiple data) paradigm. A complete description of the algorithm is found in Fox, et al., "Solving problems on concurrent Processors, Volume 1: General Techniques and Regular Problems," Prentice Hall, Englewood Cliffs, New Jersey. Within certain limits (outlined below) it is scalable. Included in the program are examples of nodes exchanging edge values, convergence checking, and use of some of the MPI collective communications routines.
This program uses a finite difference scheme to solve Laplace's equation for a square matrix, which must be (4m+2) x (4m+2).
Parallel Implementation
The program is currently configured to do a 48x48 matrix, divided over four processors.
Each worker decides for itself whether it is an edge, corner, or interior node, as well as which other workers it must communicate with. The edge nodes get their "local" boundary values from the "global" boundary values as well as communicating with their neighboring interior nodes. The initial value of all points is set to the average of the global boundary values. The sequence for an iteration is as follows:
Each worker exchanges edgevalues with its four neighbors. Then new values are calculated for the upper left and lower right corners (the "red" corners) of each node's matrix. The workers exchange edge values again. The upper right and lower left corners (the "black" corners) are then calculated.
Every 20 iterations, the nodes calculate the average difference of each point with its value 20 iterations ago. These local average differences are collected by task 0, and the global average difference is found. If this is less than some acceptable value, task 0 collects the pieces of the matrix. Otherwise, 20 more iterations are run.
Exercise
Before You Begin
You can work either in Microsoft Visual Studio or at the command line. Although the instructions that follow refer specifically to the command line, the file references are the same for Visual Studio. To review how to compile MPI programs see Basics of MPI Programming.
Copy all lab files found in
H:\VWlabs\MPI\Collective1\Lab3\
to your home directory (or subdirectory) on H:, e.g.
copy H:\VWlabs\MPI\Collective1\Lab3\* H:\Users\your_userid\Lab3\
For the following exercises, compile the programs on a login node. Create a batch file and submit the job(s) to the batch nodes.
Solution
Instructions for Compiling and Running
C files (there is no Fortran version of this program):
- Compile using the make file provided.
nmake /f parallel_laplace.mak
- Specify 4 processors to run the program (mpirun -np 4 ...)
- Execute the program parallel_laplace.exe. The code takes about 120 seconds to run. Results are automatically stored in parallel_laplace.out.
Cleanup
When you are done running programs, delete your subfolder on the T: drives. End your batch job with ccrm or ccrelease. You may also wish to delete any files you copied from the VWlabs folder into your space on H:.