Collective Communication I

Lab Exercise
Prerequisites Overview Exercise Solution Cleanup

Prerequisites

This tutorial follows the talk MPI Collective Communication I. You should complete the Basics of MPI Programming module and lab exercise before starting this tutorial.


Overview

There are two example programs for calculating pi:

  • dboard_pi:

    This is a very simple parallel program in which each task independently approximates the solution. The amount of work done by each task remains the same as the number of tasks increases, but the averaged solution becomes more correct. There are two versions: one demonstrating sends and receives (pi_send) and the other using collective communications (pi_reduce).

  • int_pi:

    This program calculates pi using an integral approximation. This code is of particular interest, because there are several programs included that illustrate the conversion of the serial code int_pi to parallel. The final version is int_pi2 (.f or .c).


     Description of Problem

  • dboard_pi:

    Picture a circle with radius 1 centered at the origin. The circle sits inside a square whose corners are at (-1,-1), (-1,1), (1,1) and (1,-1). The area of the circle divided by the area of the square is pi/4.

    Think of this as a dartboard. The darts hit at x and y coordinates which are random numbers between -1 and 1. Darts must fall within the square, and may also fall within the circle. The program approximates the value of pi by dividing the number of darts that fall within the circle by the total darts thrown, and multiplying by four.

    A full description of this problem is available in Fox et al., 1988, Solving Problems on Concurrent Processors, volume 1, page 207.

  • int_pi:

    int_pi uses a simple integral approximation to calculate pi.


     Parallel Implementation

  • dboard_pi:

There are two ways to benefit from parallelism: you may run the same program in less time, or run a larger program in the same amount of time. This example takes the latter approach.

  • The serial calculation of pi involves throwing 5000 darts for each of ten iterations, with the cumulative average reported at each iteration. For the parallel implementation, each task performs this process independently, reporting its calculated pi value to the master task (task ID 0), which prints the cumulative average. The more tasks that participate, the more accurate the calculated value of pi.

    The code is SPMD, i.e. each task runs the same executable. There are two versions. Pi_send uses low-level sends and receives to collect the pi values. Pi_reduce uses the collective communication reduction routine, with the pre-defined reduction function for double precision floating point vector addition.


Exercise

     Before You Begin

You can work either in Microsoft Visual Studio or at the command line. Although the instructions that follow refer specifically to the command line, the file references are the same for Visual Studio. To review how to compile MPI programs see Basics of MPI Programming.

Copy all lab files found in

H:\VWlabs\MPI\Collective1\Lab2\dboard\

and

H:\VWlabs\MPI\Collective1\Lab2\int\

to your home directory (or subdirectory) on H:, e.g.

copy H:\VWlabs\MPI\Collective1\Lab2\dboard\*   H:\Users\your_userid\dboard\
copy H:\VWlabs\MPI\Collective1\Lab2\int\*   H:\Users\your_userid\int\

For the following exercises, compile the programs on a login node. Create a batch file and submit the job(s) to the batch nodes.



Solution

     Instructions for Compiling and Running dboard_pi

Files:

Compile using the make file:

For Fortran:

nmake /f pi_f.mak

For C:

nmake /f pi_c.mak

Specify how many nodes to run and execute one of the master program:

pi_send.exe

or

pi_reduce.exe


     Instructions for Compiling and Running int_pi

Before proceeding, you might want to review the talks MPI Collective Communication I. and Basics of MPI Programming. Files:
  • int_pi.f or int_pi.c is the original serial version of the int_pi code.
  • int_pi1.f or int_pi1.c is the partially parallelized code converted to SPMD, but with no division of work and no message-passing.
  • int_pi2.f or int_pi2.c is the final parallelized version.
  • int_pi_f.mak or int_pi_c.mak is a makefile that invokes the necessary compiler options, include directories and library directories.

Compile the code:

mpif77 int_pi2.f -o int_pi2.exe

or

mpicc int_pi2.c -o int_pi2.exe

or, use the make file:

nmake /f int_pi_f.mak

or

nmake /f int_pi_c.mak

Specify how many nodes to run on and execute the master program:

int_pi2.exe


Cleanup

When you are done running programs, delete your subfolder on the T: drives. End your batch job with ccrm or ccrelease. You may also wish to delete any files you copied from the VWlabs folder into your space on H:.