Prerequisites
This lab should be done after the talk MPI Collective Communication II. You should complete the MPI Basics exercise before starting this lab.
To learn or reference the syntax of MPI calls, access the Message Passing Interface Standard at:
http://www-unix.mcs.anl.gov/mpi/ or the MPIPro routines
Overview
This lab will give you the opportunity to modify a code to exploit the advanced collective communication features of MPI. You will use the MPI_Scatterv routine to distribute initial data among processors, where the number of data points does not divide evenly by the number of processors. This is a typical instance in which a "plain" MPI_Scatter call does not quite have the functionality to accomplish what is desired. Then you will modify the code again, using the MPI_Gatherv routine to collect the final output data prior to writing the data to a file.
Exercise
Before You Begin
Copy all lab files found in
H:\VWlabs\MPI\Collective2\
to your home directory (or subdirectory) on H:, e.g.
copy H:\VWlabs\MPI\Collective2\* H:\Users\your_userid\
For the following exercises, compile the programs on a login node. Create a batch file and submit the job(s) to the batch nodes.
Exercise
Input file:
wave_inp.data
Fortran lab files:
wave_mw.f
parameters.h
fwave.mak
Fortran solution files:
wave_mw_vv.f
parameters.h
fwave_vv_out.data (will be obtained after run)
C lab files:
wave_mw.c
cwave.mak
C solution files:
wave_mw_vv.c
cwave_vv_out.data (will be obtained after run)
Output file (to compare C/Fortran results):
wave_vv_out.data
Program Description
Wave_mw implements the concurrent wave equation described in Chapter 5 of Fox et al. (1988) utilizing a master-worker model. It calculates the amplitude of points along a vibrating string for a specified number of time intervals. The equation it solves is:
newval[i] = (2.0 * values[i]) - oldval[i]
+ (sqtau * (values[i-1] - (2.0 * values[i]) + values[i+1]));
"i" indicates the point on the line. The values array holds the current amplitudes. Note that the new amplitude for the point will depend on the current values at neighboring points.
The decomposition can be viewed as:
Each task is assigned a contiguous block of points ("block" decomposition). Each task has all the data needed to update its interior points. To update its endpoints, the task must receive values for the points bordering the block (boundary values) from the tasks that "own" them. It must also send the values of its own endpoints to these tasks. Non-blocking sends (followed by a call to MPI_Wait) and blocking receives are used for this communication. Thus, the main computational loop exchanges endpoints and updates amplitudes.
Modifying the Code to Use Scatterv and Gatherv
In the present implementation of wave_mw, the master task reads the initial amplitudes from a file. Blocks of data must be distributed to workers. Similarly, at the end of the run, the updated data must be collected on the master for writing to a file. These steps are the focus of this lab.
The outline below shows the wave_mw program structure. The selection boxes identify changes that must be made to specific subroutines. These subroutines contain comments that provide information on how to proceed. In addition, you may want to refer to the code fragment presented in the MPI Collective Communication II talk.
- begin program
- learn number of tasks and rank (main program)
- if master: get program parameters and initial data (get_data)
- broadcast program parameters (get_data)
- distribute initial data
- distribute_dataFirst, calculate for each task:* number of points assigned* displacement in data array of first point assignedThen call scatterv
-
-
- update values (update)
-
-
- collect results
- collect_dataCall gatherv
- if master: store results (out_master)
- end
Compile and run the modified program wave_mw_vv.exe on 2-8 processors. Remember, you must copy the input file wave_inp.data into your local directory space.
The code prints out several values for validation. The correct values are:
1: .00 11:-.20 21:-.32 31:-.32 41:-.19 51: .01
61: .21 71: .32 81: .31 91: .18 100: .00
The resulting output file, cwave_vv_out.data or fwave_vv_out.data, can be compared (using diff) to the generic output file (wave_vv_out.data) in the lab directory to give additional confirmation that the code is running correctly.
References
Fox, G. et al. (1988) Solving Problems on Concurrent Processors, vol. 1. Prentice Hall.
Solution
Cleanup
When you are done running programs, delete your subfolder on the T: drives. End your batch job with ccrm or ccrelease. You may also wish to delete any files you copied from the VWlabs folder into your space on H:.