Persistent Communication

Lab Exercise
Prerequisites Overview Exercise Solution Cleanup

Prerequisites

This lab should be done after the module MPI Persistent Communication. You should complete the MPI Basics exercise before starting this lab.

To learn or reference the syntax of MPI calls, access the Message Passing Interface Standard at:
http://www-unix.mcs.anl.gov/mpi/ or the MPIPro routines


Overview

This lab will give you the opportunity to modify a code to use persistent communication. Persistent communication can improve program performance in cases where point-to-point communication routines are repeatedly called with the same arguments.


Exercise

Before You Begin

Copy all lab files found in

H:\VWlabs\MPI\Persistent\

to your home directory (or subdirectory) on H:, e.g.

copy H:\VWlabs\MPI\Persistent\*   H:\Users\your_userid\

For the following exercises, compile the programs on a login node. Create a batch file and submit the job(s) to the batch nodes.


Exercise

C lab file: wave_send.c
C solution file: wave_pcomm.c

Fortran lab file: wave_send.f, parameters.h
Fortran solution file: wave_pcomm.f, parameters.h

Program Description

Wave_send implements the concurrent wave equation described in Chapter 5 of Fox et al. (1988). It calculates the amplitude of points along a vibrating string for a specified number of time intervals. The equation it solves is:

   newval[i] = (2.0 * values[i]) - oldval[i] 
        + (sqtau * (values[i-1] - (2.0 * values[i]) + values[i+1]));
"i" indicates the point on the line. The values array holds the current amplitudes. Note that the new amplitude for the point will depend on the current values at neighboring points.

The decomposition can be viewed as:

 

Processors


Each task is assigned a contiguous block of points ("block" decomposition). Each task has all the data needed to update its interior points. To update its endpoints, the task must receive values for the points bordering the block (boundary values) from the tasks that "own" them. It must also send the values of its own endpoints to these tasks. Non-blocking sends and receives are used for this communication, followed by a call to MPI_Waitall.

The main computational loop exchanges endpoints and then updates amplitudes. Since the endpoint exchanges use the same arguments for the message-passing calls for all the loop iterations, and the loop is repeated many times, they are good candidates for persistent communication.

Run wave_send

Compile and run wave_send. Run on four processors. Save the output to a file – you will need this to test correctness of results and compare timings. You should expect timing runs to vary widely for interactive runs. The wave codes include variables for number of points along a vibrating string, and number of time steps. Try changing these variables, and run again.

Convert the Code to Persistent Communication

Convert the code to use persistent communication for the endpoint exchange.

The outline below shows the wave_send program structure. The selection boxes identify changes that must be made to specific subroutines. These subroutines also contain comments that provide additional information on the conversion process.

You will be replacing standard sends and receives with persistent communication routines. Since the routines that create persistent requests take the same arguments as the message-passing calls, you can complete the exercise without understanding how the arguments relate to the program. Here is an explanation, if you'd like this additional information.

begin program

learn number of tasks and rank (main program)

identify left and right neighbors (C:main program, F:neighbor)

get program parameters (get_data)

initialize wave values (init_line)

create persistent communication pcomm_createStep 1:Call MPI_Send_init and MPI_Recv_initCompile and test the program
update values updateStep 2Replace the send and receive calls with MPI_StartCompile and test the program
deallocate persistent communication main programStep 3Call MPI_Request_freeCompile and test the program
if master: collect results and print (C:output_master, F:out_master)

if worker: send results to master (C:output_workers, F:out_workers)

end

Run the Converted Code

Compile and run your converted code, again on 4 processors. Compare the timing results to those from the original wave_send code.


References

Fox, G. et al. (1988) Solving Problems on Concurrent Processors, vol. 1. Prentice Hall.


Solution


Cleanup

When you are done running programs, delete your subfolder on the T: drives. End your batch job with ccrm or ccrelease. You may also wish to delete any files you copied from the VWlabs folder into your space on H:.