Derived Data Types

Lab Exercise
Prerequisites Overview Exercise Solution Cleanup

Prerequisites

This lab follows the material in Derived Datatypes. The material will familiarize you with three of the common derived datatypes: contiguous, vector, and struct.

Before running this lab, you should complete the MPI Basics exercise.

To learn or reference the syntax of MPI calls, access the Message Passing Interface Standard at:
http://www-unix.mcs.anl.gov/mpi/ or the MPIPro routines


Overview

Derived datatypes are datatypes that are built from the basic MPI datatypes. They provide a portable and elegant way of communicating non-contiguous or mixed types in a message. Derived datatypes should also provide an efficient method of sending data since the data can be moved from its location in one process's memory to a location in a different process's memory without any intermediate buffering. (Note, however, if you are using MPI-F, you may wish to compare the speed of using MPI_BYTE to that of using your derived datatype.) Derived datatypes provide a template of the data that is to be sent. All the data in the datatype is identified by its offset from the base address. The base address is the address which is passed to the MPI routine using the derived datatype. This allows the same MPI datatype to be used for any number of variables of the same form.

MPI provides a number of different routines for creating derived datatypes, each aimed at certain types of data, i.e., contiguous data, non-contiguous data, and non-contiguous mixed data.


Exercise

General Lab Instructions

Copy all lab files found in

H:\VWlabs\MPI\Derived\

to your home directory (or subdirectory) on H:, e.g.

copy H:\VWlabs\MPI\Derived\*   H:\Users\your_userid\

For the following exercises, compile the programs on a login node. Create a batch file and submit the job(s) to the batch nodes.


Exercise 1: Choose a Derived Datatype

C lab file: choose.c
FORTRAN lab file: choose.f

C lab solution file: chosen.c
FORTRAN lab solution file: chosen.f

When programming with a matrix, you often want to send only a portion of it, or a sub-matrix. One way to do this is by repeatedly calling MPI send and receive commands with each element of the matrix you want to transmit. This exercise lets you try your hand at choosing a derived datatype that will simplify this process and make it more efficient.

The given C lab program sends one single column of a two-dimensional matrix from one processor to another, one column element at a time. Your task is to modify the program so it sends the entire column at once using a derived datatype. Similarly, the given FORTRAN program sends one single row of a two-dimensional matrix; you modify it so it sends the entire row at once.

Here are the steps you should follow:

  1. Read through the programs and review the MPI calls.

  2. To replace the loops around the send and receive commands with a single send and receive, you need to create a new MPI derived datatype representing the sub-matrix, then pass it to the send and receive commands. Something to consider: what is the layout of the matrix in memory? Does it matter to MPI if C stores matrices in row-major order while FORTRAN stores them column-major order?

  3. Don't forget to commit the datatypes you construct to the system.

  4. Compile your modified program. Run the program specifing two processes.

  5. Compare your choice of derived datatype to the one in the lab solution files. In MPI, there is often more than one way of doing things. Consider alternative derived datatypes you could have used. You should be able to think of at least two of them.

  6. Exercise left to the reader: modify the program so that it sends a sub-block of the original matrix. Can you do it without adding loops?

Exercise 2: Complete a Derived Datatype

C lab file: to.complete.c
FORTRAN lab file: to.complete.f

C lab solution file: completed.c
FORTRAN lab solution file: completed.f

This exercise illustrates the use of the struct derived datatype.

  1. Read through the "to.complete" program paying close attention to the FORTRAN COMMON block and C struct that are defined. Get a clear picture in your mind of how these datatypes are laid out in memory. Your goal is to create the corresponding MPI derived datatype and pass it as a message.

  2. Decide the proper values for the arguments to the MPI_TYPE_STRUCT call. What are the blocklengths? the displacements? and the types? Do you have to do anything special to make sure you have the proper displacement?

  3. Fill in the parameters for the MPI_TYPE_STRUCT call in their proper order. Review the MPI documentation on this call, as necessary.

  4. Run the program using two processes.

Acknowledgements

I wish to acknowledge the following source of ideas:

Writing Message-Passing Parallel Programs with MPI: A two-day course by Neil MacDonald, Elspeth Minty, Tim Harding, and Simon Brown.
Edinburgh Parallel Computing Centre
The University of Edinburgh
08/25/95


Solution


Cleanup

When you are done running programs, delete your subfolder on the T: drives. End your batch job with ccrm or ccrelease. You may also wish to delete any files you copied from the VWlabs folder into your space on H:.