Derived Data Types

4.2.1 Construct

As we said above, derived datatypes are datatypes that are built from the basic MPI datatypes. Typemaps are a completely general way of doing this, but they are not very convenient if we have a large number of entries. Fortunately, MPI provides a number of routines to create common datatypes from the basic datatypes without needing to construct a typemap. New datatype definitions are build up from existing datatypes (either derived or basic) using a call, or a recursive series of calls, to the routines described below:

Contiguous:

Calls to MPI_TYPE_CONTIGUOUS produce a new datatype by replicating the existing datatype into contiguous locations.

  • C
    int MPI_Type_contiguous(int count, MPI_Datatype oldtype,
        MPI_Datatype *newtype)
    
  • FORTRAN
    MPI_TYPE_CONTIGUOUS(COUNT, OLDTYPE, NEWTYPE, MPIERROR)
        INTEGER COUNT, OLDTYPE, NEWTYPE, MPIERROR
    
where
  • count is an input variable specifying the replication count
  • oldtype is an input variable specifying the handle of the old datatype
  • newtype is an output variable specifying the handle of the new datatype

Vector:

Calls to MPI_TYPE_VECTOR, like those to MPI_TYPE_CONTIGUOUS, produce a new datatype by replicating the existing one; however, MPI_TYPE_VECTOR allows for gaps in the displacement. Such gaps are multiples of the extent of the existing datatype.

  • C
    int MPI_Type_vector(int count, int blocklength, int stride,
        MPI_Datatype oldtype, MPI_Datatype *newtype)
    
  • FORTRAN
    MPI_TYPE_VECTOR(COUNT, BLOCKLENGTH, STRIDE, OLDTYPE, 
                    NEWTYPE, MPIERROR)
        INTEGER COUNT, BLOCKLENGTH, STRIDE, OLDTYPE, 
                NEWTYPE, MPIERROR
    
where
  • count is an input variable specifying the number of blocks
  • blocklength is an input variable specifying the number of elements in each block
  • stride is an input variable specifying the number of elements between the start of successive blocks

Example:

 Vector

This illustrates a call with count = 2, blocklength = 3, and stride = 5

Try this:

Using paper and pencil, illustrate the call:

MPI_TYPE_VECTOR(2, 4, 4, OLDTYPE, NEWTYPE, MPIERROR)

What does this say about the call to MPI_TYPE_CONTIGUOUS?

Caution

This datatype constructor and the ones described below can be used with allocatable objects (C or F90) provided the entire object is allocated at once. The stride in actual memory between pieces that were allocated at different times cannot be predicted. Thus, to allocate a C matrix for which MPI_TYPE_VECTOR could be used to define a datatype that represents a submatrix, one would allocate an object whose size is the number of rows times the number of columns times the size of a matrix element. An array of pointers to the rows can be set up afterward.

Hvector:

This is like MPI_TYPE_VECTOR, except the displacement is specified in bytes. The C and FORTRAN routines, MPI_Type_hvector and MPI_TYPE_HVECTOR, respectively, are identical to those for MPI_TYPE_VECTOR given above, except that stride is in bytes.

Indexed:

Calls to MPI_TYPE_INDEXED replicates the existing datatype into a sequence of blocks where each block is a concatenation of the existing datatype. Each block can contain a different number of copies and have a different displacement; however, all block displacements are multiples of the existing datatype's extent.

  • C
    int MPI_Type_indexed(int count, int *array_of_blocklengths,
        int *array_of_displacements, MPI_Datatype oldtype,
        MPI_Datatype *newtype)
    
  • FORTRAN
    MPI_TYPE_INDEXED (COUNT, ARRAY_OF_BLOCKLENGTHS, 
                      ARRAY_OF_DISPLACEMENTS, OLDTYPE, 
                      NEWTYPE, MPIERROR)
        INTEGER COUNT, ARRAY_OF_BLOCKLENGTHS(*), 
                ARRAY_OF_DISPLACEMENTS(*), OLDTYPE, 
                NEWTYPE, MPIERROR
    
where
  • count is an input variable specifying the number of blocks
  • array_of_blocklengths is an input variable specifying the number of elements per block
  • array_of_displacements is an input variable specifying the displacement for each block, in multiples of old datatype extent
Hindexed:

This is like MPI_TYPE_INDEXED, except the displacement is specified in bytes. The C and FORTRAN routines, MPI_Type_hindexed and MPI_TYPE_HINDEXED, respectively, are identical to those for MPI_TYPE_INDEXED given above, except that array_of_displacements is in bytes.

Struct:

When you call MPI_TYPE_STRUCT, you can gather a mix of different datatypes scattered at many locations in memory into one datatype that can be used for sending messages. This is the most general datatype and the only one that allows more than one datatype as input. Note that if the input arguments are basic MPI datatypes, the input is just a typemap.

  • C
    int MPI_Type_struct(int count, int *array_of_blocklengths,
        MPI_Aint *array_of_displacements, MPI_Datatype *array_of_types,
        MPI_Datatype *newtype)
    
  • FORTRAN
    MPI_TYPE_STRUCT(COUNT, ARRAY_OF_BLOCKLENGTHS, ARRAY_OF_DISPLACEMENTS, 
                    ARRAY_OF_TYPES, NEWTYPE, MPIERROR)
        INTEGER COUNT, ARRAY_OF_BLOCKLENGTHS(*), ARRAY_OF_DISPLACEMENTS(*), 
                ARRAY_OF_TYPES(*), NEWTYPE, MPIERROR
    
    
where
  • count is an input variable specifying the number of blocks
  • array_of_blocklengths is an input variable specifying the number of elements in each block
  • array_of_displacements is an input variable specifying the byte displacement of each block
  • array_of_types is an input variable specifying the type of elements in each block

If the storage relationships among the elements is determined by the compiler (C struct, F90 sequence derived type, or FORTRAN common block), the byte values for the array_of_displacements can be calculated by the programmer. However, if the elements are independently declared variables or members of an F90 nonsequence derived type, the MPI_ADDRESS function must be used to determine the absolute address of each element for use in the array_of_displacements. When using datatypes containing absolute addresses in the array_of_displacements, the buffer address must be specified as MPI_BOTTOM.

Example:

 Struct

This illustrates a call with:

count = 2
array_of_blocklengths[0] = 1
array_of_types[0] = MPI_INT
array_of_blocklengths[1] = 3
array_of_types[1] = MPI_DOUBLE

Caution

Derived datatypes defined using absolute displacements should NOT contain variables that aren't static (e.g., they are on the data stack) unless the datatype is defined and used within a single call to the subroutine where the variables are declared. The reason for this is that unless the context is identical, the stack pointer will have a different value upon reentry into the subroutine and the absolute addresses determined earlier will be invalid. At CTC, FORTRAN variables are now static by default unless they are declared AUTOMATIC, are allocatable arrays, or are in a recursive subroutine or function.

Try this:

Using paper and pencil, illustrate the call MPI_TYPE_STRUCT where:

count = 2,
array_of_blocklengths[0] = 1,
array_of_types[0] = the newtype illustrated above,
array_of_blocklengths[1] = 2, and
array_of_types[1] = MPI_INT.