Derived Data Types

2. Why?

Review, for a moment, MPI's basic datatypes, shown below:

  • MPI basic predefined datatypes for C

    MPI Datatype C datatype
    MPI_CHAR signed char
    MPI_DOUBLE double
    MPI_FLOAT float
    MPI_INT signed int
    MPI_LONG signed long int
    MPI_LONG_DOUBLE long double
    MPI_LONG_LONG_INT signed long long int
    MPI_SHORT signed short int
    MPI_UNSIGNED unsigned int
    MPI_UNSIGNED_CHAR unsigned char
    MPI_UNSIGNED_LONG unsigned long int
    MPI_UNSIGNED_SHORT unsigned short int

    MPI_BYTE
    MPI_PACKED

  • MPI basic predefined datatypes for FORTRAN

    MPI Datatype FORTRAN Datatype
    MPI_CHARACTER CHARACTER(1)
    MPI_COMPLEX COMPLEX
    MPI_DOUBLE_COMPLEX DOUBLE COMPLEX
    MPI_DOUBLE_PRECISION DOUBLE PRECISION
    MPI_INTEGER INTEGER
    MPI_INTEGER1 INTEGER*1
    MPI_INTEGER2 INTEGER*2
    MPI_INTEGER4 INTEGER*4
    MPI_LOGICAL LOGICAL
    MPI_REAL REAL
    MPI_REAL2 REAL*2
    MPI_REAL4 REAL*4
    MPI_REAL8 REAL*8
    MPI_BYTE
    MPI_PACKED

Given these datatypes and a count, you can handle messages of contiguous data of the same type.


Motivation

What if you want to specify:
  • non-contiguous data of a single type?
  • contiguous data of mixed types?
  • non-contiguous data of mixed types?

A few possible solutions suggest themselves:

  • You could make multiple MPI calls to send and receive each data element in turn.

  • You could copy the data to a buffer before sending it. One of the two MPI-specific basic datatypes, MPI_PACKED, could be used to send data that has been explicitly packed, or receive data that will be explicitly unpacked.

  • You could use MPI_BYTE to get around the datatype-matching rules. Like MPI_PACKED, MPI_BYTE (the other MPI-specific basic datatype) can be used to match any byte of storage (on a byte-addressable machine), irrespective of the datatype of the variable that contains this byte.

Generally, however, these solutions are slow, clumsy, and wasteful of memory. Using MPI_BYTE or MPI_PACKED might also result in a program that isn't portable to a heterogeneous system of machines.

The idea of MPI derived datatypes is to provide a portable and efficient way of communicating non-contiguous or mixed types in a message. MPI derived datatypes provide a simpler, cleaner, more elegant and efficient way to handle this type of data, which is common. While you can get along without derived datatypes, you couldn't do so easily.

[ Caution: some implementations of MPI derived datatypes are not as efficient as they should be, particularly in terms of I/O wait times. Once you have your code running correctly, you may want to convert a few of the derived datatypes in your code's hotspots to MPI_BYTE or MPI_PACKED. Compare the timings of your converted code to that of your code using all MPI derived datatypes, and choose the one that gives you optimal performance. ]