Derived Data Types

3.2 Extent of Datatype

The above is sufficient to answer the questions "what are derived datatypes?" From time to time, however, you may need to know a little bit more to use more complex derived datatypes successfully. In particular, it is useful to understand the concept of the extent of a derived datatype. For that, we need some more definitions.

lb(Typemap) = min(disp0, disp1, ..., dispN)

ub(Typemap) = max(disp0 + sizeof(type0), disp1 + sizeof(type1), ..., dispN + sizeof(typeN))

extent(Typemap) = ub(Typemap) - lb(Typemap) + pad

lb stands for lower bound. You can think of it as the location of the first byte described by the datatype. ub stands for the upper bound. It is the location of the last byte described by the datatype. sizeof is the size of the basic datatype in bytes. (Note: this is for basic datatypes.) extent is the difference between these two, possibly increased by pad to meet alignment requirements. Some languages, like FORTRAN and C, require that their datatypes be aligned a particular way in memory. Commonly, they require the address of a variable (in bytes) to be a multiple of its length (in bytes). MPI uses pad to take this into account, so the extent of a datatype is the span from the first byte to the last byte occupied by entries in this datatype, rounded up to satisfy alignment requirements. For all the basic datatypes, like MPI_DOUBLE, MPI_INTEGER, this is simply the number of bytes in them.

Consider an example for a derived datatype. Suppose extent(double) = 8, extent(int) = 4, and that a machine requires doubles to be aligned on 8-byte boundaries. If a derived datatype has typemap = {(int,0) (double,4)}, it follows that lb = min(0,4) = 0 and ub = max(0+4,4+8) = 12. However, since doubles must be aligned on 8-byte boundaries, the extent of this derived datatype is 16, not 12. MPI calls exist for you to get the lb, ub, and extent. Thus, you don't need to worry about how a particular machine or language aligns data in memory.

Here is the syntax for the extent routine:

  • C
    int MPI_Type_extent(MPI_Datatype datatype, MPI_Aint *extent)
    
  • FORTRAN
    MPI_TYPE_EXTENT(DATATYPE, EXTENT, MPIERROR)
        INTEGER DATATYPE, EXTENT, MPIERROR
    

where
  • datatype is an input datatype handle
  • extent is an output integer for FORTRAN, or for C, a special integer type MPI_Aint, that can hold an arbitrary address