Non-blocking standard send, message size <= threshold
Non-blocking receive
We have seen the blocking behavior for each of the communication modes. We will now discuss the non-blocking behavior for standard mode. The behaviors of the other modes can be inferred from this.
The following figure shows use of both a non-blocking standard send MPI_Isend (S) and a non-blocking receive MPI_Irecv (S). As before, the standard mode send will proceed differently depending on the message size. The following figure demonstrates the behavior for message size less than or equal to the threshold.

The sending task posts the non-blocking standard send when the message buffer contents are ready to be transmitted. It returns immediately without waiting for the copy to the remote system buffer to complete. MPI_Wait (S) is called just before the sending task needs to overwrite the message buffer.
The receiving task calls a non-blocking receive as soon as a message buffer is available to hold the message. The non-blocking receive returns without waiting for the message to arrive. The receiving task calls MPI_Wait (S) when it needs to use the incoming message data (i.e. needs to be certain that it has arrived).
No real difference in system overhead compared to blocking calls
In general, the system overhead will not differ substantially compared to blocking calls. On hardware where the CPU needs to be involved in the data transfer, computation will always be interrupted on both the sending and receiving nodes to pass the message; the point in time when the interruption occurs should not be of any particular consequence to the running program. And even when the architecture permits data transfer and computation to occur simultaneously, the fact that this behavior pertains only to small messages means that no great difference in performance would be expected.
More potential for reduction in synchronization overhead
- No blocking if system buffer is full on receiving side (until Wait)
- No sync penalty from posting MPI_Irecv (S) as early as possible**
- Can prevent deadlock (processes waiting for each other)
**Early posting is desirable because message goes directly into receive buffer, if Irecv (or Recv) is posted ahead of Isend (or Send)
The advantage of using the non-blocking send occurs when the remote system buffer is full. In this case, a blocking send would have to wait until the receiving task pulled some message data out of the buffer. If a non-blocking call is used, computation can be done during this interval.
The advantage of a non-blocking receive over a blocking one can be considerable if the receive is posted before the send. The task can continue computing until the Wait is posted, rather than sitting idle. (It is desriable to post any type of receive ahead of its matching send so that when the message arrives, it goes directly into the program buffer instead of the system buffer.) Posting MPI_Irecv (S) as early as possible therefore reduces the amount of synchronization overhead.
Non-blocking calls can ensure that deadlock -- a situation in which two processes waiting for each other -- will not occur. To achieve this, it is necessary only to make sure that all Waits are posted following all the send and receive calls needed to complete the communication.