|
8.4 Collective Operations
|
Collective means "everybody is known to take part", and there are a number of oft-encountered situations very nicely categorized as being collective in nature. Having been so identified, it follows that,
- it should be possible to figure out how to orchestra them without having to say everything "long-hand", as it were, and
- there may be terribly clever and efficient ways of actually arranging to do them, given exact knowledge of the architecture they'll be operating on.
These are the two main benefits of collective operations: you don't have to write a lot of code to do them, and, where-ever possible, they're done in as efficient a manner as the architecture permits.
- Communication
Even though communications in a distributed processing environment are, at base, all point-to-point (i.e., one processor sends a message to another processor), there are logical situations where you can very accurately describe what's going on as "and now processor-a communicates this to everybody", or "everybody sends their results to processor-b, who collects them and sums them up". Were you writing bare-bones code, you'd be doing lots of paired, structured sends and receives, but you can get your point across quite well by waving your hands and making sweeping generalizations.
After having done this enough, you wonder if there isn't some way to make this task easier:
- Broadcast (multicast)
We're still dealing here with send/receive pairs, but now they're not called "send" and "receive", but rather something like "broadcast", and one node is identified as being the broadcaster, and all other nodes are listeners ... but everybody has the same call in their code, it's just that whichever node has an id that matches the broadcaster value is the one who does the actual sending, and everyone else has that call translated into a receive. But that happens under-the-covers ... all modules have exactly the same piece of code in them, and at runtime the code implementing the broadcast function determines, for each node, which role it plays.

Multicast is actually a generalization of broadcast: multicast says "identify some subset of the nodes that will be participating in this collective communication, call that subset a multicast group", while broadcast simply says "all nodes are going to be a part of it".
- Scatter-gather

Scatter-gather operations entail handing out pieces of the work to different processors (scatter) and then collecting the results when they're finished (gather). This is a very typical situation in, e.g., matrix operations, where one processor has the whole thing, divides it up among a number of peers for, say, calculation of a sub-determinant, and then has their results fed back to it for the final calculation.
- Computation (generalized reduction operations)
Many kinds of computation have a particular characteristic, called commutativity, which makes them prime candidates for parallelization. Commutivity (the operation itself is said to be commutative) means that the result of the operation is the same regardless of the order in which the arguments are arranged: addition is commutative (a+b == b+a), but subtraction isn't (a-b != b-a, a != b). Commutivity is a very nice property for parallel operations to have, because it allows the processors to contribute in whatever order happens to occur.
The following operations are, for the most part, well-understood, so they'll simply be listed with little-if-any explanation:
- Arithmetic
- Search (value or location)
A little explanation may be needed, here: a max-value is clear enough, but what's a max-location? This is not only the maximum value, but also the location of the processor that found it.
- Boolean (logical or bitwise)
- AND: all TRUE (or bitwise 1's)
- OR: at least one TRUE (or bitwise 1)
- XOR: exactly one TRUE (or bitwise 1)
- Synchronization
The more processes a synchronization involves, the more collective it becomes.
Try One: If you'd like some first-hand experience with barrier synchronization, using an exercise from another module that deals with a game you can play, requiring only a deck of cards and a few other people ...
|