Simple approach: One process scatters the N messages to the other processes, one at a time.

Smarter approach: Let other processes help propagate the message. In the first step, process 1 sends half of the data (size 4*p) to process 2. In the second step, both processes can now participate: process 1 sends one half of the data it received in step 1 (size 2*p) to process 3, while process 2 sends one half of the data it received in step 1 to process 4. At the third step, four processes (1, 3, 2, and 4) are sending the final messages (size p) to the remaining four processes (5, 6, 7, and 8).

Compare the number of steps in the two approaches:
Simple approach: N-1
Smarter approach: log2N
The latter scales better as N increases.