Understanding Broadcast and Gather in Parallel Computing
Written on
Chapter 1: Introduction to Parallel Communication
In the previous segment of this series, we explored how to establish direct communication between processes using the Python mpi4py library. Today, we will delve into two additional communication methods, known as broadcast and gather.
Broadcast and gather are essential techniques in parallel computing, enabling data sharing among multiple processes or nodes.
Section 1.1: Understanding Broadcast
Broadcasting is a method whereby a single process transmits data to all other processes within the system simultaneously. This approach is particularly beneficial when it is necessary to share common data across all nodes, such as a configuration file or a lookup table. Rather than sending the same information to each node individually, broadcasting allows the data to be dispatched once and received by all nodes at the same time.
The first video titled "Multiprocessing in Python | Parallel Programming in Python (Part-3)" explains the principles of broadcasting in detail.
Section 1.2: Exploring Gather
Conversely, gathering is a technique where data from all nodes is collected and sent to a designated process, typically referred to as the "root" process. This method is valuable when each node holds a fragment of data that needs to be amalgamated or analyzed collectively, such as when calculating a global sum or average. The gather function enables the root process to collect all the data from the nodes for joint processing.
The second video "Parallel Programming with Python" further illustrates the gathering process and its applications.
Chapter 2: Practical Implementation of Broadcast and Gather
The following code snippet demonstrates how to implement the two key MPI functions for broadcasting and gathering: bcast and gather. The bcast function broadcasts data from a single process (the "root" process) to all other processes within the communicator, while the gather function collects data from all processes to a single process.
Here's the example code:
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
if rank == 0:
data = [i**2 for i in range(size)] # create data to broadcast
else:
data = None
local_data = comm.bcast(data, root=0) # broadcast data from rank 0 to all other ranks
print(f"Rank {rank}: local data received = {local_data}")
local_data = [i * (rank + 1) for i in local_data] # Work on the local data
gathered_data = comm.gather(local_data, root=0) # gather all local data to rank 0
if rank == 0:
print(f"nRank {rank}: gathered data = {gathered_data}") # print gathered data on rank 0
When executing this code across four nodes using the command:
mpiexec -n 4 python bcast_and_gather.py
the output will be as follows:
Rank 0: local data received = [0, 1, 4, 9]
Rank 1: local data received = [0, 1, 4, 9]
Rank 2: local data received = [0, 1, 4, 9]
Rank 3: local data received = [0, 1, 4, 9]
Rank 0: gathered data = [[0, 1, 4, 9], [0, 2, 8, 18], [0, 3, 12, 27], [0, 4, 16, 36]]
Let's break down the code step by step:
- We import the MPI module from the mpi4py package. A communicator object (comm) is created to represent all processes involved in the computation. We retrieve the rank of the current process (rank) and the total number of processes (size) in the communicator.
- Data is prepared for broadcasting from the root process (rank 0). In this case, a list is created where each element is the square of its index. If the current process is not the root, data is set to None.
- The bcast function is used to broadcast data from the root process to all other processes. The broadcasted data is assigned to local_data, which is printed to the console for each process.
- Each process performs some trivial computation on the local data, and finally, the root node gathers all local_data lists from the other nodes.
Conclusion
To sum up, broadcasting and gathering are crucial communication techniques in parallel computing, facilitating efficient data sharing among multiple processes or nodes within a distributed computing framework. The mpi4py library simplifies the implementation of these techniques, empowering Python developers to create parallel applications that involve data sharing effectively.