ScholarQuill logoScholarQuillUniversity Notes
  • Notes
  • Past Papers
  • Blogs
  • Todo
Login
ScholarQuill logoScholarQuillUniversity Notes
Login
NotesPast PapersBlogsTodo
More
SubjectsDiscussionCGPA CalculatorGPA CalculatorStudent PortalCourse Outline
About
About usPrivacy PolicyReportContact
Notes
Past Papers
Blogs
Todo
Analytics
    Current Subject
    🧩
    Parallel & Distributed Computing
    DC-323
    Progress0 / 35 topics
    Topics
    1. Asynchronous/synchronous computation/communication2. Concurrency control3. Fault tolerance4. GPU architecture and programming5. Heterogeneity6. Interconnection topologies7. Load balancing8. Memory consistency model9. Memory hierarchies10. Message passing interface (MPI)11. MIMD/SIMD12. Multithreaded programming13. Parallel algorithms & architectures14. Parallel I/O15. Performance analysis and tuning16. Power considerations17. Programming models18. Data parallel programming19. Task parallel programming20. Process-centric programming21. Shared memory programming22. Distributed memory programming23. Scalability and performance studies24. Scheduling25. Storage systems26. Synchronization27. Parallel computing tools28. CUDA, Swift29. Globus, Condor30. Amazon AWS, OpenStack31. Cilk32. GDB for parallel debugging33. Threads programming34. MPICH, OpenMP35. Hadoop, FUSE
    DC-323›Message passing interface (MPI)
    Parallel & Distributed ComputingTopic 10 of 35

    Message passing interface (MPI)

    8 minread
    1,287words
    Intermediatelevel

    The Message Passing Interface (MPI) is a standardized and portable communication protocol used to facilitate communication between processes in parallel and distributed computing environments. It is commonly used in high-performance computing (HPC) applications, where multiple processes run concurrently on different nodes or cores of a system. MPI provides a rich set of communication primitives that allow processes to exchange data, coordinate tasks, and synchronize execution in a parallel program.

    Key Concepts of MPI

    1. Parallelism: MPI enables the development of parallel applications by allowing multiple processes to run concurrently, potentially on different machines or processors.
    2. Message Passing: MPI is based on the concept of sending and receiving messages between processes, which allows them to communicate and share data.
    3. Distributed Memory Model: MPI is designed for systems that have a distributed memory architecture, meaning each process has its own local memory, and communication occurs by explicitly passing messages between processes.

    Why MPI?

    MPI was developed to address the challenges of parallel programming and distributed computing. Prior to MPI, various proprietary message-passing libraries existed, but MPI was designed as a standardized interface to make programs more portable across different systems, from clusters of workstations to supercomputers.

    Main Features of MPI

    1. Point-to-Point Communication: This involves direct communication between two processes. One process sends a message, and the other receives it. Examples of point-to-point communication include:

      • MPI_Send: Sends a message from one process to another.
      • MPI_Recv: Receives a message from another process.
    2. Collective Communication: This type of communication involves multiple processes participating in a communication operation. It is used when data needs to be shared or aggregated among many processes. Examples include:

      • MPI_Bcast: Broadcasts data from one process to all other processes in a communicator.
      • MPI_Reduce: Combines data from all processes in a communicator using a specified operation (e.g., sum, maximum, etc.).
      • MPI_Gather: Collects data from all processes and gathers it at one root process.
      • MPI_Scatter: Distributes data from one process to all other processes.
    3. Synchronization and Coordination: MPI provides mechanisms for synchronizing and coordinating processes, ensuring they run in a specified order. This includes operations like barriers:

      • MPI_Barrier: Synchronizes all processes in a communicator, making sure that all processes reach the barrier before proceeding.
    4. Data Types and Structuring: MPI supports complex data types, allowing users to define custom structures for messages, facilitating the transmission of complex data across processes.

    5. Fault Tolerance: MPI can handle certain faults and errors in a distributed system, though it doesn’t provide full fault tolerance like more advanced systems (e.g., MapReduce or Spark). Error-handling mechanisms in MPI allow programs to gracefully handle some types of communication failures.

    MPI Communication Models

    MPI supports two primary communication models:

    1. Synchronous Communication: In synchronous communication, the sending process blocks until the receiving process has received the message. This model ensures that the sender and receiver are synchronized.

      • Example: MPI_Send and MPI_Recv can be synchronous or asynchronous, depending on the implementation, but the basic model requires the sender to wait until the receiver acknowledges the message.
    2. Asynchronous Communication: Asynchronous communication allows the sending process to continue without waiting for the receiving process to acknowledge the message. The sender doesn't block, and the message can be received at a later time.

      • Example: MPI_Isend (non-blocking send) and MPI_Irecv (non-blocking receive) allow communication to occur asynchronously, meaning the sender can continue execution without waiting for the recipient.

    MPI Functions

    MPI provides a large set of functions for different operations. Here are some key categories:

    Initialization and Finalization

    • MPI_Init: Initializes the MPI environment.
    • MPI_Finalize: Terminates the MPI environment.

    Point-to-Point Communication

    • MPI_Send: Sends a message from one process to another.
    • MPI_Recv: Receives a message from another process.
    • MPI_Isend: Initiates a non-blocking send operation.
    • MPI_Irecv: Initiates a non-blocking receive operation.

    Collective Communication

    • MPI_Bcast: Broadcasts a message from one process to all other processes in a communicator.
    • MPI_Reduce: Reduces data from all processes to a single value (e.g., summing values).
    • MPI_Gather: Gathers data from all processes and places it in the root process.
    • MPI_Scatter: Distributes data from the root process to all other processes.

    Synchronization

    • MPI_Barrier: Synchronizes all processes in a communicator, ensuring that all processes reach the barrier before proceeding.

    Derived Data Types

    • MPI_Type_create_struct: Defines a new custom data type that can be sent/received by MPI processes.
    • MPI_Type_commit: Commits the custom data type so it can be used in communication.

    Error Handling

    • MPI_Comm_rank: Gets the rank (ID) of the process within a communicator.
    • MPI_Comm_size: Gets the size of the communicator (total number of processes).

    MPI Example Code

    Here’s an example of a basic MPI program that demonstrates point-to-point communication, where one process sends data to another:

    #include <mpi.h>
    #include <stdio.h>
    
    int main(int argc, char *argv[]) {
        int rank, size;
        MPI_Init(&argc, &argv);  // Initialize MPI
        MPI_Comm_rank(MPI_COMM_WORLD, &rank);  // Get process rank
        MPI_Comm_size(MPI_COMM_WORLD, &size);  // Get total number of processes
    
        int data = 0;
        if (rank == 0) {
            data = 100;  // Process 0 sets the data to 100
            MPI_Send(&data, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);  // Send data to process 1
            printf("Process 0 sent data: %d\n", data);
        } else if (rank == 1) {
            MPI_Recv(&data, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);  // Receive data from process 0
            printf("Process 1 received data: %d\n", data);
        }
    
        MPI_Finalize();  // Finalize MPI
        return 0;
    }
    

    In this example:

    • MPI_Init initializes the MPI environment.
    • MPI_Comm_rank and MPI_Comm_size retrieve the rank and size of the current process in the communicator.
    • MPI_Send sends data from process 0 to process 1.
    • MPI_Recv receives the data at process 1.
    • MPI_Finalize cleans up the MPI environment after the program finishes.

    Advantages of MPI

    1. Portability: MPI is platform-independent and is supported on many different architectures, including clusters, supercomputers, and multi-core systems.
    2. Scalability: MPI can efficiently handle communication in both small-scale (multi-core) and large-scale (multi-node) distributed systems.
    3. Flexibility: MPI supports both synchronous and asynchronous communication, allowing for fine-grained control over performance and synchronization.
    4. Rich Set of Communication Patterns: MPI provides many communication primitives to support point-to-point, collective, and one-sided communications.
    5. Performance: MPI is highly optimized for performance, offering low-latency communication and the ability to minimize the overhead of message-passing.

    Challenges of MPI

    1. Complexity: MPI programming can be more complex compared to higher-level parallel programming models, as the programmer is responsible for explicitly managing communication and synchronization.
    2. Portability Issues: Although MPI is portable, performance can vary across different systems and network architectures. Optimizing MPI programs often requires tuning for specific hardware.
    3. Debugging: Debugging parallel MPI programs can be difficult, especially when dealing with race conditions, deadlocks, or performance bottlenecks.
    4. Error Handling: While MPI provides some basic error-handling mechanisms, fault tolerance in MPI is not as advanced as in some higher-level frameworks like MapReduce.

    MPI in Modern Computing

    MPI remains a cornerstone of high-performance computing and is used in a wide variety of scientific, engineering, and industrial applications that require large-scale parallel processing. While new parallel programming models, such as OpenMP (for shared-memory systems) and newer frameworks like CUDA for GPUs, have emerged, MPI remains a powerful and widely-used tool for distributed memory systems where explicit message passing is necessary.


    Conclusion

    The Message Passing Interface (MPI) is a powerful, flexible, and portable framework for parallel and distributed programming. It provides a rich set of communication primitives that allow for efficient, fine-grained control over inter-process communication, synchronization, and data sharing in distributed memory systems. Although it can be challenging to use, its scalability and performance make it the go-to choice for high-performance applications, particularly in scientific and research computing environments.

    Previous topic 9
    Memory hierarchies
    Next topic 11
    MIMD/SIMD

    Past Papers

    Open this section to load past papers

    Click on Show Past Papers to see past papers.
    On This Page
      Reading Stats
      Est. reading time8 min
      Word count1,287
      Code examples0
      DifficultyIntermediate