COMP3139›Hardware Architectures: Multi Processors (Shared Memory)

Parallel & Distributed ComputingTopic 4 of 33

Hardware Architectures: Multi Processors (Shared Memory)

8 minread

1,331words

Intermediatelevel

Hardware Architectures: Multi-Processors (Shared Memory)

In computing, multi-processor systems refer to a system that uses multiple processors to perform computations. These processors can share memory, which means they can access and modify the same data. A shared memory system is a type of multi-processor architecture where all processors can directly access the same memory space.

Let’s break down this concept into more detail and explore how it works.

1. What is a Multi-Processor System?

A multi-processor system is a computing system that uses more than one processor to execute instructions. These processors are often used to work on different parts of a problem simultaneously, making the system more powerful than a single processor machine.

Processors: These are the computing units that execute instructions. In a multi-processor system, each processor can work on a separate task, or multiple processors can work on the same task, but on different parts.
Shared Memory: In a shared memory system, multiple processors access a common pool of memory. This allows them to read from and write to the same memory locations. Shared memory allows processors to communicate and share data without needing an external communication network.

2. Types of Multi-Processor Architectures

There are two primary types of multi-processor architectures based on how memory is shared:

Symmetric Multi-Processing (SMP)
Asymmetric Multi-Processing (AMP)

Symmetric Multi-Processing (SMP)

In an SMP system, all processors have equal access to shared memory. There is no master processor, and each processor is independent but can communicate with others via shared memory.

All processors share the same memory space. Each processor can access any part of the memory directly.
Each processor can run its own thread of execution, and the system uses shared memory for synchronization and communication between processors.
Scalability: SMP systems scale well to a moderate number of processors. However, as more processors are added, the system may face bottlenecks due to the limitations of the memory bus and the need for synchronization.

Example of SMP:

Modern multi-core processors (e.g., Intel Core i7/i9, AMD Ryzen) are built on the SMP architecture, where all cores on a chip can access the same memory.

Asymmetric Multi-Processing (AMP)

In an AMP system, there is a master processor that controls the system, while the other processors are subordinate and are used for specific tasks or workloads.

Master Processor: The main processor that manages the system and handles the majority of the computing load.
Slave Processors: Additional processors that perform certain computations but rely on the master processor for coordination.
Shared Memory: While there is still a shared memory space, the master processor typically has more control over memory access and task distribution.

AMP systems are less common than SMP systems because they are harder to scale and don’t provide the same level of flexibility.

Example of AMP:

Early mainframes and embedded systems (such as certain real-time systems or some early computer architectures) used AMP, where one processor handled the overall control and the others handled specific tasks.

3. Shared Memory in Multi-Processor Systems

In a shared memory system, multiple processors can read from and write to the same memory location, which is a key advantage when it comes to communication and data sharing.

Shared Memory Characteristics:

Unified Memory Access: All processors access the same memory space, meaning they can share data and communicate easily.
Coherence: Ensuring that all processors have a consistent view of the shared memory. If one processor changes a memory value, all other processors need to be aware of this change to avoid inconsistency.
Synchronization: Since multiple processors are accessing the same memory, synchronization mechanisms (like locks, semaphores, or barriers) are needed to avoid conflicts (e.g., two processors trying to modify the same memory location simultaneously).
Bus/Memory Access: There are mechanisms for connecting processors to memory, typically through a memory bus or interconnect.

4. How Shared Memory Works

In a shared memory system, all processors are connected to a central memory unit, either directly or through a communication bus. The key features of shared memory systems include:

Memory Bus: A system bus or interconnect connects the processors to memory. The bus handles the communication between the processors and the shared memory.
Cache: To speed up memory access, processors typically use caches (small, fast memory located closer to the processor). Since multiple processors share the same memory, cache coherency mechanisms are used to ensure that each processor sees a consistent view of the memory.
Cache Coherency: Since each processor has its own cache, there needs to be a system in place to make sure that if one processor changes a value in memory, other processors are aware of the change. The most common cache coherency protocol is MESI (Modified, Exclusive, Shared, Invalid), which keeps track of the state of each cache line (a small block of memory).
Synchronization: With multiple processors reading and writing to the same memory, synchronization is essential to avoid data conflicts. Common synchronization techniques include:
- Locks (mutexes): Prevent multiple processors from accessing shared data simultaneously.
- Semaphores: Used to control access to resources when multiple processes are competing.
- Barriers: Ensure that all processors reach a certain point in execution before continuing.

5. Advantages of Shared Memory Multi-Processor Systems

Simpler Communication: Since processors share a common memory, they can communicate easily by writing to and reading from the shared memory space. This makes programming simpler compared to systems that require explicit message passing (like in distributed systems).
Efficient Data Sharing: Shared memory systems are ideal for applications where multiple processors need to work on the same dataset. For example, parallel matrix operations in scientific computing, where each processor works on a part of the matrix but updates the same result.
Reduced Communication Overhead: Unlike distributed systems, where processors need to send messages over a network, shared memory allows processors to share data directly in memory, which is much faster than communication over a network.

6. Challenges in Shared Memory Multi-Processor Systems

Scalability: As the number of processors increases, the performance gains diminish. This is due to the increased contention for memory resources (the bus, caches, and memory access) and the overhead of maintaining cache coherency.
Cache Coherency: Managing multiple caches in a shared memory system can be complex. Without proper cache coherency protocols, processors might work with outdated or inconsistent data. This can lead to bugs and incorrect results.
Synchronization Overhead: As more processors are added, managing synchronization between them becomes more difficult. Ensuring that data access is done safely without conflicts (race conditions) requires careful design, which can add overhead and reduce overall performance.
Memory Bottleneck: In large multi-processor systems, the memory bus can become a bottleneck. If many processors are trying to access the memory simultaneously, it can slow down performance because memory access becomes a limiting factor.

7. Examples of Shared Memory Multi-Processor Systems

Supercomputers: High-performance computing (HPC) systems, such as IBM’s Blue Gene or Cray supercomputers, often use multi-processor shared memory systems for scientific simulations and large-scale calculations.
Multi-Core CPUs: Most modern processors in personal computers (e.g., Intel Core or AMD Ryzen processors) use shared memory architectures, with multiple cores accessing the same memory pool.
Cloud Servers: In cloud computing, servers with multiple processors (e.g., Intel Xeon or AMD EPYC) often use shared memory for faster inter-processor communication and resource sharing.

8. Conclusion

Multi-processor systems with shared memory are widely used in modern computing to achieve high-performance parallel processing. The ability of processors to access and modify the same memory space enables efficient communication and data sharing. However, managing synchronization, cache coherency, and memory contention becomes more challenging as the number of processors grows.

These systems are particularly suited for applications where data needs to be shared frequently and computations can be divided into smaller parallel tasks. However, developers must carefully manage synchronization and memory access to fully utilize the power of multi-processor systems.

Previous topic 3

Speedup and Amdahl's Law

Next topic 5

Hardware Architectures: Networks of Workstations (Distributed Memory)

Past Papers

Open this section to load past papers

Click on Show Past Papers to see past papers.

COMP3139›Hardware Architectures: Multi Processors (Shared Memory)

Parallel & Distributed ComputingTopic 4 of 33

Hardware Architectures: Multi Processors (Shared Memory)

8 minread

1,331words

Intermediatelevel

Hardware Architectures: Multi-Processors (Shared Memory)

Let’s break down this concept into more detail and explore how it works.

1. What is a Multi-Processor System?

Processors: These are the computing units that execute instructions. In a multi-processor system, each processor can work on a separate task, or multiple processors can work on the same task, but on different parts.
Shared Memory: In a shared memory system, multiple processors access a common pool of memory. This allows them to read from and write to the same memory locations. Shared memory allows processors to communicate and share data without needing an external communication network.

2. Types of Multi-Processor Architectures

There are two primary types of multi-processor architectures based on how memory is shared:

Symmetric Multi-Processing (SMP)
Asymmetric Multi-Processing (AMP)

Symmetric Multi-Processing (SMP)

In an SMP system, all processors have equal access to shared memory. There is no master processor, and each processor is independent but can communicate with others via shared memory.

All processors share the same memory space. Each processor can access any part of the memory directly.
Each processor can run its own thread of execution, and the system uses shared memory for synchronization and communication between processors.
Scalability: SMP systems scale well to a moderate number of processors. However, as more processors are added, the system may face bottlenecks due to the limitations of the memory bus and the need for synchronization.

Example of SMP:

Modern multi-core processors (e.g., Intel Core i7/i9, AMD Ryzen) are built on the SMP architecture, where all cores on a chip can access the same memory.

Asymmetric Multi-Processing (AMP)

In an AMP system, there is a master processor that controls the system, while the other processors are subordinate and are used for specific tasks or workloads.

Master Processor: The main processor that manages the system and handles the majority of the computing load.
Slave Processors: Additional processors that perform certain computations but rely on the master processor for coordination.
Shared Memory: While there is still a shared memory space, the master processor typically has more control over memory access and task distribution.

AMP systems are less common than SMP systems because they are harder to scale and don’t provide the same level of flexibility.

Example of AMP:

Early mainframes and embedded systems (such as certain real-time systems or some early computer architectures) used AMP, where one processor handled the overall control and the others handled specific tasks.

3. Shared Memory in Multi-Processor Systems

In a shared memory system, multiple processors can read from and write to the same memory location, which is a key advantage when it comes to communication and data sharing.

Shared Memory Characteristics:

Unified Memory Access: All processors access the same memory space, meaning they can share data and communicate easily.
Coherence: Ensuring that all processors have a consistent view of the shared memory. If one processor changes a memory value, all other processors need to be aware of this change to avoid inconsistency.
Synchronization: Since multiple processors are accessing the same memory, synchronization mechanisms (like locks, semaphores, or barriers) are needed to avoid conflicts (e.g., two processors trying to modify the same memory location simultaneously).
Bus/Memory Access: There are mechanisms for connecting processors to memory, typically through a memory bus or interconnect.

4. How Shared Memory Works

In a shared memory system, all processors are connected to a central memory unit, either directly or through a communication bus. The key features of shared memory systems include:

Memory Bus: A system bus or interconnect connects the processors to memory. The bus handles the communication between the processors and the shared memory.
Cache: To speed up memory access, processors typically use caches (small, fast memory located closer to the processor). Since multiple processors share the same memory, cache coherency mechanisms are used to ensure that each processor sees a consistent view of the memory.
Cache Coherency: Since each processor has its own cache, there needs to be a system in place to make sure that if one processor changes a value in memory, other processors are aware of the change. The most common cache coherency protocol is MESI (Modified, Exclusive, Shared, Invalid), which keeps track of the state of each cache line (a small block of memory).
Synchronization: With multiple processors reading and writing to the same memory, synchronization is essential to avoid data conflicts. Common synchronization techniques include:
- Locks (mutexes): Prevent multiple processors from accessing shared data simultaneously.
- Semaphores: Used to control access to resources when multiple processes are competing.
- Barriers: Ensure that all processors reach a certain point in execution before continuing.

5. Advantages of Shared Memory Multi-Processor Systems

Simpler Communication: Since processors share a common memory, they can communicate easily by writing to and reading from the shared memory space. This makes programming simpler compared to systems that require explicit message passing (like in distributed systems).
Efficient Data Sharing: Shared memory systems are ideal for applications where multiple processors need to work on the same dataset. For example, parallel matrix operations in scientific computing, where each processor works on a part of the matrix but updates the same result.
Reduced Communication Overhead: Unlike distributed systems, where processors need to send messages over a network, shared memory allows processors to share data directly in memory, which is much faster than communication over a network.

6. Challenges in Shared Memory Multi-Processor Systems

Scalability: As the number of processors increases, the performance gains diminish. This is due to the increased contention for memory resources (the bus, caches, and memory access) and the overhead of maintaining cache coherency.
Cache Coherency: Managing multiple caches in a shared memory system can be complex. Without proper cache coherency protocols, processors might work with outdated or inconsistent data. This can lead to bugs and incorrect results.
Synchronization Overhead: As more processors are added, managing synchronization between them becomes more difficult. Ensuring that data access is done safely without conflicts (race conditions) requires careful design, which can add overhead and reduce overall performance.
Memory Bottleneck: In large multi-processor systems, the memory bus can become a bottleneck. If many processors are trying to access the memory simultaneously, it can slow down performance because memory access becomes a limiting factor.

7. Examples of Shared Memory Multi-Processor Systems

Supercomputers: High-performance computing (HPC) systems, such as IBM’s Blue Gene or Cray supercomputers, often use multi-processor shared memory systems for scientific simulations and large-scale calculations.
Multi-Core CPUs: Most modern processors in personal computers (e.g., Intel Core or AMD Ryzen processors) use shared memory architectures, with multiple cores accessing the same memory pool.
Cloud Servers: In cloud computing, servers with multiple processors (e.g., Intel Xeon or AMD EPYC) often use shared memory for faster inter-processor communication and resource sharing.

8. Conclusion

Previous topic 3

Speedup and Amdahl's Law

Next topic 5

Hardware Architectures: Networks of Workstations (Distributed Memory)

Past Papers

Open this section to load past papers

Click on Show Past Papers to see past papers.