COMP3139›Threads in Shared-Memory Programming

Parallel & Distributed ComputingTopic 20 of 33

Threads in Shared-Memory Programming

9 minread

1,454words

Intermediatelevel

Threads in Shared-Memory Programming

In shared-memory programming, threads are independent units of execution that run concurrently on a multi-core or multi-processor system, where they all share a common memory space. Threads enable parallelism by dividing a task into smaller sub-tasks that can be executed simultaneously, leading to faster execution times for large-scale computations.

Since multiple threads share access to the same memory, thread synchronization and coordination become critical to ensure that data is accessed and modified correctly without conflicts.

1. What is a Thread?

A thread is the smallest unit of execution in a program. A thread is sometimes referred to as a "lightweight process" because it runs independently but within the context of a larger process. A process may contain one or more threads, and each thread can execute code independently. However, all threads within a process share the same memory space and can access the same data.

In the context of shared-memory programming, threads communicate and share data by reading and writing to the same memory locations.

2. Threads in Shared-Memory Systems

In shared-memory systems, all threads within a process have access to a common region of memory. This means that when you create multiple threads, they can all read and write to the same variables, arrays, or data structures, without needing explicit communication through message-passing mechanisms (which is typical in distributed-memory systems).

Key Properties of Threads in Shared-Memory Systems:

Shared Global Memory: All threads have access to the same data structures in memory. This allows threads to share information quickly, but it also means careful management is needed to avoid conflicts.
Concurrency: Multiple threads can run at the same time, each working on different parts of the data or on different tasks.
Synchronization: Since threads share memory, synchronization mechanisms (such as locks, semaphores, or atomic operations) are used to avoid issues like race conditions (simultaneous data access conflicts).
Efficiency: Thread creation and management are typically less resource-intensive than processes, making them a lightweight way to achieve parallelism.

3. Thread Management

Managing threads in shared-memory programming involves creating, running, and synchronizing threads to avoid problems like race conditions, deadlocks, and excessive contention for resources.

a. Creating Threads

Threads are typically created either by using a thread library or an API like OpenMP, Pthreads, or C++ threads.

OpenMP: A high-level directive-based API where the programmer uses special pragmas to parallelize loops or sections of code.
Pthreads (POSIX threads): A low-level library that gives the programmer fine-grained control over thread creation, synchronization, and management.
C++ Threads: A standard C++ library introduced in C++11 that provides a simple API for creating and managing threads.

b. Starting Threads

Threads in shared-memory systems typically begin execution at a specific point in the program, often a function or a block of code, and run concurrently with other threads.

For example:

C++ Thread Example: Using std::thread in C++ to create and run a thread:

#include <iostream>
#include <thread>

void print_message() {
    std::cout << "Hello from thread!" << std::endl;
}

int main() {
    // Creating a thread
    std::thread t(print_message);
    
    // Wait for the thread to complete
    t.join();  
    
    return 0;
}

Pthreads Example: Using POSIX threads in C:

#include <pthread.h>
#include <stdio.h>

void* print_message(void* arg) {
    printf("Hello from thread!\n");
    return NULL;
}

int main() {
    pthread_t thread;
    pthread_create(&thread, NULL, print_message, NULL);
    pthread_join(thread, NULL);  // Wait for thread to finish
    return 0;
}

c. Terminating Threads

A thread can finish its work and exit in two main ways:

Voluntarily: A thread can reach the end of its function, at which point it terminates.
Forced Termination: Sometimes a thread can be terminated from another thread (although this is generally discouraged due to potential issues like resource leaks or data corruption).

After a thread finishes its execution, it joins back with the main thread, indicating that the thread has completed its execution. The join() method ensures that the main thread waits for the created threads to finish before proceeding.

4. Thread Synchronization

In shared-memory programming, multiple threads can access the same memory locations. If multiple threads try to access and modify the same data simultaneously, it can lead to data races, where the final value of the shared data depends on the order of thread execution, which is unpredictable.

Synchronization Mechanisms:

To ensure correctness, synchronization mechanisms are used to control the access to shared resources. These mechanisms allow only one thread to access a resource at a time or control the order in which threads interact with shared data.

Locks (Mutexes): A mutex (short for mutual exclusion) is a synchronization primitive that allows only one thread to access a critical section of code (where shared data is accessed) at a time. When one thread locks a mutex, other threads that attempt to lock the same mutex will be blocked until the lock is released.

Example:
```
std::mutex mtx;
void thread_func() {
    mtx.lock();  // Lock the mutex
    // Critical section: Access shared resource
    mtx.unlock();  // Unlock the mutex
}
```
Atomic Operations: Some operations, like incrementing a counter, can be done atomically, meaning they happen completely without interruption. This ensures that a thread cannot be pre-empted while performing the operation, preventing race conditions.

Example (C++11 atomic increment):
```
std::atomic<int> counter(0);
counter++;  // Atomic increment
```
Barriers: A barrier is a synchronization point where threads wait for each other to reach the same point before proceeding. Barriers are useful when all threads need to synchronize their progress after a certain phase of computation.
Condition Variables: These are used for waiting and signaling between threads. A thread can wait on a condition variable, and another thread can signal it when a certain condition is met.

5. Thread Safety

Thread safety refers to the property of a program or function to execute correctly when multiple threads access shared resources concurrently. In shared-memory programming, thread safety is crucial to avoid issues like race conditions, deadlocks, and inconsistent data states.

Some guidelines for ensuring thread safety:

Minimize Shared State: The less shared data there is, the less chance there is for conflicts. Try to minimize the amount of data that multiple threads need to access.
Use Synchronization Mechanisms Appropriately: Ensure that critical sections are properly protected using locks or other synchronization methods, but avoid excessive locking, which can degrade performance.
Avoid Deadlocks: A deadlock occurs when two or more threads are waiting for each other to release resources, causing all of them to freeze. Careful management of locks and resources can avoid deadlocks.

6. Performance Considerations in Threading

While using multiple threads can speed up computation, not all parallelization strategies are beneficial. The overhead of managing threads and synchronization can sometimes outweigh the performance gains, especially if the task being parallelized is too small or there are too many threads competing for limited resources (like memory or cache).

Thread Overhead: Creating and managing threads has some inherent cost, and having too many threads can lead to overhead and context switching, where the CPU has to switch between threads frequently, slowing down the overall execution.
Load Balancing: In some cases, threads may become unevenly distributed, where one thread ends up doing more work than others, leading to idle time for some threads. Efficient work distribution is key to achieving high performance.

7. Common Threading Libraries and Models

Several libraries and programming models are used to implement threads in shared-memory programming:

OpenMP: A high-level API for parallel programming that uses compiler directives to specify parallel sections. It abstracts much of the low-level thread management, making it easy to parallelize existing code with minimal changes.

Example OpenMP directive:
```
#pragma omp parallel for
for (int i = 0; i < 100; i++) {
    // Parallel code here
}
```
Pthreads: A low-level library that provides detailed control over thread creation, management, and synchronization. It requires more effort to use but offers fine-grained control over concurrency.
C++ Threads: A higher-level threading library introduced in C++11 that allows for easy creation and management of threads using the std::thread class.
Java Threads: Java also provides built-in support for multithreading using the Thread class and the Runnable interface, with synchronization mechanisms like synchronized blocks and ReentrantLock.

8. Conclusion

Threads in shared-memory programming enable the parallel execution of tasks within a process, allowing for more efficient computation, especially on multi-core or multi-processor systems. By leveraging threads, developers can divide large tasks into smaller subtasks and run them concurrently, reducing execution time. However, managing

Previous topic 19

Shared-Memory Programming

Next topic 21

P Threads

Past Papers

Open this section to load past papers

Click on Show Past Papers to see past papers.

#include <iostream> #include <thread> void print_message() { std::cout << "Hello from thread!" << std::endl; } int main() { // Creating a thread std::thread t(print_message); // Wait for the thread to complete t.join(); return 0; }

#include <pthread.h> #include <stdio.h> void* print_message(void* arg) { printf("Hello from thread!\n"); return NULL; } int main() { pthread_t thread; pthread_create(&thread, NULL, print_message, NULL); pthread_join(thread, NULL); // Wait for thread to finish return 0; }