ScholarQuill logoScholarQuillUniversity Notes
  • Notes
  • Past Papers
  • Blogs
  • Todo
Login
ScholarQuill logoScholarQuillUniversity Notes
Login
NotesPast PapersBlogsTodo
More
SubjectsDiscussionCGPA CalculatorGPA CalculatorStudent PortalCourse Outline
About
About usPrivacy PolicyReportContact
Notes
Past Papers
Blogs
Todo
Analytics
    Current Subject
    🧩
    Computer Architecture
    COMP3147
    Progress0 / 24 topics
    Topics
    1. Digital Hardware Design: Transistors and Digital logic2. Hardware description languages (Verilog)3. Instruction Set Architecture: Instruction types and mixes4. Addressing modes5. RISC vs. CISC architectures6. Exceptions in instruction sets7. Scalar Pipelines: Data dependencies8. Static scheduling9. Pipeline performance analysis10. VLIW Pipelines: Local scheduling11. Loop unrolling and Software pipelining12. Trace scheduling13. Deferred exceptions and Predicated execution14. IA64 architecture15. Dynamic Pipelines: Dynamical scheduling16. Register renaming17. Speculative execution18. Trace cache19. Thread-Level Parallelism: Cache coherency20. Sequential consistency21. Multithreading22. Symmetric multiprocessing23. Transactional memory24. Data-Level Parallelism: GPU programming
    COMP3147›Thread-Level Parallelism: Cache coherency
    Computer ArchitectureTopic 19 of 24

    Thread-Level Parallelism: Cache coherency

    3 minread
    510words
    Beginnerlevel

    ⭐ Thread-Level Parallelism (TLP) and Cache Coherency


    1. Thread-Level Parallelism (TLP) – Definition

    Thread-Level Parallelism (TLP) refers to the ability of a computer system to execute multiple threads simultaneously, typically on multiple cores or processors, to improve overall throughput.

    While Instruction-Level Parallelism (ILP) exploits parallelism within a single thread, TLP exploits parallelism across multiple threads.


    Key Goals of TLP:

    1. Improve CPU utilization when some threads are stalled (e.g., memory access).
    2. Increase overall system throughput by running multiple threads concurrently.
    3. Exploit multi-core and multi-processor architectures.

    2. Forms of TLP

    1. Multithreading (MT)

      • Fine-grained multithreading: Switch threads every cycle → hides latency of one thread.
      • Coarse-grained multithreading: Switch threads on long-latency events (like cache misses).
    2. Simultaneous Multithreading (SMT)

      • Multiple threads share a single core pipeline simultaneously, issuing instructions in the same cycle.
      • Example: Intel Hyper-Threading Technology.
    3. Multi-core Processing

      • Each core executes independent threads → true parallel execution.

    3. Challenges in TLP

    • Shared resources contention: Cores or threads may compete for CPU caches, memory bandwidth, or functional units.
    • Synchronization: Threads often need to share data, requiring careful coordination.
    • Cache Coherency: Ensures data consistency across multiple caches in different cores.

    4. Cache Coherency – Definition

    Cache coherency is a hardware mechanism that ensures that all processors or cores see a consistent view of memory when multiple caches store copies of the same data.

    Without coherency, different cores may read stale or inconsistent data, leading to incorrect program behavior.


    Key Problems Addressed

    1. Write Propagation

      • Changes made by one processor must be propagated to other caches.
    2. Transaction Ordering

      • Reads and writes to the same memory location must appear in correct order to all threads.

    5. Cache Coherency Protocols

    a) Write-Invalidate Protocol

    • When a processor writes to a cache line, all other caches with a copy invalidate it.
    • Example: MESI Protocol (Modified, Exclusive, Shared, Invalid)

    b) Write-Update (Write-Broadcast) Protocol

    • When a processor writes to a cache line, the new value is broadcast to other caches that have it.

    c) MESI Protocol States

    State Meaning
    M (Modified) Cache has the only valid copy, memory not updated
    E (Exclusive) Cache has the only valid copy, memory is updated
    S (Shared) Cache shares the data with other caches
    I (Invalid) Cache line is invalid

    6. Example Scenario

    • Core 1: X = 5 → modifies cache line holding X → invalidates copies in Core 2’s cache.
    • Core 2: Reads X → fetches updated value 5 from Core 1 or memory.

    This ensures that all threads see a consistent value for X.


    7. Importance of Cache Coherency in TLP

    1. Correctness: Ensures threads see the latest memory updates.
    2. Performance: Minimizes stalls due to stale cache reads.
    3. Scalability: Essential for multi-core systems where each core has private caches.

    8. Exam-Friendly Summary

    Concept Definition / Purpose
    Thread-Level Parallelism (TLP) Execution of multiple threads concurrently to increase throughput
    Forms of TLP Multithreading, Simultaneous Multithreading (SMT), Multi-core execution
    Cache Coherency Ensures consistent memory view across multiple caches
    Protocols MESI (Modified, Exclusive, Shared, Invalid), write-invalidate, write-update
    Challenges Synchronization, stale data, resource contention
    Previous topic 18
    Trace cache
    Next topic 20
    Sequential consistency

    Past Papers

    Open this section to load past papers

    Click on Show Past Papers to see past papers.
    On This Page
      Reading Stats
      Est. reading time3 min
      Word count510
      Code examples0
      DifficultyBeginner