COMP3147›Pipeline performance analysis

Computer ArchitectureTopic 9 of 24

Pipeline performance analysis

5 minread

896words

Beginnerlevel

⭐ Pipeline Performance Analysis

Pipeline performance analysis focuses on measuring how much improvement a pipelined processor achieves compared to a non-pipelined one, and how hazards, stalls, and pipeline depth affect performance.

⭐ 1. Key Terms

1. Pipeline

A pipeline breaks instruction execution into stages (IF, ID, EX, MEM, WB) to overlap execution of multiple instructions.

2. Throughput

The number of instructions completed per unit time.

3. Latency

Time taken for a single instruction to pass through the entire pipeline.

4. CPI (Cycles Per Instruction)

Average number of clock cycles required per instruction.

⭐ 2. Ideal Pipeline Performance

In an ideal pipeline with no stalls, no hazards, and perfect balance, the performance improves by a factor equal to the pipeline depth.

Speedup Formula (Ideal):

\text{Speedup} = \frac{\text{Execution time (non-pipelined)}}{\text{Execution time (pipelined)}}

For a pipeline with k stages:

\text{Speedup (ideal)} = k

Ideal CPI:

\text{CPI}_{\text{ideal}} = 1

Explanation:

Once the pipeline is full, the CPU completes one instruction per cycle.

⭐ 3. Real Pipeline Performance

Real pipelines include:

Data hazards
Control hazards
Structural hazards
Stalls
Branch mispredictions
Cache misses

These reduce performance.

Actual CPI:

\text{CPI}*{\text{actual}} = \text{CPI}*{\text{ideal}} + \text{Pipeline Stall Cycles per Instruction}

Since ideal CPI = 1:

\text{CPI}_{\text{actual}} = 1 + \text{Stall Penalty}

⭐ 4. Pipeline Speedup with Stalls

\text{Speedup} = \frac{\text{Non-pipelined execution time}}{\text{Pipelined execution time}}

Since:

Non-pipelined CPI = number of stages = k
Pipelined CPI = $1 + \text{stall cycles}$

\text{Speedup} = \frac{k}{1 + \text{stall cycles}}

⭐ 5. Pipeline Throughput

Definition

Throughput = number of instructions completed per second.

\text{Throughput} = \frac{1}{\text{Cycle Time} \times \text{CPI}}

With more pipeline stages (greater k), cycle time decreases, increasing throughput.

⭐ 6. Pipeline Latency

Definition

Latency is the time for one instruction to travel through all pipeline stages.

\text{Latency} = k \times \text{Cycle Time}

Note:

Pipelining does not reduce latency.
It reduces execution time for multiple instructions.

⭐ 7. Example of Pipeline Performance

Consider a 5-stage pipeline:

Non-pipelined:

Each instruction = 5 cycles 10 instructions = 10 × 5 = 50 cycles

Pipelined:

Pipeline fill = 5 cycles Then 1 instruction per cycle:

(5 - 1) + 10 = 14 \text{ cycles}

Speedup:

\frac{50}{14} = 3.57\times

(Not the ideal 5× because of pipeline fill time.)

⭐ 8. Effect of Hazards on Pipeline Performance

Data Hazards (RAW)

Cause stalls unless forwarding is used.

If average load-use stall = 0.3 cycles:

\text{CPI} = 1 + 0.3 = 1.3

Control Hazards

Branch mispredictions cause penalty cycles.

If branch frequency = 20%, mispredict rate = 5%, penalty = 3 cycles:

\text{Penalty} = 0.20 \times 0.05 \times 3 = 0.03

\text{CPI} = 1 + 0.03 = 1.03

Structural Hazards

Shared resources cause conflicts.

Penalty added as additional stall cycles.

⭐ 9. Overall CPI Formula (All Hazards)

\text{CPI}*{\text{actual}} = 1 + \text{Stalls}*{\text{data}} + \text{Stalls}*{\text{control}} + \text{Stalls}*{\text{structural}}

⭐ 10. Pipeline Efficiency

Definition

Pipeline efficiency = useful work / total cycles

\text{Efficiency} = \frac{\text{Number of instructions}}{\text{Pipeline cycles} \times \text{Pipeline width}}

For scalar (width = 1):

\text{Efficiency} = \frac{\text{Number of instructions}}{\text{Pipeline cycles}}

⭐ 11. Factors Affecting Pipeline Performance

Pipeline depth: More stages = faster clock but more hazard penalties.
Hazard frequency: More RAW, WAR, WAW hazards → more stalls.
Branch prediction accuracy: Better prediction → fewer control stalls.
Instruction mix: High-load frequency → more load-use hazards.
Compiler optimizations: Static scheduling can reduce stalls.
Hardware support: Forwarding paths, hazard detection, branch prediction.

⭐ Exam-Focused Summary

Ideal speedup = pipeline stages
Actual CPI = 1 + stall cycles
Hazards (RAW, control, structural) increase CPI
Pipeline reduces throughput but not latency
Branch penalties and load-use stalls reduce efficiency
Speedup = $\frac{k}{1 + \text{stall cycles}}$

Previous topic 8

Static scheduling

Next topic 10

VLIW Pipelines: Local scheduling

Past Papers

Open this section to load past papers

Click on Show Past Papers to see past papers.

COMP3147›Pipeline performance analysis

Computer ArchitectureTopic 9 of 24

Pipeline performance analysis

5 minread

896words

Beginnerlevel

⭐ Pipeline Performance Analysis

Pipeline performance analysis focuses on measuring how much improvement a pipelined processor achieves compared to a non-pipelined one, and how hazards, stalls, and pipeline depth affect performance.

⭐ 1. Key Terms

1. Pipeline

A pipeline breaks instruction execution into stages (IF, ID, EX, MEM, WB) to overlap execution of multiple instructions.

2. Throughput

The number of instructions completed per unit time.

3. Latency

Time taken for a single instruction to pass through the entire pipeline.

4. CPI (Cycles Per Instruction)

Average number of clock cycles required per instruction.

⭐ 2. Ideal Pipeline Performance

In an ideal pipeline with no stalls, no hazards, and perfect balance, the performance improves by a factor equal to the pipeline depth.

Speedup Formula (Ideal):

\text{Speedup} = \frac{\text{Execution time (non-pipelined)}}{\text{Execution time (pipelined)}}

For a pipeline with k stages:

\text{Speedup (ideal)} = k

Ideal CPI:

\text{CPI}_{\text{ideal}} = 1

Explanation:

Once the pipeline is full, the CPU completes one instruction per cycle.

⭐ 3. Real Pipeline Performance

Real pipelines include:

Data hazards
Control hazards
Structural hazards
Stalls
Branch mispredictions
Cache misses

These reduce performance.

Actual CPI:

\text{CPI}*{\text{actual}} = \text{CPI}*{\text{ideal}} + \text{Pipeline Stall Cycles per Instruction}

Since ideal CPI = 1:

\text{CPI}_{\text{actual}} = 1 + \text{Stall Penalty}

⭐ 4. Pipeline Speedup with Stalls

\text{Speedup} = \frac{\text{Non-pipelined execution time}}{\text{Pipelined execution time}}

Since:

Non-pipelined CPI = number of stages = k
Pipelined CPI = $1 + \text{stall cycles}$

\text{Speedup} = \frac{k}{1 + \text{stall cycles}}

⭐ 5. Pipeline Throughput

Definition

Throughput = number of instructions completed per second.

\text{Throughput} = \frac{1}{\text{Cycle Time} \times \text{CPI}}

With more pipeline stages (greater k), cycle time decreases, increasing throughput.

⭐ 6. Pipeline Latency

Definition

Latency is the time for one instruction to travel through all pipeline stages.

\text{Latency} = k \times \text{Cycle Time}

Note:

Pipelining does not reduce latency.
It reduces execution time for multiple instructions.

⭐ 7. Example of Pipeline Performance

Consider a 5-stage pipeline:

Non-pipelined:

Each instruction = 5 cycles 10 instructions = 10 × 5 = 50 cycles

Pipelined:

Pipeline fill = 5 cycles Then 1 instruction per cycle:

(5 - 1) + 10 = 14 \text{ cycles}

Speedup:

\frac{50}{14} = 3.57\times

(Not the ideal 5× because of pipeline fill time.)

⭐ 8. Effect of Hazards on Pipeline Performance

Data Hazards (RAW)

Cause stalls unless forwarding is used.

If average load-use stall = 0.3 cycles:

\text{CPI} = 1 + 0.3 = 1.3

Control Hazards

Branch mispredictions cause penalty cycles.

If branch frequency = 20%, mispredict rate = 5%, penalty = 3 cycles:

\text{Penalty} = 0.20 \times 0.05 \times 3 = 0.03

\text{CPI} = 1 + 0.03 = 1.03

Structural Hazards

Shared resources cause conflicts.

Penalty added as additional stall cycles.

⭐ 9. Overall CPI Formula (All Hazards)

\text{CPI}*{\text{actual}} = 1 + \text{Stalls}*{\text{data}} + \text{Stalls}*{\text{control}} + \text{Stalls}*{\text{structural}}

⭐ 10. Pipeline Efficiency

Definition

Pipeline efficiency = useful work / total cycles

\text{Efficiency} = \frac{\text{Number of instructions}}{\text{Pipeline cycles} \times \text{Pipeline width}}

For scalar (width = 1):

\text{Efficiency} = \frac{\text{Number of instructions}}{\text{Pipeline cycles}}

⭐ 11. Factors Affecting Pipeline Performance

Pipeline depth: More stages = faster clock but more hazard penalties.
Hazard frequency: More RAW, WAR, WAW hazards → more stalls.
Branch prediction accuracy: Better prediction → fewer control stalls.
Instruction mix: High-load frequency → more load-use hazards.
Compiler optimizations: Static scheduling can reduce stalls.
Hardware support: Forwarding paths, hazard detection, branch prediction.

⭐ Exam-Focused Summary

Ideal speedup = pipeline stages
Actual CPI = 1 + stall cycles
Hazards (RAW, control, structural) increase CPI
Pipeline reduces throughput but not latency
Branch penalties and load-use stalls reduce efficiency
Speedup = $\frac{k}{1 + \text{stall cycles}}$

Previous topic 8

Static scheduling

Next topic 10

VLIW Pipelines: Local scheduling

Past Papers

Open this section to load past papers

Click on Show Past Papers to see past papers.