⭐ Pipeline Performance Analysis
Pipeline performance analysis focuses on measuring how much improvement a pipelined processor achieves compared to a non-pipelined one, and how hazards, stalls, and pipeline depth affect performance.
⭐ 1. Key Terms
1. Pipeline
A pipeline breaks instruction execution into stages (IF, ID, EX, MEM, WB) to overlap execution of multiple instructions.
2. Throughput
The number of instructions completed per unit time.
3. Latency
Time taken for a single instruction to pass through the entire pipeline.
4. CPI (Cycles Per Instruction)
Average number of clock cycles required per instruction.
⭐ 2. Ideal Pipeline Performance
In an ideal pipeline with no stalls, no hazards, and perfect balance, the performance improves by a factor equal to the pipeline depth.
Speedup Formula (Ideal):
Speedup=Execution time (pipelined)Execution time (non-pipelined)
For a pipeline with k stages:
Speedup (ideal)=k
Ideal CPI:
CPIideal=1
Explanation:
Once the pipeline is full, the CPU completes one instruction per cycle.
⭐ 3. Real Pipeline Performance
Real pipelines include:
- Data hazards
- Control hazards
- Structural hazards
- Stalls
- Branch mispredictions
- Cache misses
These reduce performance.
Actual CPI:
CPI∗actual=CPI∗ideal+Pipeline Stall Cycles per Instruction
Since ideal CPI = 1:
CPIactual=1+Stall Penalty
⭐ 4. Pipeline Speedup with Stalls
Speedup=Pipelined execution timeNon-pipelined execution time
Since:
- Non-pipelined CPI = number of stages = k
- Pipelined CPI = 1+stall cycles
Speedup=1+stall cyclesk
⭐ 5. Pipeline Throughput
Definition
Throughput = number of instructions completed per second.
Throughput=Cycle Time×CPI1
With more pipeline stages (greater k), cycle time decreases, increasing throughput.
⭐ 6. Pipeline Latency
Definition
Latency is the time for one instruction to travel through all pipeline stages.
Latency=k×Cycle Time
Note:
- Pipelining does not reduce latency.
- It reduces execution time for multiple instructions.
⭐ 7. Example of Pipeline Performance
Consider a 5-stage pipeline:
Non-pipelined:
Each instruction = 5 cycles
10 instructions = 10 × 5 = 50 cycles
Pipelined:
Pipeline fill = 5 cycles
Then 1 instruction per cycle:
(5−1)+10=14 cycles
Speedup:
1450=3.57×
(Not the ideal 5× because of pipeline fill time.)
⭐ 8. Effect of Hazards on Pipeline Performance
Data Hazards (RAW)
Cause stalls unless forwarding is used.
If average load-use stall = 0.3 cycles:
CPI=1+0.3=1.3
Control Hazards
Branch mispredictions cause penalty cycles.
If branch frequency = 20%, mispredict rate = 5%, penalty = 3 cycles:
Penalty=0.20×0.05×3=0.03
CPI=1+0.03=1.03
Structural Hazards
Shared resources cause conflicts.
Penalty added as additional stall cycles.
⭐ 9. Overall CPI Formula (All Hazards)
CPI∗actual=1+Stalls∗data+Stalls∗control+Stalls∗structural
⭐ 10. Pipeline Efficiency
Definition
Pipeline efficiency = useful work / total cycles
Efficiency=Pipeline cycles×Pipeline widthNumber of instructions
For scalar (width = 1):
Efficiency=Pipeline cyclesNumber of instructions
⭐ 11. Factors Affecting Pipeline Performance
-
Pipeline depth:
More stages = faster clock but more hazard penalties.
-
Hazard frequency:
More RAW, WAR, WAW hazards → more stalls.
-
Branch prediction accuracy:
Better prediction → fewer control stalls.
-
Instruction mix:
High-load frequency → more load-use hazards.
-
Compiler optimizations:
Static scheduling can reduce stalls.
-
Hardware support:
Forwarding paths, hazard detection, branch prediction.
⭐ Exam-Focused Summary
- Ideal speedup = pipeline stages
- Actual CPI = 1 + stall cycles
- Hazards (RAW, control, structural) increase CPI
- Pipeline reduces throughput but not latency
- Branch penalties and load-use stalls reduce efficiency
- Speedup = 1+stall cyclesk