Page 105 - DCAP608_REAL TIME SYSTEMS
P. 105
Real Time Systems
Notes Detailed power analysis of the RTL shows that ~60% of the total power consumed in the
FPU can be attributed to the clocks with logic switching factor of ~16% for random data. A
fully-active power consumption breakdown of the FPU per FF and logic stage is shown in
Figure. 1(b). The top heavy distribution of power consumption can be attributed to highly
parallel structures (e.g., Wallace Tree) towards the front of the block. We will see later that
such attributes are desirable for clock scheduling and glitch mitigation.
Figure 1: FPU Pipeline Characteristics Assumed in the Case Study
To augment the FPU power dissipation breakdowns, we collect architectural utilization to
determine potential benefits from clock scheduling. We use the Turandot processor
simulator to model a POWER4-like processor with two parallel 6-stage FPU pipelines. We
simulate 100M-instruction traces of the SPECfp benchmark suite. Figure 2 presents a stacked
bar graph showing the distribution of contiguous bubbles observed. The figure shows
50% of consecutive fp instructions have one or more bubbles between them. Clock
scheduling can utilize these bubbles to reduce clock power.
Figure 2: Distribution of Contiguous Bubbles Found in two Parallel 6-Stage
FPU Pipelines Found in a POWER4-Like Processor
Contd...
100 LOVELY PROFESSIONAL UNIVERSITY