Writing /fs1/eecg/moshovos/a/a3/moshovos/public_www/ACA08/wiki/data/cache/d/d8ff8f0b4fde9c34b1835454671795bc.i failed
Unable to save cache file. Hint: disk full; file permissions; safe_mode setting.
Writing /fs1/eecg/moshovos/a/a3/moshovos/public_www/ACA08/wiki/data/cache/d/d8ff8f0b4fde9c34b1835454671795bc.i failed
Unable to save cache file. Hint: disk full; file permissions; safe_mode setting.
Writing /fs1/eecg/moshovos/a/a3/moshovos/public_www/ACA08/wiki/data/cache/d/d8ff8f0b4fde9c34b1835454671795bc.xhtml failed

Project Options

There are three project options:

  1. Propose your own topic. You will have to complete a proposal and submit it by e-mail by Nov 13th.
  2. Pick a recent paper from a top-tier architecture conference (ISCA, ASPLOS, MICRO, HPCA) and validate part of its results.
  3. Participate in the branch prediction competition stongly encouraged

Branch Prediction Competition Option

The goal is to develop an accurate branch direction predictor (taken or not taken). At the very least your predictor should be able to beat the based predictor which will be provided as part of the simulator source code. Ideally, your predictor will beat this base predictor by a significant margin and your predictor will perform the best among all predictors developed as part of the course. Predictor X is better than predictor Y on benchmark Z if the prediction accuracy of predictor X is higher than the predictor accuracy of Y when running benchmark Z. Prediction accuracy is measured as the percentage of direction predictions that are correct.

For this project you will use sim-bpred from the simplescalar source code. Sim-bpred is the sim-safe simulator you used in the homework extended with a branch predictor component. By default sim-bpred reports direction accuracy as bpred_bimod.bpred_dir_rate. When you modify sim-bpred.c introduce a new statistic called mybpred.dir_rate and use it to report the accuracy of your predictor.

We will provide you with traces that can be used with sim-bpred.

You will have to work alone on this project

Simulator and Traces

The simulator source code (modified version of Simplescalar) and the traces are available through the eecg.toronto.edu filesystem under ~moshovos/ACA08/. The simulator code is in the myss directory. The traces are in the spec2k.eio directory.

To use the traces simply append them as the last argument to sim-bpred. For example:

sim-bpred 164.gzip.trace.1Ba1B.eio

The trace contains a complete image of the initial state of memory and registers, plus all data exchanged through system calls. The traces are 1B instructions long.

One program that will definitely be challenging for prediction will be gcc. So, if you are trying things out try it first and may be instruct sim-bpred to only run a few hundred million instructions (see sim-bpred -help).

Before you compile the simulator configure it using “make config-pisa”. If you grabbed the simulation sources before Nov. 18 and are having problems compiling, please grab them again. I edited some of the files so that they compile without problems on a linux box. The EIO traces are LITTLEendian so chances are you will not be able to complete this on a BIG endian machine such as an x86 box. If you plan to use the original Simplescalar source code, you'll have to manually uncompress the .eio files using gzip. The original simplescalar supports uncompressed .eio files, while the one provided for the project has built-in support for using compressed .eio files. This saves disk space and bandwidth and may make your simulations run faster, as, chances are, the bottleneck will be the disk.

The Base Predictor

The base predictor that you will have to beat is a combined predictor with:

  • A meta-predictor with 64K entries
  • A bimodal predictor with 64K entries
  • A GShare predictor with 64K entries using global history of 10 bits

Sim-bpred implements this directly. Use the following options:

-bpred comb -bpred:2lev 1 65536 10 1 -bpred:bimod 65536 -bpred:comb 65536

Restricted and Unrestricted Predictor Storage

Please develop two predictors. The first will have to use at most 64KB of storage (512Kbits). For the second you can use as much storage as you like. In the storage calculations do not include the PC, but do include any other bits your predictor uses. Of course, we are refering to the storage used in an actual hardware implementation. So, a 64K entry bimodal would use 128Kbits (64K entries each using 2 bits for the up-down counter).

Please only consider the bits you use for direction prediction in your storage calculations. Also, assume that there are up to 512 bits for other incidentals such as history registers, etc.

Deliverables

You will have to deliver the following:

  1. Modified sim-bped.c that includes your predictor. Introduce two options: -mybpred:r and -mybpred:u. The first should invoke your restricted storage predictor and the second the unrestricted predictor. There should be no other options necessary for running the simulation with your chosen predictor sizes (the ones you selected to compete with).
  2. An at most four page report (10 pt size) with figures that explains your predictors and the intuition behind them.

E-mail your bpred.c and report in PDF to moshovos@eecg.toronto.edu by Dec. 4th using “MCA: BPRED” as your subject line.

Presentations

During the second week of December you will have to offer a ten minute presentation describing your predictor and results.

Metric

Thank you for submitting your suggestions on what metric to use to judge which predictor is better. As many of you pointed out the real problem is how to decide whether a predictor A is better than predictor B when A performs better than B for some benchmarks while B performs better than A for others. Ultimately, it is performance that matters, but even then the same problem can appear. There were several well justified suggestions. I hope it's clear that there is no single metric that is perfect. To make forward progress we will be using the Average MPKI.

MPKI=Mispredictions per Kilo Instructions, that is how many mispredictions occur every 1000 instructions. This is loosely connect to performance. If we know that average CPI then we can roughly estimate performance with MPKI.

So, we will be using: X = [ MPKI(1) + … + MPKI(N) ] / N

Where MPKI(i) is the MPKI for benchmark i.

 
project.txt · Last modified: 2008/12/09 11:19 by instructor
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki