ECE 1773 – Advanced
Computer Architecture
HW #3 - Due Friday, November 3 2006
Instructor: Andreas Moshovos
Study the effect of various parameters on the performance of dynamically scheduled processors. You will have to use Simplescalar’s sim-outorder simulator. This tool simulates a fairly aggressive and customizable OOO execution core along with the supporting memory hierarchy. Use this simulator to measure the achieved CPI (or its reverse, IPC) for the gcc and fpppp benchmarks from the SPEC95 suite. Before you run your experiments it is important to first read section 4.4 from the Simplescalar paper. This is necessary to get information on the underlying organization you will be simulating. Limit your runs to 100 million instructions.
Issue-width and Window-Size. Issue-width is the maximum number of instructions that can be issued (i.e., commence execution) during the same cycle. The issue-width bounds the IPC that is possible. The window bounds how far ahead the processor can search for ILP. The goal of this assignment is gain additional insight on how the observed ILP improves with changes to these critical parameters. Run nine simulations per benchmark with issue-widths 4, 8 and 16 and window sizes of 32, 128 and 256. Note that it will not be adequate to just change the issue-width and window size parameters as this will result to unbalanced organizations. For example, it makes little sense to have an issue-width of 16, while having only 1 integer ALU and while fetching only 2 instructions per cycle. You have to carefully decide which other parameters should be changed and by how much, in order to achieve a good balance and get some additional performance from increased issue-width. In general, this is fairly challenging task. To make things a little easier, assume that the default configuration (quad issue-width) represent a well balanced core. Parameters you may have to change include the number of functional units, the fetch width, the decode width, etc. Choose 64K L1 instruction and data caches with 32 byte blocks and 4-way associativity. Also, use a combined predictor with 16K tables (each of the three predictors has 16K entries) and 8-bit global history.
In you report, briefly explain what parameters you changed and why. Your final report should also include the measured CPI per experiment and a short discussion of your results (e.g., why CPI changes as it does).
For the most resource demanding configuration change the simulator so that it outputs the IPC every 5 Million instructions. Plot the IPC over time.
Good luck.