ECE1773:
Advanced Computer Architecture, Fall 2007
ECE,
Instructor: Andreas
Moshovos, EA310, x6-7373, moshovos@eecg.toronto.edu
Lectures: Monday 12-2 BA4164 & Thursday 2-4 WB130
Communication:
Use
e-mail as much as possible, Subject should start with “ACA:”,
Office hours: Stop by
anytime (preferred method), if I’m available we can talk, or make an
appointment through e-mail.
If there is a conflict with
the schedule please e-mail me ASAP.
Final report due on December 21st, 11:59pm
EST.
Please submit via e-mail with the header: “ACA:
Final report”
Project Presentations:
We
will meet on Thursday and Friday, December 13 and 14 respectively starting at
12:30pm.
The
presentation will be held in Pratt 266.
Please
bring your own laptop, a projector will be provided.
Project Proposal and Report
Requirements
How to use the EIO traces
that were provided on the CD:
The EIO traces on the CD are compressed and are meant to be
used with the simulator given in the myss directory.
That simulator is a modified version of simplescalar.
Before using it, make sure to do edit the Makefile and
remove all references to “condor_compile”.
Then do a
“make config-pisa”.
If you
compile on cygwin, thanks to recent changes in the libraries, you may need to
include –lintl –liconv in the LIBS macro in the Makefile.
If you get
an error “config.h” not found, do a “ln –s
target-pisa/config.h .”
Lecture Notes
1, What is this Course About - Technology – Course Outline - Expectations
Required:
(a) Read this before the next lecture: Micro-architectural Innovations:
Boosting Processor Performance Beyond Technology Scaling, A. Moshovos and
G. S. Sohi, IEEE Proceedings, Jan. 2001.
(b) Reference for the Simplescalar toolset: Simplescalar report. You are not expected to read this in one
go. Use it as a reference.
(c). Read this before the end of the course: The Task of the Referee, Alan Jay Smith,
IEEE Computer, 1990.
Optional:
(a). Preliminary
discussion of the logical design of an electronic computing instrument,
Arthur W. Burks / Herman H. Goldstine / John von Neumann, Inst. for Advanced
Study, Princeton, N. J., 1946
(b). Strong Inference, John R. Platt,
Science, 1964.
2, Pipelining and Precise
Interrupts
(a) Implementing Precise Interrupts in
Pipelined Processors, J. E. Smith and A. Plezkun, IEEE Transactions on
Computers, May 1988. Required.
(b) Optimizing Pipelines for Power and
Performance, V. Srinivasan, D. Brooks, M. Gshwind and P. Bose, in the
Proceedings of the ACM/IEEE Annual Symposium on Microarchitecture, Nov. 2002. Optional.
4, Control Flow Prediction part #1
Part #2 is now included in the preceding
link.
5, Introduction to OOO Execution and
Register Renaming
Readings: Complexity Effective Superscalar Processors,
S. Parlacharla, N. Jouppi and J. E. Smith, Proceedings of the Annual
International Symposium on Computer Architecture, 1997.
A High-Speed Dynamic Instruction
Scheduling Scheme for Superscalar Processors, Masahiro Goshima, Kengo Nishino, Yasuhiko
Nakashima, Shin-ichiro Mori,Toshiaki Kitamura, and Shinji Tomita, MICRO 2001.
8, Simplescalar’s OOO Timing Simulator
9, Very Long Instruction Word Architectures
10, Instruction Supply and
Load/Store Scheduling
Homeworks
You
will need these files and the Simplescalar simulator source code.
Other
relevant files:
a.
Please install Cygwin on an windows machine. Visit www.cygwin.com.
b.
GCC port for Simplescalar. Installs
under /usr/local.
c.
MIPS ISA reference. Note that Simplescalar
implements a modified MIPS-I instruction set architecture.
2. (a) Read and summarize in two pages at most the TAGE branch predictor
paper: http://www.irisa.fr/caps/people/seznec/L-TAGE.pdf
(b) Using sim-safe.c study accuracy of a BTB. The BTB
should be indexed using the PC of
branches and should return the taken target address of the branch. Do not use
tags. Report accuracy only for those branches that are taken. Vary the size
from 1 to 1024 entries in power of two steps. Study only direct-mapped
BTBs. Use cc1.ss.lit from hw1 for this
study. Run cc1.ss.lit as follows: cc1.lit.ss –O2 gcc.i
3. Using sim-outorder,
Simplescalar’s timing simulator, measure how many operands are ready for
instructions that enter the scheduler
(done in ruu_dispatch). Collect the following statistics:
1.
A graph where the Y axis is a percentage of all dynamic instructions.
The graph should report the percentage of instructions that have no ready
operand, 1 operand ready or 2 operands ready.
Collect these statistics for the cc1.lit.ss
benchmark used as cc1.lit. ss –O2 gcc.i
PROJECT
Here’s a list of suggested
papers: