Abstract
There is great interest in creating computational circuits to accelerate compute-intense software algorithms such as molecular dynamics, logic simulation, video encoding, and financial modelling. However, circuit design is slow, tedious, and costly. To do this in a faster and cost-effective fashion, two changes are necessary: (1) rapid compilation and simulation of datapath circuits, and (2) FPGAs must be "virtualized" so that potentially large circuits can run (more slowly) on a device that is too small. Unfortunately, modern FPGA design tools are too slow and FPGAs have a fixed capacity with no capability to be virtualized.
In this talk, we motivate and propose a coarse-grained architecture, based on a massive array of 1000+ processors, and a tool flow for rapidly compiling and simulating/executing computational circuits. Just like a regular CPU, time-multiplexing the hardware adds capacity and bandwidth in return for slightly slower speeds. However, the large degree of parallelism and high clock speeds will make up the difference. Using a motion estimation circuit from MPEG encoding, we will demonstrate "capacity virtualization". To map more sophisticated algorithms into the array, we still need to complete a set of compiler-like software tools. However, preliminary results are very promising.
Biography
- University of Waterloo:
- 2002 BASc Computer Engineering
- 2004 MASc Computer Engineering
- Worked for 1 year at Slipstream Inc. on mobile browser acceleration
- Now at University of British Columbia
- PhD Candidate supervised by Prof. Guy Lemieux
- Thesis: Massively Parallel Processing on a Chip