SPREE
Soft Processor Rapid Exploration Environment
NEWS
- Apr 2008 - Benchmarks released
- Jan 2008 - Compiler released
- Jan 2008 - Generated RTL for some processors released
- Feb 2007 - Instruction and data caches implemented
- Nov 2006 - Complete EEMBC benchmark suite running in hardware on a SPREE processor.
- Sep 2006 - A SPREE processor is ported to the TM4 multi-FPGA board using DDR SDRAM for instruction/data memory.
- Apr 2006 - Peter Yiannacouras leaves for internship at Intel Microarchitecture Research labs.
- Mar 2006 - File I/O established between SPREE processors on the TM4 to host Linux boxes.
- Dec 2005 - SPREE processors ported to TM4 multi-FPGA boards and verified running in actual hardware.
The Purpose of SPREE
Processors implemented on a programmable fabric are referred to as soft processors. Soft processors are already widely deployed (Altera’s Nios, Xilinx’s Microblaze), therefore their architectures have become important. Our goal is to investigate the architecture of soft processors and develop an FPGA-specific understanding of processor architecture. To do so, we have developed SPREE (Soft Processor Rapid Exploration Environment), which, through RTL generation, can produce accurate area, clock frequency, power, and cycle count measurements from a textual description of a processor.
SPREE Overview
The entire SPREE system consists of everything needed to extract measurements from the input processor architecture description. This includes the RTL Generator, benchmarks, RTL Simulator, RTL CAD system, and accompanying scripts for using each. These are shown in the block diagram overview of SPREE above. Not shown is also a compiler infrastructure (GCC cross-compiled) for benchmarking and instruction set simulator for verification.
The core of SPREE is the automatic RTL Generator which produces synthesizable Verilog HDL code from the input processor architecture description. Using the generator, one can quickly transform an architectural idea into a real implementation. The advantage of intending the implementation to stay on an FPGA means one can make all measurements directly from the RTL description. Synthesis of the HDL can produce accurate area, clock frequency, and power measurements. In addition, RTL Simulation can be used to extract cycle count as well as cycle-by-cycle behaviour. Thus, one can quickly and fully understand the costs/benefits of any architectural modification.
With this ability, one can perform focussed studies on many architectural ideas including: Different component implementations, resource sharing, pipeline depth, pipeline organization, forward/bypass logic, HW/SW codesign evaluation, ISA changes, application specific customizations.