A 64-processor prototype of the NUMAchine multiprocessor architecture (illustrated below) is under construction in the Dept. of Electrical /Computer Engineering at the Univ. of Toronto.
The implementation of each station is based on the FutureBus+ physical standard, but NUMAchine utilizes a custom synchronous bus protocol.
A number of printed circuit boards have been designed and fabricated:
Status: The I/O board has been fabricated and is working... pictures pending. Also, a number of circuit boards which implement the global ring for the top level of the interconnection network have been fabricated and are being tested.
All boards utilize field-programmable devices (FPDs) from the Altera
Corporation for much of the control circuitry, such as the system interface
for the MIPS R4400 microprocessor, the
directory controller on the memory board, and the ring controller on the
network interface board. Field-programmable devices provide shorter design
cycles and cost-effectiveness (although good performance requires careful
design). In addition, FPDs provide flexibility to implement new protocols
to support future research.
The NUMAchine Hardware Development Group
||Major contributors who have moved on:
The bus physical backplane is at the bottom of the photograph. The boards plug vertically into the backplane.
From left to right:
The power supply is visible directly beneath the bus backplane. A clock generation and distribution board (not visible) is located underneath the backplane.
The Processor Board
At the top of the board are LED displays and connectors for diagnostics, EPROM to program the Altera FPDs, and EPROM with boot code for the R4400.
The MIPS R4400 microprocessor with heat sink is at the center of the board, surrounded by SRAM cache chips.
Directly below the R4400 is a row of Altera field-programmable devices which serve as the system interface for the R4400. Below these chips is a row of FIFO buffers to and from the NUMAchine station bus. Finally, below the FIFOs is a row of FutureBus+ BTL chips for listening to and driving the NUMAchine station bus.
Click on the picture to see the latest version of the processor board, revision 3, in detail.
The connector to the NUMAchine station bus is at the bottom of the board.
The Memory Board
DRAM SIMMs occupy the left side of the board. The top right-hand corner is occupied by a bank of SRAM chips used in maintaining the directory for the cache coherence protocol.
At the right-hand center of the board are the Altera FPDs which contain the control circuitry for the cache coherence protocol. There is also an Altera FPD at the top center of the board to control the DRAM array.
FIFO buffers and BTL interface chips connect the memory board to the NUMAchine station bus through the connector at the bottom of the board.
Click on the picture to see the latest version of the memory board, revision 2, in detail. Hardware monitoring, which was not present in the original revision, has been added in the Altera FLEX10K30 device. The patchwires were necessary to correct an FPGA programming problem, and have been eliminated with a final respin of the board.
The Network Interface Board
The ring connectors are visible in the top corners of the board. The buffers for the ring interconnect occupy the space between the connectors.
The DRAM chips for the remote data cache occupy a small area on the underside of the board.
The Altera FPDs containing the control circuitry for the cache coherence protocol, the rings, and the remote data cache are clearly visible in their sockets.
Pipelining for the wide data paths on this board requires the large number of buffer chips which occupy much of the board.
FIFO buffers and BTL chips are located at the bottom left and bottom right, as well as the the bottom edge of the board, directly above the connector to the NUMAchine station bus.
Click on the picture to see the latest version of the network interface board, revision 2, in detail. You will notice that many of the discrete buffers have been replaced with Altera FLEX6016 FPGAs. Also, the SDRAM has been moved to the top surface.
The Clock Generator Board
The clock generator board can be programmed to a wide range of frequencies by the red DIP switch block. Differential the ECL master clock is generated by the chip in the top, centre of the board and split 2:1 by the small chip in the centre. The left and right chips are 9:1 fanout replicators, giving a total of 18 ECL clock signals. We distribute the clocks to the NUMAchine backplane via twisted-pair cables. Of course, we must take care that the cables are all the same length to minimize skew mismatch between the signals.
The Bus Arbiter Board
The bus arbiter board is a centralized, synchronous arbiter that controls access to the NUMAchine bus. Since this was one of the first boards we made, a few miscellaneous test circuits were also added to experiment with high-speed signalling using Altera devices. These test circuits use the DIP switches to test different functions. Also, a NUMAchine station RESET switch is located on this board, just below the DIP switches.
The bus arbiter function has been added to the latest version of the I/O Board. Unfortunately, we do not have scans of that board ready yet for display.
Back to NUMAchine Home Page...