As FPGAs continue to increase in transistor density, designers are using them to build larger and more complex systems-on-chip that require frequent sharing, communication, queueing, and synchronization among distributed functional units and compute nodes. These functions boil down to FIFOs and register files, which can both be implemented using multi-ported memories.
In this work we propose a new design for true multi-ported memories that capitalizes on FPGA block RAMs while providing:
The intuition for why an LVT-based design is more efficient, even though the LVT is purely implemented in logic elements, is because the LVT is much narrower than the actual memory banks since it only holds bank numbers rather than full data values—thus the lines that are decoded/multiplexed are also much narrower and hence more efficiently placed and routed. An LVT-based design also leverages block RAMS, which implement bulk memory more efficiently, and has an operating frequency closer to that of the block RAMs themselves.
Additionally, LVT-based design and multipumping are complementary, and we show that with multipumping we can reduce the area of an LVT-based design by halving its maximum operating frequency. With these techniques we can support soft solutions for multi-ported memories without expensive hardware block RAMs with more than two ports.
For example, the charts below show the area and speed of three 32-bit-wide multi-ported memories on an Altera Stratix III FPGA: LVT-based using M9K block RAMs, LVT-based using MLABs, and a pure logic (Pure-ALM) approach. At a depth of 256 elements, our LVT-M9K solution has 84% less area and 43% less delay than a pure logic implementation: