ECE1388: VLSI Design Methodology
256kbit SDRAM Design
Rebecca Au & Keith Tang
In this project, a 256k synchronous DRAM is designed in 0.35-mm CMOS technology. It consists 4 memory banks; each has a size of 256 x 256 bits. The DRAM is referenced by column address in each 16-bit word. During Read/write operations, the 16-bit word is read or written to the output buffer in parallel. It has a total of 32 I/O pins: 8 for row/column address, 2 for bank address, 16 for data input/output, 3 for power and ground, 3 others for row/column address select and refresh. It supports Read/Write, Burst, and Refresh operations. All commands and operations are executed on the falling edge of the master clock signal, CLK. The DRAM operates with a clock frequency of 50 MHz. The core area is 1950 mm x 1750 mm, with the memory bank taking 76 % of the area. Total area with I/O pad is 2362 mm x 2070 mm.
The block diagram is shown in Fig. 1.
Fig. 1 SDRAM Block Diagram
The WriteEn signal selects either Read or Write operation: WriteEn HIGH for write and LOW for read. The data is read or written to the memory locations in 2 clock cycles. In the first cycle, the row address is latched into the decoder on RASn LOW. In the next cycle, the column address is latched into the decoder on CASn low. At the next falling edge of CLKn, the address is decoded, the memory cells are sensed by sense amplifier and the data is read from or is written to the output buffers. As shown in Fig. 2, the delay between falling edge of the CLKn and the rising edge of the selected word line is 3.8 ns. The large delay is required because it takes some longer delay for the ripple counter to generate address incrementally during refresh or burst operation. The delay from the falling edge of the CLKn and the rising edge of the selected column line is 4.5 ns. The delay to the column select line must be longer than that to the word line, as selected bitline is turned on after the wordline.
Fig. 2 Simulation Result showing the delay from CLK to selected wordline and column
The dynamic nature of DRAM requires that the memory be
refreshed periodically so as not to lose the contents of the memory cells.
It is accomplished internally by the refresh counter in the row address
buffer. In refresh mode, the memory
is accessed with every possible row address combination.
The refresh operation requires 10.24
to refresh all memory locations. The
memory cells should be refreshed every 28 ms. Therefore, the DRAM is unavailable
about 0.036 % of the time for refresh. The
Refresh mode is selected by putting REFRESH HIGH.
The burst mode can be used to access each word incrementally in the selected column. In the Read/Write operation, the desired memory location is accessed. Then, in burst mode, the word line is incremented and the data of the selected column is read/wrote word after word.
Fig. 3 Simulation result showing wordline selected incrementally in burst and refresh Operation
Blocks in the SDRAM
Memory Cells and Banks
The memory cell comprises two NMOS transistors, as shown in
Fig. 4. One of the NMOS transistors
is connected from source to drain, and the pn-junction acts as a planar storage
capacitor. The gate of the other
transistor is controlled by the wordline for accessing the storage capacitor.
The drain and source of the storage transistor are shared to minimize
layout area. It is also shielded by
a ground line to minimize signal coupling.
Considering the wordline is connected to the gate of the memory cells, it presents a large parasistic load on the wordline and degrades the circuit speed. So the memory is split into 4 banks at the cost of a larger layout area. Each bank consists of 256 x 256 bits, as shown in Fig. 5. A folded bitlline structure is used to minimize noise. The wordline is layout using polysilicon, which allows the NMOS to be formed by crossing the poly wordline over an n+ active area. In order to minimize the parasitics on the wordline, Metal 3 is connected in parallel with the polysilicon, with contacts in every 16 cells.
Fig. 4a Schematic of Memory Cell
Fig. 4b Layout of Memory Cell
Fig. 5 Layout of Memory Bank
Each sense amplifier is shared by the top and bottom memory arrays. The sense amplifier consists of equilibration and bias circuits, isolation devices, input/output transistors, Nsense- and Psense-amplifiers, as shown in Fig. 6. During precharge, the bitline is precharged to Vcc/2 by the equilibration and bias circuits. Then, in active mode, the isolation devices isolate the non-selected array. When the column line is selected HIGH, the sense amplifier pulls the bitlines HIGH or LOW depending on the stored charge. The simulation plot is shown in Fig. 7.
Fig. 6a Schematic of Sense Amplifier
Fig. 6b Layout of Sense Amplifier
Fig. 7 Simulation Plot of Sense Amplifier
Row/Column decoder is used for decoding the input address
bit for accessing one of the wordlines/bitlines. Dynamic logic is used to increase speed, lower power and
minimize layout area.
The 8-to-256 bits row decoder is designed to access one of
the 256 word lines in the memory bank, as shown in Fig. 8,
The row decoder consists of 6 stages of NAND-INV-NAND-INV-NAND-INV cells.
Predecoding is used for the advantages of lower power, higher efficiency
and simplified layout. The 8-bit
row address is first predecode into 16 bits using NAND-INV.
The 16-bit is further predeocde into 32 bits using NAND-INV.
In the last stage, the predecoded logic is passed to the NAND-INV and
decode to 256 bits for accessing one of the 256 wordlines.
Similarly, the 4-to-16 bits column decoder is designed to
access one of the 16 column lines (each column line access a 16-bit word), as
shown in Fig. 9. It consists of 4
stages of NAND-INV-NAND-INV. The
4-bit address is predecoded to 8-bit, which is then decoded to 16-bit in the
The bank decoder is used to select one of the 4 memory banks. It consists of simple NAND-INV-INV-INV cells, as shown in Fig. 10.
Fig. 8a Schematic of 1 path of the 6 Stages Row Decoder
Fig. 8b Layout of Row Decoder
Fig. 9a Schematic of 1 path of the 4 Stages Column Decoder
Fig. 9 Layout of Column Decoder
Fig. 10 Layout of Bank Decoder
Row/Column Address Buffers
The address buffer consists of input inverter, latch and refresh circuitry (for row address buffer only), as shown in Fig. 11. The input inverter drives through a mux, which is controlled by clock. When the clock is low, the mux is enabled. The refresh counter consists of a single inverter and a pair of inverter latches coupled through a pair of complementary muxes to form a one-bit counter. For every HIGH-to-LOW transition of CLK, the register output toggles. All of the one-bit counters are cascaded together to form a ripple counter. The latch, consists of two inverters and two input muxes latches the row address after RAS falls. The feedback inverter has low drive capability, which allows the latch to be overwritten by either the address input buffer or the refresh counter.
Fig. 11a Schematic of Row Address Buffer
Fig. 11b Layout of Row Address Buffer
Full Chip Layout
The full chip floorplan of the core area is shown in Fig. 12 and layout is shown in Fig. 13. Total area of the core is 1950 mm x 1750 mm and 2362 mm x 2070 mm with I/O pads. The 4 memory banks, which take up 76 % of the total core area, are placed slightly off-center with spaces to the left for row decoder and row address buffers. The sense amplifiers are placed in middle between two memory banks. Space is left between the row decoder and the memory banks for routing 256 x 2 wordline signals to the memory elements. The column detector and column address buffers are relatively small in area and is placed at the bottom of the row detector and next to the sense amplifier. The I/O pins are placed near the edge of the core and can be easily routed to the I/O pads. The pad frame is shown in Fig. 14.
Fig. 12 Full Chip Floorplan
Fig. 13 Full Chip Layout
Fig. 14 Pad Frame Layout