## 14.1 Negative-Resistance Read and Write Schemes for STT-MRAM in 0.13µm CMOS

David Halupka<sup>1</sup>, Safeen Huda<sup>1</sup>, William Song<sup>1</sup>, Ali Sheikholeslami<sup>1</sup>, Koji Tsunoda<sup>2</sup>, Chikako Yoshida<sup>2</sup>, Masaki Aoki<sup>2</sup>

<sup>1</sup>University of Toronto, Toronto, Canada, <sup>2</sup>Fujitsu Laboratories, Atsugi, Japan

Spin-torque-transfer (STT) magnetoresistive random-access memory (MRAM) [1-3], a successor to field-induced magnetic switching MRAM [4,5], is an emerging non-volatile memory technology that is CMOS-compatible, scalable, and allows for high-speed access. However, two circuit-level challenges remain for STT-MRAM: potentially destructive read access due to device variation and a high-power write access. This paper presents two STT-MRAM access schemes: a negative-resistance read scheme (NRRS) that guarantees non-destructive read by design, and a negative-resistance write scheme (NRWS) that, on average, reduces the write power consumption by 10.5%. A fabricated and measured test-chip in 0.13µm CMOS confirms both properties.

A simplified block-diagram of the STT-MRAM is shown in Fig. 14.1.1. The memory consists of storage cells and reference cells. A negative-resistance read/write scheme is used to access the storage cells. Two reference cells (per row) provide an on-chip reference for the read scheme. Due to area and I/O constraints, we access the 16kb array serially through shift registers. The cell, shown in the top inset, consists of a magnetic tunnel junction (MTJ) in series with an access transistor: 1T1MTJ topology [1,2]. The bottom inset shows a cross-section of the MTJ: an oxide barrier (MgO) between a fixed and a free magnetic layer (CoFeB) [3]. Depending on the magnetic orientation of the free layer with respect to the fixed layer, the MTJ ( $R_{MTJ}$ ) can be in one of two states, a parallel (P) low-resistance ( $R_{PP}$ ) or an anti-parallel (AP) high-resistance ( $R_{AP}$ ) state, which encode the stored data bit.

To read from a cell, the conventional scheme [1,2,4], shown in Fig. 14.1.2, supplies a constant current ( $I_{BFAD}$ ) to the MTJ and detects the value of  $R_{MTJ}$  ( $R_P$  or  $R_{AP}$ ) by monitoring the voltage across it ( $V_{MLI}$ ). A cell can also be read using a constant voltage source, in which case I<sub>MTJ</sub> is measured. I<sub>READ</sub> is chosen to maximize the sense voltage, which comes at the cost of reducing the non-destructive read margin due to  $I_{READ}$  approaching the MTJ switching current,  $I_{C-}$ , indicated in the current-based  $R_{\mbox{\scriptsize MTJ}}$  hysteresis loop. Due to device variation,  $I_{\mbox{\scriptsize C-}}$  might deviate from its nominal value to be smaller than I<sub>RFF</sub> in magnitude, which would cause destructive reads. To avoid this trade-off between the sense voltage and the read margin, and to guarantee a non-destructive read, we shunt the MTJ with a negative resistance (-R) that dynamically allocates current to the MTJ depending on its state (bottom of Fig. 14.1.2). We choose -R such that  $-R||R_{AP}$  is negative, while  $-R||R_P|$  is positive. A negative net resistance in parallel with the sense-line (SL) capacitance, C<sub>SI</sub>, makes an unstable system, while a positive net resistance makes a stable system. A small initial voltage,  $V_{\text{init}},$  causes  $V_{\text{MTJ}}$  to exponentially grow to  $V_{DD}$  in the unstable system, and decay to ground in the stable system, thus sensing the MTJ state and reading the stored bit. To illustrate the non-destructive nature of NRRS, we annotate the developing V<sub>MTJ</sub> with successive dots on the voltage-based R<sub>MTJ</sub> hysteresis loop. We choose the orientation of the MTJ such that in the unstable system (shaded dots)  $V_{\mbox{\scriptsize MTJ}}$  moves to the left along the hysteresis loop, while in the stable system (clear dots)  $V_{MLI}$ moves to the right. Since V<sub>init</sub> can be arbitrarily small, we set it well below V<sub>C-</sub>, ensuring a non-destructive read regardless of device variation.

Figure 14.1.3 presents the circuit implementation of the NRRS. To read from the storage cell, we first connect the BL to GND and then enable the WL, which connects the cell to the –R read circuit. A sense amplifier monitors the voltage developing on the sense node, S, and compares it against an inverter threshold, V<sub>TH</sub>, to output the read result. The –R circuit [6] generates a current proportional to the voltage at S with a transconductor M<sub>G</sub>, and then reflects this current back into S with a current mirror M<sub>P1</sub>-M<sub>P2</sub>, thus appearing as a negative resistance at node S. We bias the –R circuit such that it forms a meta-stable system when connected to R<sub>P</sub>||R<sub>AP</sub>, which is the equilibrium point between a stable R<sub>P</sub> and an unstable R<sub>AP</sub> case. We add a bias current to the current mirror using M<sub>B</sub>, which is controlled by a replica bias scheme. This bias scheme consists of two reference cells per row, a scaled –R/2 replica circuit and an amplifier in negative feedback. The two reference cells in parallel, one in P and one in AP state (R<sub>P</sub>||R<sub>AP</sub>), connect to a –R/2 replica circuit, which is two –R's in parallel. The

amplifier forces the voltage at the reference sense node,  $S_{\text{REF}}$ , to  $V_{\text{TH}}$  by driving the gate of  $M_{\text{BREF}}$  in the negative feedback loop, as well as the gate of  $M_{\text{B}}$ . As a result of this biasing, the SA effectively compares the voltage at S with the voltage at  $S_{\text{REF}}$ , which corresponds to comparing  $R_{\text{MTJ}}$  in the storage cell with the reference resistance  $R_{\text{P}}||R_{\text{AP}}$ , maximizing the sense margin. Transistor  $M_{\text{C}}$  is used in a cascode configuration to increase the  $R_{\text{P}}$  to  $R_{\text{AP}}$  difference seen by the read circuit.

To write into a cell, the conventional scheme applies a current to the cell that is larger than the critical switching current. The negative resistance saves power during write by moderating the current though the MTJ,  $I_{\text{MTJ}}$ , as its resistance drops from high R<sub>AP</sub> to low R<sub>P</sub>, i.e. writing a '0'. In contrast to the read scheme, the write scheme uses the reversed orientation of the MTJ with respect to the -R circuit, as we show in Fig. 14.1.4. While writing a '0', the -R driver reflects current into BL that is proportional to  $R_{MTJ}$ . Once the write is complete, and  $R_{MTJ}$ reaches the low R<sub>P</sub> value, the driver reduces the reflected current thus saving power. We show this using the current-based  $R_{\mbox{\scriptsize MTJ}}$  hysteresis loop. If a cell stores a '1' (shaded circles), the driver exponentially increases I<sub>MLI</sub> as in an unstable system, until  $I_{MTJ}$  reaches  $I_{C+}$  and the MTJ switches its state. Then,  $I_{MTJ}$ exponentially decays as in a stable system, saving power. If a cell stores a '0' (clear circles), I<sub>MTJ</sub> decays right away. An externally controllable V<sub>B</sub> allows us to trade-off write-0 access time against amount of current saved. To write a '1', we use a conventional push-pull topology, since the  $R_{\text{MTJ}}$  increases to a high  $R_{\text{AP}}$ value, which naturally moderates I<sub>MTJ</sub>.

In Fig. 14.1.5 we present the measured read access time  $(t_{\rm R})$  of the NRRS. The simplified schematic of the measurement setup consists of the storage cell, the NRRS, a static buffer, and a shift-register flip-flop. We define  $t_{\rm R}$  as the time from the onset of column select (CS) to the onset of the shift-register clock (CLK). We varied  $t_{\rm R}$  from 1 to 15ns, and measured the yield of successful reads for all cells in the array. The read-access-time yield plot shown in Fig. 14.1.5 indicates that for  $t_{\rm R} \geq$  8ns, the yield levels off, which confirms that the NRRS has a minimum  $t_{\rm R}$  of 8ns. Read time yield varies below 8ns, as the feedback loop around the reference negative resistance requires a finite nonzero time to stabilize to a final value.

Figure 14.1.6 compares the measured write power consumption and write access time of the NRWS and conventional schemes. The bar graph compares the power of NRWS and conventional write scheme for four write patterns. A repeating 0 pattern results in the highest savings of 58.5%, while typical data pattern results in a 10.5% saving. We measured the write access time by measuring its yield, similar to the read access time. The measured minimum access time, shown in Fig. 14.1.6, is 9ns for write-0 and 10ns for write-1 cases. Our measurements indicate that, compared to the conventional scheme, NRWS saves write power without compromising the write access time.

We implemented the test-chip, shown in Fig. 14.1.7, in Fujitsu's 1-poly 8-metal  $0.13\mu$ m CMOS process with MTJ layers present between metals 7 and 8, as illustrated in the TEM images. We implemented the proposed and conventional schemes in the same test-chip in a 16kb array. The table in Fig. 14.1.7 compares our test-chip's performance with the most recent STT-MRAM implementation [1].

## Acknowledgements:

The authors acknowledge the contributions of Oleksiy Tyshchenko for his extensive editing of this manuscript, as well as his invaluable feedback during the design phase. This work was partially funded by the Natural Sciences and Engineering Research Council of Canada.

## References:

[1] T. Kawahara *et al.*, "2Mb Spin-Transfer Torque RAM (SPRAM) with Bit-by-Bit Bidirectional Current Write and Parallelizing-Direction Current Read," *ISSCC*, pp. 480-481, 2007.

- [2] M. Hosomi *et al.*, "A Novel Nonvolatile Memory with Spin Torque Transfer Magnetization Switching: Spin-RAM," *IEDM*, pp. 459-462, 2005.
- [3] C. Yoshida *et al.*, "A Study of Dielectric Breakdown Mechanism in CoFeB/MgO/CoFeB Magnetic Tunnel Junction," *IRPS*, pp. 139-142, 2009.
- [4] J. Nahas *et al.*, "A 180 Kbit Embeddable MRAM Memory Module," *JSSC*, vol. 43, no. 8, pp. 1826-1834, 2008.

[5] R. Nebashi *et al.*, "A 90nm 12ns 32Mb 2T1MTJ MRAM," *ISSCC*, pp. 462-463, 2009.

[6] O.Tyshchenko, A.Sheikholeslami, "Match Sensing Using Match-Line Stability in Content-Addressable Memories (CAM)," *JSSC*, vol. 43, no. 9, pp. 1972-1981, 2008.



## **ISSCC 2010 PAPER CONTINUATIONS**

| $\label{eq:response} \begin{split} \hline \begin{tabular}{ c c c c c c c c c c c c c c c c c c c$ |  |
|---------------------------------------------------------------------------------------------------|--|
|                                                                                                   |  |
|                                                                                                   |  |
|                                                                                                   |  |
|                                                                                                   |  |
|                                                                                                   |  |
|                                                                                                   |  |
|                                                                                                   |  |
|                                                                                                   |  |
|                                                                                                   |  |