## 18.8 A 72Gb/s 2<sup>31</sup>-1 PRBS Generator in SiGe BiCMOS Technology

Timothy Dickson<sup>1</sup>, Ekaterina Laskin<sup>1</sup>, Imran Khalid<sup>2</sup>, Rudy Beerkens<sup>2</sup>, Jingqiong Xie<sup>2</sup>, Boris Karajica<sup>2</sup>, Sorin Voinigescu<sup>1</sup>

<sup>1</sup>University of Toronto, Toronto, Canada <sup>2</sup>STMicroelectronics, Ottawa, Canada

The demand for higher data rates in communication networks raises the need for test equipment operating above 40Gb/s. PRBS generators are commonly used to provide test patterns for highspeed blocks such as MUX or CDR circuits. While generators with pattern lengths of 27-1 have been reported at rates of 100 Gb/s [1], a 2<sup>31</sup>-1 pattern is required for testing CDRs to ensure that phase lock is maintained for data streams containing many consecutive ones or zeros. However, the high level of single-chip integration required for such long patterns above 40Gb/s remains an obstacle. Careful attention must be paid to system-level design and layout to ensure synchronous clocking of the 31 flipflops in the feedback shift register (FSR), even if the FSR is clocked with a half-rate clock or less. In this paper, a 2<sup>31</sup>-1 PRBS generator in a  $0.13 \mu m$  SiGe BiCMOS technology with  $150 GHz-f_T$ HBTs [2] is reported. The circuit is capable of producing output data rates of up to 72Gb/s, with manageable power dissipation, by using a high-speed low-voltage BiCMOS ECL/CML logic familv.

The block diagram of the PRBS generator is shown in Fig. 18.8.1. The circuit is implemented using a FSR PRBS architecture together with a 4:1 MUX to increase the output data rate. The 31flip-flop FSR generates a 2<sup>31</sup>-1 pattern with an adjustable data rate of up to 25Gb/s depending on the input clock frequency. Upon reset, a pulse generator inserts a logic "1" into the 7th flip-flop of the shift register, thus, avoiding the all-zero state. The outputs of the 28<sup>th</sup> and 31<sup>st</sup> flip-flops are added and the result is fed back into the chain. Rather than employing a separate XOR gate, the XOR function is incorporated into the master latch (Fig. 18.8.2) of the first flip-flop of the FSR to reduce delays in the chain. In order to properly multiplex 4 PRBS streams into one PRBS at 4 times the data rate, the incoming quarter-rate streams must be shifted apart by  $1/4^{\text{th}}$  of the sequence length. Therefore, before the 4:1 MUX, a phase-shifting logic block is included. This makes use of the cycle-and-add property of pseudo-random sequences to produce the shifted outputs [3]. Mismatch in path lengths in this logic block can lead to the addition of improperly-delayed sequences and produce non-pseudo-random outputs. To avoid this, synchronous modulo-2 addition is performed by using the XOR flip-flop discussed earlier. Dual full- and half-rate PRBS outputs are obtained by tapping from appropriate points in the 4:1 MUX.

The generator makes extensive use of the BiCMOS logic topology illustrated in the 80Gb/s selector cell of Fig. 18.8.3, used in the final stage of multiplexing. This topology takes full advantage of the best features of both n-MOSFETS and SiGe HBTs to minimize building-block delays while allowing for a reduction in supply voltage over pure HBT ECL [4]. The gate resistance of an n-MOSFET can be made negligible through layout techniques, yielding a much lower input-time-constant than can be achieved with SiGe HBTs. This, coupled with double emitter-follower inputs, allows for fast switching of the clocking transistors. Furthermore, unlike bipolar E<sup>2</sup>CL which requires 5V supplies or higher, the use of the BiCMOS topology allows for reliable operation from a 3.3V supply without compromising speed. The corresponding power saving is critical for achieving the high level of integration required in a 2<sup>31</sup>-1 PRBS. Upper-level differential pairs used to switch data inputs are implemented with SiGe

HBTs to capitalize on their high intrinsic slew rate and the low input-voltage swing required for complete switching. As the 80Gb/s selector directly drives an external load, resistors  $R_{\rm L1}$  and  $R_{\rm L2}$  are set to 50 $\Omega$  for matching purposes. Shunt-peaking inductors  $L_{\rm P1}$  and  $L_{\rm P2}$  extend the bandwidth and are implemented as 3-dimensional inductors to minimize area while improving the quality factor and self-resonant frequency [5]. Microstrip inductors  $L_{\rm P3}$  and  $L_{\rm P4}$  resonate with the bond-pad capacitance  $C_{\rm PAD}$  to further broaden the frequency response.

At mm-wave clock frequencies, clock and data alignment becomes a significant aspect of system-level design. To improve timing margins in critical clock paths, a variable-delay cell with 50GHz bandwidth is employed. As seen in the schematic of Fig. 18.8.4, the clock signal and a delayed version of the clock are fed into HBT differential pairs Q7-Q8 and Q9-Q10. The transconductance of each differential pair is controlled by steering the total bias current through MOS differential pair M3-M4 and mirroring to current sources M7-M10. Summation of currents at the collectors of the HBT differential pairs results in the CML weighted addition of the fast and slow clock paths. To avoid output-amplitude distortion over the tuning range of the variable-delay cell, two design precautions are exercised. First, differential pairs Q7-Q8 and Q9-Q10 are degenerated to desensitize both their input capacitance and transconductance to variations in bias current. Additionally, the gain of the delay element  $\tau_{\text{DELAY}}$ , implemented by cascading two BiCMOS cascode inverters, is set to unity to equalize the clock amplitudes at the input of the weighted adder. The excellent frequency response of the BiCMOS cascode inverter [4], coupled with its inherent low gain, makes this topology well-suited for use as the delay element.

Fig. 18.8.5 demonstrates operation of the half-rate PRBS output up to 45Gb/s, as seen from a measured eye diagram and its corresponding power spectrum. While in theory the harmonic tones of the 45Gb/s PRBS occur every 45GHz/(2<sup>31</sup>-1)=20.95Hz, the resolution bandwidth of the spectrum analyzer limits observation of these tones. The measured eye diagram and power spectrum of the full-rate output at 72Gb/s are shown in Fig. 18.8.6. A singleended swing of 300mV is obtained with an output rms jitter of 650fs. A separate test structure for the 40GHz clock tree and the 2:1 MUX is tested up to 80Gb/s; however, the output data rate of the PRBS generator is limited to 72Gb/s by the tuning range of the variable-delay cells. As seen in the die micrograph of Fig. 18.8.7, a significant portion of the layout is devoted to clock distribution. This ensures clock synchronicity in the FSR and improves timing margins while reducing duty-cycle distortion in the 4:1 MUX. The circuit occupies an area of 3.5 x 3.0mm<sup>2</sup> and consumes 9.28W from a 3.3V supply. A recently-fabricated revision of the PRBS includes an output buffer at 72Gb/s and retiming on the half-rate output.

## References:

 H. Knapp et al, "100Gb/s 2<sup>7</sup>-1 and 54Gb/s 2<sup>11</sup>-1 PRBS generators in SiGe bipolar technology," *IEEE Comp. Semiconductor IC Symp.*, pp. 219-222, Oct. 2004.

[2] M. Laurens et al, "A 150GHz f\_v/f\_MAX 0.13µm SiGe:C BiCMOS technology,"  $Proc.\ IEEE\ BCTM,$  pp. 199-202, Sept. 2003.

[3] A. Davies, "Delayed versions of maximal-length linear binary sequences," *Elec. Letters*, pp. 61-62, May 1965.

[4] T. Dickson et al, "A 2.5V, 40Gb/s decision circuit using SiGe BiCMOS logic," *Dig. Symp. VLSI Circuits*, pp. 206-209, June 2004.

[5] T. Dickson et al, "Si-based inductors and transformers for 30-100GHz applications," *IEEE MTT-S Dig.*, pp. 205-208, June 2004.



Continued on Page 000

18

