A Novel 1-Gbps Clock and Data Recovery Architecture ... - IEEE Xplore

4 downloads 8740 Views 381KB Size Report
A Novel 1-Gbps Clock and Data Recovery Architecture using Synchronous. Oscillator in CMOS VLSI technology. C. Scarabello1, 2, J-B. Bégueret1, Y. Deval1 ...
ESSCIRC 2002

A Novel 1-Gbps Clock and Data Recovery Architecture using Synchronous Oscillator in CMOS VLSI technology C. Scarabello1, 2, J-B. Bégueret1, Y. Deval1, D. Deschans1, P. Fouillat1, M. Pignol2, J-Y. Le Gall3 1 Laboratoire IXL – University of Bordeaux - France 2 CNES – DTS/AE/SEA/IL - Toulouse - France 3 Alcatel Space Industries – DOB/TTN - Toulouse – France Email : [email protected] Abstract A new fully integrated clock and data recovery (CDR) topology based on a synchronous oscillator (SO) is presented in this paper. Implemented in CMOS VLSI 0.25 µm technology, the circuit is dedicated to 1 Gbps point-to-point networks. It presents very low jitter measurements : 9.8 and 11 ps rms for respectively PRBS7 and PRBS31 patterns. The circuit exhibits some major advantages versus classical CDR based on PLL and DLL : no external component needed, low consumption and low locking time.

1. Introduction The new generation of radar and optical satellite is data rate more and more increasing. For example in optical satellite, the digital data communications use 50 Mbps for SPOT4, 128 Mbps for SPOT5 and 400 Mbps for PLEIADES. The latter requires a very high resolution (1 meter), and therefore needs an inside pointto-point network greater than Gbps. These networks are used for interfacing sensors (camera, …) to mass memories or processing elements. In high-speed serial link, data are transmitted without the bit clock. It leads to decrease the number of wire and it prevents the stringent consideration related to the phase between clock and data signals. Then, the clock must be extracted from the input data rate. Usually, phase-locked loops (PLL) and delay-locked loop (DLL) are brought into play for clock extraction [1, 2], but these techniques are challenging at this high rate, especially from the phase detector point of view. Furthermore, so high a frequency of interest will doubtless induce large power consumption within the PLL. This is mostly due to the frequency dividers. In addition, for the dividers to be able to handle gigahertz signals, advanced technologies are mandated. Thus, making both low-cost and low-power systems will be difficult to achieve, if not impossible. Moreover, in order to achieve large time constants required to maintain the clock synchronized on the bit rate, even if long running length at the same value occurs, large capacitors are mandated. These passive

components cannot be integrated without a huge increase of silicon area, and subsequently the classical CDR imposes numerous off-chip components. The proposed solution in this paper is to use a Synchronous Oscillators (SO) which has been previously published as an alternative to Phase Locked Loop (PLL) [3, 4]. Indeed, the injection-locked phenomena SO takes advantage of leads to very compact PLL-like circuits. As no further frequency divider is needed in a SO, power consumption is dramatically reduced with regards to its PLL counterparts, and classical technologies such as VLSI CMOS can still be used. This paper proposes a new methodology based on a Synchronous Oscillator for clock extraction [5]. It is well known from the literature [6, 7] that an oscillator could be locked on an external signal which frequency is closed to its free running frequency. We have adapted this methodology to generate a clock recovery system. This paper describes this new integrated architecture of clock and data recovery based on a Synchronous Oscillator. The architecture is detailed in section 2 and the circuit design in section 3. Implemented in CMOS VLSI compatible 0.25µm technology, we present the results in section 4.

2. CDR Architecture Fig.1 depicts the block diagram of the CDR circuit. As conventional ones, two building blocks are implemented : one for the clock recovery and the second for the data recovery and synchronization versus the clock. The clock recovery is based on the use of a synchronous oscillator. The latter is built with a classical oscillator associated with a pulse generator. The principle of operation [4] is to stimulate an oscillator by an input signal where the input frequency is closed to the free running one. Then the oscillator will lock on this input reference. The pulse generator block is implemented in order to create sharp edges at the input of the oscillator to smooth the progress of synchronization.

779

Synchronous Oscillator

Pulse Oscillator Generator Clock buffered

Decision circuit Data in Data in

Clock

Delay Line

D-FFs

inductance parameters, power consumption and selfresonance effects leads to the component values shown in Fig. 2. Its yields a roughly 1.1 GHz free running frequency. M101

Clock Data out

Logic

Data out

L 5.6 nH

L 5.6 nH

C 4.2 pF

out1

Figure 1. Block diagram of the CDR circuit When no data is delivered to the input data lines, the output clock is given by the free running frequency of the oscillator. When the bits flow is set, and if the input data rate is closed to the free running frequency of the oscillator, the SO is synchronized on the input data rate. The bit clock is then recovered. Unlike RF applications where the carrier is a continuous waveform, the input reference is a pseudorandom signal. Indeed, the pulse generator does not deliver a periodic pulse but also a pseudo-random one. The oscillator is then synchronized by multiple harmonics of the input signal. All these transitions resynchronize the oscillator, so the average frequency is the same than the data rate. In order to have a good synchronization –i.e. to avoid the SO to slowly shift towards its natural oscillation frequency when a long running length at the same value occurs– the transition bit rate is an important parameter.

3. Design strategy 3.1. Synchronous Oscillator

(

Vpol

M2

M1

Figure 2. Negative resistance oscillator The output of the pulse generator is directly connected to the drain of the MOS transistors. In order to preserve sufficient voltage headroom to appropriately polarize the current sources of the pulse generator, PMOS transistor M101 has been implemented.

3.2. Pulse Generator The schematic of the pulse generator, based on a Delay Oriented Design (DOD) approach, is depicted in Fig.3. To synchronize the oscillator on the input data rate, this block has to generate a current pulse, which duration equal to half the period of the wanted frequency [4].

)

1 + 2 ⋅ gl − gm ⋅ r ' '' L ⋅  C + C   

M102

Synchronisation input

I1(t) I2(t) M5

(1)

C ' = C + 2 ⋅ C  gd with  ' ' where C C C = + +C +C +C  gs gb db ox int  Cgd, Cgs, Cgb, Cdb are respectively the gate-drain, gatesource, gate-bulk, and drain-bulk capacitors, gm is the transistor transconductance, gl-1 is the capacitor load and r, Cox, and Cint come from the integrated self-inductor model. These inductors are based on square hollow spiral structure on top metal layer with minimum space between two metal lumps. In order to perform differential operation, the capacitor C is split into two identical metal-metal lateral fringe ones, chosen to optimize the locking range. A trade-off between

780

M3

2,5 V

The synchronous oscillator, as shown in Fig.2, uses a 'negative resistance' configuration. The free running frequency is given by:

ω0 =

out2

Figure 3. Pulse generator schematic The synchronization range is done by equation 2: 1 ∆f = ⋅ 2 ⋅π

I sync V0



1 C

(2)

where V0 is the output voltage, Isync the synchronization current and C the overall tank capacitor, including parasitic. The pulse current has been evaluated for an optimized locking range. It delivers roughly 9 mA and 680 ps pulse width, which leads to about 200 MHz of synchronization range. This is enough to cover some process variations.

3.3. Decision Circuit

4. Experimental results

The decision circuit objective is to synchronize the data frame with the internal recovered clock. The classical architectures using D Flip-Flop [8] are not well suited for our system, so we have developed a more robust design based on a voting sequence. Fig.4 shows a circuit diagram of the decision circuit. It consists of a comparison between few D Flip-Flops.

Fig.5 depicts the chip microphotograph of the clock recovery. This chip has been manufactured in a 5-metals, 0.25 µm, low cost digital CMOS technology. The area is roughly 2.5 mm². Due to the parasitic components yielded by the package, a great number of power bondings were used in addition with on-chip decoupling capacitors. This circuit will demonstrate than the synchronous oscillator can be used as clock recovery without any external component.

D Flip-Flop

Delay Line Data in

D1

D2

Logic

D3 D-FF1 B

Clock

3

Data out

D-FF2 B

2

Clock Data in

D-FF3 B

Data out

1

D1

D2

D3

Figure 4. Decision circuit diagram The data frame goes through two delay lines. Each line is built with three delay elements which generate three identical delays D1, D2 and D3. The delay is chosen so that D2 frame is synchronized with the falling edge of the clock. In this application, the frequency is 1 GHz, so each delay element is optimized at 250 ps. D3 is then 750 ps behind the rising edge of the clock. Therefore, two of three delayed lines will always be in a bit period. The six lines are synchronized with the clock into three D FlipFlops, D-FF1, D-FF2, and D-FF3. We can observe four different events, depending on the synchronization process : - B1 swings earlier than B2 and B3. In fact, B1 arises during the previous bit, i.e. between 250 and 500 ps earlier. That represents an excessive amount of jitter on the data and/or the clock, or a clock frequency higher than the frame frequency (corresponding to the beginning of the synchronization process). - When B1, B2 and B3 are in phase, the three lines are phasing. Clock and data are synchronized and the jitter does not exceed one third of 1 GHz. - B3 swings earlier than B1 and B2. In fact, B3 arises on the following bit, i.e. between 250 and 500 ps later. Origins are the same as the first occurrence, except that in this case the clock frequency could be lower than the frame frequency. - If the swing is up than 500 ps, the comparison is not made on the appropriate bit. The data frame is synchronized on the falling edge of the clock. For this reason, another D Flip-Flop, added at the output of the decision circuit, will synchronize the data on the rising edge of the recovered clock.

Figure 5. Chip microphotograph A second chip, including the overall CDR system, has been fully implemented in a 6-metals, 0.18 µm, low cost CMOS technology. The clock recovery is the same architecture than previous one. The decision circuit includes 50 Ω output buffers in order to drive the classical instrumentation. The total current consumption is roughly 40 mA under 1.8 V supply voltage. The experimental results presented in this paper concern the CMOS clock recovery system. The free running frequency of the clock recovery is 1.15 GHz and the synchronization range is around 200 MHz. In Fig. 6, the extracted clock is shown. The current consumption is 47 mA under 2.5 V supply voltage. When a recurring bit pattern is applied at 1 Gbps, the rms measured jitter is 1.7 ps. In fact, this jitter is only representative of Anritsu 1632A generator random noise, which is 1.3 ps rms. It confirms that the synchronization process leads to minimum generated jitter.

781

5. Conclusion

Figure 6. Clock recovery signal When a PBRS7 pattern is used, 60 ps peak-to-peak and 9.8 ps rms jitter is measured. Using a PRBS31 pattern, the synchronization is still achieved with 94 pp p-p and 11 ps rms.

Figure 7. Jitter histogram Fig. 7 depicts this 74 ps data dependant jitter for a 12 bits pattern. All traces represent the data random pulse. The system is synchronized in less than one pattern, which is an inherent property of the SO, due to its large synchronization bandwidth. This is a major improvement versus classical CDR using DLL or PLL. The main features of this chip are summarized in Table 1. Table 1. Measurement results Supply voltage 2.5 V 100 mW Power dissipation (including 50 Ω buffers) Free running frequency 1.15 GHz Synchronization range 200 MHz PRBS7 rms jitter 9.8 ps PRBS31 rms jitter 11 ps Technology CMOS 5M1P 0.25 µm

782

In this paper, a fully integrated clock and data recovery has been presented. We have demonstrated that a synchronous oscillator is able to realize a clock recovery. This structure needs no external components, and the silicon area used is much lower than systems using PLL or DLL : only a classical oscillator associated with a simple logic is mandated. It leads to low cost applications. Among others major advantages versus classical CRD systems are a lower power consumption and a lower locking time.

6. References [1] H. Djahanshahi, C.A.T. Salama, “Differential CMOS Circuits for 622-MHz/933-MHz Clock and Data Recovery Application”, IEEE J. Solid-State Circuits, vol. 35, pp. 847-855, June 2000. [2] T. Lee, J.F. Bulzacchelli, “A 155 MHz clock recovery Delay Loop and Phase Locked Loop (D/PLL)”, IEEE J. Solid-State Circuits, vol. 35, pp. 948-955, June 1992. [3] F. Badets, Y. Deval, J-B Begueret, A. Spataro and P. Fouillat, “A Fully Integrated 3V 2.3GHz Synchronous Oscillator For WLAN Applications”, in Proc. Bipolar/BiCMOS Circuits and Technology Meeting, Minneapolis, US, Sept. 1999, pp. 145-149. [4] Y. Deval, J-B Bégueret, P. Fouillat and D. Belot, "Low-Power Synchronous Oscillator in SiGe Technology for UMTS 2000 Applications", in Proc. IEEE Asia-Pacific Microwave Conference, Sydney, Australia, Dec. 2000, pp.1580-1583. [5] Patent pending, J.B. Begueret, Y. Deval, P. Fouillat, J.Y. Le Gall, M. Pignol, C. Scarabello, “Clock and Data Recovery using the Synchronous Oscillator Concept”, April 2002. [6] R. Adler, “A Study of Locking Phenomena in Oscillators”, in Proc. I.R.E. and Waves and Electrons, vol. 34, pp. 351-357, June 1946. [7] V. Uzunoglu, M.H. White, “The Synchronous Oscillator : A synchronization and Tracking Network”, IEEE J. Solid-State Circuits, vol. 20, pp. 1214-1226, Dec. 1985. [8] M. Shikata, K. Tanaka, H.T. Yamada, H.I. Fujishiro, S. Nishi, C. Yamagishi and M. Akiyama, “A 20-Gb/s Flip-Flop Circuit Using Direct-Coupled FET Logic”, IEEE J. Solid-State Circuits, vol. 38, pp. 10461051, Oct. 1993.