A 200Mb/s ~ 3.2Gb/s Referenceless Clock and Data Recovery Circuit ...

5 downloads 8694 Views 535KB Size Report
referenceless clock and data recovery (CDR) circuit in 180nm CMOS process. A bidirectional frequency detector (FD) is proposed to eliminate the harmonic ...
This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Electronics Express, Vol.* No.*,*-*

A 200Mb/s ~ 3.2Gb/s Referenceless Clock and Data Recovery Circuit With Bidirectional Frequency Detector Nguyen Huu Tho, Kyung-Sub Son, and Jin-Ku Kanga) Dept. of electronics Engineering, Inha University, 100 Inha-ro, Nam-gu, Incheon 402-751, Republic of Korea a) [email protected]

Abstract: This paper presents a 200-Mb/s to 3.2-Gb/s half-rate referenceless clock and data recovery (CDR) circuit in 180nm CMOS process. A bidirectional frequency detector (FD) is proposed to eliminate the harmonic locking and reduce the frequency acquisition time. A frequency band selector for wide-range the voltage-control oscillator (VCO) is also presented to select an exact frequency band of the VCO. The simulation shows the CDR achieves 11-ps peak-to-peak jitter at 3Gb/s and the frequency acquisition time of 11.8 µs. Keywords: Clock and data recovery, bidirectional frequency detector, referenceless, wide-band VCO Classification: Integrated circuits References

©IEICE 2017 DOI: 10.1587/elex.14.20161279 Received January 4, 2017 Accepted January 20, 2017 Publicized April 6, 2017

[1]

B. Stilling: “Bit rate and protocol independent clock and data recovery,” Electron. Lett (2000) 824 (DOI: 10.1049/el:20000603).

[2]

R.J. Yang, S.P. Chen, and S.I. Liu: “A 3.125 Gbps clock and data recovery circuit for 10-Gbase-LX4 Ethernet,” IEEE J. Solid-State Circuits 39 (2004) 1356 (DOI: 10.1109/JSSC.2004.831809).

[3]

R.J. Yang, et al.: “A 155.52 Mbps-3.125 Gbps Continuous-Rate Clock-andData- Recovery Circuit,” IEEE J. Solid-State Circuits 41 (2006) 1380 (DOI: 10.1109/JSSC.2006.874328).

[4]

D. Dalton, K. Chai, E. Evans, et. al.: “A 12.5-Mb/s to 2.7-Gb/s Continuous-Rate CDR with Automatic Frequency Acquisition and Data-Rate Readback,” IEEE J. Solid-State Circuits 40 (2005) 2713 (DOI: 10.1109/JSSC.2005.856577).

[5]

M.S. Hwang, et al.: “A 180-Mb/s to 3.2-Gb/s, Continuous-Rate, Fast-Locking CDR Without Using External Reference Clock,” IEEE ASSCC (2007) 144 (DOI: 10.1109/ASSCC.2007.4425751).

[6]

S.-K. Lee, et al.: “A 650Mb/s-to-8Gb/s referenceless CDR circuit with automatic acquisition of data rate,” IEEE ISSCC Dig. Tech. Papers (2009) 184 (DOI: 10.1109/ISSCC.2009.4977369).

1

IEICE Electronics Express, Vol.* No.*,*-*

[7]

N.Kocaman, et al.: "An 8.5-11.5-Gbps SONET Transceiver With Referenceless Frequency Acquisition", IEEE J. Solid-State Circuits 48 (2013) 1875 (DOI: 10.1109/JSSC.2013.2259033).

[8]

C.L. Hsieh and S.I. Liu: “A 1–16-Gb/s Wide-Range Clock/Data Recovery Circuit With a Bidirectional Frequency Detector,” IEEE TCS-II 58 (2011) 487 (DOI: 10.1109/TCSII.2011.2158719).

[9]

R. Inti, et al.: “A 0.5-to-2.5 Gb/s Reference-Less Half-Rate Digital CDR With Unlimited Frequency Acquisition Range and Improved Input Duty-Cycle Error Tolerance,” IEEE J. Solid-State Circuits 46 (2011) 3150 (DOI: 10.1109/ JSSC.2011.2168872).

[10]

G. Shu, et al.: “A 4-to-10.5Gb/s Continuous-Rate Digital Clock and Data Recovery with Automatic Frequency Acquisition,” IEEE J. Solid-State Circuits 51 (2016) 428 (DOI: 10.1109/JSSC.2015.2497963).

[11]

B. Razavi: Design of Integrated Circuits for Optical Communication Systems (McGraw-Hill, New York, 2003) 310-313.

1 Introduction Clock and data recovery (CDR) circuits are used extensively in high-speed interface systems to extract the data and clock signals from the received signal. Phase-locked-loop (PLL)-based CDR is widely employed in monolithic implementations of the CDR circuits. Due to the narrow frequency acquisition range of the PLL, the most CDR implementations require to add a frequency detector (FD). Based on the frequency acquisition method, there are two types of the CDR described in the literature: reference and referenceless CDR. The first method uses a external reference clock for frequency acquisition. This method is simple but it increases the design cost. Furthermore, the usable data rate is limited to one or a few discrete values. So, it is not suitable for applications that require wide-range of the data rate. The second method extracts directly the clock from the input data stream without an external reference clock. So, it can be used for many different applications. The most significant challenge of referenceless CDR is the harmonic locking issue. To solve this problem, several wide-range unilateral FDs are presented in literature. The unilateral FDs always start from minimum frequency [4, 6, 10] or maximum frequency [5] of the VCO for the frequency acquisition process, thus increasing frequency acquisition time. Although quadricorrelator frequency detectors (QFDs) can track bidirectionally frequency, the QFD has limited locking range of ± 25% in [1, 7] and ± 13 % in [2]. To overcome this problem, bidirectional FDs are presented in [3, 8, 9]. The reference [3] assumes that run-length of the input data is fixed and can work by interaction between phase detector (PD) and FD because of absence of the lock detector. In [8], an additional quadrature divider is required for the frequency decrement acquisition, hence complicating the FD design. In reference [9], accuracy of the FD strongly depends on input transition density. This paper proposes a simple bidirectional FD for wide-band referenceless CDR circuits. This proposed FD is free from harmonic locking and the FD does not depend on run-length and transition density of the input data.

2

IEICE Electronics Express, Vol.* No.*,*-*

2 Circuit Description The block diagram of the proposed half-rate CDR circuit is shown in Fig. 1. It consists of the proposed bidirectional FD, a frequency lock detector (FLD), a half-rate bang-bang phase detector [11], two-band VCO, a frequency band selector (FBS), and two charge pump (CP) circuits. In Fig. 1, when the signal EN is activated, switch S1 turns on and S2 turns off. Based on bit rate of the input data, the proposed frequency band selector selects an exact frequency band from two-band VCO by updating the control bit D0. Then, the proposed FD tracks the frequency error between the input data and the output clocks, i.e., CKI and CKQ. Once the frequency error is small enough, the frequency lock detector triggers the LOCK signal to turn off S1 and turn on S2. After that, the PD takes over the acquisition process. Din

Half-rate PD

UPPh DNPh

Vc

CP1 Rp

C1

Cp

EN

Bidirectional FD

CP2 S1

Frequency lock detector

UPFr DNFr

D0

Off-chip

UPFr DNFr

CK

VCO

S2

EN UPFr DNFr Vc1max Vc2min Vc2max

Lock

Frequency Band Selector

CKI & CKQ

Fig.1. The block diagram of the proposed wide-band referenceless CDR

2.1 Bidirectional frequency detector Fig. 2 shows the proposed bidirectional FD. This proposed FD includes two unilateral FDs, a D-type flip-flop (D-FF) and a multiplexer. Din CKI CKQ

FD_DATA SLOWER

0

DN GND

1 S

VDD Din CKI CKQ

D

FD_DATA FASTER

CK

Q

SW

D-FF R

UP EN

Fig.2. The block diagram of the proposed bidirectional FD The unilateral FD that detects whether the data rate is faster than the clock is shown in Fig. 3(a) and its timing diagram is shown in Fig. 3(b). To reduce frequency acquisition time, instead of counting the rising edges of input data during one clock period of CKI/CKQ [4], we propose to count the number of consecutive transition edge of the data in the half of clock period. Moreover, the FD in [8] detects a single specific pattern "101" of the data in a half of clock period

3

IEICE Electronics Express, Vol.* No.*,*-*

and use only a single phase of clock. Our proposed bidirectional FD detects both specific pattern "101" and "010" of the data and uses two phases of the clock CKI, CKQ. Therefore, the proposed FD decreases the frequency acquisition time, and eliminates delay block required in [8] as well. As long as the pulse of clock CKI/CKQ encloses a single pulse bit of data Din, UP signal is generated. VDD D

Din

CK

Q

Q2

D-FF3

D CK

R

UP1

Q

D-FF4 UP

CKI

VDD

CKQ CKQ

D CK

Q

Q3

D-FF5

D CK

R

UP2

Q

D-FF6

(a) TCK/2

CKI

Din

‘1’

‘1’

‘1’

‘0’

‘1’

‘0’

‘0’

Q2

Q3

UP1

UP2

(b) Fig.3. (a) The block diagram of the frequency increment acquisition FD, (b) Timing diagram of the frequency increment acquisition FD in Fig. 3(a)

The proposed unilateral FD that detects whether the data rate is slower than the clock is shown in Fig. 4(a) and its timing diagram is shown in Fig. 4(b). The frequency decrement acquisition circuit (FDAC) is implemented with only two D-FFs, which is simpler than [3] and [8]. The FDAC in [8] uses a quadrature divider to generate signal DN with fixed pulsewidth. In our work, the proposed FDAC counts the number of consecutive transition edges of the clock in sequence "010" and "011"of the data. Consequently, the proposed FDAC generates the signal DN with pulsewidth that is ratio to frequency error, by which the quadrature divider can be removed. In addition, the proposed FDAC works well for any pseudorandom bit sequence (PRBS). As long as the data Din encloses pulse of clock CKI/CKQ, signal DN1 is generated. VDD D

CKI CKQ CKQ

CK

Q

D-FF1 R

Q1

D

Q

CK

DN2

DN1

D-FF2

Din

(a)

4

IEICE Electronics Express, Vol.* No.*,*-*

‘1’

‘1’

‘0’

Din

‘0’

‘0’

CKI

Q1

DN2

(b) Fig. 4. (a) The block diagram of the frequency decrement acquisition FD, (b) Timing diagram of the frequency decrement acquisition FD in Fig. 4(a)

Although the proposed FDAC generates DN signal to discharge loop filter, a possible false operation might take place. Assume that the data is faster than the clock frequency, we expect that the FD only creates UP signal at its output. However because of long-run length of the data, the FD creates both UP signal and DN which could lead to error the FD (Fig.5). That means, when the data is faster than the clock frequency, sometimes the FD generates UP, sometimes it generates DN. To solve this problem, we added a D-FF and a multiplexer to the FD (Fig.2). SW at output of D-FF is used to control DN of the frequency decrement acquisition. Initially, the external pulse EN keeps low to reset SW. When the clock is faster than the data, as long as SW keeps low, then DN goes to high to discharge the loop filter to decrease the clock frequency. When the data rate is faster than the clock, UP goes high. Then, SW is activated to disable DN and stop decreasing the clock frequency. After that, the frequency acquisition process can be accomplished by the frequency increment acquisition FD as shown in Fig. 6(b). The proposed FD uses two phases of the clock, i.e., CKI and CKQ. TCK/2

CKI ‘1’ Din

‘0’

‘1’ ‘0’

Long-run

‘0’

UP1 Error the FD DN2

Fig.5. Error due to a long-run length of input data

EN

EN

SW

Frequency Locked

SW VC Frequency Locked

VC

(a) Data rate is faster than clock rate

(b) Data rate is slower than clock rate

Fig.6. Timing diagram of the proposed bidirectional FD

5

IEICE Electronics Express, Vol.* No.*,*-*

2.2 Frequency band selector For wide-band CDR applications, the wide-range VCO design is also important. In this paper, we added a frequency band selector for the wide-range VCO. We designed the ring-type four-stage VCO with delay-cell in Fig.7. In which, VHP is PMOS gate bias voltage generated by current-mirror transformation from VHN. For more flexible control and better jitter performance for the CDR, the VCO is divided into two bands that is controlled by bit D0. The frequency band selection algorithm for the VCO is shown in Fig. 8. The frequency band selection algorithm always starts from the minimum frequency of band 2. And then based on bit rate of the input data with UP signal at the FD output, the control bit D0 is updated to select exact frequency band of the VCO.

VHP

D00

IN

OUTb

OUT

INb

D01

VHN

Fig.7. Delay-cell of the four-stage VCO

Initial VCO at minimum frequency of Band 2

UP?

YES

Select Band 2

NO

Set VCO at maximum frequency of Band 1

Set VCO at maximum frequency of Band 2

Select Band 1

Fig.8. Frequency band selection algorithm for the wide-range VCO

2.3 Frequency lock detector The schematic of frequency lock detector is shown in Fig.9. The data signal is divided by 2 (use a D-FFs) before feeding into the N-bits counter to relax counter speed requirements. The N-bit counter counts the number of consecutive rising edges of Din/2. As soon as the counter reachs a threshold (8 bits), the signal LOCK goes to high. That means, the FD is in the locking state. If UP or DN is appearing at output of the FD, it will immediately reset the counter to zero.

6

IEICE Electronics Express, Vol.* No.*,*-*

VDD D

Q

LOCK

D-FF1

CLA D

Din

CK

Q

D-FF1

CK

CK

COUNTER N-BITS

R

Q R

UP DN EN

Fig.9. Frequency lock detector

2.4 Analysis of the proposed FD A PRBS 27 - 1 for the input data is assumed for the analysis below. When the data rate is slower than the clock frequency, the timing diagram of the FD is shown in Fig.4(b). According to this timing diagram, two conditions must be satisfied to activate signal DN. The first condition is that the input data is a consecutive data pattern of "01x" where x can be '"0" or "1", so the probability of consecutive data pattern of "01x" is 1/4; The second condition is the data Din should be enclosed within a pulse of the clock CKI. We define the enclose rate is an estimated possibility that the pulse of clock CKI is enclosed by data Din in a given time ∆t. Because of long-run length of the input data, the average enclosure rate is given as

4Tb  TCKI 2 4Tb

(1)

in which TCKI is the period of clock CKI, and Tb is the bit time of the input data. As a result, the activated rate of DN in time ∆t is estimated as

t 4Tb  TCKI 2 1 . . Tb 4Tb 4

(2)

where ∆t Tb is the number of the input data bits in a given time ∆t. Fig.10 shows two cases where DN signal is activated in the proposed bidirectional FD. For the case of frequently changing input data, and the pulsewidth of UP is approximately equal to Tb as shown in Fig.10(a). For the case of the long-run length of input data, the pulsewidth of UP is approximately equal to 7Tb as ahown in Fig.10(b). So the average pulsewidth of UP is approximately equal to 4Tb . Therefore, in a given time ∆t, the controlled voltage of the VCO is decreased by

VC 

t 4Tb  TCKI 2 1 1 . . .4Tb . .I FD _ DN Tb 4Tb 4 Cp

(3)

in which IFD _DN is the charge current of the CP connected to the FD. In our case, since we use two clock phases CKI, CKQ, ∆Vc becomes

VC  2.

t 4Tb  TCKI 2 1 1 . . .4Tb . .I FD _ DN Tb 4Tb 4 Cp

(4)

therefore, the frequency deviation ∆f of clock is decreased by

7

IEICE Electronics Express, Vol.* No.*,*-*

f  Kvco .2t.

4Tb  TCKI 2 1 . .I FD _ DN 4Tb Cp

Tb

(5)

7Tb

Din

Din

CKI

CKI

Q1

Q1

DN2

DN1

(a)

(b)

Fig.10. (a) Shortest and (b) longest pulsewidth of DN signal for the input data of PRBS 27-1

When the data rate is faster than the clock frequency, the timing diagram of the FD is shown in Fig.3(b). This timing diagram reveals that to activate signal UP two conditions must be satisfied. The first condition is that the input data is a consecutive data pattern of "010"; The second condition is that the pulse of CKI must enclose one bit of the input data. For the first condition, the probability of consecutive data pattern of "010" is 1/8. For the second condition, the enclosure rate is given as [8]

TCKI 2  Tb TCKI 2

(6)

thus, the activated rate of signal UP in time ∆t is estimated as [8]

t TCKI 2  Tb 1 . . Tb TCKI 2 8

(7)

Using similar method as frequency decrement acquisition analysis, from Fig.11 we can get the average pulsewidth of UP with 5Tb , and the frequency deviation ∆f of clock is increased as [8]

VC 

t TCKI 2  Tb 1 1 . . .5Tb . .I FD _ UP Tb TCKI 2 8 Cp

(8)

In our work, we detect two sequences "010", "101" of input data, and use two clock phases CKI, CKQ, so ∆Vc becomes

VC  4.

t TCKI 2  Tb 1 1 . . .5Tb . .I FD _ UP Tb TCKI 2 8 Cp

(9)

hence, the frequency deviation ∆f of clock in a given time ∆t is increased by

f  KVCO .

20t TCKI 2  Tb 1 . . .I FD _ UP 8 TCKI 2 C p

(10)

Thus From (5) and (10), with a given frequency deviation ∆f, the frequency acquisition can be achieved shorter ∆t than [8].

8

IEICE Electronics Express, Vol.* No.*,*-*

Tck/2

CKI ‘1’

Din

‘1’ ‘0’

Tb

‘1’ ‘0’

‘1’

CKI ‘1’

‘0’

Q2 2Tb

‘1’

‘1’ ‘0’

‘0’

‘0’

‘0’

‘0’

‘0’

‘0’

Q2 8Tb

UP1

(a)

UP1

(b)

Fig.11. (a) Shortest and (b) longest pulsewidth of UP signal for the input data of PRBS 27-1

3 Experimental results The proposed half-rate CDR circuit is implemented in a 180 nm CMOS process. The simulation results show that the circuit successfully recovery with 27-1 PRBS data from 200Mb/s to 3.2Gb/s. The operating data range of the proposed FD is limited by the frequency range of the VCO. To demontrate the wide tracking ability of the proposal FD, a high gain and wide range VCO are used. Fig. 12 shows the decremental frequency acquisition when the initial VCO frequency and the bit rate of the input data are 1.6GHz and 1Gb/s, respectively. Fig. 13 shows the incremental frequency acquisition when the initial VCO frequency and the bit rate of the input data are 0.4GHz and 3Gb/s, respectively. For KVCO = 2.5GHz/V, IFD_DN = 120μA, IFD_UP = 100μA and Cp = 1.5nF, the calculated results are 2.88μs and 3.6μs for the decrement acquisition and increment acquisition time, respectively (11 μs and 14.4 μs in [8]). The simulation result for the decrement acquisition and increment acquisition time are 2.7μs and 3.7μs, respectively. There exist a little deviations between the calculation and simulation. These are because we averaged and approximated while estimating the frequency acquisition time. Simulation result in Fig. 12 and Fig. 13 show that the proposal FD can track the input data without harmonic locking. That means, for frequency acquisition, the proposal FD does not need to reset the VCO frequency to the minimum or maximum of its frequency band.

Fig. 12. Simulation result of thefrequency decrement acquisition FD

9

IEICE Electronics Express, Vol.* No.*,*-*

Fig. 13. Simulation result of thefrequency increment acquisition FD

Fig. 14 shows the frequency and phase acquisition process for the CDR when the data rate is 3Gb/s. The acquisition process of the CDR is divided by three periods. At the start, the frequency band selector operates to search the true frequency band of the VCO. Then, the frequency acquisition process can be accomplished by the proposed FD. The frequency lock detector drives LOCK signal to high state when frequency error is small enough. After that, the CDR transfers the loop control to the PD. The jitter of the recovered clock signal at 1.5 GHz is shown in Fig. 15. The CDR circuit archieves a peak-to-peak jitter of 11ps. The performance comparisons with other CDRs are shown in Table I. As shown, the acquisition time of proposed CDR is reduced compared to other works.

Fig. 14. Simulation result of frequency and phase acquisition process

Fig.15. Jitter performance of recovered clock at 1.5GHz 10

IEICE Electronics Express, Vol.* No.*,*-*

Table I. Performance comparison of wide-band referenceless CDR

Technology (nm) Supply (V) Data rate (Gb/s) FD type Divider in FD Jitterp-p(ps)

[10]

[3]

[4]

[8]

65 CMOS

180 CMOS

130 CMOS

1.2 4-10 Half-rate Unilateral Yes

1.8 0.15552-3.125 Full-rate Bidirectional Yes

350 BiCMOS 3.3 0.0125-2.7 Full-rate Unilateral Yes

This work (Simulation) 180 CMOS

1.5 1-16 Half-rate Bidirectional Yes

1.8 0.2-3.2 Half-rate Bidirectional No

24@10Gb/s

82.2@ 1.244Gb/s 100

N/A

146@1Gb/s

11@3Gb/s

1000

1000

11.8

9

0.134 (without FBS) 160

0.319 (with FBS) 64.8

Acquisition time ( µs) Area (mm2) Power (mW)

230 1.63 22.5

0.88 (without FBS) 95

750

4 Conclusion A referenceless 200Mb/s to 3.2Gb/s rate CDR with the proposed bidirectional FD circuit is implemented in 180nm CMOS process. It reduces the frequency acquisition time to 11.8 µs with a 64.8 mW power consumption for a supply of 1.8V.

Acknowledgments This research was supported by Inha University. The authors also thank the IDEC for CAD tool support.

11