Multiple-Valued Time-Based Architecture for Serial ... - IEEE Xplore

2 downloads 7041 Views 676KB Size Report
locked loop (PPL) and the clock-and-data-recovery (CDR) circuit, resulting in more power consumption and chip area. [1-3]. Channel. High speed. Mux. High.
2012 IEEE 42nd International Symposium on Multiple-Valued Logic

Multiple-Valued Time-Based Architecture for Serial Communication Links Mostafa Rashdan, Member, IEEE, James Haslett, Life Fellow, IEEE and Brent Maundy, Member, IEEE Department of Electrical and Computer Engineering University of Calgary Calgary, Canada [email protected] Abstract—A new multi-level differential-time-signaling (DTS) architecture for serial communication links is presented in this paper. The proposed system concentrates the transmitted signal energy in a smaller bandwidth than conventional architectures, allowing higher data rates for a given channel, and uses simple circuitry compared to other serial links, resulting in less power consumption and chip area. A 6-bit 3Gb/s three-level DTS link has been simulated using Cadence tools in a mixed-signal 90nm CMOS process. The eye diagrams of the transmitted signal and of the received signal at the end of a 40-inch FR-4 channel are presented. The spectral energy content in the transmitted signal is compared to our two-level DTS architecture and to the standard Serializer / Deserializer (SerDes) architecture to illustrate the advantages.

I.

Several authors have published alternative serial-link architectures. Jeong and Burm [4] used pulse-amplitude modulation (PAM) in order to avoid multiplexing the input data as in the SerDes architectures. The transmitted signal amplitude was modulated as shown in Figure 2 according to the input data. Using the PAM technique reduces the transmitted signal bandwidth but on the other hand, it complicates the CDR circuit design, which increases the power consumption and area on chip. The decrease in the supply voltage with technology scaling makes it very difficult for PAM link designers to transmit more than 3 bits per link, which limits the PAM serial link applications. Another issue is the signal-to-noise ratio (SNR), which becomes low compared to other serial links.

INTRODUCTION

With the rapid increase in multi-GHz serial communication links, the performance requirements of the link design will continue to grow. Figure 1 shows the block diagram of the commonly-used Serializer / deserializer (SerDes) system. In the SerDes architecture, the input data is multiplexed at the transmitter side and demultiplexed at the receiver side. Pre-emphases circuits and equalization circuits are used to compensate the channel attenuation. At higher data rates, crosstalk, jitter, data skew and inter-symbol interference (ISI) become very important issues in the serial link design. Two main limitations restrict designers from achieving higher data rates. The first limitation is the circuit performance and the second limitation is the transmission channel bandwidth. At high data rates, the SerDes link requires a very high input clock frequency signal, which complicates the design of components such as the phaselocked loop (PPL) and the clock-and-data-recovery (CDR) circuit, resulting in more power consumption and chip area [1-3].

High speed Mux

Ref. Clk

11 10 01 PAM signal

Channel

PLL

High speed Demux

Another solution presented by the authors in [5] and [6] uses multi-valued time-based architectures, which take advantage of improving time resolution with technology scaling. This approach modulates the input clock signal in time according to the input data, rather than multiplexing the input data as in the SerDes architectures. The differential-time-signaling (DTS) data-link architecture in [5] uses a lower input clock frequency compared to the serializer / deserializer (SerDes) architecture for the same link rate, concentrating the transmitted signal energy in a lower bandwidth while reducing clock jitter effects. The simulated performance was verified using FPGAs. A further improvement in the DTS system can be made by adding multi-voltage levels to the time-based architecture, and the new approach is presented in this paper, using three levels to illustrate the advantages. Unlike [4], the multivoltage level approach is not proposed to increase the number of transmitted bits. The proposed architecture relaxes the

Data Out

CDR

Figure 1. Block diagram of the conventional SerDes system.

0195-623X/12 $26.00 © 2012 IEEE DOI 10.1109/ISMVL.2012.16

00

Figure 2. The PAM transmitted signal timing diagram.

Equalizer

Preemphasis Data In

Clock signal

1

compared to the DTS architecture presented by the authors in [5], and has additional advantages in terms of avoiding ISI. In addition, a comparator circuit separates the data pulse and the reference clock pulse from the received signal. As a result, no separation circuit is needed in the receiver side, which relaxes the circuit design further. In the remainder of this paper, the previously-published DTS architecture in [5] will be referred to as the two-level DTS architecture to distinguish it from the proposed multilevel architecture.

circuit design and reduces the transmitted signal bandwidth further compared to the DTS link and other serial links in order to achieve higher data rates. The remainder of the paper is organized as follows: In section II, a brief description of the differential time signaling (DTS) architecture is presented. Section III shows the design details of the multi-level DTS architecture. The simulation results for a three-level example are shown in section IV. Finally, section V concludes the presented work in this paper. II.

DTS ARCHITECTURE

MULTIPLE-VALUED TIME-BASED ARCHITECTURE

III.

The DTS architecture described by the authors in [5] consists of a Pulse-Position-Modulation (PPM)-based transmitter and a Time-to-Digital Converter (TDC)-based receiver as shown in Figure 3. Figure 4 shows the timing diagram of the transmitted signal of a 4-bit DTS serial link. Each period consists of a reference clock pulse and a data pulse. The reference clock pulse has a fixed position at each period. The data-pulse edges are modulated independently according to the input code. The dotted lines show all possible positions of the data-pulse edges. The time difference between two consecutive dotted lines is defined as the link resolution. The smaller the link resolution, the larger the number of transmitted bits. At the receiver side, a separation circuit is used to separate the reference clock pulse and the data pulse from the received signal. The time difference between the data-pulse edges and the reference clock-pulse positive edge is converted into a binary code corresponding to the transmitted signal code.

The proposed multi-level serial link consists of a transmitter circuit, a pre-emphasis circuit and a receiver circuit. The transmitted signal in the following example uses 3 voltage levels, V1, V2 and V3, in addition to the multivalued time delays, to illustrate the advantages of multiple voltage levels. In this section the design details of the transmitter and the receiver circuits are provided. 1

3

5 7

CLK signal

9 2

VDD 4

6

8

10

Data in

PPM-based transmitter

Transmitted signal

TDC-based receiver

Data out

Reference Clock pulse generation circuit

Figure 5. The multi-level DTS transmitter circuit.

Clk in

A. The Transmitter circuit The three-level DTS transmitter circuit is shown in Figure 5. The circuit consists of two PPM circuits. The first circuit modulates the positive edge of the input clock signal and the second circuit modulates the negative edge of the input clock signal. The D-type flip-flop shown in Figure 5 combines the output signals of the PPM circuits to generate a data signal, which has both edges modulated independently, where (CP) is the rising-edge-triggered clock, (CD) is the active-high clear input and (D) is the input data. The lower part of the transmitter circuit diagram is the reference clock-pulse generation circuit. The pre-emphasis circuit combines the data-pulse signal and the reference-clock-pulse signal in order to generate the three-level transmitted signal. Figure 6 shows the timing diagrams of the signals that have been marked in Figure 5, in a single-ended eye diagram representation. The figure shows the generation of the transmitted signal from the input clock signal. Figures 6(a) and 6(b) show the input and inverted clock signals respectively. Figures 6(c) and 6(d) indicate the PPM output signals and show that both edges have been modulated in the PPM circuits. Figures 6(e) and 6(f) show the PPM output

Figure 3. The DTS architecture block diagram. The reference clock pulse

The data pulse

T

Figure 4. The DTS transmitted signal timing diagram.

The spectral content of the transmitted waveform can be shown to be concentrated in a narrower bandwidth than a conventional SerDes signal at the same data rate [5,6]. Reducing the transmitted signal bandwidth reduces ISI as well as relaxing the pre-emphases and equalization circuit designs. A multi-level DTS architecture is presented in this paper, which further reduces the transmitted signal bandwidth

2

In the upper circuit, when the clock pulse signal is high and the data pulse signal is low, the transistors M2 and M4 are on and the transistors M1 and M3 are off. As a result, the output voltage at Vout is calculated as follows assuming that R = RChannel :

signals after each AND gate. They indicate that each signal has one modulated edge and the other edge is not modulated. The generated data signal is shown in Figure 6(g) and the generated reference clock pulse signal is shown in Figure 6(h). The differential transmitted signal is shown in Figures 6(i) and 6(j). The muli-level signals are shown in the figures as V1, V2 and V3 volts. The PPM circuits used in the design have been presented by the authors in [7].

V out = [V dd − ( I d ,M 1,sat + I d ,M 3,sat ) R ] / 2 .

(1)

When the clock pulse signal is low and the data pulse signal is low, the transistors M1 and M4 are on and the transistors M2 and M3 are off. As a result, the output voltage at Vout is calculated as follows:

1 (a) The input clock signal 2

Vout = [Vdd − I d ,M 1,sat R ] / 2 .

(b) The inverted clock signal 3

(2)

In the last case, when the clock pulse signal is low and the data pulse signal is high, the transistors M1 and M3 are on and the transistors M2 and M4 are off. As a result, the output voltage at Vout is calculated as follows:

(c) The eye diagram of the PPM_1 output signal 4

(d) The eye diagram of the PPM_2 output signal

Vout = Vdd

5

Rchannel . Rchannel + R

(3)

(e) The eye diagram of the PPM_1 output signal after the AND gate

The value of R as well as the channel characteristic impedance is 50 ohms. The Vout values in equations 1, 2 and 3 that correspond to V1, V2 and V3 are designed to be 0.2, 0.4 and 0.6 volts respectively.

6

(f) The eye diagram of the PPM_2 output signal after the AND gate 1.2V 7

$$

0V (g) The eye diagram of the data signal 1.2V

8 0V (h) The eye diagram of the generated clock signal 9

V3 V2

$$

V1 (i) The eye diagram of the transmitted signal V3 10 V2

V1 (j) The eye diagram of the inverted transmitted signal

Figure 6. The eye diagram of the signals indicated in Figure 5. Figure 7. The pre-emphasis circuit.

B. The pre-emphasis circuit The pre-emphasis circuit used in the link design is shown in Figure 7. It consists of two circuits. The upper circuit is used to generate the transmitted signal and the lower circuit is used to generate the inverted transmitted signal. Each circuit consists of two stages, which are the driver stage and the tap stage.

C. The receiver The block diagram of the receiver used in the link design is shown in Figure 8. It consists of a comparator circuit and the receiver circuit. The comparator circuit is shown in Figure 9, which consists of four differential amplifier stages. The

3

inputs are terminated in 50 ohms in order to match the input impedance of the comparator circuit to the FR-4 channel impedance. R is set to 1.1K ohms. The comparator circuit detects and amplifies the received signal as well as separates the clock pulses from the data pulses. Figure 10 shows the circuit diagram of the first stage of the comparator circuit. The figure indicates the timing diagrams of the differential input signal and the output signals. When both inputs are at level V2, the circuit is designed so that both transistors are off and both outputs are at Vdd. When Vin1 is at the voltage level V3, Vin2 is at voltage level V1. In that case the transistor on the left is on while the transistor on the right is off resulting in a high state at output Vout1 and a low state at Vout2. When Vin1 is at the voltage level V1, Vin2 is at voltage level V3. Then the transistor on the left is off and the transistor on the right is on, resulting in a high state at the output Vout2 and a low state at Vout1. The clock and data are then separated as shown.

A time-difference calibration circuit is used before each TDC circuit as shown in Figure 11 in order to calibrate the time difference between the reference clock-pulse signal and the data signal. The TDC-1 circuit converts the time difference between the rising edge of the reference clock-pulse signal and the rising edge of the data signal into a binary code N1 corresponding to the transmitted code. The TDC-2 circuit converts the time difference between the rising edge of the reference clock-pulse signal and the rising edge of the inverted data signal into a binary code N2 corresponding to the transmitted code, where N1 is the number of bits that have been transmitted by modulating the positive edge of the input clock signal and N2 is the number of bits that have been transmitted by modulating the negative edge of the input clock signal. The TDC circuit used in the receiver circuit design has been published by the authors in [8]. Time difference calibration circuit Data signal TDC-1 circuit

N1

TDC-2 circuit

N2

Clock pulse signal

Figure 11. The receiver circuit. V3

Figure 8. The block diagram of the receiver side. V2 VDD

V1 RL R

Vin

RL

RL

RL

RL

RL

RL

RL

1 Vo2

R 2

1

V3

Vo1

V2

4

3

V1 (a) The eye diagram of the differential received signal

Vdd 2

0.0

Figure 9. The comparator circuit. VDD

V3

(b) The eye diagram of the separated data signal

Vdd 3

Vdd

V2

0.0

0.0 RL

V1 R

Vin1

(c) The eye diagram of the separated clock pulse signal

RL V out1

Figure 12. The eye diagrams of the signals indicated in Figure 8.

R 1

V out2

Figure 12(a) shows the eye diagram of the differential received signal when a perfect channel is used. The comparator output signals are shown in Figures 12(b) and 12(c). D. Channel modeling A 40-inch FR4 channel has been used as a transmission medium for the designed link and an S-parameter table has been generated using ADS tools and used in Cadence to

Vin2 V3 V2

Vdd 0.0

V1

Figure 10. The circuit diagram of stage one in the comparator circuit indicating the timing diagrams of the signals at different nodes.

4

The Magnitude of S11 and S21 (dB)

simulate the channel. The wire-bond and bond-pad equivalent circuits have been taken into account in the simulation. Figure 13 shows the S11 and S21 curves of the channel using Cadence tools [5]. S11

0.0 -10 -20

S21

Figure 15. The eye diagram of the inverted transmitted signal. -30 -40

0.1G

1G

10G

100G

Frequency in (Hz)

Figure 13. The S11 and S21 curves of the FR-4 channel used in the link design [5].

IV.

SIMULATION RESULTS

A 6-bit 3Gb/s multi-level DTS link has been designed and simulated using Cadence tools in a mixed signal 90nm CMOS process. The input bit rate is 500 Mb/s for each bit. The designed link uses an input clock frequency of 500 MHz and a link resolution of 62.5 ps. Figures 14 and 15 represent the eye diagrams of the differential transmitted signal. Figure 16 shows the eye diagram of the received signal at the 40-inch FR-4 channel output. Figures 17 and 18 indicate the eye diagrams of the comparator output signals. Figure 17 shows the eye diagram of the recovered reference clock signal and Figure 18 shows the eye diagram of the data signal. Figure 17 indicates a small jitter in both edges of the reference clock signal. The jitter appearing on the negative edge is not a concern since this edge is not used in the recovery process. The jitter appearing in the positive edge is much smaller than the link resolution, and therefore does not affect the data recovery process. The transmitted codes are successfully recovered at the receiver side.

Figure 16. The eye diagram of the three-level DTS received signal at the end of a 40 inch FR-4 channel.

Figure 17. The eye diagram of the reference clock pulse signal at the comparator output indicating the jitter effect on both edges.

Figure 18. The eye diagram of the data pulse signal at the comparator output. Figure 14. The eye diagram of the transmitted signal.

When decreasing the time spacing between the negative edge of the reference clock pulse and the positive edge of the data pulse in order to increase the link speed, an inter-symbol interference (ISI) might occur (indicated by the arrows and the circle shown in Figure 19), which will result in an error in the recovery process. This problem has disappeared in the three-level DTS received signal as shown in figure 16. The

The three-level DTS system has an important advantage over the two-level DTS system published by the authors in [5]. Figure 19 shows the eye diagram of the two-level DTS signal received at the end of a 40-inch FR-4 channel.

5

transitions are indicated by the arrows. As a result, the proposed architecture avoids having two closely-spaced consecutive transitions, which would translate into high frequency components in the transmitted signal spectrum, and thus reduces the transmitted signal bandwidth.

The power consumption of the 3Gb/s multiple-valued DTS link is 55mW and the estimated area on chip is 0.18mm2 using 90nm CMOS process. V.

A multi-level DTS architecture has been presented in this paper. The proposed link relaxes the circuit design by avoiding the use of a separation circuit in the receiver side as in the case in the two-level DTS architecture. It also removes the possibility of error when transitions in the signals get too close to one another. A 6-bit 3Gb/s three-level DTS link has been designed and simulated in a commercial 90nm CMOS process. The simulated signals at the transmitter and the receiver sides have been presented using the eye diagram representation. The link uses a one-tap pre-emphases circuit and no equalization circuit. The transmitted bits have been successfully recovered at the receiver side. A comparison in terms of the energy concentrated in the transmitted signals has been carried out among the proposed link, the DTS link and the NRZ SerDes link at the same link rate. The comparison shows that the three-level DTS transmitted signal occupies a smaller bandwidth than the two-level DTS and the NRZ SerDes architectures. As time resolution improves and voltage resolution degrades with scaling, this technique should have several advantages over conventional systems.

Time in nsec Figure 19. The eye diagram of the two-level DTS received signal at the end of a 40 inch FR-4 channel,. The arrows and the circle indicate the problem area as the time difference between the two arrows decreases at higher data rates. 400

364 mV

350

Mag. (mV)

300 250 178 mV

200

CONCLUSION

150 100

ACKNOWLEDGMENT

50.0

This work was supported by the provincial iCORE program, by NSERC, and by the University of Calgary.

0.0 0

2.5

5.0

7.5

10.0

Frequency (GHz)

REFERENCES [1]

Figure 20. The three-level DTS transmitted signal spectrum.

The % energy concentration in different bandwidths 3Gb/s SerDes link 3Gb/s Two-Level DTS link 3Gb/s Three-Level DTS link

1GHz 52.5% 73.8% 85.7%

2GHz 88.7% 88.3% 94.7%

[2]

4GHz 93% 94.1% 97.5%

[3]

Table 1. The percentage energy concentration of the transmitted signal in different bandwidths for each of the link methods.

[4]

Figure 20 shows the transmitted signal spectrum of the multilevel DTS link. Table I shows a comparison of the SerDes link, the two-level DTS link, and the three-level DTS link in terms of the transmitted signal energy concentration in different bandwidths. The table indicates that the three-level DTS transmitted signal concentrates the signal energy in a lower bandwidth than the two-level DTS link and the SerDes links. As a result, the three-level DTS transmitted signal can be transmitted over longer distances compared to the NRZ SerDes link and the two-level DTS link. Also, the three-level DTS architecture can be used at higher data rates while relaxing the equalization circuit design for longer distances. It should be noted that increasing the number of voltage levels to more than 3 levels is similar to using a pre-emphasis circuit with more tap stages, so 3 levels is probably optimal.

[5]

[6]

[7]

[8]

6

T. Beukema et al., “A 6.4-Gb/s CMOS SerDes core with feed-forward and decision-feedback equalization,” IEEE J. Solid-State Circuits (USA), vol. 40, pp. 2633 – 2645, Dec. 2005. Y. Nishi et al., “An ASIC-ready 1.25-6.25Gb/s SerDes in 90nm CMOS with multi-standard compatibility,” IEEE Asian Solid-State Circuits Conference, pp. 37 – 40, November 2008. K. Krishna et al., “A multigigabit backplane transceiver core in 0.13-m CMOS with a power-efficient equalization architecture,” IEEE J. SolidState Circuits (USA), vol. 40, pp. 2658 – 66, Dec. 2005. J. L. Jikyung Jeong and J. Burm, “A CMOS 3.2 Gb/s 4-PAM serial link transceiver,” International SoC Design Conference (ISOCC 2009), pp. 408 – 411, 2009. M. Rashdan, J. Yousif, A. Haslett, and B. Maundy, “Differential Time Signaling Data-Link Architecture,” Springer Journal of Signal Processing Systems, in press, 2011. M. Rashdan, A. Yousif, J. Haslett, and B. Maundy, “New time-based architecture for serial communication links,” IEEE 16th Int. Conf. on Electronics, Circuits and Systems (ICECS), pp. 531–534, 13-16 December 2009. A. Yousif, M. Rashdan, J. Haslett, and B. Maundy, “A low power and high speed PPM design for ultra wideband communications,” The Canadian Conf. on Electrical and Computer Engineering (CCECE), Niagra falls, Canada, pp. 1055-1058, 4-7 May 2008. A. Yousif and J. Haslett, “A fine resolution TDC architecture for next generation PET imaging,” IEEE Transactions on Nuclear Science, vol. 54, no. 5, pp. 1574–1582, Oct. 2007.

Suggest Documents