A Low-Overhead and Low-Power RF Transceiver

1 downloads 0 Views 637KB Size Report
In this paper, we present a new low-power, small-area RF-I transceiver that ... this RF-I transceiver is an ultra wide bandwidth system with a throughput of over.
IEICE TRANS. ELECTRON., VOL.E94–C, NO.5 MAY 2011

854

BRIEF PAPER

Special Section on Fundamentals and Applications of Advanced Semiconductor Devices

A Low-Overhead and Low-Power RF Transceiver for Short-Distance On- and Off-Chip Interconnects Jongsun KIM†a) , Member, Gyungsu BYUN†† , and M. Frank CHANG†† , Nonmembers

SUMMARY One of the most difficult problems that remains to be solved in wire interconnect architectures is the achievement of lower latency and higher concurrency on a shared bus or link without increasing the power and circuit overhead. Novel improvements in short distance onand off-chip interconnects can be provided by using a multi-band RF interconnect (RF-I) system. Unlike the conventional current- or voltage-mode square wave signaling transceivers that use binary or multilevel baseband signals, the proposed RF-I transceiver uses high-frequency modulated RF passband signals with binary phase-shift keying (BPSK) modulation. The proposed low-overhead RF-I transceiver using 0.18-µm CMOS technology achieves an aggregate data rate of 4 Gb/s/pin between four I/Os (2Tx-to2Rx) on a shared FR4 PCB line using two carriers of 6 GHz and 12 GHz. The two transceivers occupy an area of 0.077 mm2 and dissipate a power of about 25 mW with a power efficiency of 6.25 pJ/bit. key words: interconnect, communication, RF interconnect, wire line transceiver, VLSI

nary phase-shift keying (BPSK) [2] or amplitude-shift keying (ASK) [3], [6] modulation mixers with high-frequency carriers; this method achieves communication concurrency and re-configurability without interference. However, the size and power overhead of the previously used off-chip RFI transceiver [2] is an issue that needs to be resolved before it can be used in practical applications. In this paper, we present a new low-power, small-area RF-I transceiver that can achieve an aggregate data rate of 4 Gb/s/pin between four (2 Tx-to-2 Rx) off-chip I/Os on a shared 10-cm FR4 PCB transmission line using 0.18-µm CMOS technology.

1.

2.1 Architecture

Introduction

The performance of future VLSI systems is likely to be determined by improvements in the design efficiency of wire interconnect architectures. The challenge is to increase concurrency and reduce latency without increasing transceiver power and complexity. Conventional on-chip interconnect technologies that use repeaters or low-swing signaling, even with copper wires and low-k dielectrics, cannot keep up with the advances in transistor speeds or satisfy the future requirements of the signaling power budget. To resolve the problem of limited concurrency and increased latency in on- and off-chip wire interconnect architectures, novel RFinterconnect (RF-I) techniques based on CDMA and FDMA modulations [1]–[3], [8] have been applied to signaling technologies to achieve flexible and efficient communication networks while consuming less power. Unlike conventional current- or voltage-mode signaling transceivers that use binary or multilevel baseband signals, RF-I transceivers [2], [3], [8] use modulated radio-frequency electromagnetic wave signals (referred to as RF signals in this paper); this enables multiple concurrent data communications on a shared interconnect (e.g., between multiple short-range off-chip I/Os [2] or between multiple on-chip I/Os [3]). Compared with conventional baseband digital signaling, which uses time-division multiplexing (TDM), the RF-I transmitter transmits modulated RF carrier signals using either biManuscript received September 1, 2010. Manuscript revised February 1, 2011. † The author is with Hongik University, Seoul, 121-791, Korea. †† The authors are with UCLA, Los Angeles, 90095 USA. a) E-mail: [email protected] DOI: 10.1587/transele.E94.C.854

2.

RF-I Transceiver Architecture and Circuit Design

Figure 1(a) shows the proposed two-band RF-I system with two transmitters (TX1, TX2) and two receivers (RX1, RX2), which share a terminated off-chip FR4 PCB transmission line. In this system, when Tx1 and Rx1 communicate with each other by using a 6 GHz RF carrier LO1, the other transmitter and receiver (Tx2 and Rx2) can occupy the same channel simultaneously by using a 12 GHz RF carrier LO2; this results in the achievement of twice the original channel concurrency without any latency overhead. Figure 1(b) shows the frequency spectrum of scalable bandwidth allocation for multiple concurrent transactions by assigning two carrier frequency bands. Unlike the conventional wireless RF transceiver, which is a narrow-band system with a data bandwidth of less than tens of MHz, this RF-I transceiver is an ultra wide bandwidth system with a throughput of over a few gigabits per second per pin. The challenge in designing this high-data-rate RF-I transceiver lies in reducing the area and power overhead of the circuit, while simultane-

Fig. 1 (a) Schematic of the RF-I system architecture and (b) frequency spectrum for multi-band communication.

c 2011 The Institute of Electronics, Information and Communication Engineers Copyright 

BRIEF PAPER

855

Fig. 2 (a) proposed RF-I transmitter, (b) proposed RF-I receiver, and (c) PLL for LO generation.

ously achieving a wide signal bandwidth spanning several GHz with less noise production. This challenge is overcome by using transceiver architecture, which is designed with zero on-chip inductors, and by applying a simple BPSK (de)modulation algorithm. 2.2 Circuit Design The proposed RF-I transceiver implemented in 0.18-µm CMOS technology is shown in Fig. 2. The transmitter (Tx) circuit, shown in Fig. 2(a), consists of three parts. The first part is a data buffer, which converts the differential digital data (Din/Dinb) into analog signals whose signal swings can be adjusted. The signal swing is controlled by the tail current source. The second part of the Tx is a BPSK modulator. It functions as a mixer that performs up-conversion and modulation with a differential high frequency carrier signal LO/LOB. The two NMOS switches are voltage-controlled resistors whose resistances are controlled via the gate voltage LO and LOB. The third part of the Tx is a single-ended, open-drain, current-mode output driver, which transmits the modulated high frequency RF signal through the terminated transmission line with a signal swing of around 30 mV at the end of the FR4 channel. Compared with the typical off-chip I/O drivers with signal swings of a few hundred millivolts, the signal swing of the RF-I Tx is only one-tenth of con-

ventional drivers, ideally resulting in one-tenth the signaling power on the channel. The cascode driver stage provides good isolation between the Tx and the transmission line as well as relatively high output impedance. One advantage of this RF-I Tx is that it does not use LC-tuned circuits; therefore, it achieves ultra-wide bandwidth even with a small silicon area. The RF-I receiver (Rx) circuit is shown in Fig. 2(b). The Rx receives the modulated high-frequency RF signals from the shared transmission line through an on-chip capacitor Cc. This capacitor has a capacitance of approximately 100 fF, and it behaves as a high-pass filter, rejecting baseband noise signals. The Rx then demodulates the signal by using a two-stage differential amplifier, a BPSK demodulation mixer, and a low-pass filter (LPF). The two-stage differential amplifier has receiver sensitivity of larger than 30 mV at12 GHz carrier frequency. The simulated gain of this twostage amplifier is 16.4 dB for compensation of signal loss. The mixer down-converts the output signal of the differential amplifier; subsequently, the original data can be demodulated correctly, while the other interfering frequency band signals are rejected by the following LPF. Instead of increasing the speed of conventional signaling (mostly based on a single-tone in the baseband), the RF-I system can achieve a higher aggregate data rate per pin for a larger number of RF carriers [2], [4], [5]. To save power and area overhead for an N x N RF-I, only the receiver with the highest carrier frequency requires the phase-locked loop (PLL), as shown in Fig. 2(c), or frequency-locked loop (FLL) to generate LO signals, while the other N-1 carriers can be generated by dividing down the highest LO frequency. For proper BPSK demodulation, both frequency and phase must be synchronized for original baseband data recovery. In this prototype design, frequency synchronization should be performed by the on-chip PLL which generates the LO1 and LO2 reference clock for the BPSK modulator and the mixer. Phase control of each RX1 and RX2 was done externally in this prototype design, since we assumed that each transceiver chip has its own PLL and forwarded clocking scheme is not used for simplicity. In order to make a simple comparison for I/O operation only, we did not count the clocking power in this paper. It is necessary to analyze the frequency characteristic of the FR-4 PCB channel, since the transmission signals around 12 GHz and 6 GHz may be suffered from large loss in practical use. For accurate channel modeling including wire bonds and parasitic capacitance, the most accurate 3D EM solver tool was used to generate S parameters. The simulated signal loss of the 10-cm FR4 PCB transmission line is −2.2 dB and −5.4 dB at 6 GHz and 12 GHz, respectively. In this prototype test channel, the channel loss is not critical, since we aimed to implement point-to-point or 2point-to2point link for short distance mobile memory I/O interface. Since modulated high frequency small swing RF signal is transmitted through the channel, inter symbol interference (ISI) is much less at 12 GHz than a conventional baseband voltage-mode signaling.

IEICE TRANS. ELECTRON., VOL.E94–C, NO.5 MAY 2011

856

3.

Simulation Results

The proposed low-overhead RF-I transceiver has been designed in a 0.18-µm 1.2-V CMOS technology to demonstrate multi-band bi-directional signaling on a shared PCB bus. As shown in Fig. 1(a), the test board consists of two RXs and two TXs over a 10-cm FR4 PCB line. The input data patterns, recovered data waveforms, and superposed RF channel signal waveforms of both RF1 and RF2 are shown in Fig. 3. Simultaneous multiple access with two RF carriers can be observed from Fig. 3(a); it shows bi-directional signaling with a throughput of 2 Gb/s/pin in the 6 GHz RFband (RF1), and 2 Gb/s/pin in the 12 GHz RF-band (RF2). The modulated and superposed RF signal on the transmission line shows signal amplitude of a few tens of millivolts. The two-stage Therefore, the RF-I transceivers achieve an aggregate data rate of 4 Gb/s/pin, with a power dissipation of approximately 25 mW (5.4 mW for Tx1 and Tx2, 19.6 mW for Rx1 and Rx2), and a power efficiency of 6.25 pJ/bit. This power efficiency is only one-sixth that of the prior design [2], and roughly half of the conventional memory interface transceivers designed using 1.0-V CMOS [7]. Figure 3(b) shows the simulated frequency spectrum of the channel signal with two RF carriers (6 and 12 GHz). Figure 4 shows the simulated mixer output signals of

the RF transceivers TX1/RX1 and TX2/RX2 using 6 GHz and 12 GHz carrier frequencies, respectively. The modulated and superposed mixed signals on the transmission line are distorted due to inter-channel interference. However, after the LPF with a bandwidth of much less than 6 GHz, the modulated carrier signals of 6 GHz and 12 GHz carrier were recovered to original baseband data correctly. A bandselective band-pass filter (BPF) can be used to improve receiver performance at the cost of power and area overhead. The simulation is verified using post-extracted capacitive parasitic and ESD components. VTERM is set to 0.85 V for maximum gain (max Gm) of the first stage differential amplifier. The minimum receiver sensitivity and the overall gain of the two-stage differential amplifier is optimized to sense and amplify the 12 GHz carrier signal by considering all factors of signal integrity degradation such as channel loss, dispersion, and reflection due to impedance mismatch. Figure 5 shows the layout of the RF-I transceivers with two TXs and two RXs. The RF-I transceiver chip occupies an area of 0.077 mm2 , which is approximately one-eighth of

Fig. 4 Simulated mixer output signals of transceivers TX1/RX1 and TX2/RX2.

Fig. 3 (a) Simulated 4 Gb/s/pin simultaneous bi-directional signaling with two RF carriers of 6 and 12 GHz, (b) Simulated DFT Frequency Spectrum with two RF carriers (6 and 12 GHz).

Fig. 5 RXs.

Layout of the proposed RF-I transceivers with two TXs and two

BRIEF PAPER

857

the previously designed one [2]. 4.

Conclusion

On-chip and off-chip communication between multiple IPs for a future VLSI system, including network-on-chips (NOCs) and (system-on-chips) SOCs, would consume a significant amount of power and have increased system latency. To resolve the problems of excessive power, limited bandwidth, and concurrency of conventional chip-to-chip communication networks that use current- or voltage-mode baseband transceivers, a low-overhead RF-I transceiver has been proposed that enables simultaneous bi-directional signaling and increased concurrency on a shared wire interconnect. The proposed low-overhead RF-I transceiver uses simple BPSK modulation for RF-band signaling over a lowcost FR4 PCB transmission line. The test board system simultaneously achieved multiple access between four (2Txto-2Rx) I/Os with an aggregate data rate of 4.0 Gb/s/pin. The transceiver circuit designed in 0.18-µm 1.2-V CMOS technology occupies an area of 0.077 mm2 , has a very lowpower dissipation of approximately 25 mW, and has a power efficiency of 6.25 pJ/bit. As silicon CMOS technology continues to advance in terms of transistor speed and size, a higher device fT (e.g., 240 GHz for 45 nm and 490 GHz for 16 nm CMOS) will improve the aggregate data rate of the RF-I with higher carrier frequencies for a substantially smaller silicon area overhead and relatively low power. Acknowledgments This work (Grant No. 000419970110) was supported in

part by Business for Cooperative R&D between Industry, Academy and Research Institute funded Korea Small and Medium Business Administration in 2010. References [1] J. Kim, I. Verbauwhede, and M.F. Chang, “Design of an interconnect architecture and signaling technology for parallelism in communication,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol.15, no.8, pp.881–894, Aug. 2007. [2] J. Ko, J. Kim, Z. Xu, Q. Gu, C. Chien, and M.F. Chang, “An RF/Baseband FDMA-Interconnect transceiver for reconfigurable multiple access chip-to-chip communication,” IEEE International SolidState Circuits Conference (ISSCC) Dig. Tech. Papers, pp.338–339, Feb. 2005. [3] S. Tam, E. Socher, A. Wong, and M.F. Chang, “A simultaneous tri-band on-chip RF-interconnect for future network-on-chip,” 2009 Symposium on VLSI Circuits, pp.90–91, June 2009. [4] M.F. Chang, I. Verbauwhede, C. Chien, Z. Xu, J. Kim, J. Ko, Q. Gu, and B. Lai, “Advanced RF/Baseband interconnect schemes for Intra- and Inter-ULSI communications,” IEEE Trans. Electron Devices, vol.52, no.7, pp.1271–1285, July 2005. [5] E. Socher and M.-C.F. Chang, “Can RF help CMOS processors?,” IEEE Commun. Mag., vol.45, no.8, pp.104–111, Aug. 2008. [6] Q. Gu, Z. Xu, J. Ko, and M.F. Chang., “Two 10 Gb/s/pin low-power interconnect methods for 3D ICs,” IEEE International Solid-State Circuits Conference (ISSCC) Dig. Tech. Papers, pp.448–449, Feb. 2007. [7] E. Prete, D. Scheidler, and A. Sanders, “A 100 mW 9.6 Gb/s transceiver in 90 nm CMOS for next-generation memory interfaces,” IEEE Int. Solid-State Circuits Dig. Tech. Papers, pp.253–262, 2006. [8] J. Kim, G. Byun, and M.F. Chang, “RF interconnect technology for on-chip and off-chip communication,” 2010 Asia-Pacific Workshop on Fundamentals and Applications of Advanced Semiconductor, pp.105–108, July 2010.