IEEE 1998. FA 10.4: A 2.6GB/s Multi-Purpose Chip-to-Chip Interface. B. Lau, Y-F. ... A binary-weighted controlled-current out driver, using an impedance-tracking ...
FA 10.4: A 2.6GB/s Multi-Purpose Chip-to-Chip Interface B. Lau, Y-F. Chan, A. Moncayo, J. Ho, M. Allen1, J. Salmon1, J. Liu1, M. Muthal2, C. Lee3, T. Nguyen3, B. Horine3, M. Leddige3, K. Huang, J. Wei, L. Yu, R. Tarver, Y. Hsia, R. Vu, E. Tsern, H-J. Liaw, J. Hudson, D. Nguyen, K. Donnelly, R. Crisp Rambus, Mountain View, CA 1 Intel, Folsom, CA/2Santa Clara, CA/3Hillsboro, OR A high-speed interface cell delivers 800Mb/s/pin data transfer rate on a 26b wide I/O interface consisting of a dual-byte data field and a byte-wide command field. For 2.6GB/s data rate, a 400MHz clock recovery circuit guarantees the timing margin for transferring 800mV swing data at both clock edges over the I/O interface [1]. Data from the high speed interface is internally deserialized to provide a 100MHz (f/4) ASIC clock interface. A test chip contains three megacells and built-in clock synchronization circuits to ensure proper data transfer between the three megacells with minimal impact on latency. Controlled impedance buses, referred to as channels, with careful PCB layout ensure 800 Mb/s/pin data rate on-board for ASIC-to-ASIC or ASIC-to-DRAM system configuration [2]. A binary-weighted controlled-current out driver, using an impedance-tracking current control algorithm [3], is used to obtain the optimal current level. The circuit determines the level by sensing two I/O levels, one at logic 1 and the other at logic 0, and compares them relative to the reference voltage.
versus power consumption of the input sampler, a current switch compensates the channel length modulation of the current source. The controlled-current output is automatically calibrated to the optimal output current level. The result is an interface that consumes less I/O power/MB/s compared to other advanced I/O architectures, as shown in Figure 3. To ensure robust operation at an 800Mb/s/pin signaling rate, the interconnect is designed by following a set of detailed rules to establish a uniform controlled-impedance signal path. Figure 4 shows the system board used for characterizing the interface chip. In addition to detailed PCB design, real system issues like packaging and thermal dissipation for multiple chips on the channel are considered in the system board design. The channel used to transport data between ASIC and DRAM uses microwave design methodologies for maximum interconnect bandwidth [5]. The channel consists of the megacell and potentially many DRAMs. It can best be represented as a uniform array of cascaded reactive elements thst model the parasitic inductance, capacitance, and resistive loss of the DRAMs. The electrical behavior of such a network is much like that of a microwave filter, exhibiting characteristic passbands and stopbands. The PCB design rules and DRAM packages minimize cost while maximizing overall channel bandwidth. The result is a transmission system with controlled impedance and propagation (attenuation and phase) required to transport signals at 800Mb/s. Figure 5 shows typical channel frequency response.
Previous interface circuits running at lower speeds did not require any slew-rate control mechanism [4]. As data rate increases, inductive overshoot and undershoot caused by the I/O slew rate can significantly affect the I/O timing margin. Figure 1 shows a process detector and a three-state controlled pre-driver used to adjust the slew rate of the controlled-current output. The process effect is determined by sensing the pulse width of a one-shot by a fixed frequency clock. To match the behavior of the open-drain nMOS output driver, the delay element of the one-shot is created by discharging through a nMOS chain, as shown in Figure 2. Two one-shot pulses are tapped off from two different positions of the nMOS delay chain to determine three process windows. Using information from the process detector, the three three-state pre-drivers adjust the output slew rate.
The I/O circuitry of the megacell takes into account limitations of physical packaging and interconnect of the target application or system - and vice versa. Overall channel voltage and timing margins are maximized using a combined approach resulting in the proper set of design trade-offs. Figure 6 shows the data waveform transmitted on the channel.
A three-level distributed clock tree minimizes clock skew between the twenty-six I/O blocks. Each buffer stage is matched to within 10ps skew across variations in process, Vdd and temperature. Precise parasitic extraction which accounts for both lump and coupling capacitance is used to better match silicon performance to simulations. Measured results on silicon across various process, Vdd, and temperature corners demonstrate I/O timing mismatch less than 30ps.
The interface cell is implemented on various CMOS technologies from 0.35µm to 0.18µm. Automated analog and high-speed digital circuit design methodology facilitates porting the megacell to various process technologies. Each megacell spans 74 pad pitches in width. The I/O timing margin at 800+ Mb/s/pin is characterized using a 1.3GHz tester from 1.8V to 3.5V Vdd. Figure 7 is a micrograph of the three-megacell test chip, assembled in a low-cost PBGA package, that achieves >2.6GB/ s bandwidth per megacell. The megacell is fully characterized, and with the complete system solution it can be incorporated to any generic digital chip-to-chip communication. Multiple megacells can be used for high bandwidth interface for main memory, networking, graphics, consumer electronics and DSP applications.
The megacell can interface with a single dual-byte wide DRAM channel, two byte-wide DRAM channels, or an ASIC chip with the same megacell interface. The twenty-six I/Os can be configured differently for each application. For a single dual-byte DRAM, the interface is configured as an 18b data field, with an 8b command field. When used with two parallel byte-wide DRAM channels, eleven I/Os are assigned to each of the two DRAM channels, which multiplex data and control information onto the same signal wires. In this configuration, the clock recovery circuit is shared between the two channels for power savings. The megacell can interface with several generations of DRAM and ASIC chips using different termination voltages. The input circuit supports either 1.8V or 2.5V termination voltages. To optimize performance
© IEEE 1998
An example of these considerations is the slew rate control scheme described earlier. From a circuit performance standpoint, it is desirable to have a maximum slew rate to guarantee optimal output timing. On the channel, it is desirable to have the minimum possible slew rate so the reflection, signal coupling, and termination network switching noises are minimized. Thus the slew rate is bounded between conflicting requirements.
Acknowledgments: The authors thank T. Randolph, J. Cobrunson, Y-H. Shih, M. McGinty and D. Olarte for CAD and layout support; H. Lau, and G. Ikeda for verification support; and many engineers of Intel Corp. including T. Ryan, E. Huber, S. Hoyle, B. Mehr, J. Delino, and J. Ford for fabrication and design support.
10.4-1
Figure 4: System prototype board. Figure 1: Data slew rate control schematic.
Figure 2: Process detector.
Figure 3: I/O power comparison chart.
© IEEE 1998
Figure 5: Channel frequency response.
Figure 6: Channel data waveform at 1.3Gb/s/pin.
10.4-2
FA 10.4: A 2.6GB/s Multi-Purpose Chip-to-Chip Interface
References: [1] Lee, T., et al., "A 2.5V CMOS Delay-Locked Loop for an 18Mbit 500Megabytes/s DRAM," IEEE Journal of Solid-State Circuits, vol 29, no. 12, pp. 1491-1496, Dec., 1994. [2] Crisp, R., et al., "Development of Single-Chip Multi-GB/s DRAMs," ISSCC Digest of Technical Paper, pp. 226-227, Feb., 1997. [3] Griffin, M., et al., "A Process-Independent 800MB/s DRAM Bytewide Interface Featuring Command Interleaving and Concurrent Memory," ISSCC Digest of Technical Paper, Feb., 1998. [4] Donnelly,K.,"A660MB/sInterfaceMegacellPortableCircuitin0.3µm - 0.7µm CMOS ASIC," IEEE Journal of Solid-State Circuits, vol 31, no. 12, pp. 1995-2003, Dec., 1996. [5] Monocaya,A.,etal.,"BusDesignAndAnalysisat500MHzandBeyond," Design SuperCon, 1995.
Figure 7: Micrograph.
© IEEE 1998
10.4-3
Figure 1:
© IEEE 1998
Data slew rate control schematic.
10.4-4
Figure 2:
© IEEE 1998
Process detector.
10.4-5
Figure 3:
© IEEE 1998
I/O power comparison chart.
10.4-6
Figure 4:
© IEEE 1998
System prototype board.
10.4-7
Figure 5:
© IEEE 1998
Channel frequency response.
10.4-8
Figure 6:
© IEEE 1998
Channel data waveform at 1.3Gb/s/pin.
10.4-9
Figure 7:
© IEEE 1998
Micrograph.
10.4-10