THE VLSI IMPLEMENTATION OF A BASEBAND RECEIVER FOR DECTBASED PORTABLE APPLICATIONS* M. Perakis,1 A.E. Tzimas,2 E.G. Metaxakis,2 D. Soudris,3 G.A. Kalivas,2 C. Katis, 4 C. Dre 4, C.E. Goutis,5 A. Thanailakis, 3, and T. Stouraitis5 1
Dept. of Comp. Eng. and Informatics, Univ. of Patras, 26110, Greece 2
3
Applied Electronics Lab, Dept. of Elec. & Comp. Eng., Univ. of Patras, 26110, Greece
VLSI Design and Testing Center, Dept. of Elec. & Comp. Eng., Democritus Univ. of Thrace, 67100 Xanthi, Greece,
[email protected] 4 5
INTRACOM S.A., 19,5 km Markopoulou Ave., 190 02, Peania, Greece
VLSI Design Lab, Dept. of Elec. & Comp. Eng., Univ. of Patras, 26110, Greece
ABSTRACT A digital area/power efficient VLSI implementation of the baseband part of a DECT demodulator, is introduced. Starting from algorithm level and after exhaustive architecture-level exploration employing low power design techniques and transformations, we conclude with the hardware implementation of four optimized algorithms. The proposed DECT receiver will be integrated with the processor ASPIS [7,8] implementing the baseband signal processing of a multi-mode terminal GSM/DECT/DCS-1800.
1. Introduction In the last few years, there is a continuously growing demand for wireless terminals integrating sophisticated multi-service applications. Beside their functional characteristics these terminals should have small size and low power consumption. The later increases the requirements for extended battery life. Since the current advances in battery technology are not as promising as the applications require, lowpower design of integrated circuits becomes a challenging problem [1,2]. One of the above mentioned wireless multi-service systems is based on the DECT standard [3] which is a sophisticated platform able to support applications such as voice and data communications for geographically confined indoor and outdoor areas. DECT is specified to operate in the 1.9 GHz frequency band at the rate of 1.152 Mbits/s. The access scheme is TDMA with 24 slots and the modulation technique is GFSK with modulation index h=0.5. *
In this paper, a low power VLSI architecture implementation of a number of optimal wireless algorithms, is introduced. The development, analysis, and simulation of these algorithms were reported in [4]. The direct conversion architecture is employed in this work. This approach is considered one of the most promising [5, 6] as it minimizes power consumption and reduces front-end complexity. According to this technique the Intermediate Frequency (IF) down converting stages are eliminated. This favors digital detection schemes based on In phase (I) and Quadrature (Q) channel demodulation. As a result, we have digital processing and implementation in the baseband part of the receiver. For the sake of completeness, the architecture of zero-IF receiver, from the front-end to the baseband, is illustrated in Fig. 1. LPF
Mixer
A/D conv.
FIR Filter
A
Differential Detection
D D
0
Frequency Offset Correction LO
90
Preselect Filter
LNA
Power Splitter
Synthesizer
Symbol Detector FIR Filter
LPF FRONT END
A
Mixer ZERO-IF CONVERTER
Slot Synchronisation Symbol Timing Estimation
D
A/D conv.
BASEBAND RECEIVER FUNCTIONS
Fig. 1. The complete Zero-IF Receiver Architecture Comparing the system developed in this work with other DECT implementations, it combines the realization of a powerful detection algorithm with optimization of the detection path, where, along with symbol detection, frequency correction and synchronization are achieved in efficient digital block design approach.
This work is partially supported by the project LPGD 25256 ESPRIT IV
The implemented DECT receiver will be incorporated into the ASPIS processor [7, 8], which will implement the entire baseband signal processing of a multi-mode GSM/DECT/DCS-1800 terminal. 2. GENERAL DESCRIPTION OF THE DIGITAL DEMODULATOR The receiver uses the direct conversion architecture which translates the RF signal directly to baseband in-phase and quadrature signals, thus, favouring digital processing. To reduce circuit complexity the receiver exploits incoherent differential data detection. In more detail, the digital DECT receiver consists of four blocks, as shown in Fig. 2.: The phase difference detector (PDD) uses the In-phase (I) and Quadrature (Q) components of the received baseband signal to calculate the phase difference between two consecutive symbols. The Automatic Frequency Correction Block (AFC) uses a feed-forward technique to compensate for the frequency drifts between the local oscillators of the transmitter and receiver. The Symbol Decoder Block (SDB) translates the corrected phase difference to a positive, negative, or zero transition which through a Finite State Machine makes a decision for the transmitted symbol. The Slot Synchronization and Symbol Timing Estimation Block (SS&STE) is used to achieve slot synchronization and proper timing to sample the signal at the best possible instance. I Q
PHASE DIFFERENTIAL DETECTOR
AUTOMATIC FREQUENCY CORRECTION
SYMBOL DETECTOR
Bit Stream
Symbol Timing Estimation Slot Synchronization
Fig. 2. The functional structure of the proposed DECT receiver. The proposed system accepts at its input a quantized, 4x oversampled, IQ stream consisting of a pair (I, Q) of 6-bit vectors in sign-magnitude form received on each clock cycle. The processing of the above stream yields the bit stream of the data section contained in a DECT slot, on the circuit output. Every DECT slot has a 32-bit header, with a 16-bit preamble and a 16-bit fixed sync word, followed by the data section of 392 bits and a guard space of 56 bits. 3. PHASE DIFFERENCE DETECTOR The phase difference detector (PDD) calculates the phase difference between two consecutive symbols using a modified arc tangent function. The phase difference is represented with a fixed-point, 8-bit word in two’s complement format, having values in
the range [-pi, pi). The functional diagram of the PDD are shown in Fig. 3. I Q
sign stripping
arctan () look-up table
modulo pi correction
sign mapper
Fig. 3 The block diagram of the Phase Differential Detector The most critical component is the implementation of the arctan function. Due to the small word length of the input streams, a look-up table (LUT) implementation is chosen. A direct approach would give a LUT of ((2^6)*(2^6))*8 =32Kbits. The properties of the trigonometric functions are used in order to reduce the LUT size. By ignoring the signs of I and Q, only the 64 out of the 256 angles are stored in the LUT, i.e. the angles that belong to the first quadrant. The signs are no longer used for the addressing of the LUT, therefore the size of the LUT can be ((2^5)*(2^5))*6=6Kbits. The signs of I and Q are used after the access to the LUT, in order to place the angle to the correct quadrant. Further reduction of the LUT size was achieved by exploiting the following property of the arctan function: If one stores only the arctan values for angles corresponding to Cartesian co-ordinates where Q
2
1
bound 1
phase to binary number mapping
FSM
Fig. 5. The architecture of Symbol Detector Block 6. SLOT SYNCHRONIZATION AND SYMBOL TIMING ESTIMATION (SS&STE) The DECT standard specifies a that each data packet starts with a synchronization field that should be used for clock and packet synchronization of the radio link [3]. Due to lack of synchronization between the transmitter and receiver clocks, the I and Q streams are oversampled by a factor of 4, in order to eliminate the possibility of sampling between symbols. The receiver uses correlation between the fixed synchronization field and samples of the 4 estimated sequences spaced T seconds apart to achieve synchronization. This is an attractive approach due to its inherent simplicity. The optimum sampling time is the one that corresponds to the sequence that minimizes the expression :
31
Ci =
∑ pn ⊗ bn,i
0≤i≤3
n =0
where ⊗ denotes a XOR operation sn is the n-th symbol of the synchronization field, bn,i is the n-th symbol of the i-th sequence. The functionality of the SS&STE block can be outlined as executing the following two steps: i) in each of the four time multiplexed bit streams (due to oversampling), which derive from the four corresponding IQ streams, detect the preamble and sync word (the start of a DECT slot), allowing a userdefined maximum number of errors and ii) among the bit streams that have match the above criterion, select the one with the smallest number of errors. The architecture of SS&STE block is shown in Fig. 6. In order for a bit stream to be eligible for selection, the 16-bit sync word should be detected with no more than T1 errors, and the last 4 bits of the preamble should be detected with no more than T2 errors. Once these restrictions are satisfied for one bit stream, the circuit stores the number of total errors in the whole 32-bit header. If in the next 3 clock cycles another bit stream satisfies these restrictions with a lower total error count, that bit stream will be selected as the optimum. This bit stream is most probably sampled at the closest to optimal time point, thus correct symbol timing estimation is achieved along with correct slot detection. Input stream
0
4
. . .
12
8
127
shift register of 128 bits 32
32 bits inputs to 32 XOR from 128-bit shift register
32
32 bits inputs to 32 XOR from "preamble" register
32 A A A A E 9 8 A shift register of 32 bits PIPELINE WEIGHT CONTROL 6
5
5 3
comp min{ }
1
3 comp2 >
comp1 >
6d
thr1 thr2
{
external signals
6
32-bit comparison
5
16-bit comparison
3
Latch
4-bit comparison
6 enable
counter
enable
Latch To Sample Buffer & Selector
Fig. 6. The architecture of Slot Synchronization & Symbol Timing Estimation subsystem. Since every incoming bit defines a new probable 32 bit slot header, the above computations must be executed at a rate equal to the clock frequency, until the slot is successfully detected. To cope with this requirement, a 32-bit error counter, a 16-bit T1-order comparator and a 4-bit T2-order comparator (T1 and T2 user-defined) were implemented together in a
highly pipelined tree of addition and comparison circuitry, thus achieving the minimum latency of one symbol. Once the bit stream is chosen, the SS&STE block ceases operation, while all other computations (i.e. phase difference, frequency correction etc.) are performed only on the corresponding IQ stream. 7. VLSI IMPLEMENTATION The DECT receiver system was implemented in behavioral, synthesizeable VHDL, compiled and simulated using the Mentor Graphics tools. A logic synthesis using the ES2 0.7-micron technology library, with low effort on area optimization, yielded a circuit with less than 5000 equivalent gates. Since the transmission rate of the DECT standard is 1.152 Mbits/sec and the selected oversampling equals N=4, the clock frequency is 4×1.152=4.608 MHz.
References [1] A. Chandrakasan, S. Sheng, and R.W. Brodersen, “Low Power Digital CMOS Design”, in IEEE Journal of Solid State Circuits, pp. 473-484, April 1992. [2] M. Pedram, “Power Minimization in IC Design: Principles and Applications,” ACM Trans. on Design Automation of Electronic Systems, pp. 356, Jan. 1996. [3] ETSI: “DECT Specification Part 2: Physical Layer”, ETS 300 175-2, July 1995. [4] E. Metaxakis, A. Tzimas, G. Kalivas, “A low Complexity Baseband Receiver for Direct Conversion DECT-based Portable Communication Systems”, IEEE 1998 Int. Conf. in Univ. Personal Communications, October, 1998. [5] G. Schultes, P. Kreuzgruber, A. Scholtz, “DECT Transceiver architectures: Superheterodyne or direct conversion ?”, in Proc. Of the IEEE 43d Vehicular technology Conference, May 1993, pp. 953-956. [6] J. Sevenhans, D. Haspeslagh, A. Delarbre, L. Kiss, J. Kukielka, “An Integrated Si Bipolar RF transceiver for a zero IF 900 MHz GSM digital Mobile Radio single chip RF up and RF down converter of a hand portable phone”, in Proc. Of the IEEE Int. Conf.Solid -State Circuits”, 1994. [7] D. Moolenaar, 'Building Blocks Preliminary Design Report (Processor Core)', ASPIS project deliverable,EP20287/IMEC/VSDM/DS/D4.2.R1/a 1 Version 2, 20 May 1997. [8] H.C. Karathanasis, V. Bella, S. Blionas, D. Dervenis, C. Dre, 'Market Input Report for Multimode Terminal Specifications', ASPIS project deliverable, P20287/ICOM/DPD/DS/D6.1.R1/a1, Version 4, 15 June 1996.