Rapid Prototyping of Wireless Communications Systems

RICE UNIVERSITY Rapid Prototyping of Wireless Communications Systems by Bryan A. Jones A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree MASTER OF SCIENCE Approved, Thesis Committee:

Joseph R. Cavallaro, Chair Associate Professor of Electrical and Computer Engineering

Behnaam Aazhang J. S. Abercrobmie Professor of Electrical and Computer Engineering

Don H. Johnson J. S. Abercrobmie Professor of Electrical and Computer Engineering and of Statistics

Houston, Texas May, 2002

ABSTRACT

Rapid Prototyping of Wireless Communications Systems by Bryan A. Jones

This thesis introduces rapid prototyping methodology which overcomes important barriers in the design and implementation of digital signal processing (DSP) algorithms and systems on embedded hardware platforms, such as cellular phones. This thesis describes rapid prototyping in terms of a simulation/prototype bridge and in terms of appropriate language design. The simulation/prototype bridge combines the strengths of simulation and of prototyping, allowing the designer to develop and evaluate next-generation communications systems partly in simulation on a host computer, and partly as a prototype on embedded hardware. Appropriate language design allows engineers to express a communications system as a block diagram, in which each block represents an algorithm specified by a set of equations. Software tools developed for this thesis implement both concepts, and have been successfully used in the development of a type-based detector for a code-division multiple access (CDMA) cellular wireless communications system.

Acknowledgments I am very thankful to Dr. Cavallaro for his encouragement, constructive comments, and support in pursuing this work. Working with Dr. Aazhang continues to be a pleasure; his support for prototyping the algorithms developed in his group enables and encourages the work in this thesis. I’m also thankful to Dr. Johnson for introducing me to type-based theory, and for the fun of talking with a fellow cyclist. The RENE team at Rice is excellent, and a joy to work with. Frank, Vikram, Sridhar, and many others have put up with many bugs and provided excellent feedback. I thank Louis Belanger of Lyr Signal Processing for his generous donation of equipment to my research group, and for his company’s enthusiastic support of this work. Both Nokia, Texas Instruments, and the Texas Advanced Technology Program provide funding under grant 1999-003604-080 enabling this work, for which I am very grateful. I’m also thankful for NSF support under grant ANI-9979465. I’d like to thank my parents for their encouragement to leave a comfortable job at Compaq and pursue graduate studies. I am blessed by their comfort in hard time, by their friendship, and by their wisdom in daily living. Finally, I thank the Lord for the joy in journey he’s planned for my life. “Embracing what God does for you is the best thing you can do for him” – Romans 12:1, The Message.

Contents Abstract

ii

Acknowledgments

iii

List of Figures

viii

List of Tables

xi

1 Introduction

1

1.1

Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.2

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.3

Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

2 Background 2.1

2.2

2.3

6

Rapid Prototyping . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

2.1.1

Current Design Methodology

. . . . . . . . . . . . . . . . . .

10

2.1.2

Relevant Work . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.1.3

Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

Type-Based Detection . . . . . . . . . . . . . . . . . . . . . . . . . .

23

2.2.1

Difficulties with Gaussian Approximation . . . . . . . . . . . .

24

2.2.2

Related Work in Type-Based Detection . . . . . . . . . . . . .

27

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

v 3 Simulation/Prototype Bridge 3.1

3.2

Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

3.1.1

Label Determination in Hierarchical Block Diagrams . . . . .

40

3.1.2

Identifying Communication Edges . . . . . . . . . . . . . . . .

44

3.1.3

Partitioning of the Labeled Block Diagram . . . . . . . . . . .

45

3.1.4

Implementation of the Simulation/Prototype Bridge . . . . . .

46

3.1.5

Compilation and Execution of the Partitioned Block Diagram

48

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 Appropriate Language Design 4.1

4.2

4.3

34

50 51

Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54

4.1.1

Features Requiring Mapping . . . . . . . . . . . . . . . . . . .

54

4.1.2

Mapping Between Simulink and C/Matlab . . . . . . . . . . .

56

4.1.3

Translation between C or Matlab and Simulink . . . . . . . .

57

Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

4.2.1

Feature Specification Using a GUI . . . . . . . . . . . . . . . .

59

4.2.2

Name-Based Mapping . . . . . . . . . . . . . . . . . . . . . .

61

4.2.3

Insertion of Translation Code . . . . . . . . . . . . . . . . . .

62

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

5 Type-Based Detection

64

vi 5.1

Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

5.1.1

Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . .

67

5.1.2

Type Estimation . . . . . . . . . . . . . . . . . . . . . . . . .

68

5.1.3

Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

70

5.1.4

Variables Used . . . . . . . . . . . . . . . . . . . . . . . . . .

71

5.2

Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71

5.3

Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

72

5.3.1

Gaussian noise model . . . . . . . . . . . . . . . . . . . . . . .

73

5.3.2

Laplacian noise model . . . . . . . . . . . . . . . . . . . . . .

75

Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . .

77

5.4

6 Conclusions and Future work

79

6.1

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

6.2

Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

80

A Switcher

81

B Wrapper

89

C Testbed

102

C.1 First-time Use of the Testbed . . . . . . . . . . . . . . . . . . . . . . 102 C.2 Detector Choice, Uplink and Downlink Support . . . . . . . . . . . . 102

vii C.3 Additional Noise Models . . . . . . . . . . . . . . . . . . . . . . . . . 106 C.4 Source Code Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 References

107

List of Figures

1.1

A rough sketch of a communications system . . . . . . . . . . . . . .

2

1.2

A refined sketch of the system . . . . . . . . . . . . . . . . . . . . . .

2

1.3

A fully-developed system, ready for simulation or prototyping . . . .

3

1.4

A sketch of the system executing partly on the host, and partly on the prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

2.1

The block diagram of a multiuser receiver . . . . . . . . . . . . . . .

8

2.2

Multiuser receiver block diagram and its equivalent in Matlab . . . .

13

2.3

Control system block diagram with a feedback loop . . . . . . . . . .

14

2.4

FIR filter block diagram contain feedback loops . . . . . . . . . . . .

15

2.5

Block diagram of a multiuser receiver and its Simulink equivalent . .

17

2.6

Channel estimation equations and its Simulink equivalent . . . . . . .

18

2.7

Rate of convergence of the Central Limit Theorem . . . . . . . . . . .

27

3.1

Directed graph of a CDMA baseband receiver . . . . . . . . . . . . .

39

3.2

Directed graph of a CDMA baseband receiver, after partitioning . . .

40

3.3

Receiver graph, after partitioning and communications link insertion .

41

3.4

Hierarchical block diagram . . . . . . . . . . . . . . . . . . . . . . . .

42

3.5

Host and prototype partitions of the original block diagram . . . . . .

46

ix 3.6

Block diagram of the Lyr SignalMaster . . . . . . . . . . . . . . . . .

48

3.7

Lyr SignalMaster’s connection to the host PC . . . . . . . . . . . . .

49

4.1

FIR filter written in Matlab embedded in a Simulink block . . . . . .

57

5.1

Formation of the received waveform . . . . . . . . . . . . . . . . . . .

68

5.2

Simulations for 100 preamble bits when corrupted by Gaussian noise .

73

5.3

Simulations for 1000 preamble bits corrupted by Gaussian noise . . .

75

5.4

Simulations for 1000 preamble bits corrupted by Laplacian noise . . .

76

A.1 Switcher block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82

A.2 Switcher GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

A.3 Switcher GUI with model . . . . . . . . . . . . . . . . . . . . . . . .

85

A.4 Switcher mask dialog box of a labeled block . . . . . . . . . . . . . .

86

A.5 The lms host Simulink model . . . . . . . . . . . . . . . . . . . . . .

87

A.6 The lms DSP Simulink model . . . . . . . . . . . . . . . . . . . . . . .

88

B.1 Wrapper block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

90

B.2 Wrapper block in new model . . . . . . . . . . . . . . . . . . . . . . .

90

B.3 Wrapper mask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

91

B.4 Type-based wrapper block in new model . . . . . . . . . . . . . . . .

91

B.5 Type-based wrapper dialog box . . . . . . . . . . . . . . . . . . . . .

92

B.6 Type-based wrapper dialog box, external tab . . . . . . . . . . . . . .

93

x B.7 Type-based wrapper dialog box, external tab with parameters . . . .

94

B.8 Input port dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . .

95

B.9 Output port dialog box . . . . . . . . . . . . . . . . . . . . . . . . . .

96

B.10 Wrapper internals panel . . . . . . . . . . . . . . . . . . . . . . . . .

97

B.11 Wrapper variables panel . . . . . . . . . . . . . . . . . . . . . . . . .

98

B.12 Wrapper results panel . . . . . . . . . . . . . . . . . . . . . . . . . . 100 B.13 Type-based wrapper block parameters dialog . . . . . . . . . . . . . . 101 C.1 The ubs demo CDMA uplink/downlink testbed

. . . . . . . . . . . . 103

C.2 The testbed parameters dialog box . . . . . . . . . . . . . . . . . . . 104 C.3 Detector selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

List of Tables

2.1

Equations for channel estimation and their equivalent in Matlab . . .

11

2.2

A matrix multiplication algorithm in C and in Matlab . . . . . . . . .

19

5.1

Definition of variables used in chapter 5 . . . . . . . . . . . . . . . . .

78

Chapter 1 Introduction 1.1

Motivation

Increasingly, highly-sophisticated digital signal processing applications fuel the information revolution. Space-time codes, channel equalization, and source coding are founded on complicated systems of equations and are frequently interconnected with additional signal processing algorithms. However, many of these concepts prove difficult to implement in products. For example, the 3rd generation (3G) standard for cell phones was developed in the mid-1990s, but still awaits deployment. This thesis provides digital signal processing (DSP) engineers with improved tools to implement these complex communications systems.

1.2

Introduction

The design cycle of a new DSP application begins at the napkin stage, as a rough sketch of a block diagram, as in Figure 1.1. Next, the design is refined by choosing algorithms that specify the functionality of each block. Each algorithm is further developed by deriving a set of equations to implement the algorithm. For example, choosing a finite impulse response (FIR) filter for the filter block in Figure 1.1 results in the equation out =

P

i

ini · coef f i . Figure 1.2 shows the communications system

formed by labeling each block in the block diagram with the equations representing

2

Acquire samples Figure 1.1

Filter

Output data

A rough sketch of the block diagram of a communications system.

Acquire samples

FIR Filter

Output data

out = ∑ ini ⋅ coeff i i

Figure 1.2

A refined sketch of the system, in which equations specify the functionality

of each block.

the chosen algorithms. Finally, the design can be simulated on a host workstation and prototyped on embedded hardware. Figure 1.3 illustrates these possibilities. Input data may be generated by simulation, or by acquiring the data from sensors on the embedded hardware prototype and digitizing it using an analog to digital (A/D) converter. The resulting data may be processed by the FIR filter on the host or on the embedded hardware prototype. Filtered data may be output using a digital to analog (D/A) converter on the hardware prototype connected to an output device, such as a speaker or radio frequency (RF) transmitter. Alternatively, it may be analyzed and performance characteristics plotted on the host workstation. When the bottom element of all three blocks in Figure 1.3 are chosen and validated, the prototype is finished, and ready for encapsulation in a cellular phone, personal digital assistant (PDA), or other wireless device.

3

Acquire samples Simulate input data

FIR Filter Executing on host

OR Sensor

A/D converter

OR Executing on embedded hardware out = ini ⋅ coeff i

Output data Analyze and plot results OR D/A converter

Output device

i

Figure 1.3

A fully-developed system, prepared for simulation on a host workstation to check the system’s performance and correctness, for execution on embedded hardware to validate the system’s real-world characteristics, or for a combination of both to better analyze the performance of the system.

This thesis infers two important realizations from the design cycle. First, the design takes place in two distinct locations. Because a cellular phone or PDA must be small and lightweight, its prototype by design contains minimal hardware: a powerefficient DSP, and a small display. In contrast, the system used to design the prototype is usually a powerful workstation, with a mouse, keyboard, video display, and large amounts of storage. Second, the design was specified using several languages: an equation description language, a block diagram language, and code in the C language, running on the DSP in the prototype.

1.3

Contributions

This thesis discusses two contributions that enable and improve the rapid prototyping of communications systems and presents an improved type-based detec-

4 tion algorithm, developed using rapid prototyping techniques. First, the simulation/prototype bridge unites a simulation with a hardware prototype, providing communication system designers with the combined benefits of both approaches. Portions of the design used to generate data and analyze results can be executed on a host computer, while time-critical blocks execute on a hardware prototype. Figure 1.4 shows a block diagram in which the prototype samples and filters data, then sends it to the host for analysis. Note that addition of a communications link, automatically inserted by the simulation/prototype bridge, which connects the prototype to the host. Second, the use of appropriate language design allows the engineer to express each subsystem in a communications system using the language best suited for that subsystem. For example, the designer may use Simulink to draw block diagrams, and Matlab to implement equations for each block in the block diagram, as illustrated by the “FIR Filter” block in Figure 1.2. Finally, the type-based detector presented in this thesis extends previous work in the field of type-based detection, providing a detector suitable to a wide variety of noise environments. Chapter 2 discuses previous work in rapid prototyping and in type-based detection. This work presents the simulation/prototype bridge operation pictured in Figure 1.4 in Chapter 3, and demonstrates the use of programs implementing the bridge in Appendix A. Chapter 4 details appropriate language design concepts illustrated by the “FIR Filter block” in Figure 1.2, while Appendix B demonstrates an implementa-

5

Prototype hardware Sensor

A/D converter

Host workstation FIR Filter

out = ini ⋅ coeff i

Analyze and plot results

i

Communications link Figure 1.4

A sketch of the system, showing some blocks executing on the prototype hardware, while others execute on the host workstation. Note the addition of a communications links, automatically inserted by the simulation/prototype bridge.

tion of these concepts. The type-based detector is presented in Chapter 5; Appendix C shows the simulation testbed used to evaluate the detector’s performance.

Chapter 2 Background This chapter reviews research relevant to the work presented in this thesis. First, Section 2.1 reviews varying approaches to the rapid prototyping of communications systems. Next, Section 2.2 discusses previous work in type-based detection. Finally, Section 2.3 summarizes and concludes the chapter.

2.1

Rapid Prototyping

Next-generation communications systems promise to deliver a wide variety of new features, such as improved battery life, smaller size, full-motion video, and highbandwidth Internet connections. Inherent in the design of any such system is the development and integration of several computationally-intensive algorithms which enable these new features. Two problems hinder designers of these systems. First, block diagrams and equations compose typical communications systems; however, prototype hardware must be programmed in C or assembly, an awkward and errorprone means to implement block diagrams and equations. Second, designers develop simulations which execute on a host, while other engineers create hardware prototypes. However, the host and prototype platforms remain isolated from each other; the simulator’s power cannot be combined with the real-time constraints of the prototype. The following paragraphs detail the design of a next-generation communications

7 system, then discuss the language/platform problem arising from this design process. In the design process, the designer must integrate algorithms into a communications system executing on prototype hardware. Each algorithm, usually expressed as a set of equations, must be tested to insure it operates as expected. The interconnection of each algorithm block in the block diagram of a communications system must then be checked to verify that each block is properly connected and operates correctly with its neighboring blocks. Finally, the resulting block diagram of the system must be translated into a program suitable for execution on the prototype hardware. An examination of the design process reveals the three languages and two platforms inherent in any communications system design. For algorithm creation, the designer uses an equation-oriented language. In contrast, a language for block diagram description and simulation is preferable for system design. Finally, many prototyping systems utilize a Digital Signal Processor (DSP) to process digital data. DSPs require the use of either assembly language or the C programming language to generate an executable. The design process also reveals two platforms inherent in system design, because it takes place on both the host and the prototype. Initial design entry and simulation is done on the host; final testing of the design is performed on the prototype hardware. For example, Figure 2.1 shows a block diagram for a multiuser receiver, which is part of a base station in a next-generation cellular phone network. Each block is

8

y = AT r d = sign( y ) L = A1H A0 C = A0H A0 + A1H A1 − diag ( A0H A0 + A1H A1 ) yi(l ) = yi( 0) − Ld i(−l1) − Cd i( l ) − LH d i(+l1)

Antenna

Multiuser detector

y = r

Chip-matched filter T

Channel estimator

Detected bits

A(i ) = A(i −1) − µ ( A(i −1) * Rbb( i ) − Rbr(i ) ) Rbb(i ) = Rbb(i −1) + bL * bLT − b0 * b0T Rbr(i ) = Rbr(i −1) + bL * rLH − b0 * r0H

Figure 2.1

A multiuser receiver, represented as a block diagram in which each block contains an algorithm specified by a set of equations.

annotated with equations, which specify the algorithm implemented by that block. Simulation is first used to verify the correct operation of each block and then of the entire system. Finally, the system in translated into C or an HDL and compiled to run on the DSP or FPGA at the heart of the base station prototype. Unfortunately, the languages and design tools available today are largely incompatible with each other and are usually unable to execute both on the host and on the DSP. Matlab, Simulink, C and hardware description languages interoperate poorly, and run either only on the host or only on the prototype. Algorithm designers prefer a powerful programming language such as Matlab which is tailored to the description of

9 equations. Algorithms written in Matlab, however, cannot directly execute on a DSP, though there are several promising papers in this area [1, 2]. Communications system designers prefer a block diagram entry and simulation package such as Simulink. Like Matlab, Simulink runs only on the host; its ability to integrate Matlab into algorithm blocks is very poor. C code written for the DSP typically uses DSP-only libraries, preventing it from executing on the host. Integrating C code with Matlab or with Simulink is a difficult task and requires knowledge of the Matlab C-MEX interface [3] or the Simulink S-function interface [4]. The monolingual and uni-location nature of today’s languages and tools limits the complexity of designs achievable. First, they restrict a designer to the use of only one language for the entire design, though the use of an alternate language for parts of the design is preferable. For example, a communications system designed in Matlab implies that all block diagrams must be expressed as text in Matlab’s programming language, rather than representing the diagrams graphically. This text representation obscures the block diagram structure of the system, which also hinders the compiler’s ability to understand and optimize the design. Second, today’s languages and tools force the designer to rewrite the entire design when moving between languages or locations. Moving a system based on algorithms written in Matlab from the host to a prototype requires all Matlab code to be rewritten in C. Finally, modern languages and tools isolate the host-based simulation environment from the DSP-based

10 execution environment. Real-time data acquired by the prototype hardware cannot be easily passed back to the host for analysis; likewise, simulated data generated on the host cannot be processed on the prototype. The following sections discuss the application of today’s languages and design tools to the design of communications systems. Matlab, the preferred language for algorithm design, will be evaluated as a tool for communications system design. Next, this thesis evaluates the use of Simulink, a block diagram editor, in communications system design. Following this, the use of C in system design is evaluated. Finally, this section discusses related work performed by other researchers in rapid prototyping. 2.1.1

Current Design Methodology

Each of the languages reviewed is uniquely suited for differing parts of a communications system design. The C language or an HDL excels at implementation on hardware. Matlab is an excellent language for algorithms; Simulink’s block-oriented nature is well-suited for top-level system design. However, the strengths of these languages cannot be combined by writing different parts of the design in the most appropriate language. Therefore, each of the languages must suffice for describing every subsystem in the entire design. The following sections discuss the exclusive use of each of these languages in system design, and difficulties of this approach.

11 Matlab Matlab [5], a popular tool for DSP engineers, is an excellent candidate for designing algorithms based on a set of equations. Matlab’s language allows these equations to be easily entered in its text-based language. However, as a text-only language, it lacks the ability to graphically represent block diagrams. In addition, it cannot yet produce efficient code for execution on a hardware prototype. Matlab’s programming language contains several features making it uniquely suited for algorithm design, as illustrated in Table 2.1. Most importantly, it allows the user to easily enter complex equations, by defining common arithmetic operators such as addition and multiplication for scalars, vectors, and matrices over both real and complex numbers. In addition, Matlab supports a wide range of toolboxes for standard signal processing operations. Its powerful language, excellent debugger, and flexible plotting features allow the user to quickly develop and debug new algorithms. While other packages such as MathCAD [6], Mathematica [7], and Maple [8] offer a graphical equation editor, they lack the language and toolbox support provided by Matlab. A(i) (i) Rbb (i) Rbr

(i)

(i)

= A(i−1) − µ(A(i−1) ∗ Rbb − Rbr ) A = A prev - mu*(A prev*Rbb - Rbr); (i−1) = Rbb + bL bTL − b0 bT0 Rbb = Rbb prev + b*b’ - b0*b0’; (i−1) H H = Rbr + bL rL − b0 r0 Rbr = Rbr prev + b*r’ - b0*r0’;

Table 2.1

Equations for channel estimation (see Figure 2.1) on the left side of the table and their equivalent in Matlab on the right side of the table.

12 However, Matlab’s programming language is less suited for describing block diagrams. Unlike synchronous dataflow languages like Simulink, time is not defined in Matlab. In Simulink, each block specifies the rate at which input data arrives and output data is ready. The knowledge of these rates and the interconnections between blocks enable Simulink to compute an order in which each block must be run. In contrast, Matlab provides none of these services, which are restrictive when developing new algorithms. Instead, the Matlab programmer must manually schedule blocks, and write additional code which defines a variable to represent time. In addition, Matlab does not define a standardized interface between Matlab functions. In contrast, Simulink’s block interface is standardized, encouraging modularity. Figure 2.2 shows a hand-scheduled block diagram, written in Matlab. As demonstrated by the figure, reducing the block diagram to a Matlab program obscures the overall structure of the design. Simulink Block diagram languages such as Simulink [9], the Co-Centric System Studio [10], and the Signal Processing Worksystem (SPW) [11] provide a graphical interface for interconnecting blocks in a block diagram. This ability to interconnect blocks results from the use of a coordination language [12]. Further defining this language using synchronous dataflow semantics [12] brings several additional benefits. First, the resulting block diagrams can be analyzed to correctly support feedback loops.

13

Antenna Multiuser detector Chip-matched filter Channel estimator

Detected bits

r = chip match filt(ant); A = channel est(r); b = multiuser det(A, r); Figure 2.2

A block diagram of a multiuser receiver(see Figure 2.1) on top and its equivalent in Matlab on bottom.

In addition, analysis of these block diagrams enables the determination of a static schedule for the block diagram, which allows the automatic generation of a program implementing this schedule. Although the block diagram nature of these coordination languages makes then highly suitable for system-level design, they are poorly suited for algorithm design. The following paragraphs first define a coordination language, then discuss the benefits of Simulink, a popular and widely-available coordination language used by the academic community. Next, this thesis discusses the weaknesses of Simulink, particularly when used for algorithm design. Rather than provide a complete, general-purpose programming language, these coordination languages [12] precisely define the interaction between blocks, allowing them to be interconnected easily. Block diagram languages, for example, require each block to specify the number of inputs and outputs, and the type and amount

14

+ U(s)

Σ

E(s)

H(s)

Y(s)

B(s)

G(s)

Figure 2.3 A typical control system with feedback, which contains no time delays in the feedback loop. of data produced by each input and output. Given this information, coordination languages enforce correct block interconnections by verifying that each input expects the same type and amount of data produced by the output connected to it. However, these coordination languages do not have the ability to perform computations, relying instead on a host language [12] such as C or Matlab to calculate output values given input values. The creation of feedback loops such as those shown in Figures 2.3 and 2.4 in coordination languages illustrates two related feedback loop problems faced by coordination languages. First, loops containing no time delays, such as the control system pictured in Figure 2.3, must produce algebraically correct outputs. Second, loops containing time delays, such as the small finite impulse response (FIR) filter drawn in Figure 2.4, must escape deadlock conditions to produce outputs. The following two paragraphs address these two problems in order. Loops containing no time delays represent a set of simultaneous equations to be solved. Following standard control system notation, the two blocks containing G(s)

15

+

Σ

x(k)

y(k)

z-1 Figure 2.4

c0

An FIR filter, which contains a unit time delay in the feedback loop.

and H(s) output the product of their input with their contents. Therefore, Figure 2.3 presents the following set of equations: E(s) = U (s) + B(s), Y (s) = G(s)E(s), and B(s) = G(s)Y (s). Solution of this set of equations reveals that Y (s) =

G(s) . 1+G(s)H(s)

Therefore, coordination languages must recognize and solve these equations analytically or numerically. Simulink implements both analytical and numerical solution methods, as discussed on pages 3-19 and following of Using Simulink [9]. Loops containing time delays contain potential deadlock problems. In Figure 2.4, z −1 represents a unit time delay. Therefore, at the beginning of the simulation, the output y(k) depends on x(k), the current input, and on y(k − 1), the previous output. However, at the beginning of simulation the previous output is undefined, preventing the coordination language from computing the current output. To solve this deadlock, many coordination languages require all time-delay elements such as the z−1 block to output an initial value during the first time step. In succeeding time steps, the block outputs a delayed input as usual. Simulink provides this solution, allowing the user to specify the initial value output by all time-delayed blocks.

16 As discussed earlier, synchronous dataflow networks require each block to statically define the number of inputs consumed and the number of outputs produced by each block. Synchronous dataflow networks also require each block to statically specify the rate at which that block executes. Because these values are static, they do not change during the execution of the block diagram. The static nature of these semantics insures that a static schedule can be computed, rather than requiring a run-time scheduler to examine each block to determine what blocks are ready for execution. This approach also increases the likelihood that the design can be executed in a bounded amount of memory. Simulink’s primary host language is C; most of the blocksets provided with Simulink are implemented in C. This allows Simulink’s code generator, the Real-Time Workshop (RTW) [13], to reduce the block diagram into a C program. Matlab can also be used as a host language; however, the connection language for Matlab blocks is much more limited than the connection language available to blocks written in C. A block diagram and the same diagram annotated with the required port and block information shown in Figure 2.5 illustrates the suitability of Simulink for capturing system-level designs. Although Simulink is well-suited to system-level design, it is poorly suited to algorithm design. As a coordination language, Simulink cannot describe computations such as algorithms. Instead, it must rely on blocks written in a general-purpose programming language such as C to perform algorithm-level

17

Antenna Multiuser detector Chip-matched filter Channel estimator Antenna 10 bits/execution

Rate: 4 K/s Multiuser detector 1 estimate/execution

Chip-matched filter Rate: 128 K/s 32 samples/execution

Detected bits

Channel estimator Rate: 1 K/s

Detected bits

Figure 2.5 The block diagram of a multiuser receiver (see Figure 2.1) above its equivalent in Simulink. computations. For example, Figure 2.6 demonstrates the use of matrix-matrix adder blocks and multiplier blocks to define a channel estimation algorithm. These graphical equations bear little resemblance to their textual counterparts, obscuring the intent of the algorithm. In addition, coordination languages such as Simulink impose blockto-block communication overheads which reduce the efficiency of algorithms specified in Simulink. The C Programming Language The C programming language produces the highest-performance executables on DSPs of all the languages reviewed in this thesis. Most modern DSPs provides compilers with excellent support for the C programming language. While writing in assembly

18

(i)

(i)

A(i) = A(i−1) − µ(A(i−1) ∗ Rbb − Rbr )

Rbb

1 cycle delay

–

µ

*

*

–

Rbr

A out Figure 2.6

One equation for channel estimation (see Figure 2.1) above its equivalent as a block diagram.

language produces the most efficient executables, C compilers provide most of the efficiency necessary for prototyping while significantly reducing the development time spent on implementing a given algorithm. However, unlike Matlab, the C language poorly supports equation entry. Table 2.2 illustrates the difficulty of performing a matrix-matrix multiply in C. More complex operations involving matrix-vector products, additions, and transposition are quite difficult to develop and debug. Likewise, C’s text-based language is not suitable for the graphical description of block diagrams. The same principle, illustrated using Matlab code in Table 2.1, applies to C when describing block diagrams.

19

Matlab C

A = A_prev - mu*(A_prev*Rbb - Rbr) for(i = 0; i< 2*K;++i) for(j = 0; j < N; ++j) { for(k = 0; k < 2*K; ++k) A_temp[i][j] += Rbb[i][k] * A_prev[k][j]; A_temp[i][j] -= Rbr[i][j]; A[i][j] -= A_temp[i][j]*mu; }

Table 2.2

On the top, a single Matlab equation performs a series of matrix-matrix operations. On the bottom, the equivalent C code demonstrates the difficulty of writing these equations in C.

2.1.2

Relevant Work

Similar to other coordination languages such as Simulink, the Ptolemy project [14] provides a coordination language which enables the simulation and prototyping of heterogeneous systems. A team of researchers at the University of California-Berkeley developed this system in the early 1990s; material in this section is based on their work [14]. Ptolemy supports heterogeneous systems by allowing blocks with differing computational models, or domains, to coexist in a single system. For example, a filtering block in a signal processing domain expects a single input, and calculates a single output at a constant rate. In contrast, a queuing block in a networking domain accepts a variable number of inputs and executes only when a downstream block pulls data from its queue. The Ptolemy project primarily focuses on the development of a coordination lan-

20 guage, which is implemented as a set of C++ classes. New computational models may be developed by inheriting from the appropriate base classes, then writing appropriate code for the new model. Unlike the work in this thesis, it does not provide a bridge between a simulation executing on the host and a prototype executing on a DSP. In addition, Ptolemy exclusively relies on C++, rather than providing appropriate language design as discussed in this thesis. 2.1.3

Applications

The following concepts developed by this thesis improve the design process for communications systems. First, the specification of a system using languages appropriate for each subsystem of the design improves the robustness, modularity and abstraction of the design. These three attributes create opportunities for extensive optimization. Second, the use of a simulation/prototype bridge combines the real-time, real-world behavior with the powerful analytical tools of a simulation environment. Appropriate language design encourages robust design practices. Concise descriptions of a concept are possible using a language designed to express the concept. For example, drawing a state machine diagram allows a clearer, more compact description than a large switch statement with many cases in a traditional programming language. Second, a concise description better illustrates the purpose of the design both to the designer, and to other designers planning to use or improve the design. Finally, appropriate language design shortens the development cycle by providing

21 debugging and analysis tools tailored for the design. The Matlab debugger, for example, allows the user to halt a program and perform complex analysis of the code. Displaying the norm of a matrix, or plotting the Fourier transform of an intermediate result is simple. Performing the same analysis in C is difficult, if not impossible. Appropriate language design also encourages modular design practices. Because the language is suited for the design, the designer is able to naturally divide the design into modules. The language’s calling conventions guarantee that each module will have a standardized interface, encouraging re-use. Simulink, for example, divides a design captured as a block diagram into a set of blocks. Designers can easily replace one block, such as a filter, with an improved filter. Finally, appropriate language design encourages the designer to focus on the design though the use of abstraction. The languages free the designer from unnecessary complexity by providing high-level abstraction for complex operations. For example, the details of a matrix multiply or the mechanics of block scheduling are handled by Matlab and Simulink respectively. In addition, Matlab’s interpreter allows the user to call powerful analysis functions such as fast Fourier transforms (FFTs) during the debug process, while C’s compiled nature prevents such flexibility. Like Artemis [15], the use of a Matlab’s high-level features allow design exploration at the algorithm level, before writing architecture-specific C code to efficiently implement each block. Robust, modular, abstracted language design enables the use of many power-

22 ful optimization techniques. Applications of these optimizations to block-diagram languages and to equation-description languages such as Simulink and Matlab are reviewed below. The separation of tasks into a series of interconnected blocks in a block diagram allows the designer to naturally express parallelism in a design. The design can then be scheduled on a heterogeneous multi-processor system using techniques detailed in [16, 17]. Alternatively, the design can be optimized for a VLIW architecture with performance approaching that of a highly complex superscalar processor using threadparallel techniques in [18, 19]. By specifying each block as a set of linear equations, optimization techniques specific to linear algebra can be applied. Methods in [20, 21, 22] demonstrate significant performance improvements. In addition, the application of fixed-point techniques in [23] to the equations trade a small decrease in accuracy for a significant performance increase. The goal of both appropriate language design and a simulation/prototype bridge is the development of next-generation communications systems. The following section demonstrates the utility of these concepts by example. It discusses type-based detection, a novel approach to the detection of received signals corrupted by unknown noise. Given this background, this thesis then develops an improved type-based detector in Chapter 5. The detector is then implemented and tested using concepts and

23 software tools developed for this research.

2.2

Type-Based Detection

The growing interest in cellular communications demonstrates the importance of rapid prototyping in wireless communications systems. The continuing improvement in silicon process technology places increasing amounts of computational power at the disposal of designers. This increased power challenges designers to implement new features for cellular phones, such as increasing the data rate to support video over a wireless connection. However, cellular phones also continue to shrink in size and battery capacity, constraining the power and bandwidth available to transmit high data rate signals. New algorithms which build on previous work in type-based detection, such as those presented in this thesis, enable cellular base stations to correctly distinguish bits transmitted to them in spite of low signal to noise ratios. However, correctly distinguishing data sent by cell phones to the base station is difficult. The base station receives both signals transmitted by the cell phones and distorted by the channel and noise from many sources. The signal transmitted by each phone reflects off nearby surfaces, such as buildings and the ground, before arriving at the base station. In addition, distance and intervening obstacles attenuate each echo produced by these reflections by unknown amounts, an effect termed multipath fading. Doppler shift, cause by user mobility, creates unpredictable changes in the frequency of the received signal. Noise sources also influence the signals received by the base

24 station. Cosmic noise produced in space is picked up by the base station’s antenna. Thermal noise generated in the base station’s receiver corrupts the incoming signal. Finally, interfering transmitters, such as out-of-cell users, add to the noise received. These unknown multipath fading parameters, unknown Doppler shift, and unknown noise sources combine at the receiver, forming a complex signal distribution. Rather than analytically characterizing this distribution, researchers approach the problem in different ways. One approach is to observe that the noise is a summation of a number of random variables. The Central Limit Theorem states such summations converge to the Gaussian (normal) distribution, assuming the variance of these random variables is finite. This thesis pursues a second approach, known as type-based detection [24]. First, the base station forms a type by measuring the probability distribution of the received signal and noise. Then, given these types, it classifies incoming signals based on which type they belong to. 2.2.1

Difficulties with Gaussian Approximation

In wireless communications systems, the use of the Central Limit Theorem (CLT) presents some difficulties. Although the Large Deviation Principle proves that the probability of a summation of random variables exceeding the mean of the series converges exponentially fast, communications systems depend on the “tail” behavior of the summation. This section presents Johnson’s analysis of the tail behavior of a summation of random variables [25], proving that these tails converge very slowly. This

25 slow convergence behavior proves that Gaussian approximations of these summations is inaccurate. Therefore, alternative approaches, such as type-based detection presented in Section 2.2.2, outperform Gaussian detectors in the detection of real-world signals received by wireless communications systems. The Central Limit Theorem states that, given a sequence of independent, identically distributed, zero-mean random variables Xi with a finite variance σ 2 , the sum P

N

√ Xi / N converges in distribution to a normally distributed random variable [26].

That is, N 1 X √ Xi → N (0, σ 2 ). N i=1

(2.1)

The Large Deviation Principle states that the probability of the sum sequence Xi exceeding its mean value of zero goes to zero exponentially rapidly. Define the sum SN as SN =

N X Xi i=1

N

.

(2.2)

Then, given a constant q > 0, P [SN > a], the probability that SN exceeds a given limit a, is bounded by a value that decreases exponentially rapidly with q. Specifically, P [SN > a] ≤ eN ·inf (logE[e

qX ]−qa)

.

(2.3)

However, this does not imply that the distribution of the entire summation converges to the normal distribution at the same rate. In fact, just the opposite occurs. The absolute error in this approximation was bounded by Cramér [26]. Let the sum-

26 mation of the variables FN be N 1 X FN = √ Xi . N i=1

(2.4)

Then the absolute error between FN (x) and the cumulative distribution function (CDF) of the normal distribution Φ(x) N (0, 1) is bounded as ρX |FN (x) − Φ(x)| < C √ . N

(2.5)

The universal constant C is bounded by 0.4097 < C < 0.7975 [27]. The absolute third moment ρX is defined as ρX = E[|X|3 ]/σ 3 given X is zero-mean. This condition holds only if ρX < ∞; if not, the error in the approximation of the Central Limit Theorem is unbounded. Cramér proved that this is a tight bound. Dividing by Φ(X) gives the relative error for a given distribution as =

ρX |FN (x) − Φ(x)| < C√ . Φ(x) N Φ(x)

(2.6)

Solving for N gives the number of random variables required in the summation to achieve a given relative error: N 0. 5.1.4

Variables Used

Table 5.1 lists the meaning of the variables used in this chapter.

5.2

Experimental Setup

This section presents the experimental setup used to evaluate the performance of the type-based detector developed in the previous section, and to compare its performance with the performance of other detectors. The Simulink-based CDMA uplink/downlink testbed, developed at Rice and extensively rewritten for this thesis, provides the ability to carry out these experiments. The testbed simulates a number of transmitters sending binary phase-shift keyed (BPSK) data over a channel in which noise is added to the transmission. After the channel, a chip-matched filter processes

72 the corrupted data, then outputs its results to a detector. The testbed then compares the detector’s decisions with the original transmitted data, computing the resulting bit error rate (BER). The testbed supports easily varying the number of users, signal to noise ratio (SNR), length of the training sequence, channel model, and type of detector. This testbed produces +1 and −1 symbols with equal probability. Therefore, the type-based detector used assumes η = 0. In order to compare the performance of the detector presented in this chapter with other detectors, the testbed uses an additive, linear channel model. Therefore, D(·) =

5.3

PI

i=1

ui (t) + n(t) as stated in Section 5.1.1.

Results

Simulations of the type-based detector were run over a wide variety of conditions. The SNR was varied from 10 dB to 0 dB, in 1 dB increments. A spreading code of 31 chips, using Gold codes, was employed. With this spreading gain, the number of users was varied from 1 to 31. The discrete alphabet size used by the detector varied from 8 letter to 256 letters, in powers of 2. Training sequences of 100 bits and 1000 were tested. Gaussian and Laplacian noise models were employed to corrupt data sent over the channel. Simulations include results from a standard matched-filter detector, which assumes a Gaussian noise model. These results allow performance comparisons between the matched-filter detector and the type-based detector presented in this chapter.

73

1 user BER plot

0

10

−1

10

−2

−2

BER

10

−3

10

−4

10

−3

10

−4

10

0

8 16 32 64 128 256 MF

−1

10 BER

6 users BER plot

0

10

10 2

4 6 8 SNR, in dB (a)

10

0

2

4 6 8 SNR, in dB (b)

10

Figure 5.2

Simulations for 100 preamble bits when corrupted by Gaussian noise. The legend indicates simulations results for alphabets of 8 letters to 256 letters, in powers of 2. “MF” indicates the results of simulating a Gaussian-based matched filter detector.

5.3.1

Gaussian noise model

Figure 5.2 shows the results of a set of simulations run with a length 100 preamble, followed by 3000 data bits, for 1 user and for 6 users. As indicated by the legend, alphabet sizes of 8, 16, 32, 64, 128, and 256 letters were simulated. For comparison purposes, the figure also includes simulation results from a Gaussian-based matched filter detector, notated in the legend as “MF.” The matched filter does not require training data, relying instead on a priori signal amplitude, symbol probability, and noise distribution. In these simulations, the signal is corrupted by zero-mean Gaussian noise. Therefore, the matched filter achieves optimal performance, while the typebased detector is asymptotically optimal. These Gaussian-based simulations, then, allow the comparison of the type-based detector to the optimal Gaussian detector. As shown in Figure 5.2(a), simulations of the single-user case demonstrate that

74 the 8-letter detector exhibits approximately 2dB of loss in performance compared to the matched filter detector. The 16-letter detector loses 1dB or less, while 32-letter and greater detectors meet the optimal matched filter detector’s performance. In the single-user case, therefore, a 32-letter detector training with 100 bits gives nearly optimal performance. Because simulations were executed for 3000 bits, error rates at low SNR ratios become unreliable. For example, 10 errors in 3000 bits produces a BER of 3.33 · 10−3 , making data below a BER of approximately 103 unreliable. Therefore, the higher BER rate for the matched filter at an SNR of 6dB in the single-user case represents a spurious data point. Figure 5.2(b) illustrates simulation results of a 6-user system. All type-based detectors in this situation are unable to accurately estimate the channel with the shorter 100 bits preamble, due to a higher level of MAI. However, extending the preamble length to 1000 bits as shown in Figure 5.3(b) solves this problem. Unsurprisingly, the 8-letter detector is unable to accurately distinguish between ±1 bits for all 6 users, because the detector’s alphabet is to small to handle these combinations. The 16-letter detector suffers 1dB of performance degradation, while 32-letter and greater detectors meet the matched filter’s performance. In the 1000 bit preamble, single-user case, Figure 5.3(a) demonstrates that additional training does not improve the detector’s performance, even in the 8-letter

75

1 user BER plot

0

10

−1

10

−2

−2

BER

10

−3

10

−4

10

−3

10

−4

10

0

8 16 32 64 128 256 MF

−1

10 BER

6 users BER plot

0

10

10 2


10

0

2


10

Figure 5.3 Simulations for 1000 preamble bits corrupted by Gaussian noise, for 8 to 256 letter alphabets and for a matched filter. detector. The probability densities estimated by the 1000 bit preamble are no more accurate due to increased data. Therefore, the lower performance of the 8- and 16letter detectors results from a shortage of letters in which to place received data, rather than a lack of training. 5.3.2

Laplacian noise model

In a Laplacian noise environment, the matched-filter detector is no longer optimal. Therefore, its performance can be exceeded by other detection schemes, such as a type-based detector. Single-user simulations for a 1000 bit preamble, followed by 3000 data bits, under a Laplacian noise distribution are shown in Figures 5.4(a). This graph demonstrates that the 8-letter detector’s performance is much better than its performance in the Gaussian case in Figure 5.3(a). The 16-letter detector offers a small amount of performance improvement, while the 32-letter and higher detectors

76

1 user BER plot

0

10

−1

10

−2

−2

BER

10

−3

10

−4

10

−3

10

−4

10

0

8 16 32 64 128 256 MF

−1

10 BER

6 users BER plot

0

10

10 2


10

0

2


10

Figure 5.4 Simulations for 1000 preamble bits corrupted by Laplacian noise, for 8 to 256 letter alphabets and for a matched filter. give essentially the same results. However, none of the type-based detectors in this case exceed the performance of the matched-filter detector. One possibility is the difficulty of properly estimating the Laplacian pdf, which contains a sharp peak at its center. The Gaussian smoothing kernel used to improve the performance of the detector also reduces the detector’s ability to correctly estimate sharp, high-frequency areas in a pdf. Figure 5.4(b) gives simulation results for the 6 user case. Again, an 8-letter detector does not possess the necessary resolution to distinguish ±1 bits for 6 users. The 16-letter detector offers significant performance improvement, while 32-letter and higher detectors share identical performance. Like the single-user case, these typebased detector do not outperform the matched-filter detector.

77

5.4

Conclusions and Future Work

Type-based detection presents an interesting alternative to the traditional Gaussianbased methodology. The theory and simulations demonstrate the ability of type-based detectors to match the performance of the optimal detector in Gaussian noise environments. Further work, and experimentation with differing distributions, will enable these detectors to demonstrate improved performance in non-Gaussian environments. The type-based detector presented in this chapter can be extended in several important areas. First, modifications to enable the use of a fading, multipath channel model are necessary for detection in a wireless communications system. Second, support for the complex data produced by the I and Q channels of a receiver would enable use with a receiver. Finally, allowing asynchronous transmission of user data provides the flexibility necessary in a typical communications system.

78

T t s(t) k K i I I+1 I−1 bk [i] uk (t) τ

Duration of spreading code time Spreading waveform, 0 outside (0 . . . T ) User of interest Total number of users Symbol of interest Total number of symbols Number of +1 symbols in the training sequence Number of −1 symbols in the training sequence ith symbol sent by the k th user Transmitted waveform of the k th user Delay between all synchronous transmitters and the receiver D(·) Time-invariant function which describes the manner in which the channel combines users and noise n(t) Stationary, unknown noise L Number of letters in the discrete alphabet A = {a1 , a2 , . . . aL } Discrete alphabet used for type-based detection c Chip of interest C Number of chips per bit r(t) Analog, continuous-time received signal r[c, i] r(t), sampled at the cth chip of the ith bit of the transmission c h+1,k Type estimate of the cth chip, k th user for a +1 symbol c h−1,k Type estimate of the cth chip, k th user for a −1 symbol hcr,k Type estimate of the received signal for the k th user Q Gaussian smoothing kernel c c c h+1s,k , h−1s,k , hrs,k Smoothed type estimate: cth chip, k th user, +1 symbol/−1 symbols/received symbol γ+1 , γ−1 Computed KL distance metrics for +1 and −1 symbols γ Sufficient statistic used by the type-based detector η Constant used to set to detection threshold based on a priori symbol probabilities Table 5.1

Definition of variables used in chapter 5

Chapter 6 Conclusions and Future work 6.1

Conclusions

This thesis presents two important concepts which enable the rapid prototyping of communications systems. The simulation/prototype bridge provides the ability to arbitrarily distribute the execution of a Simulink block diagram between the host and multiple DSPs in a hardware prototype. This flexibility joins the strengths of simulation with the strengths of prototyping, enabling designers to rapidly and smoothly transition from the simulation of a new communications system to a working prototype of the system. The use of appropriate language design by inserting blocks written in C or Matlab into a Simulink block diagram provides the engineer with the ability to develop new algorithms in a language best suited for the algorithm, then rapidly integrate these algorithms in a block diagram. In addition, appropriate language design encourages modularity by encapsulating new algorithms in blocks, which can then be easily re-used in a different block diagram. Appropriate language design encourages design clarity. The equations underlying algorithms written in Matlab can be simply expressed and well documented with Matlab’s rich set of mathematical operators. Finally, Simulink clearly captures the overall structure of a design in a simple block diagram. This thesis also develops a novel type-based detector, capable of robust operation

80 in the unknown noise environments encountered in wireless communications systems. Using an approach similar to a matched-filter, this detector sacrifices asymptotic optimality and its associated exponential computational complexity to provide good performance with a linear growth in computational complexity. Simulations of this detector demonstrate competitive performance with the optimal Gaussian detector in a Gaussian noise environment.

6.2

Future work

The research presented in this thesis can be extended in a number of directions. One promising area for both the simulation/prototype bridge and appropriate language design is extension of these concepts and implementations to support FPGAs and ASICs. The ability to efficiently compile Matlab code for DSPs would significantly enhance the power of appropriate language design. For FPGAs, the Xilinx System Generator [42] supports synthesizing a Simulink block diagram composed of Xilinx blocks to a Xilinx FPGA. Finally, extension of the type-based detector to support fading, multipath channels producing complex data will enable the implementation of this detector in future communications systems.

Appendix A Switcher This appendix demonstrates the usage of the Switcher, a set of Matlab functions which implement the concepts discussed in chapter 3. The Simulink model file wrapper test.mdl contains the Switcher block, as shown in Figure A.1. Double-clicking on the Switcher block reveals the Switcher GUI, shown in Figure A.2. For convenience, this block may be copied into a Simulink model in which the Switcher will be frequently used. The GUI presents the user with two options: to separate a block diagram, or Simulink model, into multiple block diagrams, as discussed in Section 3.1; or to place a user-switchable label on a given block. Before the topmost button labeled “Separate system” can be used, a Simulink model should first be given labels using the lower ‘Add “Runs on” to’ button. To accomplish this, first open a Simulink model, then resize it so that both the model and the Switcher GUI are visible. See Figure A.3. Now, move the mouse to the Simulink model window and click on a block to which a label will be applied. Move the mouse from the Simulink model back to the Switcher GUI window, and note that the names displayed are updated. Figure A.3 shows that the “Adaptive Noise Cancellation” block is selected in the Simulink model; that choice is reflected in the Switcher GUI below it.

82

SWrapper_test

Double-click for Switcher GUI

S-Function

Switcher GUI

The Rapid Prototyping Tool Suite www.ece.rice.edu/~bryan/academia/Rapid_prototyping/S-Function_wrapper.html Copyright (c) 2002 Bryan Jones and Rice University Last revised 2-1-2002

Figure A.1 This figure shows the switcher block, a part of the wrapper test.mdl Simulink model. The switcher is the rightmost of the two blocks. Press the ‘Add “Runs on” to’ button in the Switcher GUI. This will add a label to the block selected in the Simulink model. Move back to the Simulink model, and double-click on the selected block to show the label, which appears as the block’s mask dialog box shown in Figure A.4. Repeat this procedure for each block in the diagram which needs a label. The Switcher assumes that all unlabeled blocks execute on the host. Note that the labeling obeys the model’s hierarchy. So, labeling a parent block such as the “Adaptive Noise Cancellation” applies that label to all the blocks contained in its subsystem. However, if any block in the subsystem is labeled, this label overrides the labeling of its parent, “Adaptive Noise Cancellation.” After adding labels to a Simulink model, the model must then be separated before it can be executed. To separate a system, first display both the Simulink model to be

83

Figure A.2

This figure shows the switcher GUI, revealed by double-clicking on the

Switcher block.

separated and the Switcher GUI, as illustrated in Figure A.3. Click anywhere in the Simulink model, to select the current model. Then, move the mouse into the Switcher GUI. The GUI will update itself with the name of the currently selected Simulink model. Then, click the “Separate system” button. If the current model has not been saved, the program allows the user to optionally save changes before performing the separation. After separation, the program creates one model for each execution location, such as “host” or “DSP.” It names the model by appending host or DSP to the original model’s name. Figures A.5 and A.6 show the resulting two models generated by the program. The original model from which these partitions are derived remains

84 open, and is the active window after the separation process. To compile and run the separated model, switch to the Simulink model containing the host partition, whose file name ends with host. In this case, switch to the lms host model. Then, simply press the play button in Simulink. The model will be compiled, downloaded to the Lyr SignalMaster, then executed.

85

Figure A.3

This figure shows the switcher GUI next to a Simulink model. Both should be visible for easiest use of the GUI.

86

Figure A.4 This figure shows the mask dialog box of a block, after clicking the ‘Add “Runs on” to’ button in the Switcher GUI.

87

Noise Cancellation (Host or DSP Simulation)

[AdaptiveNoiseCancellation_6]

Signal

Noise

AdaptiveNoiseCancellation_1 [AdaptiveNoiseCancellation_3] From4 Zero-Order Goto1 [AdaptiveNoiseCancellation_5] From1 Hold [AdaptiveNoiseCancellation_4] From3 AdaptiveNoiseCancellation [AdaptiveNoiseCancellation_2] From2 Goto From

Results User Filter Taps FFT

Freq Response

[AdaptiveNoiseCancellation]


From5

Goto2



From6

Goto3 [AdaptiveNoiseCancellation_4] LMS_DSP

Goto4 [AdaptiveNoiseCancellation_5] Goto5 [AdaptiveNoiseCancellation_6] Goto6

Gateway

Figure A.5

This figure shows the lms host Simulink model resulting from separation.

88

Noise Cancellation (Host or DSP Simulation)

Input Signal

[ZeroOrderHold]

Signal

Signal + Noise

From1

Error Signal Filter Taps

[Noise]

Results_2 Results Goto4 Results_1 Goto2 Goto3

FilterTaps

Noise Freq Response

Goto

From Adaptive Noise Cancellation

FreqResponse Goto1

1 In

2 In1

[Noise] Goto5 [ZeroOrderHold] Goto6

[FilterTaps] From2 [FreqResponse] From3 [Results] From4 [Results_1] From5 [Results_2] From6

Figure A.6

1 Out

2 Out1

3 Out2

4 Out3

5 Out4

This figure shows the lms DSP Simulink model resulting from separation.

Appendix B Wrapper This appendix gives an overview of the Wrapper, a set of Matlab, Java, and Simulink functions which implement appropriate language design concepts discussed in chapter 4. Specifically, the Wrapper allows both C and Matlab to be placed inside a Simulink block. The Simulink model file wrapper test.mdl contains the Wrapper block, as shown in Figure B.1. To use the Wrapper, place it into a Simulink block diagram or library by dragging the Wrapper block from the wrapper test.mdl model to a new model. See Figure B.2. Next, double-click on the Wrapper block in the new block diagram to bring up the block’s mask dialog, as shown in Figure B.3. Give the block a new, and unique name. There cannot be any other Simulink model or m-file anywhere in the Matlab path with this new name. Press OK, and note the labeling of the block changes. Figure B.4 shows a new type-based detection block named type based. Next, double-click on the renamed wrapper block again, redisplaying the mask dialog box like the dialog box shown in Figure B.3. Now, check the “Check to edit

90

SWrapper_test

Double-click for Switcher GUI

S-Function

Switcher GUI

The Rapid Prototyping Tool Suite www.ece.rice.edu/~bryan/academia/Rapid_prototyping/S-Function_wrapper.html Copyright (c) 2002 Bryan Jones and Rice University Last revised 2-1-2002

Figure B.1 This figure shows the wrapper block, a part of the wrapper test.mdl Simulink model. The wrapper is the leftmost of the two blocks.

SWrapper_test

S-Function

Figure B.2

This figure shows wrapper block placed in a new Simulink model.

wrapper” checkbox. This closes the mask dialog box, then displays the wrapper dialog box shown in Figure B.5. When wrapping C code, enter the name of the C file to include, with the .c extension, in the “Include file” text box shown in Figure B.5. C++ files, with a .cpp or .cc extension, may also be used. Global Simulink variables, such as the current simulation time, may be added by single-clicking on the “Click to add / drag to delete” entry in the table display in Figure B.5. Note that Lyr SignalMaster v2.1

91

Figure B.3

This figure shows the mask dialog box displayed by double-clicking on the

Wrapper block.

type_based

S-Function

Figure B.4

This figure shows a renamed type based wrapper block placed in a new

Simulink model.

does not support compilation of C++ source files. Next, click on the Externals tab to display the input ports, output ports, and parameters this block will use to communicate with Simulink. See Figure B.6. The type-based detector requires four parameters: the bits sent during the preamble, or training sequence; the number of quantization bins to use; the number of users in the system; and a matrix of spreading codes for each user. In the bottom half of the Externals dialog shown in Figure B.7, the variable names preamble, numQuant,

92

Figure B.5

This figure shows the wrapper dialog box displayed after checking “Check to edit wrapper” in the mask dialog.

numUsers, and sprdCodes were chosen to represent these four parameters. Next, input ports and output ports should be added, as shown in the top half of Figure B.7. When adding an input port, the program displays the input port dialog box pictured in Figure B.8. In this dialog box, the dimensions and rate of the port are specified. The figure shows an input port named chips. This input port is a dynamically sized vector, meaning that this block accepts any vector given to it by

93

Figure B.6

This figure shows the wrapper dialog box displayed after choosing the

“Externals” tab.

Simulink. The length of the resulting vector is stored in the variable chipsPerBit. This input port expects discrete-time inputs at a rate of 1 input every simulation time unit. As shown in the dialog, Simulink also supports continuous-time inputs. The output port dialog box in Figure B.9 includes many of the same parameters. Named decision, this output port is configured to output a numUsers-element vector in discrete time, at a rate of 1 output per simulation time unit. Note that, in the

94

Figure B.7

This figure shows the Externals dialog box after entering four parameters.

current version of the wrapper, numUsers must be dereferenced as numUsers[0]. Because numUsers is a parameter, it is internally stored and accessed as a Matlab matrix. Matlab matrices are accessed in C by dereferencing the pointer to the matrix, as shown above. Next, click on the Internals tab to display the internals dialog, shown in Figure B.10. In this dialog, C or Matlab code fragments may use the variables defined in

95

Figure B.8

This figure shows the input port dialog box.

the Externals dialog box, and in the input port and output port dialog boxes. First, select the language used in this dialog by clicking on the appropriate radio button (C, C++, or Matlab) at the bottom of the dialog. When using C or C++, the screen contains two text areas: one in which variable declarations such as int i may be entered, and one in which statements may be entered. Because Matlab does not need or support variable declarations, the variable declarations text box is hidden when the Matlab language is chosen. Note that algorithms written in Matlab cannot be executed on a DSP.

96

Figure B.9

This figure shows the output port dialog box.

Next, select the Simulink function in which to add code. Three functions are of interest to most users. The mdlStart function is called once, as the simulation is starting. The mdlTerminate function is called once, when the simulation is ending. The mdlOutputs function is called every time new inputs are ready, or new output from the block is available. Therefore, the function which computes the block’s outputs given its inputs should be placed in mdlOutputs. A complete list of the functions Simulink calls is given in the “Writing S-Functions” manual [4]. Finally, add the appropriate code, using the variables defined earlier. In Figure

97

Figure B.10

This figure shows the results panel of the wrapper.

B.10, the type-based detection function type based det is called with variables defined in the External dialog: chips, a vector of chips arriving from the input port; preamble, a matrix of training data passed as the block’s first parameter; time, a global Simulink variable containing the current simulation time; numQuant, the second parameter specifying the number of quantization bins to use; and sprdCodes, a matrix of spreading codes given as the third parameter to the block.

98

Figure B.11

This figure shows the variables panel of the wrapper.

The Variables panel shown in Figure B.11 lists all the variables defined in the Externals panel, and also gives the location at which each variable was defined. This panel also allows the advanced user to change the C data type used to reference. Clicking on any entry in the “Data type” column of the table reveals a drop-down list of alternative data types. Finally, the Results panel in Figure B.12 shows the C code generated by the

99 wrapper, in response to the data entered in the other panels. Two buttons at the bottom of the panel cause the program to save the generated code. The “Save Code” button writes the resulting C program to disk, along with any supporting Matlab M-functions. The “Save All and Compile” button saves the code, and also saves the data entered in the wrapper in an .sfw file, which is an S-Function Wrapper-specific save file. Data entered in the wrapper can also be saved from the General panel shown in Figure B.5 by clicking on the “Save” button in that panel. This panel also displays any warnings or errors encountered while generating the code. After clicking on the “Save All and Compile” button, the Matlab command prompt displays a message indicating compilation is in progress, and stating the Matlab command used to compile the block. When finished with the compile, the message Done is printed to the Matlab command prompt. The final step is to enter the parameters from the External panel in the correct order to Simulink. Right-click the block, then choose “Block parameters...”. Simulink displays the block parameters dialog box shown in Figure B.13. Always leave the top entry in Figure B.13, the S-function name, unchanged. Instead, change the Sfunction’s name when necessary by entering the new name in the mask dialog box shown in Figure B.3. The second entry in the block parameters dialog box shown in Figure B.13 should contain Matlab variables or expressions in the same order as those specified in the bottom half of the Externals panel (Figure B.7). That is, the

100

Figure B.12

This figure shows the results panel of the wrapper.

first (leftmost) parameter entered in the block parameters dialog box (Figure B.13) is passed to the Wrapper’s first (topmost) parameter entered in the Parameters list (the bottom half of Figure B.7). Likewise, the second (from the left) parameter in the block parameters dialog box is passed to the Wrapper’s second (from the top) parameter entered in the Parameters list, the third from the left is passed to the third from the top, and so on.

101

Figure B.13

This figure shows the block parameters dialog box of the type-based

wrapper.

In these figures, therefore, the parameter “preamble” in Figure B.13 gets assigned to the variable “preamble” in Figure B.7. Similarly, “numQuant” gets assigned to the variable “numQuant,” “numUsers” to “numUsers,” and “sprd codes” to “sprdCodes.” Note that not simply variables, but expressions may also be entered. For example, the variable “preamble” in Figure B.13 could be replaced with an expression to generate a random series of bits, such as “randint(numUsers, preambleLength)*21.” Upon execution of the model, Matlab would evaluate this expression, and assign the resulting matrix to the variable “preamble” when calling the Wrapper.

Appendix C Testbed This appendix reviews the CDMA testbed, first developed by Sundaramurthy [43]. As part of the work presented in this thesis, the testbed was significantly upgraded. Upgrades were done in three primary areas. First, the testbed was extended to support both uplink and downlink. Second, the method of changing detectors was improved, and the library of estimators, detectors, and support block reorganized. Finally, several additional noise models were added to the testbed. The following sections discuss these three aspects, and also demonstrates first-time use of the testbed.

C.1

First-time Use of the Testbed

To use the CDMA testbed, first open the ubs demo Simulink model, as shown in Figure C.1. Before executing the model, a number of global variables must be set. To set these variables, double-click on the “Update Parameters” block in the ubs demo model. This brings up the parameters dialog box illustrated in Figure C.2. Press OK, and a number of global variables will be set. Simulations can now be run by selecting the play button on the Simulink model, subject to the caveats stated in Section C.2.

C.2

Detector Choice, Uplink and Downlink Support

Changing the detector used by the testbed is simple. First, right-click the current detector block to bring up a block-specific menu. Next, select the “Block choice”

103

CDMA uplink/downlink simulation User 1 detected

Chip MF

Type-based Detector (Matlab)

[Configurable] Bandband data

chan i/p

chip MF o/p

Type-based Detector

6

Detected bits

User 1 detected User 1 original

26784

Original bits

User 1 original

6

6

Baseband data generation

Error rate User 1

3

Error Rate User 1

BER calculation

Channel Estimation

Max. Likelihood Channel Est.

Original bits

3

Demux

26784

0.01266 4 3

316 User 1 Display: Error rate # errors # bits

Update Parameters

Show Stats

Compute delay

Figure C.1 This figure shows the ubs demo Simulink model, which performs CDMA uplink and downlink simulations. item as shown in Figure C.3. Finally, pick the desired detector from the submenu. Note that not all detectors can be run. Specifically, the DSP-specific detectors are designed for the Sundance DSP, and may only be used in a PC with a Sundance card. Note that there are three problems that can arise when switching detectors. First, when switching from a base-station, or uplink, algorithm to a handset, or downlink, algorithm, the “Update Parameters” dialog box shown in Figure C.2 must be opened. In it the last item in the dialog box, “Direction of link,” must be correctly updated for the detector type. If this setting is incorrect, the testbed will give confusing errors

104

Figure C.2

This figure shows the testbed parameters dialog box, displayed after doubleclicking the “Update Parameters” block in the ubs demo model.

messages in the block contained in the “BER Calculation” block. Second, some detector changes require the user to bring up the “Update Parameters” dialog box, then click OK. Clicking OK indirectly runs the buildsim Matlab

105

Figure C.3

This figure shows the selection of a new detector in the ubs demo model.

script, which automatically modifies some important simulation internals based on the detector type and number of users. Try this if you get unexplained errors. See the buildsim script, particularly the end of the script, for more details. Third, different detectors have differing delays between the time when chips enter the detector and the resulting symbol is output. For example, the block-based detectors produce decisions 12 bits after arrival of the original chips, while the matched filter detector imposes a 2 bit delay. To determine the detector’s delay, double-click the “Compute delay” block in the ubs demo block diagram. The delay will be au-

106 tomatically determined. However, misbehaving detectors can confuse this algorithm, which is essentially a sliding correlator. If the plot that is displayed after computing the delay shows multiple peaks, instead of a single peak, the detector is misbehaving.

C.3

Additional Noise Models

To change the noise model in use, double-click on the “Baseband data generation” block, revealing its subsystem. Note that the “User data” block is no longer used; its functionality is replaced by the “datagen” Wrapper block. Currently, the “Asynchronous Delay” block is removed, for testing of the type-based detector. This should be reinserted between the “Spreading” and “Wireless Channel” blocks. Next, double click on the “Wireless Channel” block, revealing its subsystem. Right-click on the noise source block, then choose the “Block choice” option. Select a new noise source. Note that the exponential and Cauchy noise sources have not been tested to verify that they produce the expected distribution.

C.4

Source Code Control

Note that the testbed, and most of its contents, are entered in a Visual SourceSafe database. This database opens upon execution of the Visual SourceSafe program on Visby.

107

References 1. P. Banerjee, N. Shenoy, A. Choudhary, S. Hauck, M. Haldar, P. Joisha, A. Jones, A. Kanhare, A. Nayak, S. Periyacheri, M. Walkden, and D. Zaretsky, “A MATLAB compiler for distributed heterogeneous reconfigurable computing systems,” in IEEE Symposium on FPGA Custom Computing Machines (FCCM-2000), (Napa Valley, CA), pp. 39–48, April 2000. 2. L. DeRose, K. Gallivan, E. Gallopoulos, B. Marsolf, and D. Padua, “FALCON: An environment for the development of scientific libraries and applications,” in Proceedings of the First International Workshop on Knowledge-Based System for the (re)Use of Program Libraries (KBUP), (Sophia Antipolis, France), November 1995. 3. The MathWorks, Inc., Natick, MA, Application Program Interface Reference, June 2001. Revised for MATLAB 6.1 (Release 12.1). 4. The MathWorks, Inc., Natick, MA, Writing S-Functions, June 2001. Revised for Simulink 4.1 (Release 12.1). 5. The MathWorks, Inc., Natick, MA, Using Matlab, June 2001. Revised for MATLAB 6.1 (Release 12.1). 6. R. W. Larsen, Introduction to MathCAD. Upper Saddle River, NJ: Prentice Hall, 1999. 7. S. Kaufmann, Mathematica as a Tool: an Introduction with Practical Examples. Boston, MA: Birkhauser, 1994. 8. D. Redfern, Maple Handbook. New York, NY, 1996. Maple V Release 4. 9. The MathWorks, Inc., Natick, MA, Using Simulink, June 2001. Revised for Simulink 4.1 (Release 12.1). 10. Synopsis, Inc., Mountain View, CA, Getting Started with COSSAP. v1998.08. 11. Cadence, Inc., Cadence Signal Processing Workshop http://www.cadence.com/eda solutions/sld spdv l3 index.html.

(SPW).

12. E. A. Lee and T. A. Parks, “Dataflow process networks,” Proceedings of the IEEE, vol. 83, pp. 773–799, May 1995. 13. The MathWorks, Inc., Natick, MA, The Real-Time Workshop User’s Guide, June 2001. Revised for Simulink 4.1 (Release 12.1).

108 14. J. Buck, S. Ha, E. A. Lee, and D. G. Messerschmitt, “Ptolemy: a framework for simulating and prototyping heterogeneous systems,” International Journal of Computer Simulation, special issue on “Simulation Software Development”, vol. 4, pp. 155–182, April 1994. 15. A. D. Pimentel, P. Lieverse, P. van der Wolf, L. O. Hertzberger, and E. F. Deprettere, “Exploring embedded-systems architectures with artemis,” IEEE Computer Magazine, vol. 43, pp. 57–63, November 2001. 16. S. S. Bhattacharyya, “Hardware/software co-synthesis of DSP systems,” in Programmable Digital Signal Processors: Architecture, Programming, and Applications (Y. H. Hu, ed.), pp. 333–378, Marcel Dekker, Inc., 2002. 17. B. P. Dave and N. K. Jha, “COHRA: hardware-software cosynthesis of hierarchical heterogeneous distributed embedded systems,” IEEE Transactions on Computer-Aided Design, vol. 17, pp. 900–919, October 1998. 18. D. M. Tullsen, S. J. Eggers, and H. M. Levy, “Simultaneous multithreading: Maximizing on-chip parallelism,” in Proceedings of the 22nd International Symposium on Computer Architecture (ISCA), (Santa Margherita Ligure, Italy), pp. 392–403, June 1995. 19. H. Akkary and M. A. Driscoll, “A dynamic multithreaded processor,” in Proceedings of the 31st ACM/IEEE International Symposium on Microarchitecture (MICRO-31), (Dallas, TX), pp. 226–236, November 1998. 20. T. L. Veldhuizen, “Arrays in Blitz++,” in Proceedings of the 2nd International Symposium on Computing in Object-Oriented Parallel Environments (ISCOPE), Lecture Notes in Computer Science, (Santa Fe, New Mexico), pp. 223–230, Springer-Verlag, December 1998. 21. S. Karmesin, J. Crotinger, J. Cummings, S. Haney, W. Humphrey, J. Reynders, S. Smith, and T. Williams, “Array design and expression evaluation in POOMA II,” in Proceedings of the 2nd International Symposium on Computing in ObjectOriented Parallel Environments (ISCOPE), Lecture Notes in Computer Science, (Santa Fe, New Mexico), pp. 231–238, Springer-Verlag, December 1998. 22. J. G. Siek and A. Lumsdaine, “A rational approach to portable high performance: The basic linear algebra instruction set (BLAIS) and the fixed algorithm size template (FAST) library,” in 2nd European Conference on Object-Oriented Programming (ECOOP), workshop on Parallel Object-Oriented Scientific Computing (POOSC), (Brussels, Belgium), pp. 468–469, July 1998.

109 23. F. Livingston, V. Chandrasekhar, M. Vaya, and J. R. Cavallaro, “Handset detector architectures for 3G wireless systems,” in IEEE International Symposium on Circuits and Systems (ISCAS), (Phoenix, AZ), May 2002. Accepted. 24. M. Gutman, “Asymptotically optimal classification for multiple tests with empirically observed statistics,” IEEE Transactions on Information Theory, vol. 35, pp. 401–408, 1989. 25. D. H. Johnson and D. E. Dudgeon, Array Signal Processing: Concepts and Techniques. Englewood Cliffs, NJ: P T R Prentice Hall, 1993. 26. H. Cramér, Random variables and probability distributions. London: Cambridge University Press, 1970. 27. T. S. Ferguson, A Course in Large Sample Theory. New York: Chapman & Hall, 1996. 28. E. Dahlman, B. Gudmundson, M. Milsson, and J. Scold, “UMTS/IMT-2000 based on W-CDMA,” IEEE Communications Magazine, vol. 39, pp. 70–80, September 1998. 29. S. Kullback, Information Theory and Statistics. New York: Wiley, 1959. 30. O. E. Kelly, Intersymbol interference equalization by universal likelihood. PhD thesis, Department of Electrical and Computer Engineering, Rice University, Houston, TX, October 1996. 31. L. Yue, Universal classification for wireless CDMA communications. PhD thesis, Department of Electrical and Computer Engineering, Rice University, Houston, TX, April 1998. 32. L. Belanger, J. Ahern, and P. Fortier, “Prototyping wireless base stations or edge devices on a DSP/FPGA architecture using high-level tools,” in International Conference on Signal Processing, Applications, and Technology (ICSPAT), (Dallas, TX), October 2000. 33. B. A. Jones, S. Rajagopal, and J. R. Cavallaro, “Real-time DSP multiprocessor implementation for future wireless base-stations,” in Texas Instruments DSPS Fest 2000, (Houston, TX), May 2000. 34. S. Rajagopal, B. A. Jones, and J. R. Cavallaro, “Task partitioning wireless basestation receiver algorithms on multiple DSPs and FPGAs,” in International Conference on Signal Processing, Applications, and Technology (ICSPAT), (Dallas, TX), October 2000.

110 35. LYR Signal Processing (LSP), a division of LYRTech, Québec, Québec Canada, SignalMaster SM-C67X-Elan Users Manual and Installation Guide. 36. The MathWorks, Inc., Natick, MA, Target Language Compiler Reference Guide, April 2001. Revised for Simulink 4.1 (Release 12.1). 37. A. Chauhan and K. Kennedy, “Optimizing strategies for telescoping languages: Procedure strength reduction and procedure vectorization,” in Proceedings of the 15th ACM International Conference on Supercomputing, (Sorrento, Italy), pp. 92–101, June 2001. 38. The MathWorks, Inc., Natick, MA, External Interfaces, June 2001. Revised for MATLAB 6.1 (Release 12.1). 39. The MathWorks, Inc., Natick, MA, MATLAB Function Reference, Volume 2: F-O, June 2001. Revised for MATLAB 6.1 (Release 12.1). 40. S. Verd´ u, Multiuser Detection. Cambridge: Cambridge University Press, 1998. 41. D. H. Johnson, P. A. Gon¸clavès, and R. G. Baraniuk, “Improved type-based detection of analog signals,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 5, (Munich, Germany), pp. 3717– 3720, 1997. 42. The MathWorks, Inc., U. S. A, Xilinx System Generator v2.1 for Simulink: User’s Guide and Blockset Reference Manual. 43. V. Sundaramurthy and J. R. Cavallaro, “A software simulation testbed for third generation CDMA wireless systems,” in 33rd Asilomar Conference on Signal, Systems, and Computers, (Pacific Grove, CA), pp. 1680–1684, October 1999.