ance as well as the performance of a continuously variable slope delta. (CVSD) modulation system. For bit error rates. (BER's) of lo-' or greater, the transversal ...
74
TRANSACTIONS IEEE
COMMUNICATIONS, ON
VOL. NO. COM-33,
1, JANUARY 1985
Backward Adaptive Lattice and Transversal Predictors in A'DPCM
Abstruci-Four different backward adaptive predictorsand a fixed predictor are compared for use in an adaptive differential pulse code modulation (ADPCM) system for coding speech at 16 kilobits/second (kbitds). For noise-free channels, the four adaptive predictors, a least squares lattice, a least mean square lattice, a Kalman transversal form, and a gradient transversal form, all exceed the fixed predictor performance as well as the performance of a continuously variable slope delta (CVSD)modulationsystem.For biterrorrates (BER's) of lo-' or greater, the transversal predictor performance falls below that of the fixed predictor and CVSD; however, the lattice structures maintain their performance advantage. The least squares lattice predictor has the best objective and subjective performance forboth noiseless and noisy channels. All systems perform poorly for a BER of lo-*. To extend the performance of ADPCM with a least squares lattice predictor down to a BER of the sampling rate i s reduced and a selective coding scheme is devised. The resulting ADPCM system maintains excellent performance through a BER of and outperforms CVSDfor noise-free and noisychannels.The dynamic range, tandeming performance, and behavior 'for noisy inputs for the ADPCM system and CVSD are investigated.
I. INTRODUCTION.
A
DAPTIVE differential pulse code modulation (ADPCM)is an important system for speech coding at 16-32 kilobits/ ADPCM second (kbits/s) [ 11 -'[4]. In this paper, a 16 kbit/s systemwithbackwardadaptivequantization is investigated, and performance comparisons are made for the system with a fixedpredictorandwithfourdifferentbackwardadaptive predictors. The best ADPCM system isalso compared to a 16 kbit/scontinuouslyvariableslopedelta(CVSD)modulation system. All comparisons consider both noise-free channels and noisy channels with bit error rates (BER) as high as l o p 2 . Backward adaptive predictors are considered exclusively in this work. There are two main reasons for this emphasis on backward adaptive algorithms. First, since backward adaptive predictors adapt based on the quantized prediction error and the reconstructed sequence, both of which are available at the receiver, no side information is needed and the full transmission bandwidth is allocated to the quantized prediction error. Second, backward adaptive predictors operate on a sample-bysample basis and, therefore, insert minimal processing delay. Unfortunately, since they adapt off of the quantized prediction error, backward adaptive algorithms are very sensitive to channel errors. Ofthefouradaptivepredictorsinvestigated,twohavea transversalfilterstructureandtwo have alatticestructure. Thetwo transversal predictor algorithms, namely, least a Paper approved by the Editor for Data CommunicationSystems ofthe IEEE Communications Society for publication after presentation at the International ConferenceonAcoustics,Speech, and SignalProcessing, San Diego, CA, March 1984. Manuscript received October 14, 1983; revised April 10, 1984. This workwassupportedinpartbytheNational Science Foundationunder Grant ECS-8020304. R . C. ReiningeriswiththeDepartment of ElectricalandComputer Engineering, OklahomaState University, Stillwater, OK 74078. J . D. Gibson,is with the Department of Electrical Engineering, TexasA&M University, College Station, TX 77843.
meansquare (LMS) gradientalgorithmandaKalman algorithm,usewell-knowndampingschemestoreducetransmissionerroreffects,whilenewdampingmethodsareproposed for the two lattice predictors, an LMS gradient algorithm and a least squares algorithm. In addition to using damping t o dissipatetransmissionerroreffects,forwarderrorcorrecting codesarealsostudied.Thebestchannelcodingmethod(of thoseinvestigated) is implemented by reducing the sampling rate of the system from 8 to 6.4 kHz and using a rate 4/5 channel code to protect only the magnitude bits of the quantizer output. The sign bits are left unprotected. ' One of themotivationsforthiswork was todesignan ADPCM system which has better performance than a 16 kbit/s CVSD system for both noise-free and noisy channels. Although a 16 kbit/s ADPCM systemcanbedesignedwhich is better than the CVSD system for ideal channels, the CVSD system is much less sensitive t o noise than an ADPCM system and, hence, normally has the better noisy channel performance. Techniques which are used to improve the noisy channel performance of t h e ADPCM systemusuallyresultindecreasedperformance for a noise-free channel and, thus, make the task of designing an ADPCM system that is uniformly better than CVSD more difficult. Comparisons between the best ADPCM system and CVSD for both noise-free and noisy channels are included in this paper. This paper is organized as follows. The description of the ADPCM systemandtheadaptivepredictoralgorithms is in Section 11, and the description of the channel coding scheme is given in Section 111. The results of the Monte Carlo simulations of the system are presented in Section IV, and comparisons to a CVSD system are in Section V.
11. SYSTEMDESCRIPTION A block diagram of the ADPCM system is shown in Fig. 1.. Inthissystem,thelinearpredictor P generatesapredicted value $(i I i - 1) of the current input samples(i), and the quantizeddifferencebetweenthesetwo, e q ( i ) , is transmittedto the receiver. The reconstructed sample ?(i) is the sum of the quantized residual and the predicted value of the input. The predicted value consists of a linear combination of past reconstructed samples given by N
:(i
Ii-
1) =
2 an(i - l):(i
-
n)
n=l
or in vector notation by a(i I i - 1) = @(i
-
l)A(i - 1)
(2)
where A ( i ) is the vector of predictor coefficients and i N ( i- 1 ) is the vector of previous reconstructed values,
i N ( i-
i) = [;(i
-
11, a(i
-
21, -, ;(i - N ) J * .
(3)
In the above equations, the term N is referred t o as the predictor order. An adaptive predictor consists of an implementa-
0090-6778/85/0 1QO-0074$0 1.OO O 1985 IEEE
REININGER AND GIBSON: BACKWARD ADAPTIVE LATTICE A N D TRANSVERSAL PREDICTORS
75
algorithm is given by [ 9 J
L,r& 3ili-1)
K(i) = s(i)
I(il1-1)
SN(i - 1)
l O + D ( i - 1)
i*
where
Fig. 1. ADPCM system. F
tion of (2), plusanalgorithmforupdatingthecoefficients A ( i ) . The algorithms used in this work use the mean squared error criterion for determining the optimum set of predictor coefficients. For afixedpredictor,thecoefficientsareset equal to a constant set, A(i) = A
*.
(4)
For the fixed predictor considered int h s work, a second-order predictor was usedwiththecoefficientscomputedfrom McDonald [ 5 J . This choice of fixed predictor is in contrast to the work of HonigandMesserschmitt 16 J , whoreoptimizedthefixedpredictorforeachinputdatasetintheir comparisons.
and
D ( 0 ) = 0. This gradient algorithm will be referred to as the least mean square (LMS) transversalpredictor.Theterm D ( i ) in (7) is the exponentially weighted sum of the squared reconstructed sequenceandacts as theautomaticgaincontrolinthe algorithm. The second term in the denominator of (7) prevents apossibledivisionbyzero.TheKalmanadaptivealgorithm [ l o ] , [ l l ] isgivenby
A . Quantizer F o r allof the ADPCM systemsdiscussedinthispaper, afour-levelrobustJayantquantizer [ 7 ] , [ 8 ] was used,for which the quantizer output can be coded into 2 bits/sample without the use of entropy coding. In this quantizer, the step size A is updated at each sample, depending on the present step size and the quantizerlevel that was used for that sample,
A(i
+ 1) = AP(i)M[L(i)].
(51
Theexpansion/contractionfactor M dependsonlyonthe level of the quantizer, L ( i ) , with M > 1 for outer quantizer levels and M 1 forinnerlevels.Thus,wheneveranouter level is used, the quantizer expands, and when an inner level is used,thequantizercontracts.Theexponentialdamping factor /3 is includedtoallowthequantizerto“forget”its stateinthedistantpastand,thus,recoverfromchannel errors. As /3 decreases,thetimeconstantforrecoverydecreases, but so does the dynamic range of the quantizer. The value of 0 must be a tradeoff between error-free performance and noisy-channel recovery.
78
I E E E T R A N S A C T I O N S O N C O M M U N I C A T I O N S , VOL. C O M - 3 3 , N O . 1, J A N U A R Y 1 9 8 5 20
TABLE I
/LMS TRANSVERSAL
SENTENCES USED IN SIMULATIONS 1.
The pipe began t or u s tw h i l e
2.
Add the sum totheproduct
1
new. (Female)
o f thesethree.(Female)
3.
Oak i s strong and alsogives
4.
Thieves who robfriendsdeserve
5.
Cats and dogs
shade. (Male) j a i l . (Male)
each hatetheother.(Male)
BIT ERROR R A T E
TABLE I1 COMPARISON OF NOISE-FREE PERFORMANCE OF ADPCM SPEECH CODERS WITH DAMPINGIN THE PREDICTOR ALGORITHMS-SEGSNR PERFORMANCEMEASURE, N = 4, 'U = 0.99
(a) 20 , - L M SL A T T I C E
syrtea Transversal Sentence
KalMn
Llls
Fixed
CVSD
15.98 15.80 12.91 12.58 13.18
14.87
13-82
14.54 11.94 11.38 11.83
nla nla
2
16.12 16.41
15.25 16.07
16-03 15.79
3 4
14.19 14.65
5
14.15
13.59 14.23 14.03
12.79 12.81 12.70
0
nla 12.39
01 10-6
11101010-0-50-4-3-21 B I T ERROR R A T E
(b)
Fig. 4. Noisy channel performance of damped LMS lattice, LMS transversal, and fixed predictors. (a) Sentence 1 . (b) Sentence 5.
TABLE UI RANKING OF NOISE-FREE SPEECH CODERS WITH DAMPING IN THE PREDICTOR ALGORITHM SE!im
SUBJECTIYE
LS L a t t i c e
LS L a t t i c e
LHS L a t t i c e
K a h n Transversal
L A Transversal
W Lattice
Kalrnan Transversal
LHS Transversal
CVSD
CVSD
Fixed
Fixed
In all of the systems, there is a knee in the performance curves atabiterrorrateofandtheeffects of thechannel errors begin to become noticeable at this rate. This rolloff pointisprimarilyafunction of theerrorsinthequantizer stepsizewhichresultfromtransmissionerrors.Sincethe system performance at this error rate is limited by the quantizer, increasing the damping in the predictor algorithms results in only marginal improvement in system performance for the high error rates. The output speech is still intelligible at error rates as high as buttheoutput is verynoisy.Insome applications, there is a need for systems that operate at error rates greater than Although the above systems will operateatthesehigherrorrates,theoutputquality of the speech is very poor and they would not be suitable for high noise environments.
B. Performance with Chanml Coding m 0
15
BITERRORRATE
(a)
,
20
, , -L S L A T T I C E
B I T ERROR R A T E
(b) Fig. 3.
Noisy channel performance of damped LS lattice, Kalman transversal, and fixed predictors. (a) Sentence 1 . @) Sentence 5.
T o verify that magnitude bits have a much greater impact on system performance, the system was tested by alternatively allowing channel errors only in the sign and only in the magnitude bits. The results of the tests for an LS lattice system with 6.4 kHz sampling rate are in Table IV. Included in the comparisonaretheresultsforan ADPCM systemwith no restrictions on thelocation of thechannelerrors.Thebit errorrate given inthetable is theprobability of errorfor the specified bit. For the cases where only one bit is allowed t o have errors, the actual bit error rate for the system would be halfof that given inthetable.TheresultsinTableIV clearly show that errors in the magnitude bits are the primary source of loss in the system. Even error rates as high as l o v 2 in the sign bit cause minimal loss in system performance. The noisy channel performance of the partial coding schemeforan ADPCM systemwasdeterminedbyaMonte Carlo simulation of the system. Since the LS lattice predictor had the best performance of all the predictors, it isused in thesystemwithselectivecoding.Theresults of thesimulationsforSentences1and 5 areshowninFig. 5. Included in thisfigurearetheresultsforthe 8 kHz ADPCM system withoutchannelcoding.Also,forSentences 1 and 5 , the performance of a 16 kHz CVSD system is included. It should be emphasized that the transmission rate of all of the systems in Fig. 5 is 16 kbits/s. The performance of the CVSD system
REININGER GIBSON: AND
TABLE IV EFFECT OF POSITION OF BIT ERRORS IN AN ADPCM SYSTEM-LS LATTICE PREDICTOR
I
79 -
BACKWARD ADAPTIVE LATTICETRANSVERSAL AND PREDICTORS
Sentewe
!
Posltlon of Blt Errors
80th Maanltude Sian 14.98
14.98
LS
~
14.98 14.75 14.35 12.81
13.62 10.86 5.21
13.17 13.01 12.59 11.26
13.17 12.09 10.00 6.06
~~
.
13.18 9.98 5.62 13.17 11.70 9.28 6.28
LATTICE
cn LSLATTICE 6.4 kHz
Fig. 5 . Noisychannelperformance of 16 kbps speechcodingsystems. (a) Sentence 1. @) Sentence 5.
of thetwosystems.Ataerrorrate,theoutputafthe 8 kHz ADPCM system without channel coding was very noisy, but was stillintelligible.Theoutputofthe6.4kHzsystem was on&islightly degraded at the l o p 2 error rate, and listening to it was not unpleasant or annoying. Theresults of thissectionindicatethatforapplications whereaverynoisychannelisexpected,a6.4kHz ADPCM systemwithan LS latticepredictorandselectivechannel codingwouldbeaverygoodcandidate.Thequalityofthe 6.4 kHz system is only slightly less than the 8 kHz system for anoise-freechannel,andsignificantlybetterforverynoisy channels.Furthermore,theperformanceofthe6.4kHz ADPCM system is uniformly'betterthanaCVSDsystem which operates at the same transmission rate.
V. COMPARISONTO CVSD In Section 111 theperformance of ADPCM systemswith differentadaptivepredictorswascomparedanddiscussed. LS It was shownthatthe ADPCM systemwithanadaptive lattice predictor was thebestamongtheadaptivepredictors considered. In Section IV, a partial channel coding scheme was introduced to improve the performanceof the ADPCM system for very noisy channels, and the noisy channel performance of the LS lattice ADPCM system was compared to the CVSD system. In this section, further comparisons are made between the LS lattice ADPCM system and the CVSD system for the 16kbit/stransmissionrate.ThecomparisontotheCVSD system is important because this system i s the only 16 kbit/s waveform coder that has actually been manufactured and sold off the shelf, and thus it serves as a benchmark for comparison ofothersystems.Thecomparisonsinthissectioninclude dynamicrangeperformance,theeffects of tandeming,and theeffects of backgroundnoise.CVSD is anadaptivedelta m'odulation(DM)thathasproventoberobusttochannel errors.Adaptive DM systems have the same structure as the ADCPM system shown in Fig. 2, but they have a 1 bit (twolevel) adaptive quantizer, and a fixed first-order predictor. In a DM system,thesamplingrateandtransmissionrate arethesame, so a16kbit/ssystemhas a 16kHzsampling rate. For speech signals, this sampling rate iswell above the Nyquistrate of 8 kHz. In aCVSDsystem,thequantizer step size is slowly varied with changes of the input sequence. The quantizer adaptation algorithm is A(i) = P[A(i) - A,]
was determined by low-pass filtering and down-sampling the input and calculating the SEGSNR for the down-sampled data. By doingthis,theperformance of theCVSDsystem can be compared to the 8 kHz ADPCM system. The performance of the 8 kHz ADPCM system is about1.5dBbetter thanthe6.4kHz ADPCM system,butthere is verylittle subjectivedifferencebetweenthetwo.Theperformance of the 8 kHz system without coding starts to roll off at about 3 X while the 6.4 kHz system with coding starts to roll off at about 3 X ] O W 3 . Consequently, the 6.4 kHz system has better performance than the 8 kHz system for error rates ofandgreater.ForSentences 1 and 5 , theperformance of the 6.4 kHz ADPCM system is uniformly better than the CVSD system. The same cannot be said for the 8 kHz system. ForSentence1,the 8 kHz ADPCM system is uniformly better than the CVSD system, but for Sentence 5, the CVSD system does better for high error rates. There is a slight difference in the subjective quality of the two ADPCM systems, with the 8 kHz system sounding a little better than the 6.4 kHz system. Both ADPCM systems sounded betterthantheCVSDsystem. For noisychannels,thedifference between the two ADPCM systems was not noticeable error for rates between and error For rates of 3X and there isclear a difference the inquality
+ A, + g(i)
(31)
where A, is the minimum step size and is the syllabic decay constant.Thestep size increasesonlywhenthe sign of the predictionresidual is thesameforthepastthreesamples. The parameters used in the simulation were P = 0.990, A, = 10, and E = 7.4. The ADPCM system is clearly more complex than the CVSD system, and it is expected that there should be a corresponding increasein performance.
A. Dynamic Range An important characteristic of a speech coding system is the dynamic range of the system, which is the range of input signallevels forwhichthesystemhasrelativelyflatperformance. The dynamic range of a system gives an indication of howaccuratelythesystem will reproducebothlowand high level signals. Dynamic range is an important consideration inspeech'codingsystems,sincespeechwaveformsare nonstationary and have periods where signal the level is high or low. Acommonlyuseddynamicrangetestisperformedby measuring the performance of the system for a band-limited 100 Hz)atdifferentamplitudes.The noisesignal(700-1
80
IEEE T R A N S A C T I O N S O N C O M M U N I C A T I O N S , VOL. COM-33, NO. 1 , J A N U A R Y 1985
TABLE V BkHz ADPCM
TANDEM PERFORMANCE OF ADPCM SYSTEM WITH LS LATTICE PREDICTOR-8 kHz SAMPLING RATE, W = 0 . 9 9 , V = 0 . 9 8 Sentence 1 3 5
'"I -50
-40
-30
-20
INPUT SIGNAL POWER
-10
Nunber o f System in
Series 1
2
3
16.12 14.19 14.16
12.92 11.23 11.33
11.07 9.38 9.68
I
0
- dB
Fig. 6 . Dynamicrangefor LS latticesystems. ADPCM system results of thedynamicrangetestsforan with an LS lattice predictor are shown in Fig. 6. This figure contains the results for a system with an LS lattice predictor for sampling rates of both 8 and 6.4 kHz. For the 8 kHz sampling rate, the dynamic range was found for the system with and without damping in the predictor algorithm. For the 6.4 kHzsamplingrate,thedynamicrangewasfoundforthe in this figure i s the dynamic system with damping. Included raqge of the 16 kbit/s CVSD system, The performance of the CVSD systemwasdeterminedbyfindingtheSNRforthe down-sampled (8 kHz) sequences as described in Section IV. Theinputsignal levelis the averagepoweroftheinput sequencemeasuredindecibels.The 0 dBreferenceforthe curves is the power of a sine wave whose peak value is equal to the maximum allowable input to theADPCM system. F o r all of the 16 kbit/s ADPCM systems, the performance is flat for about 25 dB of inputsignal power. The introduction of damping in the 8 kHz system leads t o a loss in performance of the SNR's for 6.4 of about 2 dB. Although a comparison and 8 kHz is not strictly valid, the SNR results indicate that performance of the 6.4 kHz systemis below the 8 kHz system. Notice that while the performance of the ADPCM systems is flat, the CVSD system peaks at an input level of about -10 dB, andfallsoffsteadilyforlowerinputlevels.Eventhebest of the performance of t h e CVSD system is farbelowthat ADPCM systems.Theseresultsindicatethatthe ADPCM system is superior t o the CVSD system at 16 kbits/s.
B. Tandem In the tandem simulation, each system was cascaded with itself, with the output of one stage becoming the input to the next stage. It was assumed that the tandem was asynchronous, that is, although the sampling rates 'of the systems were the same, the actual sampling instances for each stage were possiblydifferent.Thisasynchronoustandemwassimulatedby transforming an ouput sequence to a new sequence by digital low-pass filtering and introducing a time delay by interpolating between the samples. For all of the systems, a time delay of one-half the sampling period was introduced, since this representstheworstpossibledelay.IntheSNRcalculation,an output sequence 'was always shifted back to the time frame of the sequence it was being compared to. Theresults of tandemingtestsforthe 8 kHz ADPCM, 6.4kHz ADPCM, and16kHz CVSD systemsareshownin tables V, VI, and VII, respectively. For the ADPCM systems, there is a 3 dB drop in performance at the first tandem and another 2 dBdropafterthesecondtandem. On theother hand, the CVSD system shows a 1 dB drop at the first tandem and a small drop at the second tandem. These results indicate that for tandeming, the CVSD system has better performance thanthe ADPCM systems.Subjectivelistening'testsarein agreement with SEGSNR measure in Tables V-VII. The ADPCM systemshadhigherqualitythanthe CVSD system a t t h e o u t p u t of the first system, and approximately the same performance at the output of thesecondsystem.Afterthe third system, however, the output of the CVSD system had
TABLE VI
TANDEM PERFORMANCE OF ADPCM SYSTEM WITH Ls LATTICE PFWDICTOR-6.4 kHz SAMPLING RATE, W = 0 . 9 9 , = 0.98
v
Sentence
Series 3
1 3 5
Number of Systems in 1 2
14.66 9.8111.63 13.63 8.61 10.30 10.23 13.22
8.68
TABLE VI1 TANDEM PERFORMANCE OF A
Sentence 12.43 1 5
16 kbit/s CVSD SYSTEM
Number of System in Series 1
2
3
12.67 13.82 10.92 11.27 12.39
higher quality than either of the ADPCM systems. The better performance of the tandemed CVSD systems is probably due t o t h e higher sampling rate for the system. Since the CVSD system i s sampled at least twice as fast as the ADPCM system, the maximum time delay caused by asynchronous sampling is at most half the maximum delay for the ADPCM system. This shorter delay results in a smaller difference between the sample values of a waveform and the sample valuesof the time delayed waveform. Also, a critical factor in the performance of For a a CVSD systemistheslopeoftheinputwaveform. (ADM) system,there is a given adaptivedeltamodulation maximum slope for which the system can reliably track the input. If a section of the input waveform has a slope greater than the maximum for the ADM system or the slope changes tooquicklyforthesystemtoadapt,thesystem goes into slope overload. For tandemed CVSD systems, slope overload occurs primarily at the first system. There is little slope overloadinthefollowingsystems,sincetheslopeandrate of change of the slope of the input waveform has been limited by the first system.
C Background Noise Up to this point in the paper, only uncontaminated speech inputs have been considered as inputs for the speech coders. In real-life implementations, there is almost always some noise presentintheinputsignal.Itisusefultoknowhowthe systemreactswhentheinputcontainsbackgroundnoise, since sensitivity t o noise can put restrictions on the usage of the system. The two types of noisysignals to be considered areadditivewhitenoiseandmultiplespeakers.Theadditive white noise represents ambient noise picked up by a microphone in addition to the desired speech input. Thetestforbackgroundnoisewasperformedbyadding anindependentGaussiannoisesequencetoaspeechinput sequencetoformacontaminatedinputsequence.Thecorresponding reconstructed sequences for the ADPCM and CVSD speech coders were compared to the uncontaminated input sequence. The results of this test are shown in Fig. 7. The input noise power of the system is the variance of t h e noisesequencerepresented,indecibels.The 0 dBreference
REININGER GIBSON: AND
BACKWARD ADAPTIVE LATTICE
TRANSVERSAL AND
PREDICTORS
81
transversal forms for both noisy and noise-free channels. The system performance for all of the predictors fell off rapidly m for error rates greater than -The performance of the system at higher error rates was K improvedbytheaddition of a channel code to the system. z rn 10 ’ ‘ ‘4 . The 16 kbit/s transmission rate was kept constant by reducing I3 the sampling rate of the system to 6.4 kHz, and using a rate n 2 5 4/5channelcode.Sincenoacceptablerate4/5codescould 0 be found that operated at the high error rates, a hybrid channel 0 codingschemewasdevised.Thisschemereliedonthepro-80 - 7 0 -60 -50 -40 -30 - 2 0 perty of thesystem in whicherrors in the sign bits of the INPUTNOISE POWER - dB quantizer output caused far less distortion than errors in the Fig. 7. System performance as a function of background noise. magnitude bits. Using a rate 2/3 channel code on the magnitudebitsonlyresultedingoodperformanceforerrorrates TABLE VU1 u p t o I O p 2 . Onlyasmall loss in performance was incurred PERFORMANCE OF CODING SYSTEMS FOR MULTIPLE SPEAKER INPUTS by reducing the sampling rate from 8 t o 6.4 kHz. The noisy channelperformance of the ADPCM systemwiththe LS Sys tern lattice predictor and channel coding proved to be uniformly Input ADPCM CVSD betterthana16kbit/sCVSDsystem.The ADPCM system 14.16 12.39 s5 + 0 s1 has better dynamic range than the CVSD system, and better performancefornoisyinputs,buttheCVSDsystemproved 14.72 13.51 s5 + . 2 s1 t o have bettertandemingproperties.Itshould be noted 14.14 s* + .4 s1 15.00 thatsincethe ADPCM systemhasmoredesignparameters 14.00 S5 .6 S1 15.15 than the CVSD system, there is more room for further optimizatjon of the ADPCM system than in the CVSD system. 14.00 S5 + . 8 S1 15.13 The results given in this paper indicate that the LS lattice s5 + 1 s1 13.85 15.52 predictor is the best candidate for the implementation of a 16 kbit/s ADPCM systemwithabackwardadaptivepredictor. If a less complex system is desired, then the LMS lattice is the same as that used in the dynamic range test in Section would beagoodsecondchoice. If a low noisechannel is V-A. The system response to background noise is similar for expected, then the 8 kHz system would be the best choice, both systems, with neither system showing a higher sensitivity i s expected,thenusingthe6.4 but if ahighnoisechannel t o t h e noise than the other, The curves flatten out for input kHz system with channel coding would be indicated. noise power less than -60 dB, with the performance of the ADPCM systemabout 2 dBhigherthantheCVSDsystem. REFERENCES Theresults of thistestareconsistentwithpreviousresults, J. L. Flanagan, M. R. Schroeder, B. S . Atal, R. E. Crochiere,. N. S . whichshowedthattheperformance of the ADPCM system Jayant, and J. M. Tribolet,“Speechcoding,” IEEETrans. ComwasbetterthantheCVSDsystem.Theperformancefalls mun., vol. COM-27, pp. 710-737, .Apr. 1979. steadily for input noise power greater than -60 dB, which can J.-M.Raulin, G. Bonnerot, J.-L. Jeandot, and R. Lacroix,“A 60 be expected due to the noisy input to the system, channel PCM-ADPCMconverter,” IEEETrans. Commun., vol. For the multiple speaker test, composite input sequences COM-30, pp. 567-573, Apr. 1982. D. W. Petr,“32 kbps ADPCM-DLQ codingfor network applicaweregeneratedbyaddingtwosamplespeechsequencesof tions,” in Conf. Rec., IEEE Global Telecommun. Conf., Miami, differentweight.Thecompositesequence s M s wasformed FL, Nov. 29-Dec. 2, 1982, pp. A8.3.1-A8.3.5. by adding Sentence 1 to Sentence 5 according t o 20-
I
8 h H z ADPCM
5I
,
~
SM s
= $5
+ €31
(33)
where E can take on values from 0 to 1. The results of t h e multiple speaker test are shown in Table VIII. The values in the table are the SEGSNR between the composite signal and the reconstructed sequence. The results indicate that both the ADPCM and CVSD systems can reproduce multispeaker inputs aswellas singlespeakerinputs,withthe ADPCM system having better performance for both typesof inputs.
VI. CONCLUSIONS Inthispaper,theuseofadaptivepredictorsin ADPCM systemswasevaluatedforcodingspeechat16kbits/s.Both the classical transversal predictor and the more recently developedadaptivelatticepredictorswereconsidered.Only backwardadaptivepredictorswereconsidered so thatthe systemwouldnothave significant a codingdelay.When transmittingovernoisychannels,it is necessary toinclude dampinginthepredictoralgorithmstoensurethatthereceiver tracksthe signal atthetransmitter.Dampinginthe predictor algorithm results in a significant gain for the noisy channelperformance of thesystemwithonlyasmall loss forthe noise-freechannel.Asimpledampingschemewas introduced for the lattice predictor which preserves the stageindependentnature of thealgorithm.Thelatticepredictorswereshowntohavebetterperformancethanthe
T. Nishitani, S. Aikoh, T. Araseki, K. Ozawa, and R. Maruta, “A 32 kb/s toll quality ADPCM codec using a single chip signal processor,” in Proc. Int. Conf.Acoust., Speech,SignalProcessing, Paris, France, May 3-5, 1982, pp. 960-963. R. A. McDonald, “Signal-to-noise and idle channel performance of DPCM systems-Particular application to voice signals,” Bell Syst. Tech. J., vol. 45, pp. 1123-1151, Sept. 1966. M.L.Honig and D.G. Messerschmitt,“Comparison of adaptive linear prediction algorithms in ADPCM,” IEEE Trans. Cornmun., VOI. COM-30, pp. 1775-1785, July 1982. N. S. Jayant, “Adaptive quantization with one-word memory,” Bell Syst. Tech. J., pp. 1119-1144, Sept.1973. D. J. Goodman and R. M. Wilkinson, “A robust adaptive quantizer,” IEEETrans. Commun., vol. COM-23, pp. 1362-1365, Nov. 1975. D. L. Cohn and J. L. Melsa,“The residual encoder-An improved ADPCMsystemforspeechdigitization,” IEEETrans. Commun., vol. COM-23, pp. 935-941, Sept. 1975. J. D. Gibson, S. K . Jones, and J . L.Melsa, “Sequentially adaptive prediction and coding of speech signals,” IEEE Trans. Commun., VOI.COM-22, pp. 1789-1796, NOV. 1974. J. D.Gibson,“Adaptiveprediction in speechdifferential encoding systems,” Proc. IEEE, vol. 68, pp. 488-525, Apr. 1980. J. Makhoul, “A class of all-zero lattice digital filters: Properties and applications,” IEEETrans. Acoust., Speech,SignalProcessing, vol. ASSP-26, pp. 304-314, Aug. 1978. B. Friedlander, “Lattice filters for adaptive processing,”Proc. IEEE, VOI.70, pp. 829-867, Aug. 1982. M. Morfand D. T. Lee, “Recursive least squares ladderforms for fast parameter tracking,” Proc. IEEE Conf. Decision Contr., San Diego, CA, Jan. 12, 1979, pp. 1362-1367.
82
IEEE TRANSACTIONS ON
COMMUNICATIONS, VOL. COM-33, NO. 1 , JANUARY 1985
Jerry D. Gibson (S’73-M’73-SM’83) was born in Fort Worth, TX, on May 12, 1946. He received the B.S. degree in electrical engineeringfrom the University of Texas at Austin in 1969 and theM S . and Ph.D. degrees from SouthernMethodist University, Dallas, TX, in 1971 and 1973, respectively. From1969 to 1972hewasaMember of the Technical Staff of General Dynamics Corporation, Fort Worth, where he was involved in the design and analysis of radar systems, electronic countermeasures, and control systems. He was a Graduate ResearchAssistant at SouthernMethodistUniversity from1972 to 1973, where he conducted researchon sequential estimation methodsfor speech data rate reduction. During the 1973-1974 academic year, he was a Postdbctoral Research Associate and then a Visiting AssistantProfessor at theUniversity of Notre Dame, Notre Dame, IN. He spent the summer of 1974 as a Summer FacultyAppointee at theDefense CommunicationsAgency, Reston, VA. From 1974 to 1976 he was AssistantProfessor of Electrical Engineering at the UniversityofNebraska-Lincoln. In 1976 he joined theDepartment of Electrical Engineering at Texas A&M University, College Station, where he currentlyholdstherankof Professor. He is principalauthorof thebook Introduction to Nonparametric Detection with Applications (New York: Academic,1975). His researchinterestsincludedata compression, digital communications, and estimation theory. Dr. Gibson is a member of Eta Kappa Nu and Sigma Xi. He is Associate Editor for Speech Processing for the IEEE TRANSACTIONS ON COMMUNICATIONS.