A Fast Method for Real-Time Median Filtering - IEEE Xplore

1 downloads 0 Views 792KB Size Report
Abstract-A fast real-time algorithm is presented for median filtering of signals and images. The algorithm determines the kth bit of the median by inspecting the k ...
IEEE TRANSACTIONS ON ACOUSTICS,

SPEECH, AND

SIGNAL PROCESSING, VOL. ASSP-28, NO. 4, AUGUST 1980

41 5

A Fast Method for Real-Time Median Filtering E. A T A " ,

MEMBER,

IEEE, v. K. AATRE, SENIOR

Abstract-A fast real-time algorithm is presented for median filtering of signalsandimages. The algorithm determinesthekthbit of the median by inspecting the k most significant bits of the samples. The total number o f full-word comparison stepsis equal to the wordlength ofthe samples. Speed andhardware complexity of the algorithm is compared with two other fast methods for median filtering.

I.INTRODUCTION HE median filtering (MF) algorithm was first suggested by Tukey [I]. It is applied to smoothing of statistical data byBeatonandTukey [2], to speech processing byRabiner et al. [3], Steele and Goodman [4] , and Jayant [SI, and to image processing by Pratt [6], Frieden [7], and Ataman and Alparslan [SI. Properties of MF were investigated by some of theaboveauthors,and also byJustusson [9]-[l I] , Tyan [ 121, and Ataman et al. [ 131. More recently, there have been some efforts for the development of fast methods for application of MF in off-line processing of images by Huang et al. [14] andGaribottoandLambarelli[15],and fast hardware implementationfor real-time imageprocessing byShamos [ 161, Eversole et al. [ 171, and Narendra [ 181. Most of the above papers and the state of research on MF is surveyed by Aatre et al. [19]. In this paper we are also concernedwith computational aspects of a real-time algorithm and its hardware implementationformedian filtering of signals and images. The median of N elements x i , i = 1, * * * ,N is the t t h largest element for N odd and t = [N/21.' In applications for median filtering of signals and images, a window is moved along the sampled values of the signal or the image and the median of the samples withinthewindow is computed.Themedian value computed in this operation is termed the running (or moving) median. Let, at the nth position (or the nth time instant) of the window, the N elements within the window be denoted by

T

.

xi(n) * * * > X N ( ~ ) ) and the running median (RM) of WN(n)be W N ( ~=) {xl(n), x2(n),

Y N ( n ) = RM[WN(n)I.2

* *

(1)

(2)

MEMBER, IEEE, AND

K. M.WONG

(The indices i, n, and N willbe dropped when there is no possibility of ambiguity.) In the computation of the running medians, after each move (or step) of the window, somenew elements enter the window andsomeoldelements leave it. As the sizeof the window [numberofelements in W'(n)] is constant, the number of incoming and outgoing elements at any stage is the same. This number is denoted by 4. InSection I1 ofthepapera fast methodoffindingthe median of N elements is described and in the following sections this method is applied to the determination of running medians. 11. A

RADIX

METHOD FOR FINDINGTHE MEDIAN

The algorithm described here is based on the binary representationoftheelements in W. Let the radix-2representations of an element xi and the median y be (bi b i , * * , b t ) and ( p 1 p 2 ,* * ,fit), respectively. If the majority of bl (i = 1,* , N ) are equal to 1(0), then, by the definition of the median, p1 = 1(0), that is, we need to know only the first bits (most significant bits) of each of the elements to determine the first bit of the median. With some minor modifications we can extend this ideato the determination of the other bits of the median. In general, in order to determine the first k bits of the median, it would be sufficient to scan only the first k bits of the elementsin W. If p1 = 1, then y will be an element of the set of those elements which have bf = 1. Let the cardinality of this set be C. If some b', = 0, then y is no longer the median of this set which is the [C/21 th largest element in it, but it is the t t h largest element, where t is still t = [N/2] (if all bi = 1, then C = N and y is still the median of the set.) If fil = 0, then y will be the (t - C)th largest element of the set of those elements which have bf = 0. If p1 = 0, then C < [N/21 because otherwise p1 would be 1. Before proceedinganyfurther it will beconvenient to introduce the following notation: Let [r]k be the k bit binary representation of r where Y < 2k - 1. For example, [3] = 00011. We define a partition II [ Y ] ~of W as the mdtiset of all the elements in W of which leftmost k bits are [r]k, where r = 0, 1, * 2k - 1and k = 1 , 2 , * * ,L . When thebinary representation of [ r ] k is used, it will be enclosed by round parentheses, e.g., II [213 = II(Ol0). Each partition II[rlk is a union of two other partitions and is defined by

-

Manuscript received June 11,1979;revised December 28, 1979. This workwas supportedin part bythe NationalResearch Council of Canada. E. Ataman was with the Electronics Division, Marmara Research Institute,Gebze, Turkey, on leaveat Nova Scotia Technical College, Halifax, N. S., Canada. He is now with NCR, Waterloo, Ont., Canada. V. K. Aatre and K. M. Wongare with the Department of Electrical N. S., Canada. Engineering, Nova Scotia Technical College, Halifax, [x1 = the smallest integergreater than or equal to x . 2There might be a multiplicity of elements with identicalvalues in W , n [ ur l~k[=2 nr ][ k2 +r +i 1 1 k + 1 that is, W is a multiset. By the definition of the tth largest element z in a multiset S, there are at most f - 1 elements in S greater than z, and at least r elements greater than or equal to z. or e

,

0096-3518/80/0800-0415$00.75 0 1980 IEEE

(1)

IEEE TRANSACTIONS ON ACOUSTICS. SPEECH, AND SIGNAL PROCESSING, VOL.

416

ASSP-28, NO. 4, AUGUST 1 9 8 0

Once the elements inthe a array are computed, it takes only L comparisons to determine the median. However, it is thecomputation of the a array that takes time and this is discussed in the nextsection. Example: Let L=3,N=9,and W = ( 0 , 1,6,3,7,3,2,4,1}. In binary notation W = (000, 001, 110, 011, 1 1 1 , 011, 010, 100,001). k = l : t = [N/2] =S,m=O n [ 2 m+ I l k = n(1)= (110,111,100)

Fig. 1. The tree of partitions, L = 2.

a(m,k)=3t

* p2 = 1 , m + 2 m + p 2 = 1 .

k = 3 : t = 2 , m =1 n[2m t l ] k = n ( o l 1 ) =(011, 011)

Fig. 2. The binary decision tree for determining pl and p2

0) where 0 is the concatenation ~ p e r a t o r . ~ Let C(r, k ) be the cardinality of n [rIk,i.e., n[rlk

= n([rlk

C(r, k ) =

n([rlk

l)

I n [TI k I

a ( m , k ) = 2 = t * p3 = 1 . ( 2 ) The median is y = (01 1 ) = 3, which can be verified by sorting the elements in W : 0 , 1 , 1 , 2 , 3 , 3 , 4 , 6 , 7 . (3)

and let a be the cardinality of

n [rIk when r is odd, i.e.,

111. COMPUTATIONOF THE a ARRAY We suggest two routines for computation of the a array: the direct routine and a recursive routine.

a(m,k)=III[2mtl]kl=C(2mt1,k),

m =o, 1, * .

a

,

2k-1 - I .

(4)

By the recursion rule (1) we also have

C(r, k ) = C(2r t 1, k t 1) + C(2r, k t 1).

(5 1

The “tree of partitions” and C(r, k ) and a(m, k ) corresponding to each partition is shown in Fig. 1 . If we assume that all the a(m,k), m = 0 , 1 , . ,2k-’ - 1 ; k = 1 , 2 , * . . ,L are computed and stored in the memory (we present two fast routines for computing a(m,k ) in the next section), then the median can be found by determining its bits pi in succession, starting at p1 as follows: Let t = [N/2]. If a(0, 1 ) Z t , that is, if the majority of b: = 1, then y E II (1) and pl = 1. In t h s case, continue the search in II(1). If a ( 0 , l ) < t , that is, if the majority of bi = 0 and p1 = 0 then y will be the (t - a(0, 1))th largest element in II(0). In this case, set t + t - a(0, 1 ) and continue the search in II(0). In general, for determining Elk, starting at k = 1 , t = [N/2] and m = 0, check if a(m, k ) > t. If so, then set pk +- 1. If a(m, k ) < t, then set p k 0 , t + t - a(m,k). Repeat this procedure after setting m 2m t & and k + k + 1 until k > L. (If at any stage a(m,k ) = 1 and t = 1, then one need not continue until k > L as y is the only element in the partition n [ 2 m t ilk.) The decision tree for determination of p k is shown in Fig. 2. f

f

3The union of two rnultisets (kJ) includes all the elements of both sets.

A. The Direct Routine We can compute the a array from its definition, that is, a(m,k)= [n[2m+lIk[, m = 0 , 1 , . * * , 2 k - ’- 1 ; k = 1 , 2 ; . * , L .

(6)

The a array takes 2L memory locations. However, as storing a(m, k ) as a two-dimensional ( 2 L - 1 ,L ) array takes L/2 times more memory locations; it is mapped onto a one-dimensional array defined by k)

-+

P(u)

(7)

where ~=mt2~-’.

(8)

(The flowchart for computingP(u) is given in Fig. 3.)

B. The Recursive Routine Another routineforcomputing the (Y array is as follows. Compute all the C(r, L ) for r = 0,1, * ,2L - 1. This can be done easily since C(r,L ) is the histogram of xi for r = xi. Then compute the other values of C(r, k ) fork < L using the recursive relation ( 5 ) or C(r, k - 1) = C(2r + 1, k ) + C(2r,k )

(9)

where r = 1,2, . * . , Zk-’ and k is taken successively as k = L , L - 1 , * , 2 . The a array can be determined from the C(r, k ) array as a(m, k ) = C(2m + 1 , k).

-

1

417

ATAMAN et el. : REAL-TIME MEDIAN FILTERING R, = REGISTER i (R,)= CONTENTS OF R,

T R j = REGISTER i (Rj) =CONTENTS OF R,

LOAD AN INCOMING OR AN OUTGOING ELEMENT TO R,

.

TOTHE RIGHT

I

DIU\+

Rill)+ 1

ri k-k-1

TO THE RIGHT

THIS THE TO THE RIGHT

YES COMPUTE THE RUNNING MEDIAN

Fig. 3. The flowchart for computing p(u).

Fig. 4. The flowchart for updating p(u) for finding the running median.

Updatingthe p arrayandcomputationoftherunning Although it takes 2L+1 memorylocationsfor storing the C(r, k ) array, this can be reduced by half if the computations median involves for (9) are done in locations of C(r, k), where r is even. 1) 2qL bit checking operations, Run-times of both routines can be analyzed by listing the 2 ) qL memory location increments andq E decrements, and number of operations involved. If we ignore any register level 3) L comparison operations. addition/subtraction/shift or index limit checking operations, the main operations for thedirect routine are Therefore, if 4 < [ N / 2 ] ,computation of running medians is faster than computation of the median Nofelements. 1) L N bit checking operations, and 2 ) L . N memory location increment operations (this is in V. HARDWARE CONSIDERATIONS the worst case). The hardware required for computing the running medians The main operations for therecursive routine are is the following: 2L wordsofmemoryanda logic unit for arithmetic andlogic operations. 1) N memory location increment operations for computing The algorithm is suitable for parallel computations, which C(r,L ) , and would speed up the computation of the a array by a factor 2 ) approximately 2L additionoperationsforcomputing of L. However, in this case, L logic andmemoryunits are C(r, k), k 5. Therefore, a direct application of the selection networks is impractical. In view of this, Shamos suggested generalization of an idea, due to Tukey [22], for computation of an approximation A(uj to the exact median, where uz = N (assuming flis an integer).

FILTERING ATAMAN MEDIAN et al. : REAL-TIME

419

The algorithm is applied as follows: At first the Nelements within the window are partitioned into u groups of u elements each. The exact median for each group Bi, j = 1, . . . ,u is found by using u selection networks yi(u),j = 1, * * * ,u. After all the Bi are found, one of the selection networks say yl(u) is used for finding A(u) as the median of the medians (of the partitions), Bi, j = 1 , 2 , * , u. This entity, medianofthe medians, has been called “ninther” for u2 = 9 by Tukey [22], and itis an approximation of the exact median y. Let the t t h largest element of N elements be O(t), then the rank (or order) of O(t) is t and the rank of y is [N/21. For N = 25, y = 8(13). By simulation,Shamoscomputedthe probability density of rank ofA(5). He found out that

-

and then columnwisewouldmake it less liable to large approximation errors (if this is considered as an approximation tothe squarewindow). However, theseparable filter has merit in itself andshould not alwaysbe considered as an approximation. The method given by Huang et al. and Garibotto and Lambarelli is based on the histogram of elements in W. Let h,(a) be the number of elements x i @ ) = a and let X,(o) be

By definition of the median, h,(y(n)) = [N/21. The algorithm of the method is as follows. 1) After the window is moved to the position n + 1, h,(a) ~ ( 4 5 =) e(13)) (1 = 0.29 1) and h,(y(n)) are updated, that is, h,+,(a) and X,+,(y(n)) are P ( A ( ~=) e(12)) = P ( A ( ~=) o(14)) = 0.2162. (12) computed.Computationof X,+l(y(n)) can be carried out easily by incrementing/decrementing X,(y (n)) for each Therefore, rank of A(5) is within 1 rank o f y , with a probabilincoming/outgoing element less than or equal toy@). ity of 0.72. 2) If X,+l(y(n)) = [N/2] then y ( n + 1) = y(n). If However, the difference between thevalue of A(5) and y can X,+ ( y (n)) is greater/less than [N/2] , then decrement/increbe as large as 2L - 1. For example, consider the case ment w by 1 until h,+l(o) = [N/2] ;then,y(n + 1) = a. x1 = x2 = . = xg = 0 and xl0 = xll * = x25 = 255, Huang et al. discovered that the average numberofcomfast (1 3) parisons is 2q + 10. This makes the histogram method a off-line algorithm. However, for on-line applications, one has which means y = 255 and O(t > 10) = 255. However, it is pos- to consider the number of comparisons needed in the worst sible to have A(5) = 0. case, which is2q t 2L - 1. Thiswouldhappen if ly(n) A case like (13) can occur at the edges of the objects where y(n + 1)1 = 2L - 1. Therefore, this method might be somethere are sharp discontinuities in the graylevels. Therefore, what slow for real-time applications. this algorithm may cause large errors near the edges. VII. CONCLUSIONS Shamos maintains that 35 comparators are needed for computation of A(5), as Z(5) = 7, and there are five y(5) selection We have described three fast methods for finding the running networks. Actually, fewer comparators would be needed for median: the digital and analog selection network, histogram, the computation of the running medians if one can partition andradixmethods.Thethreemethods are compared in the N elements such that some of the Bi, j = 1, * , u would Table I. The analog selection network is not included in the remain unchanged after each move of the window. This can table as none of the comparison criteria in the table are applibe done if q = fl (as it is the case in square windows). In cable to it. this case, if all the incoming elements are put into the partiIt is not easy to suggest one method for all cases. Speed of tion of all the outgoing elements,then the median of onlythis the digital selection network (DSN) method is not dependent partition needs to be computed. Therefore, Z(q) comparators on L , althoughcomplexityof the hardware is. Therefore, wouldbe sufficient afterthe first move ofthewindow. If this method would be relatively faster for small N . However, T(u) is the number of comparison steps (time-delay) in y(v), there is no general theory for constructing DSN for large N . then a total of 2T(u) steps would be required. For N = 25, Moreover, this is an approximate method. 2T(5) = 12. Speedandhardwarecomplexity of radixandhistogram The analog selection network method suggested by Narendra methods does not depend on N , but on q and L . Therefore, [ 181 is similar to the digital selection network method in prin- these methods would be more suitable for caseswhen N is ciple, but makes use of the analog selection networks. These large and q and L are small. Worst case run-time of the histonetworks are first proposed by Morgan [23] for selecting the gram method grows exponentiallywith L , whereasfor the t t h largest element by using diode comparator networks. In radix method itis proportional to L . these networks, as the amount of hardware increases exponThe radix method can be used as an approximate method, entially by N , they are feasible only for small values of N , e.g., as well as anexactmethod.Inthelatter,theapproximate median yA may or may not be in the set W ’ as opposed to the N < 9.In order to circumvent the prohibitivelycomplex hardware for large values of N and also to improve the speed DSN method, which always picks yA [or rather A(u)] from W .’ However, in thismethodtheerrorbetweentheexact ofthealgorithm,Narendra suggested a“separablemedian as large as filter” for pictures. Inthis algorithm the picture is first median y andtheapproximatemediancanbe filtered rowwiseusing a 1 X fl (horizontal) line window 2L - 1, whereas in the radix method the error is guaranteed to and then filtered columnwise using a fiX 1 (vertical) line be less than 2d-1. Also, if L - d 2 [log, N ] and if all the 2L ’ (that is, if window. This operation is the same as finding the median of possible elements are equally likely to be in W the medians, but application of the algorithm first rowwise they are uniformly distributed), then it is highly probable that

-

- -

420

IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL. ASSP-28, NO. 4 , AUGUST 1980

COMPARISON OF

THE

TABLE I THREE METHODS FOR COMPUTING RUNNING MEDIANS(NUMBERS TO THE CASEN = 25, q 5, L = 8, AND d 4) NUMBER OF MEMORY INCREMENTS/DECREMENTS

METHODS

NUMBER OF BIT COMPARISONSTEPS

IN

NUMBER OF FULL-WORD COMPARISON STEPS REQUIREMENTS (WORDS)

Histogram (Exact)

2q+10 (20) average 2q+ZL-1 (265)worst case

PARENTHESESCORRESPOND

MEMORY

EL ( 2 5 6 )

R a d i x( S e r i a l ) (Exact)

R a d i x( P a r a l l e l ) (Exact)

Histogram (approximation)

h+lO(L-d) /L (1 average 2q+2L-d-1 case

5) 2L-d(16)

( 2 5 ) worst

R a d i x( S e r i a l ) (Approximation)

Radix (Para1 le1) (Approximation) D i g i t a lS e l e c t i o r Network(Approxin t3on)

y~ would be a “pseudomedian” y p ,that is,its rank inthe set Wl& = w, {Y,> - { Y ) is [N/21. The histogram method can also be adapted to computation of a pseudomedian. For this, all we need to do is to ignore the rightmost (least significant) d bits of the elements. This would reduce the numberof histogram bins to 2L-d. Thenthe number of comparisons in the worst casewould reduce to 2q + 2 L - d - L .

REFERENCES [ 11 J. W. Tukey, “Nonlinear (nonsuperposable) methods for smoothing data,” in Conf Rec., I974 EASCON, p. 673. of power series, [2] A.E. Beatonand J. W. Tukey,“Thefitting meaning polynomials,illustrated on band-spectroscopicdata,” Technometrics, vol. 16, pp. 147-185, May 1974. [3] L. R. Rabiner, M. R. Sambur,and C. E. Schmidt,“Applications of a nonlinear smoothing algorithm to speech processing,”ZEEE Trans. Acoust.,Speech, Signal Processing, vol.ASSP-23, pp. 552-557, Dec. 1975. [4] R. Steele and D. J. Goodman, “Detection and selective smoothing of transmission errors in linear PCM,”Bell Syst. Tech. J., vol. 56, pp. 300-409, Mar. 1977. [ 5 ] N. S . Jayant, “Average and median based smoothing techniques forimprovingdigitalspeechqualityin the presence of transmission errors,” ZEEE Trans. Commun., vol. COM-24, pp. 10431045, Sept. 1976. [6] W. K. Pratt, DigitalImage Processing. New York, NY:WileyInterscience, 1978. the preferential [7] B. R. Frieden, “A newrestoringalgorithmfor enhancement of edge gradients,” J. Opt. SOC.Amer., vol. 66, pp. 280-283,1976. ] E. Atamanand E.Alparslan, “Application of medianfiltering algorithm to images,” Electronics Division,MarmaraResearch Institute, Gebze, Turkey, Tech. Rep. UI 78/10, Sept. 1978. ] B. Justusson, “Noise reductionbymedian filtering,” Roy.Inst. Technol., Stockholm, Sweden,Tech. Rep. TRITA-MAT-1978-7, Feb. 1978.

2L-d(16)

~

[ l o ] B. Justusson,“Orderstatistics onstationaryrandom processes, withapplications to movingmedians,” Roy.Inst. Technol., Stockholm, Sweden, Tech. Rep. TRITA-MAT-1979-1, Jan. 1979. [ l l ] B. Justusson, “Median filtering: Statistical properties,” Roy. Inst. Technol., Stockholm, Sweden, Tech. Rep. TRITA-MAT-1979-11, Aug. 1979;to appear in T. S. Huang, Ed., “Two-dimensional TopicsinApplied Physics. Berlin, transformsandfilters,”in Germany: Springer-Verlag, ch. 8. [12] S . G. Tyan,“Fixedpointsofrunningmedians”(unpublished report), Dep.Elec.Eng. Electrophys.,Polytechnic Institute of New York, Brooklyn, NY, 1977. statistical [13] E. Ataman, V. K. Aatre,and K. M. Wong,“Some properties of median filters,” to be published. [ 141 T. S. Huang, G. T. Yang, and G. Y. Tang, “A fast two-dimensional median filtering algorithm,” ZEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-27, pp. 13-18, Feb., 1979. [ 15 ] G . Garibotto and L. Lambarelli, “Fast on-line implementation of two dimensional median filtering,” Electron. Lett., vol. 15, pp. 24-25, Jan. 1979. processing operatorsandtheir [16] M. I. Shamos,“Robustpicture implementation as circuits,” in Proc. Image Understanding Workshop, Pittsburgh, PA, Nov. 1978, pp. 127-129. [17] W. L. Eversole, D. J. Mayer, F. B. Frazee, and T. F. Cheek, Jr., “Investigation ofVLSItechnologiesfor imageprocessing.” in Proc. Image Understanding Workshop, Pittsbuigh, PA,-Nov. 1978, PP. 191-195. [ 181 P. Ml -Nxendra, “A separablemedianfilterfor image noise smoothing,” in Proc. I978 Con$ Pattern Recognition and Image Processing, Chicago, IL, May 1978, pp. 137-141. [19] V. K. Aatre, E. Ataman,and K. M. Wong,“Medianfiltering,” presented at 17th Allerton Conf., Monticello, IL, Oct. 1979. [20] D.L. Knuth, Sortingand Searching. Reading, MA: AddisonWesley, 1975. [21] V. E.Alekseyev, “Sorting algorithms with minimum memory,” Kibernetika, vol. 5,pp. 99-103,1969. 1221 J. W. Tukey, “The ninther,atechniqueforlow-effectrobust (resistant) location in large samples,” in Contributions to Survey Sampling and AppliedStatistics, H.A. David, Ed. New York: Academic, 1978, pp. 251-257. [23] D. R.Morgan,“Analog sortingnetworkranksinputsbyamplitude and allows selection,” Electron. Design, pp. 72-73, Jan. 18, 1973;also “Correction,” Electron. Design, p. 1, Aug. 16, 1973.

IEEE TRANSACTIONS

ON ACOUSTICS, SPEECH,

AND SIGNAL PROCESSING, VOL.

E. Ataman (S’68-M’75) was born in Malatya, Turkey. He received the B.S.E.E. and M.S.E.E. degrees from the Middle East Technical University, Ankara, Turkey in 1967 and 1968, respectively, and the Ph.D. degree in electrical and computer engineering fromthe University of Wisconsin, Madison, WI, in 1972. From 1973-1979, he was with the Electronics Division of Marmara Research Institute, Gebze, Turkey. During 1979, he was on leave at Nova Scotia Technical College, Halifax, N.S., Canada. Heis currently with NCR, Waterloo, Ont., Canada. His research interests are digital signal and image processing, character recognition, fast algorithms, and digitalthe hardware structures. and

ASSP-28, NO. 4 , AUGUST 1980 42

1

He joined Nova Scotia Technical College, Halifax N.S., Canada, in 1968 and is currentlyan Associate Professor. In 1977he was a Visiting Scientist with the Departmentof Electrical Engineering, Indian Institute of Science. His teaching and research interests are in network theory, active and digital filtering, and digital signal processing.

K. M. Wong was born in Macao, China. He received the B.Sc. (Eng.) degree from the University of London,London,England, in 1969 D.I.C. and Ph.D. degrees Imperial from College, University of London,London,England, in 1972 and 1974, respectively. He joined the Transmission Division of V. K. Aatre (S’65-MY67-SM’78)was bornin Plessey Telecommunications Research Ltd., 1970 he was on leave Bangalore, India. He received the B.E. degree England. In October from the UniversityMysore, of India, the M.E. from Plessey pursuing further studies and redegree Indian from the Institute of Science, search at Imperial College, London University, Bangalore, India, and the Ph.D. degree from the London, England. In 1972 he rejoined Plessey and was working on University of Waterloo, Ont.,Canada, all in digital signal processing and signal transmission. Since 1976he has electrical engineering, in 1961,1963, and 1967, been an Assistant Professor with the Department of Electrical Engineerrespectively. ing, Nova Scotia Technical College, Halifax, N.S., Canada. From1967-1968,he was Postdoctoral a Dr. Wong is a member of the Institution of Electrical Engineers and Fellow in the Department of Electrical Engi- the Institute of Physics, and a Fellow of the Royal Statistical Society, neering, University of Waterloo, Ont., Canada. all of which are inEngland.

Windowless Techniques for LPC Analysis THOMAS P. BARNWELL, I11

11. THEORYAND BACKGROUND Most LPC vocoders can be represented by the block diagram of Fig. 1. In all cases, the speech signal is first sampled to produce the input sequence {si}, and then two types of feature extraction are performed. The first feature extraction, called the “LPC analysis algorithm,” consists of estimatingparameters in an all-pole digital filter model so that the spectrum of I.INTRODUCTION the transfer function of the digital filter approximatesthe HIS paper examines two refinements to the linear predic- spectrum of the transfer function formed by combining the tive coding (LPC) algorithmforspeech analysis. In effects of the glottal pulse shape, the shape of the upper vocal neither of these methods is the input speech signal multiplied tract, and the effect of radiation from the mouth. Numerous byan explicit windowfunctionbefore analysis, yetboth forms for the digital filter model and for the analysis algorithm methodsproduce linear predictor coefficients which always have been presented in the literature [ 11, 121, [7], [ 121, [ 171, correspond to predictorpolynomials whose roots areinside [18]. The second feature extraction, called the “pitch period the unit circle. Experiments are designed to study the quality algorithm,” consists of making a voiced-unvoiced decision for and acceptability of the spectral estimates produced by these the input speech and estimating the fundamental frequency of methods in LPC vocoder applications. Theexperiments sug- the excitation (F,) for the voiced sounds. This algorithm may gest that both of the methods considered produce acceptable either operate on the input speech signal, or may operate in spectral estimates using fewer speech samples than the other conjunction with the LPC analysis algorithm. Numerous pitch methods which require the speech data to be multiplied by a period detectors have been presented in the literature [2], [6], window function. ~ 3 1 [151, , ~91. For the purposes of this paper, the following form of the Manuscript received May 21, 1976; revised October 30, 1978. This “LPC analysis algorithm” is of interest. The input sequence is work was supported by the National Science Foundation under Grants first divided into frames at a fixed frame interval of L samples. NSF-GK-37451 and ENG76-02029. The author is with the School of Electrical Engineering, Georgia In- An analysis window length M is determined for each frame stitute of Technology, Atlanta,GA 30332. (this may be fixed or variable). Over each analysis window, it Abstract-Thepurpose of thiswork was tostudy,experimentally, two windowless LPC analysis algorithms for use in speech digitization. Thetwoalgorithmsareacircularautocorrelationtechniquewhich utilizes the pseudoperiodic nature of voiced speed, and a reflection coefficient estimation technique suggestion by J. P. Burg. Both techniques showed considerable promise in the experimental results.

T

0096-3518/80/0800-0421$00.75 0 1980 IEEE

Suggest Documents