Estimation of Wavefront Arrival Delay Using the Cross ... - CiteSeerX

11 downloads 5473 Views 270KB Size Report
Cross-Power Spectrum Phase Technique. Daniel V. Rabinkin ... lar method for TDOA estimation is the Cross-Power. Spectrum ..... \A Real-Time Desktop Micro-.
Estimation of Wavefront Arrival Delay Using the Cross-Power Spectrum Phase Technique Daniel V. Rabinkin, Richard J. Renomeron, Joseph C. French and James L. Flanagan Center for Computer Aids for Industrial Productivity, Rutgers Univ. P.O. Box 1390, Piscataway, NJ 08855-1390 World Wide Web: http://www.caip.rutgers.edu/multimedia/marrays Presented at the 132nd Meeting of the Acoustical Society of America, Honolulu, HI, USA, December 4, 1996

Abstract

locate a sound source and obtain its coordinates. The location of the source can be used to automatically aim other devices in such applications as pointing a video-camera at a performer or directing machine gun re at a sniper. An array can be \steered" via beamforming techniques to provide spatially selective sound capture. Beamforming attains signal gain through in-phase addition of the captured sound signals. The array can be \aimed" at a desired spatial location and provide selective gain to sound originating from that region. Examples of beamforming array systems are presented in [5, 6]. In order to aim the beam at a sound source, the coordinates of the sound source must be known to a certain degree of precision. The spatial resolution and SNR gain provided by a beamforming system grow as a function of the number of microphones in the array. Hence a system with a larger number of microphones requires more accurate location estimates. As advances in DSP and computing technology allow higher channel systems to be built [10], it becomes increasingly critical to derive precise source position estimates. Time delay of arrival (TDOA) estimation based source location has been the technique of choice for source location in recent array systems [7]. A popular method for TDOA estimation is the Cross-Power Spectrum Phase (CPSP) method of Omologo and Svaizer [8]. A modi ed version of this algorithm has been used in systems implemented by the authors [9] and is similar in performance to other phase based methods such as [3]. The TDOA technique is based

In order to provide spatially selective sound capture in a teleconferencing environment, microphone array systems require accurate determination of the location of the desired source. Sound source location in turn relies on estimating time delay of arrival (TDOA) of a sound wavefront across a given microphone sensor pair. The crosspower spectrum phase (CPSP) method [M. Omologo and P. Svaizer, ICASSP 94] may be used for TDOA estimation. It is desirable to operate microphone array systems in untreated acoustical environments. Such an environment may produce multipath sound propagation and may contain moderate sources of interfering noise. The TDOA estimates computed using CPSP may become unreliable in such an environment. A strategy is presented to extract reliable TDOA information from the CPSP algorithm. It includes: spectral weighting of the CPSP function based on knowledge of noise and sound source statistics; measuring the accuracy of the CPSP function by examining its shape; and, evaluating the CPSP only when the captured signal has appropriate energy and spectral characteristics. This strategy enables improved performance, and allows more e ective utilization of computing resources in a real-time system.

1 Introduction Microphone array technology can potentially eliminate the need for close-talking sensors or body-worn sound equipment in high delity sound capture applications. In addition, array technology can be used to 1

on a two step process as shown in Figure 1. The time

X0 X1

M1 d1 Vs d t2 = 2 Vs

XM-1

t1 = D

d1

θ

TDOA’s (1st Step)

d

{Di,j}

M2 d2

Source Location (2nd Step)

S

Figure 2: Sound propagation from source to sensor pair

{X,Y,Z} Figure 1: A two stage source TDOA based source location algorithm

STP conditions is assumed to be 342 meters/second. Consider a microphone pair M1 and M2 , and a stationary sound source S as shown in Figure 2. It is assumed that the sound source acts as a perfect point source. Such a source could model the mouth of a talking person, a musical instrument, or any other localized sound generator. The sound source is distance d1 from M1 and distance d2 from M2 . Correspondingly, sound will take time t1 to travel from S to M1 and time t2 to travel from S to M2 , where t1 = s and t2 = s . The goal of TDOA estimation is to determine the di erence in propagation delay times: d ?d D12 = t1 ? t2 = 1 2 (1) V If the sound source emits a signal s(t), the microphones will capture the signals: x (t) = s(t ? t ) + n (t) i = 1; 2 (2) The sensor captures the source signal with delay t and an attenuation constant , which varies with distance d and is shown in [4] to be : 1 (3) /

di erences of arrival of a sound wavefront across selected sensor pairs is computed in the rst step. The set of di erences is used to estimate the sound source location in the second step. A numerical method such as described in [9] may be used to calculate Cartesian coordinates from the TDOA estimates. The accuracy of the second step is dependent on the error incurred in obtaining a set of time delay estimates. Brandstein [2] derives a relationship between array geometry and error distribution for TDOA-based position estimates that are modeled as zero mean independent identically distributed (IID) gaussian random variables. However, in a real environment, multipath sound propagation and room noise also a ect the accuracy of the TDOA estimates, which in turn a ect the accuracy of the position estimates. Two goals are addressed in this work: Characterizing the error of TDOA estimation based on the CPSP algorithm as a function of array geometry and acoustic environment, and prescribing techniques to reduce the estimation error.

d1

d2

V

V

s

i

i

i

i

i

i

i

i

2 TDOA Estimation

d

i

The noise n (t) may be expressed as a sum of two Sound propagates in air with velocity V . This veloc- components: n (t) = n (t) + n (t) (4) ity is temperature and pressure dependent but under i

s

i

2

I;i

R;i

where n (t) is interference noise due to competing sound sources, and n (t) is the reverberation signal produced due to indirect path sound arrival. The former is generally uncorrelated with s(t) and could be produced by ceiling fans, movement or whispering from other people in a room, or general air circulation, while the latter is due to source-produced sound waves re ecting from surrounding enclosure walls and arriving with delay di erent from t . Re ections are often modeled with the image method [1]. A signal received at a sensor due to the source may be written as: M (t) = h (t)  s(t) (5) where 1 X (6) h (t) = c  (t ?  )

or equivalently:

I;i

s(i) = s

R;i

The assumption will be made that all delays ft g can be suciently resolved with an integer number of sampling periods. Then the delays t1 and t2 from equation (1) are equivalent to: t1 = M1  T (10) t2 = M2  T (11) where fM g are integers and where T is the sampling period. The sequence captured at the microphones is: x1(i) = 1s1 (i) + n1(i) (12) x2(i) = 2s2 (i) + n2(i) (13) where s1 (i) = s(i ? M1 ) (14) s2 (i) = s(i ? M2 ) (15) and fn g are the sampled noise processes described in equation (4). A vector notation is de ned where a bold letter represents an N -length vector of sequence samples: s(i) = s(i) i = 0 : : : (N ? 1) (16) Equations (14) and (15) may be separated and rewritten in the notation of (16) as: s1(i) = ^s1(i) + 1(i) (17) s2(i) = ^s2(i) + 2(i) (18) where:  s(i ? M ) : (N ? 1) ^s (i) = s(i ? M + N ) ii == 0M: : :: (: M ? 1) j = 1; 2 (19) and 80 i = M : : : (N ? 1) <  (i) = s(i ? M ) ? s(i ? M + N ) : i = 0 : : : (M ? 1) j = 1; 2 (20) The  may be termed as window distortion noise for reasons to be described shortly. Equations (12) and (13) now become: x1 (i) = 1^s1(i) + 11(i) + n1(i) (21) x2 (i) = 2^s2(i) + 22(i) + n2(i) (22) i

i

R;i

i

i

j

i

j

j =0

and c0 = 0, 0 = t are due to the direct arrival path. Since the direct path is the shortest,  > t for j  0. Typically, the magnitude of the re ections will attenuate with time, hence c < for all j  1. In fact, the magnitude of c is shown to decay exponentially with t in an absorbing room with no obstructions between the source and sensor (i.e., the magnitude of re ections decays exponentially with time). Equation (5) may now be rewritten as: i

j

j

i

i

j

j

j

M

R;i

1 0 1 X (t) = @  (t ? t ) + c  (t ?  )A  s(t) i

j

i

j =1

= s(t ? t ) + n (t) i

i

R;i

j

(7)

which indicates n (t) is correlated with s(t). It is desired to develop a method to estimate D12 from the captured signals x1(t) and x2(t). Such a method will be derived in Section 3. It is also desired to gauge the accuracy of the estimation at varying levels of n (t), n (t) and as a function of array source geometry as de ned by fd1; d2; Dg. This, along with methods to improve the accuracy of the TDOA estimation, will be discussed in Section 4. R;i

R;i

j

j

I;i

j

j

j

j

j

3 TDOA Estimation Algorithm

j

j

j

In this section, a method of computing a TDOA estimate from discretely sampled frames of captured microphone signal will be examined. The source produces a signal that can be sampled as s = s 0 ; s1 ; s 2 ; : : :

(9)

i

j

(8) 3

The  function in (31) is the unit pulse. It is 1 at i = M and zero at all other i; hence it has an unambiguous maximum at M . If it is assumed that jD12j < NT=2, then (1), (10), (11), and (32) can be combined to express D12 as: M T M < N=2 D12 = (M ? N )T M (33)  N=2 Finally, if the time resolution of the TDOA estimate needs to be better than a single sampling period, interpolation of the IDFT result is possible in (25). Fitting the IDFT result using the sin(x)=x function is the simplest way to achieve a noninteger i in (31). Relaxing the zero noise assumption on the performance of the TDOA estimator allows equations (21) and (22) to be rewritten as: x1(i) = 1^s1 (i) + e1(i) (34) x2(i) = 2^s2 (i) + e2(i) (35) where e1 =  1 + n 1 (36) e2 =  2 + n 2 (37) The argument of the IDFT in (25) is now:

It is easily shown that if: F S(k) s(i) $

then

F W ^s (i) $

D

(23)

D

S(k)

(24) where S and s are a Discrete Fourier Transform (DFT) pair 1 and W = e? Nj . The expression for the CPSP TDOA estimator in the discrete nite frame domain is: kM1 N

1

D

2

N

 X X  CP SP = IDF T 1

2

j X1 jj X2 j

D0 = i : max CP SP (i) i

(25) (26)

If the over-simplistic assumption is made that the terms  and n are suciently close to zero to be negligible, expressions (21) and (22) may be simpli ed to: x1(i) = 1^s1(i) (27) x2(i) = 2^s2(i) (28) Substituting (27) and (28) into (25) produces: j

j

! S^ S^  j S^ jj S^ j

D0 = i : max IDF T

1

i

2

D

D

(29)



D





S + E1 W S + E2 The argument of the IDF T operator above may be arg = W S + E1 W S + E2 simpli ed using (24):    W ? SS = W S W S W S + E1 W S + E2 arg = W S W S ? S E + E E  W SE + W + ? W S + E1 W S + E2 = W j k M ?M (30) = e? N = A+B (38) To compute the TDOA of the arg: The quantities A and B will be used as place holders 1

2

1W

kM1 N

kM1

1

kM1

1

N

kM1 N

1

k (M1

1

kM2

2

N

2

kM2 N

(

1

kM1 N

1

M2 )

1

2)

2

k (M1

kM2 N

2

N

kM2

M2 )

N

kM1 N

1

N

2

N

2

kM2 N

2

2

2

kM1

N

kM2

2

N

1

1

2

kM2

N

for the two fractions in (38). Expression (25) can be j2k(M1 ?M2 ) N ) D0 = i : max IDF T (e? split up as: = i : max  (i ? (M )) D0 = i : max (IDF T (A) + IDF T (B)) (39) = M (31) i

i

D

i

Notice that A may be rewritten as:

D

where M

D

 M ?M =

M1  M2 1 2 M1 ? M2 + N M1 < M2

A = F (k)W

(32)

where

The convention will be used that an upper case letter S representsa vector that is the DFT of lowercase vector s. IDFT de nes the inverse DFT transform.

F (k) = 1W

1

4

kM N

k (M1 N

?

(40)

M2 )

1 2 SS 1 S + E1 2W

kM2 N

(41)

S + E2

is a real non-negative function of k. Let a(i) be the IDFT of A: ?1 X a(i) = 1 F (k)W ( ? )W? (42) N

N

k M1

M2

ik

N

N

k =0

?1 X F (k)W = N1 N

k (M N

d?i)

(45) to be false. The peakedness of the CPSP function is therefore a good measure of the reliability of the TDOA estimate. A function is proposed to measure the peakedness of the CPSP: 0 ) CP SP (D12 (48) P RR = P ?1 (43) =0 6= 0 jCP SP (i)j N i

k =0

d

4 Data Analysis

d

d

D12

The performance of the P RR as an estimator of TDOA error is discussed in Section 4.

Since F (k) is real and non-negative: a(M )  a(i) for all i 6= M (44) This indicates that IDF T (A) is maximum at M . In order for the TDOA to be estimated incorrectly, the inequality b(i)  a(M ) ? a(i) for some i 6= M (45) must hold. The error in the TDOA estimate is therefore de ned as: E = jD^ 12 ? D12jT (46) where  D0 0 < N=2 ^ D12 = N ? D0 D (47) D0  N=2 Low levels of fE1; E2 g tend to drive a(i) to resemble  () and b(i) to be small. Under such conditions, (45) will remain false. Larger values of fE1 ; E2g may cause incorrect estimates of theTDOA to be produced. The likelihood of (45) is dependent on noise level and spectrum, as well as the spectrum of the source signal. It is also dependent on the geometry of the source relative to the array. The portion of noise f1; 2g in (36) diminishes with jD12j and is zero for D12 = 0. Thus a broadside orientation of the sensor pair to the source will diminish that portion of the noise. Generally, however, M =N is quite small so the portion of noise f1; 2g due to windowing e ects is non-dominant. The portion of noise due to interfering sources and reverberation poses a more serious problem. Generally, noise that is spatially distributed tends to cause little harm. Noise that is spatially localized, such as a competing source or a strong re ection, may cause (45) to be true. Data collected in various accoustical environments which characterizes the performance of the CPSP estimator is described in Section 4. d

i

Speech recordings produced by a source with known coordinates were collected in selected enclosures with an 8 sensor microphone array. The collection was performed using an 8 channel Sigma-Delta A/D system sampling at 16 KHz. Analysis of TDOA estimation results based on these samples is presented in this section. Estimation error is plotted with respect to distance, angle, and spread of the microphone array pair.

d

4.1 Acoustic Enclosures

Microphone sensors were placed at measured locations in four di erent chambers with varying acoustical properties. The following enclosures were used: Anechoic Chamber An anechoic enclosure was used as a baseline for comparison. The chamber is a 10  10  10 meter enclosure located at the Murray Hill Bell Laboratories facility and is shown in Figure 3. It has double wall insulation and is padded with berglass wedges. Its noise and re ection level is estimated to be at least 50dB below the excitation speech level at the microphone sensors. CAIP Laboratory Room A hard walled room about 4  4 meters in dimension and a 2.5 meter ceiling as shown in Figure 4. The room has a fair amount of fan noise (noise level was measured to be 45 dBA) and a moderate reverberation time (approximately 200 ms). Bellcore Auditorium An irregularly shaped room about 10  8 meters in dimension with 2.5 meter 3.1 Peak to Floor Ratio ceiling and partially padded walls as shown in High signal to noise ratio at the microphone will proFigure 5. It has a moderate reverberation time duce a peaky CPSP vector, and will tend to cause and a moderate amount of background noise. d

5

4 Meters

4 Meters

Array Microphones

Recording Locations

3 Meters

Figure 3: Bell Labs Anechoic Chamber

Figure 4: CAIP Lab Room

CAIP Auditorium A sloping room about 10  15

meters in dimension and a 4 meter ceiling as shown in Figure 6. The auditorium has hard cement walls and a steel beam ceiling that produce an extremely high reverberation time (approximately 1.2 seconds). The background noise level is moderate (44 dBA).

Stage Microphones

8 Meters

4.2 Excitation Signals The speech captured in by the arrays in the CAIP lab room, CAIP Auditorium, and Bellcore auditorium was generated by live subjects. The mouths of two male talkers were placed at known positions, and the test phrase \I have a question! How much are tomatoes?" was uttered repeatedly. The speech captured in the anechoic chamber was generated using a loudspeaker placed at pre-measured locations. Four pre-recorded sentences from the TIMIT Database [11] were used as loudspeaker excitation signals - two from male talkers and two from female talkers.

Talker locations 10 Meters

Belcore Auditorium

Figure 5: Bellcore Auditorium 6

Array Microphones

TDOA Error vs. Angle

15 Meters

Error in Mean Square Time (mS)

0.5

0.4

CAIP Auditorium Bellcore Auditorium CAIP Lab Room Anechoic Chamber

0.3

0.2

0.1

Recording Locations

0 0

10 Meters

10

20

30 40 50 60 70 Angle of Arrival Off Broadside (degrees)

80

90

Figure 7: RMS TDOA error E vs. arrival angle 

Figure 6: CAIP Auditorium

4.3 Data Analysis

TDOA performance has been plotted as a function of sensor geometry parameters D, d, and  in Figure 2. Speci cally, root mean square (RMS) TDOA estimation error E in expression (46), and RMS Peak to Average signal ratio PRR are plotted as functions of the mentioned variables for speech gathered in the rooms described in Section 4.1. Figures 7 and 8 show E and PRR as functions of angle. The rst striking feature of Figure 7 is the variation in performance observed between the four rooms. The anechoic chamber, which has virtually no interfering or reverberant noise, produces estimates that are awless regardless of angle 2 . The next in quality is the small relatively quiet CAIP lab room followed by a more noisy Bellcore auditorium, and nally by the tremendously reverberant and fairly noisy CAIP auditorium. This pattern repeats itself as a function of D and d as well and indicates the relative overall severity of these enclosures as environments for

CPSP Peak to Avg. Ratio vs. Angle 0.9 0.8 0.7

Normalized Ratio

0.6 0.5 0.4 0.3 0.2 0.1 0 0

10

20

30 40 50 60 70 Angle of Arrival Off Broadside (degrees)

80

90

Figure 8: Normalized PRR vs. arrival angle 

The tiny bump on the curve that may be barely seen at the 50 mark is due to a string of 3 bad frames, which we suspect were caused by something falling on the oor during the recording sequence. 2

7

CPSP Peak to Avg. Ratio vs. Distance

TDOA Error vs. Distance 2.5

CAIP Auditorium Bellcore Auditorium CAIP Lab Room Anechoic Chamber

0.3

0.2

Normalized Ratio

Error in Mean Square Time (mS)

2 0.25

0.15

0.1

1.5

1

0.05 0.5 0

−0.05 0

1

2

3 4 5 6 7 Distance from Source to Midpoint (Meters)

8

9

0 0

10

1

2

3 4 5 6 7 Distance from Source to Midpoint (Meters)

8

9

10

Figure 9: RMS TDOA error E vs. source to pair Figure 10: Normalized PRR vs. source to pair disdistance d tance d Figure 12 displays a marked decrease with greater distance spread, indicating that unequal levels of 1 and 2 produce an overall decrease in PRR. The distribution of TDOA error ts the Gaussian model very well in the observed data. Plots of TDOA histograms, and their Gaussian ts are given in Figures 13 and 14 for the anechoic chamber and the CAIP lab room. There is a very strong connection between the PRR level and the TDOA estimator. Figure 15 plots TDOA error E vs the PRR level for data collected in the three \real" rooms. Levels of PRR above a normalized 0.2 produce a TDOA error that is well inside the range of acceptable. A simple strategy for location improvement suggests discarding, or weighting very lightly, those estimates whose PRR's are too low, when passing TDOA estimates to the second stage of the source location algorithm.

source location. There is a tendency for E to increase with . This is especially pronounced in the larger rooms where reverberation dominates the ambient noise. The PRR in Figure 8 exhibits the reverse behavior of E producing very low values for the CAIP auditorium and high values for the anechoic chamber and CAIP lab room. There is little apparent dependence of the PRR on the angle of arrival. Figures 9 and 10 show E and PRR as functions of distance d. The error shows an increase with distance; this is especially clear in the case of the CAIP lab room environment where interference noise dominates the total noise. The PRR in Figure 10 shows a very clear decreasing behavior with distance d. In fact, d can be accurately estimated based on PRR for a given room. Figures 11 and 12 show E and PRR as functions of sensor spread D. In Figure 11, E indicates a minimum at about a 20-30 centimeter spread for the Bellcore auditorium and CAIP lab room. Insucient variation in D exists in the CAIP Auditorium data to produce conclusions. The E measured in the anechoic chamber was too low too exhibit noticeable trends with D. Although there is insucient data to provide clear plots of E as a function of spread for D > 0:5 meters in more severe acoustical environments, the authors observed a clear increase in E for larger spread values. The plot of PRR in

5 Conclusion In this paper, several issues involved in the estimation of Time Delays of Arrival (TDOA) have been explored. TDOA Estimation is an integral part of the automatic location of sound sources using microphone arrays. Source location can be performed in two steps: the computation of the TDOA estimates and determination of the source coordinates from the 8

Histogram of TDOA error distribution

TDOA Error vs. Sensor Spread

1600

1400

CAIP Auditorium Bellcore Auditorium CAIP Lab Room Anechoic Chamber

0.2

Histogram Gaussian fit

1200

1000 Frames

Error in Mean Square Time (mS)

0.25

0.15

800

0.1

600

0.05

400

200

0

−0.05 0

0 −10

0.5

1

−8

−6

−4

1.5

−2 0 TDOA error (mS)

2

4

6

8 −3

x 10

Sensor spread (Meters)

13: TDOA histogram for the speech frames Figure 11: RMS TDOA error E vs. sensor spread D Figure collected in the anechoic chamber

Histogram of TDOA error distribution

CPSP Peak to Avg. Ratio vs. Sensor Spread

400

0.8

350

0.7

Histogram Gaussian fit

300

250

0.5 Frames

Normalized Ratio

0.6

0.4

150

0.3

100

0.2

50

0.1

0 0

200

0.5

1

0 −0.04

1.5

Sensor spread (Meters)

−0.03

−0.02

−0.01 0 0.01 TDOA error (mS)

0.02

0.03

0.04

Figure 14: TDOA histogram for the speech frames collected in the CAIP lab room

Figure 12: Normalized PRR vs. sensor spread D

9

References

TDOA Error vs. Normalized PRR 0.4

0.35

Mean TDOA Error (mS)

0.3

Bellcore Auditorium CAIP Auditorium

0.25

CAIP Lab Room

0.2

0.15

0.1

0.05

0 0

0.5

1

1.5

2 2.5 Normalized PRR

3

3.5

4

4.5

Figure 15: TDOA mean error E vs. Normalized PRR TDOA estimates. It has been shown [2] that errors in the position estimates behave as a function of error in the TDOA estimates. Therefore it is critical to obtain accurate TDOA estimates. The accuracy of the TDOA estimation has been shown to deteriorate with distance from source to sensor, with angle measured from the normal to the array, and with increases in room reverberation and ambient noise. A reliability metric has been proposed which can be utilized to intelligently eliminate faulty TDOA estimates. This metric can be used to calculate coordinates using only reliable TDOA estimates, which would increase the accuracy of position estimates. Future work will include implementing these improvements to the TDOA estimation and incorporating them into the real-time system described in [9] and adding improvements to the coordinate calculation step of the source location process. Furthermore, an improved sound capture system which uses the source location information to steer a beamformer towards the sound source is also being included.

Acknowledgments This work has been supported by NSF Grant No. MIP-9314625 and by grants from Bellcore and the New Jersey Commission on Science and Technology. 10

[1] J.B. Allen and D.A.Berkley. \Image method for eciently simulating small-room acoustics." J. Acoust. Soc. Am., 65:943-950, 1979. [2] M. Brandstein. A framework for speech source localization using sensor arrays. Doctoral Dissertation, Brown University, May 1995. [3] M. S. Brandstein, J. E. Adcock, J. H. DiBiase, and H. F. Silverman. \A frequency-domain delay estimator for talker location and beamforming with microphone arrays." In Proceedings of ICASSP-1995, pages 3019{3022, Detroit, Michigan, May 1995. [4] W. Elmore and M. Heald. Physics of waves. Dover Publications, New York, 1969. [5] J.L. Flanagan, J.D. Johnston, R. Zahn, and G.W. Elko. \Computer-steered microphone arrays for sound transduction in large rooms." Journal of the Acoustical Society of America, 78:1508{1518, November 1985. [6] W. Kellermann. \A self-steered digital microphone array." In Proceedings of 1992 ICASSP, pages 3581{3584, Toronto, Canada, March 1991. [7] H. Kim, T. Hughes, D. Mashao, J. Adcock and H. Silverman. \A Real-Time Desktop Microphone Array System for Speech Recognition." Submitted to ICASSP 97. [8] M. Omologo and P. Svaizer. \Acoustic event localization using a crosspower-spectrum phase based technique." Proceedings of ICASSP-1994, Adelaide, Australia, 1994. [9] D.Rabinkin, R.Renomeron, A.Dahl, J.French, J.Flanagan, and M.Bianchi. \A DSP Implementation of Source Location Using Microphone Arrays." Proceedings of the SPIE, Denver, August 1996. [10] H. Silverman, W. Patterson, J. Flanagan, and D. Rabinkin. \The huge microphone array." Report on Grant #MIP-9314625 to the NSF, July 1996. [11] The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus (TIMIT) Training and Test Data NIST Speech Disc CD1-1.1