Sound field capture with microphone arrays, proximity microphones, and optimal filters. P.-A. Gauthier, T. Padois, T. Ramanana,. A. Bolduc, Y. Pasco, A. Berry.
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Sound field capture with microphone arrays, proximity microphones, and optimal filters P.-A. Gauthier, T. Padois, T. Ramanana, A. Bolduc, Y. Pasco, A. Berry GAUS, Groupe d’Acoustique de l’Université de Sherbrooke, Québec, Canada CIRMMT, Centre for Interdisciplinary Research in Music, Media, and Technology, McGill University, Montréal, Canada
AES 55th International Conference Spatial Audio Helsinki, Finland, August 2014 P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
1
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Presentation Plan 1
Introduction and background: Applied industrial context
2
Objective and methods
3
Microphone array and proximity microphone processing
4
Numerical simulations
5
Conclusion
6
Future works and applications P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
2
Spatial hearing and injuries? Localizing alarms in noisy environment
Picture: Chesapeake Bay Program, flickr
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Spatial hearing and injuries? Technical requirements
Previous works on spatial hearing and injuries
Example: Risks related with audibility in working environments US Bureau of Labor Statistics, 2002 (Vaillancourt et al., 2012): 6% of lethal accidents (397) in the USA in the construction are caused by reversing vehicles with reversing alarms ... why?
Research on alarm audibility have been conducted From these researches and studies, one notes: Technical challenges and opportunities for spatial audio
P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
6
Previous works on spatial hearing and injuries Challenging and limited experimental methods Their aim: Evaluating alarm localization and risk perception They rely on simple background ambiance plus loudspeakers to position alarm sounds: No precise consideration of the spatial sound environment or spatial release of masking Impossible to conduct rigorous on-site perception evaluations with scene or alarm variations ...
Opportunities for spatial audio application However, researchers/engineers in the field of workers’ health/security want to conduct extensive listening tests within more realistic sound environments and for parametric studies: Change: background level, positions, indoor vs outdoor, etc.
Impossible to achieve on industrial site (security/productivity) Wave Field Synthesis can solve these issues
Technical requirements: Capture and reproduction Physically-accurate reproduction STRICT requirement for certification agencies and engineers Impossible to rely on auditory illusion or stereo Physical approaches: Wave Field Synthesis or HOA Sound capture: Spot microphones vs arrays? WFS of virtual sources driven by spot microphone signals: Includes near field: Not physically accurate at listening position
WFS using microphone array recording Physically accurate but challenging: up-mixing, variations
Industry production constraint: Impossible to sample (stop/start) individual noise source on site Our proposal: Combining proximity microphones and arrays? Spot mic + array to facilitate source separation without interrupting the industry production and workers
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Objective and methods General project aim: Sound environment reproduction of working and dangerous sound environments using WFS for perceptual studies in the field of workers’ health/security related to alarm audibility Today’s specific objectives/methods for sound field capture 1
Separate referenced machine sound (with proximity microphones) from background ambiance in microphone array recordings and compensate for near-field effects
2
Test this method on the basis of numerical simulations
3
Evaluate the ability of the method to preserve spatial information at the microphone array using acoustic imaging P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
11
Parametric sound environment reproduction by WFS Foreground signals yi(n) Surrounding signals sm(n) M
I channels
Microphone array processing (beamforming)
Foreground source L
WFS loudspeaker array Foreground source
Surrounding sound environment (as plane waves)
Background or surrounding sound environment Should first be separated from foreground array signal Capture can rely on beamforming at the microphone array Reproduced as far-field sources (plane waves) Enough plane waves to recreate diffuse/immersive impression
Parametric sound environment reproduction by WFS Foreground signals yi(n) Surrounding signals sm(n) M
I channels
Microphone array processing (beamforming)
Foreground source L
WFS loudspeaker array Foreground source
Surrounding sound environment (as plane waves)
Foreground sound sources Should first be separated from background array signal Reproduced as spherical sources (focused or non-focused) Allow for changes in position, etc.
Parametric sound environment reproduction by WFS Foreground signals yi(n) Surrounding signals sm(n) M
I channels
Microphone array processing (beamforming)
Foreground source L
WFS loudspeaker array Foreground source
Surrounding sound environment (as plane waves)
Foreground sound sources Challenge: Spherical source cannot be driven by proximity mic signals (as for usual spot mic techniques in music) because proximity mic includes nearfield that does not propagate to listener position: not a physically-accurate approach
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Optimal/Wiener filters Acoustic imaging
Foreground and surrounding signal separation Foreground source i Reference signal xi(n)
Proximity microphone
Foreground signals ymi(n)
SIMO optimal filter Wi
M
-1 Array signals dm(n) M channels
+
Surrounding signals smi(n) M
Microphone array
Surrounding sound environment (background sources)
Figure: Foreground and surrounding signal separation at the microphone array for the i-th ref signal and M-mic array. P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
17
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Optimal/Wiener filters Acoustic imaging
Derivation of the optimal/Wiener filters ... in few words The aim of the mi-th optimal filter: Perform blind system identification between proximity microphone i and microphone m in the array Minimize remaining signal smi Any part of dm that is not perceived by the proximity microphone will stay in smi smi is anything that does not correlate with proximity microphone signal xi Unconstrained Wiener filter in the frequency domain with averaging ˜ mi (k) = S˜xi dm (k)/S˜xi xi (k) 0 ≤ k ≤ L − 1 w P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
19
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Optimal/Wiener filters Acoustic imaging
Performance evaluation by acoustic imaging Spatial evaluation of separated signals? 1
Foreground array signals ymi and surrounding signals smi should preserve spatial information at the microphone array since spatial post-processing is planned ...
2
To evaluate the spatial image of the separated source: horizontal Beamforming maps
Used beamforming algorithm 1
Non-focused beamforming
2
Classical delay-and-sum normalized beamforming
3
Based on the array signal Cross-Spectral-Matrix (CSM)
4
Include time-averaging in the CSM computation P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
21
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Simulated architecture Background sound source #1
Foreground sound source 176.8o
135o
21 m
76 m
48-microphone array
Background sound source #2
Figure: Top view of the modeled environment with background sources and a foreground source.
P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
23
Audio rate ray-tracing in Blender: E.A.R. plugin
E.A.R, http://www.explauralisation.org, accessed on 2014 January 22th
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Foreground and background source content Power/frequency [dB/Hz]
−20
1 kHz oct band for maps −40 −60 −80
Background source #1 Background source #2 Foreground source
−100
−1
10
0 Frequency [kHz] 10
Figure: Power spectral densities of sources. Listen to: Foreground, background #1, background #2 P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
25
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Acoustical maps of foreground and background signals Averaging in the 1 kHz oct band
(a) −50
Level [dB ref 1]
Original scene −58.63 dB →
−60
← −64.31 dB −70 135° → −80 0
45
(b) −50
90
135
← 176.8° 180
225
270
315
360
Steering directions [deg] P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
26
Level [
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
−70
135° →
← 176.8°
Acoustical −80 maps of foreground and background signals 0
45
90
135
180
270
315
360
Averaging in the 1 kHz oct band
(b) −50
Level [dB ref 1]
225
Original scene Original foreground Extracted foreground
−60
−70
−80 0
45
(c) −50
90
135
180
225
270
315
360
Steering directions [deg] P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
27
Level [
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
−70
Acoustical −80 maps of foreground and background signals 0
45
90
135
180
270
315
360
Averaging in the 1 kHz oct band
(c) −50
Level [dB ref 1]
225
Original scene Original background Extracted background
−60
−70
−80 0
45
90
135
180
225
270
315
360
Steering direction [°]
Steering directions [deg] P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
28
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Example of Wiener FIR filter coefficients 1.5
w11
1 0.5 0 −0.5 0
0.2
0.4
0.6
0.8
1
1.2
Time [s] Figure: Example of optimal FIR filter coefficients from proximity microphone to array microphone #1.
P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
29
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Power/frequency [dB/Hz]
Extracted signal at microphone #1 Foreground sound at microphone #1 of the array −20
Original Extracted
−40 −60 −80 −100
Listen: Original, Extracted −1
0
10
10 Frequency [kHz]
P.-A. Gauthier et al. at microphone Sound field capture: Mics and proximity Background sound #1 array of the arraymics
30
Power/freq
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
−80
−100
Extracted signal at microphone #1 −1 Power/frequency [dB/Hz]
10
0
10
Background sound at microphone #1 of the array −20
Original Extracted
−40 −60 −80 −100
Listen: Original, Extracted −1
0
10
10 Frequency [kHz]
P.-A. Gauthier et Sound field capture: proximity mics Mixed sound atal.microphone #1 Mics of array the and array
31
Power/freq
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
−80
−100
Extracted signal at microphone #1 −1 Power/frequency [dB/Hz]
10
0
10
Mixed sound at microphone #1 of the array −20
Original Reconstructed
−40 −60 −80 −100 −1
0
10
10 Frequency [kHz] [kHz] Frequency
P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
32
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Investigation of near-field effect Detailed near-field effects: not included in the simulations The aim of the method is to compute the foreground signal at the microphone array using the proximity microphone ... What happens if strong near-field sound is present at the proximity microphone but not at the array? Examples: evanescent waves, low-frequency content that does not radiate in the far-field Is this influencing the performance of the method? Test: Simulations with additional synthetic tone in the proximity mic signal xi at 93Hz and much louder than original reference signal at this frequency (synthetic tone: -14 dB, original peak -40 dB) P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
33
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Investigation of near-field effect Foreground sound at microphone #1 of the array Power/frequency [dB/Hz]
0 Original Extracted Extracted, 93Hz @ −14dB
−20 −40 −60 −80
0.08
0.09
0.1
0.11
0.12
Frequency [kHz] P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
34
Investigation of near-field effect Strong near-field signal in the reference can degrade the performance, check for coherence as a noise gate Solution: Coherence filter Coherence between proximity mic i and microphone m: ˜mi (k) = |S˜x d (k)|2 /S˜x x (k)S˜d d (k) C m m i i i m A gate using sigmoid function with threshold a and slope c ˜ F˜mi (k) = 1/(1 + e −c(Cmi (k)−a) )
Modified optimal/Wiener filter with coherence gate: C gate
Wiener
z }| { z }| { ˜ mi (k) = F˜mi (k) S˜xi dm (k)/S˜xi xi (k) w
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Compensation of near-field effect Foreground sound at microphone #1 of the array Power/frequency [dB/Hz]
0 Original Extracted Extracted, 93Hz @ −14dB
−20 −40 −60 −80
0.08
0.09
0.1
0.11
0.12
Frequency [kHz] P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
36
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Conclusion (1/2)
1
Separate foreground/background signal at microphone array using reference signals from proximity microphones at foreground sources
2
Rely on optimal/Wiener filters to perform blind system identification and prediction Results of numerical simulations show satisfactory results:
3
Separated spectrum at microphone array Horizontal acoustical map of separated signal at array
P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
38
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Conclusion (2/2) 1
2
Weak point: Some potential leakage between extracted foreground and background signals However: Numerical simulations, not perfectly representative: Real life: Signal at proximity microphone can be loud in comparison with leaking background at the proximity mic
3 4 5
6
Real life solution: Directive proximity microphone Real life solution: Replace proximity mic by vibration sensors Use Wiener filters to identify propagation delays between proximity microphone and array microphones: Further isolate the foreground using focused beamforming with the correspond delays while processing ymi Future investigations: Enhance the performance of the coherence gate P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
39
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Microphone array measurements in industrial contexts
Picture: P.-A. Gauthier at Graymont (Bedford, Canada) P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
41
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Reproduction and objective evaluation in WFS room
Figure: WFS reproduction with 96-loudspeaker array by Sonic Emotion P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
42
Introduction and background: Applied industrial context Objective and methods Mic array and proximity mic Numerical simulations Conclusion Future works
Appendix
Wiener/optimal filters References: Alarms References: Paper Questions
P.-A. Gauthier et al.
Sound field capture: Mics array and proximity mics
44
Derivation of the optimal/Wiener filters Quadratic cost function to be minimized for ref i and mic m Jmi = E [smi (n)2 ] with smi : remaining surround signal Gives the discrete form of Wiener-Hopf equations X wmil Rxi xi (n − l ) − Rxi dm (n) = 0 l=0...L
Unconstrained case: −∞ ≤ n ≤ ∞ and the summation operates from −∞ to ∞ (Elliott, 2001). Rxi xi and Rxi dm are the auto-correlation of xi and cross-correlation of xi and dm The filter are given in the frequency domain with averaging ˜ mi (k) = S˜xi dm (k)/S˜xi xi (k) 0 ≤ k ≤ L − 1 w
References:Alarms Blouin, S., 2005. Bilan de connaissances sur les dispositifs de détection de personnes lors des manoeuvres de recul des véhicules dans les chantiers de construction, Bilan des connaissances IRSST, B-067. Laroche, C., Tran Quoc, H, Hétu, R., McDuff, S., 1991. ”Detectsound”: A computerized model for predicting the detectability of warning signals in noisy workplaces. Applied Acoustics 32 pp. 193-214. Laroche, C., Ross, M.-J., Lefebvre, L., Larocque, R., 1995. Détermination des caractéristiques acoustiques optimales des alarmes de recul, rapport IRSST, R-117. NIOSH, 2004. Worker Health Chartbook. Department of Health and Human Services, Centers for Disease Control and Prevention.
References: Alarms
Pichette, L., 2013. Quelle technologie rend les alarmes de recul plus sécuritaires? Prévention au travail, p. 24-26. Vaillancourt, V., Nélisse, H., Laroche, C., Giguère, C., Boutin, J., Laferriére, P., 2012. Sécurité des travailleurs derrière les véhicules lourds - évaluation de trois types d’alarmes sonores de recul, IRSST, R-763. Withington, D.J., 2004. Reversing goes broadband. Quarry Management, pp.27-33. Disponible en ligne: http://www.agg-net.com/files/qmjcorp/Reversing%20goes%20Broadband_0.pdf [visité le 2 décembre, 2013].
References: Paper
Nicol R. and Emerit M., “3D-sound Reproduction Over an Extensive Listening Area: A Hybrid Method Derived from Holophony and Ambisonic,” presented at the AES 16th International Conference, Rovaniemi, Finland, 1999. Ahrens J., Analytical Methods of Sound Field Synthesis, Springer, Berlin, 2012. Hulsebos E., de Vries D., and Bourdillat E., “Improved Microphone Array Configurations for Auralization of Sound Fields by Wave-Field Synthesis,” J. Audio Eng. Soc., vol. 50, no. 10, pp. 779–790 (2002 October). Elliott S., Signal Processing for Active Control, Academic Press, San Diego, 2001.
References: Paper
Hur Y., Abel J.S., Park Y.-C., and Youn D.H., “Techniques for Synthetic Reconfiguration of Microphone Arrays,” J. Audio Eng. Soc., vol. 59, no. 6, pp. 404–418 (2011 June). Home of the Blender project, http://www.blender.org, accessed on 2014 January 22th. E.A.R, http://www.explauralisation.org, accessed on 2014 January 22th.
Questions
Why did you use the unconstrained Wiener-Hopf equations? Everything is done off-line = ok with non-causal filters After filtering, we can reject the filter build-up time in the separated sample For better performance in terms of separation But we could investigate further the causality constraint
Questions Would that work for very non-stationary sounds? Since the optimal filters are derived from averaging throughout the entire samples, we think that yes since the resulting filter are simply a blind system identification with proximity mic as input and microphones in the array as output Proof: Some of the reported examples are not perfectly stationary On-off sound may cause problem since on the average the background leak in the proximity mic would be stronger Solutions: 1) Include a signal detection algorithm (at the proximity microphone) to switch between a) foreground to array prediction and b) background leak at proximity microphone to array for further cancellation and 2) Include signal detection algorithms and a group-of-frame algorithm to limit Wiener filter derivation from frames where foreground source is active
Questions
How the proximity microphone, array, and sources are modeled in the ray-tracing simulation Sources: omnidirectional Proximity mics: omnidirectinal (50cm from source plus natural beackground leak) Array mics: omnidirectional Material: Same material for all surfaces Rev time: 1.59 sec (Schroeder curve)
Questions
I don’t understand what you will do with the resulting separated foreground/background array signals? Foreground sources: reproduced as point sources by WFS (physical positions measured on site) Foreground signals to mono: Find propagation delays from max(Wiener), use focused beamforming for signal extraction and further reduce background (could also be combined to generalized side-lobe canceler for further reduction of the background leaks) Background signals: 1) processed through fixed-directivity beamforming (panning laws as directivities) and plane wave reproduction or 2) inverse problem and plane wave reproduction