A Novel Nested Circular Microphone Array and

6 downloads 0 Views 303KB Size Report
May 19, 2015 - DOI 10.1007/s00034-015-0077-6. A Novel Nested Circular Microphone Array and. Subband Processing-Based System for Counting and.
Circuits Syst Signal Process (2016) 35:573–601 DOI 10.1007/s00034-015-0077-6

A Novel Nested Circular Microphone Array and Subband Processing-Based System for Counting and DOA Estimation of Multiple Simultaneous Speakers Ali Dehghan Firoozabadi1 · Hamid Reza Abutalebi1

Received: 12 December 2014 / Revised: 4 May 2015 / Accepted: 5 May 2015 / Published online: 19 May 2015 © Springer Science+Business Media New York 2015

Abstract This paper addresses the topic of simultaneous speaker localization. The work is related to the generalized cross-correlation (GCC)-based methods for estimating the direction of multiple speakers. Considering the defects of GCC-based direction of arrival (DOA) estimation methods, we have applied several modifications to improve our previous subband processing-based system for the localization of simultaneous speakers. Three modifications have been presented in this paper. In the first step, the DOA estimation method is equipped with a front-end block that determines the number of speakers based on K-means clustering and silhouette criterion. This block provides the true number of speakers for the DOA estimator. Secondly, in order to eliminate the spatial aliasing, we propose a novel nested circular microphone array. In the proposed array design, each microphone pair is only used in appropriate subband according to its inter-microphone distance. In the third step, to overcome the weakness of GCC-phase transform (GCC-PHAT) in noisy and noisy-reverberant conditions, we propose a SNR estimation block. So, we can separate noisy and reverberant conditions and use PHAT filter for reverberant conditions and maximum likelihood filter for noisy situations. The proposed method has been evaluated on both simulated and real multi-speaker speech data in various environmental conditions and different number of speakers. Our evaluations in terms of DOA accuracy demonstrate the superiority of the proposed method compared to the fullband and baseline subband methods.

B

Hamid Reza Abutalebi [email protected] Ali Dehghan Firoozabadi [email protected]

1

Electrical and Computer Engineering Department, Yazd University, Pajuhesh St., Safaieh, Postal Box: 89195-741, Yazd, Iran

Suggest Documents