Robust Estimation of Sound Source Direction with Deterministic Background Noise and Stochastic Source Dynamics Models

Mitsunori Mizumachi; Katsuyuki Niyada

doi:10.20965/jaciii.2010.p0208

single-jc.php

« previous

JACIII Vol.14 No.2 pp. 208-213

doi: 10.20965/jaciii.2010.p0208

(2010)

Paper:

Views over last 60 days: 686

Robust Estimation of Sound Source Direction with Deterministic Background Noise and Stochastic Source Dynamics Models

Mitsunori Mizumachi and Katsuyuki Niyada

Department of Electrical Engineering, Kyushu Institute of Technology, 1-1 Sensui-cho, Tobata-ku, Kitakyushu-shi, Fukuoka 804-8550, Japan

Received:

July 9, 2009

Accepted:

December 15, 2009

Published:

March 20, 2010

Keywords:

sound source direction, noise robustness, robust feature extraction, frequency selectivity, particle filtering

Abstract

Direction of Arrival (DOA), a type of auxiliary information used in acoustic signal processing, is vulnerable to acoustical noise, so we want to male the estimation of DOA in noisy environments, relying on spectral sparseness. The energy of acoustic signals such as speech is wide-band, with individual signals localized in specific but different frequency regions. Our proposal involves filtering out spatial features provisionally from subband frequency components at the dominant frequency of the target signal using particle filtering with a sound source dynamics model. The feasibility of our proposal is confirmed by estimating a sound source direction in noisy conditions, also confirming that frequency selectivity and state estimation using particle filters help improve DOA estimation robustness against noise in noisy conditions.

Cite this article as:

M. Mizumachi and K. Niyada, “Robust Estimation of Sound Source Direction with Deterministic Background Noise and Stochastic Source Dynamics Models,” J. Adv. Comput. Intell. Intell. Inform., Vol.14 No.2, pp. 208-213, 2010.

Data files:

References

[1] M. S. Brandstein and D. B. Ward (eds.), “Microphone Arrays: Signal Processing Techniques and Applications,” Springer-Verlag, 2001.
[2] M. Brandstein, “On the use of explicit speech modeling in microphone array applications,” Proc. Int. Conf. on Acoust., Speech, and Signal Process. (ICASSP’98), pp. 613-616, 1998.
[3] S. Araki, H. Sawada, R. Mukai, and S. Makino, “DOA estimation for multiple sparse sources with normalized observation vector clustering,” Proc. Int. Conf. on Acoust., Speech, and Signal Proces. (ICASSP’06), 2006.
[4] M. Mizumachi and K. Niyada, “DOA Estimation Based on Cross-Correlation with Frequency Selectivity,” RISP J. of Signal Processing, Vol.11, No.1, pp. 43-50, 2007.
[5] S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoust., Speech, and Signal Process., Vol.27, No.2, pp. 113-120, 1979.
[6] A. Doucet, J. F. G. Freitas, and N. J. Gordon, “Sequential Monte Carlo Methods in Practice,” Springer-Verlag, New York, 2001.
[7] J. Vermaark and A. Blake, “Nonlinear filtering for speaker tracking in noisy and reverberant environments,” Proc. Int. Conf. on Acoust., Speech, and Signal Processing (ICASSP’01), Vol.5, pp. 3021-3024, 2001.
[8] D. B. Ward, E. A. Lehmann, and R. C. Williamson, “Particle filtering algorithms for tracking an acoustic source in a reverberant environment,” IEEE Trans. Speech, and Audio Process., Vol.11, No.6, pp. 826-836, 2003.
[9] M. Mizumachi and K. Niyada, “DOA estimation using crosscorrelation with particle filter,” Proc. Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA2005), CD-ROM, 2005.
[10] M. Mizumachi and K. Niyada, “DOA estimation based on crosscorrelation by two-step particle filtering,” Proc. 14th European Signal Process. Conf. (EUSIPCO2006), CD-ROM, 2006.
[11] C. H. Knapp and G. C. Carter, “The generalized correlation method for estimation of time delay,” IEEE Trans. Acoust., Speech, Signal Process., Vol.24, pp. 320-327, 1976.
[12] R. O. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Trans. Antennas Propagation, Vol.AP-34, No.3, pp. 276-280, 1986.
[13] R. G. Leonard, “A database for speaker independent digit recognition,” Proc. Int. Conf. on Acoust., Speech, and Signal Process. (ICASSP’84), Vol.9, pp. 328-331, 1984.
[14] A. Varga, H. J. M. Steeneken, “Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems,” Speech Comm., Vol.12, No.3, pp. 247-252, 1993.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] M. S. Brandstein and D. B. Ward (eds.), “Microphone Arrays: Signal Processing Techniques and Applications,” Springer-Verlag, 2001.

[2] [2] M. Brandstein, “On the use of explicit speech modeling in microphone array applications,” Proc. Int. Conf. on Acoust., Speech, and Signal Process. (ICASSP’98), pp. 613-616, 1998.

[3] [3] S. Araki, H. Sawada, R. Mukai, and S. Makino, “DOA estimation for multiple sparse sources with normalized observation vector clustering,” Proc. Int. Conf. on Acoust., Speech, and Signal Proces. (ICASSP’06), 2006.

[4] [4] M. Mizumachi and K. Niyada, “DOA Estimation Based on Cross-Correlation with Frequency Selectivity,” RISP J. of Signal Processing, Vol.11, No.1, pp. 43-50, 2007.

[5] [5] S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoust., Speech, and Signal Process., Vol.27, No.2, pp. 113-120, 1979.

[6] [6] A. Doucet, J. F. G. Freitas, and N. J. Gordon, “Sequential Monte Carlo Methods in Practice,” Springer-Verlag, New York, 2001.

[7] [7] J. Vermaark and A. Blake, “Nonlinear filtering for speaker tracking in noisy and reverberant environments,” Proc. Int. Conf. on Acoust., Speech, and Signal Processing (ICASSP’01), Vol.5, pp. 3021-3024, 2001.

[8] [8] D. B. Ward, E. A. Lehmann, and R. C. Williamson, “Particle filtering algorithms for tracking an acoustic source in a reverberant environment,” IEEE Trans. Speech, and Audio Process., Vol.11, No.6, pp. 826-836, 2003.

[9] [9] M. Mizumachi and K. Niyada, “DOA estimation using crosscorrelation with particle filter,” Proc. Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA2005), CD-ROM, 2005.

[10] [10] M. Mizumachi and K. Niyada, “DOA estimation based on crosscorrelation by two-step particle filtering,” Proc. 14th European Signal Process. Conf. (EUSIPCO2006), CD-ROM, 2006.

[11] [11] C. H. Knapp and G. C. Carter, “The generalized correlation method for estimation of time delay,” IEEE Trans. Acoust., Speech, Signal Process., Vol.24, pp. 320-327, 1976.

[12] [12] R. O. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Trans. Antennas Propagation, Vol.AP-34, No.3, pp. 276-280, 1986.

[13] [13] R. G. Leonard, “A database for speaker independent digit recognition,” Proc. Int. Conf. on Acoust., Speech, and Signal Process. (ICASSP’84), Vol.9, pp. 328-331, 1984.

[14] [14] A. Varga, H. J. M. Steeneken, “Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems,” Speech Comm., Vol.12, No.3, pp. 247-252, 1993.