JACIII Vol.14 No.2 pp. 208-213
doi: 10.20965/jaciii.2010.p0208


Robust Estimation of Sound Source Direction with Deterministic Background Noise and Stochastic Source Dynamics Models

Mitsunori Mizumachi and Katsuyuki Niyada

Department of Electrical Engineering, Kyushu Institute of Technology, 1-1 Sensui-cho, Tobata-ku, Kitakyushu-shi, Fukuoka 804-8550, Japan

July 9, 2009
December 15, 2009
March 20, 2010
sound source direction, noise robustness, robust feature extraction, frequency selectivity, particle filtering

Direction of Arrival (DOA), a type of auxiliary information used in acoustic signal processing, is vulnerable to acoustical noise, so we want to male the estimation of DOA in noisy environments, relying on spectral sparseness. The energy of acoustic signals such as speech is wide-band, with individual signals localized in specific but different frequency regions. Our proposal involves filtering out spatial features provisionally from subband frequency components at the dominant frequency of the target signal using particle filtering with a sound source dynamics model. The feasibility of our proposal is confirmed by estimating a sound source direction in noisy conditions, also confirming that frequency selectivity and state estimation using particle filters help improve DOA estimation robustness against noise in noisy conditions.

Cite this article as:
Mitsunori Mizumachi and Katsuyuki Niyada, “Robust Estimation of Sound Source Direction with Deterministic Background Noise and Stochastic Source Dynamics Models,” J. Adv. Comput. Intell. Intell. Inform., Vol.14, No.2, pp. 208-213, 2010.
Data files:
  1. [1] M. S. Brandstein and D. B. Ward (eds.), “Microphone Arrays: Signal Processing Techniques and Applications,” Springer-Verlag, 2001.
  2. [2] M. Brandstein, “On the use of explicit speech modeling in microphone array applications,” Proc. Int. Conf. on Acoust., Speech, and Signal Process. (ICASSP’98), pp. 613-616, 1998.
  3. [3] S. Araki, H. Sawada, R. Mukai, and S. Makino, “DOA estimation for multiple sparse sources with normalized observation vector clustering,” Proc. Int. Conf. on Acoust., Speech, and Signal Proces. (ICASSP’06), 2006.
  4. [4] M. Mizumachi and K. Niyada, “DOA Estimation Based on Cross-Correlation with Frequency Selectivity,” RISP J. of Signal Processing, Vol.11, No.1, pp. 43-50, 2007.
  5. [5] S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoust., Speech, and Signal Process., Vol.27, No.2, pp. 113-120, 1979.
  6. [6] A. Doucet, J. F. G. Freitas, and N. J. Gordon, “Sequential Monte Carlo Methods in Practice,” Springer-Verlag, New York, 2001.
  7. [7] J. Vermaark and A. Blake, “Nonlinear filtering for speaker tracking in noisy and reverberant environments,” Proc. Int. Conf. on Acoust., Speech, and Signal Processing (ICASSP’01), Vol.5, pp. 3021-3024, 2001.
  8. [8] D. B. Ward, E. A. Lehmann, and R. C. Williamson, “Particle filtering algorithms for tracking an acoustic source in a reverberant environment,” IEEE Trans. Speech, and Audio Process., Vol.11, No.6, pp. 826-836, 2003.
  9. [9] M. Mizumachi and K. Niyada, “DOA estimation using crosscorrelation with particle filter,” Proc. Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA2005), CD-ROM, 2005.
  10. [10] M. Mizumachi and K. Niyada, “DOA estimation based on crosscorrelation by two-step particle filtering,” Proc. 14th European Signal Process. Conf. (EUSIPCO2006), CD-ROM, 2006.
  11. [11] C. H. Knapp and G. C. Carter, “The generalized correlation method for estimation of time delay,” IEEE Trans. Acoust., Speech, Signal Process., Vol.24, pp. 320-327, 1976.
  12. [12] R. O. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Trans. Antennas Propagation, Vol.AP-34, No.3, pp. 276-280, 1986.
  13. [13] R. G. Leonard, “A database for speaker independent digit recognition,” Proc. Int. Conf. on Acoust., Speech, and Signal Process. (ICASSP’84), Vol.9, pp. 328-331, 1984.
  14. [14] A. Varga, H. J. M. Steeneken, “Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems,” Speech Comm., Vol.12, No.3, pp. 247-252, 1993.

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Mar. 05, 2021