Spatial Localization of Concurrent Multiple Sound Sources Using Phase Candidate Histogram

Huakang Li; Jie Huang; Minyi Guo; Qunfei Zhao

doi:10.20965/jaciii.2011.p1277

single-jc.php

« previous

JACIII Vol.15 No.9 pp. 1277-1286

doi: 10.20965/jaciii.2011.p1277

(2011)

Paper:

Views over last 60 days: 750

Spatial Localization of Concurrent Multiple Sound Sources Using Phase Candidate Histogram

Huakang Li, Jie Huang, Minyi Guo, and Qunfei Zhao

Department of Computer, Science and Engineering, Shanghai Jiao Tong University, No.800, Dongchuan RD., Minhang District, Shanghai, 200240, China

Received:

April 28, 2011

Accepted:

August 24, 2011

Published:

November 20, 2011

Keywords:

acoustic signal processing, direction estimation, time-delay estimation, candidate histogram, precedence effect

Abstract

Mobile robots communicating with people would benefit from being able to detect sound sources to help localize interesting events in real-life settings. We propose using a spherical robot with four microphones to determine the spatial locations of multiple sound sources in ordinary rooms. The arrival temporal disparities from phase difference histograms are used to calculate the time differences. A precedence effect model suppresses the influence of echoes in reverberant environments. To integrate spatial cues of different microphones, we map the correlation between different microphone pairs on a 3D map corresponding to the azimuth and elevation of sound source direction. Results of experiments indicate that our proposed system provides sound source distribution very clearly and precisely, even concurrently in reverberant environments with the Echo Avoidance (EA) model.

Cite this article as:

H. Li, J. Huang, M. Guo, and Q. Zhao, “Spatial Localization of Concurrent Multiple Sound Sources Using Phase Candidate Histogram,” J. Adv. Comput. Intell. Intell. Inform., Vol.15 No.9, pp. 1277-1286, 2011.

Data files:

References

[1] G. Medioni and S. B. Kang, “Emerging topics in computer vision,” Prentice Hall PTR Upper Saddle River, NJ, USA, 2004.
[2] J. Huang, C. Zhao, Y. Ohtake, H. Li, and Q. Zhao, “Robot Position Identification Using Specially Designed Landmarks,” In Instrumentation and Measurement Technology Conference, 2006, IMTC 2006, Proc. of the IEEE, pp. 2091-2094, 2006.
[3] R. S. Heffner and H. E. Heffner, “Evolution of sound localization in mammals,” The evolutionary biology of hearing, pp. 691-715, 1992.
[4] J. Huang, N. Ohnishi, and N. Sugie, “Building ears for robots: sound localization and separation,” Artificial Life and Robotics, Vol.1, No.4, pp. 157-163, 1997.
[5] P. Arabi and S. Zaky, “Integrated vision and sound localization,” In Information Fusion, 2000, FUSION 2000, Proc. of the Third Int. Conf. on, Vol.2, 2000.
[6] H. Li, K. Thongam, A. Saji, K. Tanno, J. Huang, and Q. Zhao, “Robot Position Identification by Visual and Sound Beacons,” In FAN 2009, Proc. of The 19th Intelligent System Symposium, pp. 292-297, 2009.
[7] M. Wax and T. Kailath, “Optimum localization of multiple sources by passive arrays,” Acoustics, Speech and Signal Processing, IEEE Trans. on, Vol.31, No.5, pp. 1210-1217, 1983.
[8] K. C. Ho and M. Sun, “An accurate algebraic closed-form solution for energy-based source localization,” Audio, Speech, and Language Processing, IEEE Trans. on, Vol.15, No.8, pp. 2542-2550, 2007.
[9] R. Roy and T. Kailath, “ESPRIT-estimation of signal parameters via rotational invariance techniques,” Acoustics, Speech and Signal Processing, IEEE Trans. on, Vol.37, No.7, pp. 984-995, 1989.
[10] R. Schmidt, “Multiple emitter location and signal parameter estimation,” Antennas and Propagation, IEEE Trans. on, Vol.34, No.3, pp. 276-280, 1986.
[11] J. M. Valin, F. Michaud, J. Rouat, and D. Létourneau, “Robust sound source localization using a microphone array on a mobile robot,” In Intelligent Robots and Systems, 2003 (IROS 2003), Proc. 2003 IEEE/RSJ Int. Conf. on, Vol.2, pp. 1228-1233, IEEE, 2003.
[12] M. Brandstein and D.Ward, “Microphone arrays: signal processing techniques and applications,” Springer Verlag, 2001.
[13] T. Gustafsson, B.D. Rao, and M. Trivedi, “Source localization in reverberant environments: modeling and statistical analysis,” Speech and Audio Processing, IEEE Trans. on, Vol.11, No.6, pp. 791-803, 2003.
[14] Y. Rui and D. Florencio, “Time delay estimation in the presence of correlated noise and reverberation,” In Acoustics, Speech, and Signal Processing, 2004, Proc. (ICASSP’04). IEEE Int. Conf. on, Vol.2, pp. ii-133, IEEE, 2004.
[15] X. Sheng and Y.-H. Hu, “Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks,” Signal Processing, IEEE Trans. on, Vol.53, No.1, pp. 44-53, 2005.
[16] M. Matsumoto and S. Hashimoto, “Multiple signal classification by aggregated microphones,” IEICE Trans. on Fundamentals of Electronics Communications and Computer Sciences E. Series A, Vol.88, No.7, p. 1701, 2005.
[17] T.-l. JU, Q.-c. Peng, H.-z., Shao, and J.-r. Lin, “Speech source 2D DOA estimation algorithm based on random microphone array,” Propagation & EM Theory, 2006, ISAPE ’06. 7th Int. Symposium on pp. 1-4, 2006.
[18] J. Huang, N. Ohnishi, and N. Sugie, “Sound localization in reverberant environment based on the model of the precedence effect,” IEEE Trans. on Instrumentation and Measurement, Vol.46, No.4, pp. 842-846, 1997.
[19] H. Li, T. Yosiara, Q. Zhao, T. Watanabe, and J. Huang, “A spatial sound localization system for mobile robots,” In Instrumentation and Measurement Technology Conf. Proc., 2007, IMTC 2007, IEEE, pp. 1-6, IEEE, 2007.
[20] H. Nakashima, N. Onishi, and T. Mukai, “A learning system for estimating the elevation angle of a sound source by using a feature map of spectrum,” IEICE Trans. on Information and Systems, Vol.87, No.11, pp. 2034-2044, 2004.
[21] O. Yılmaz and S. Rickard, “Blind separation of speech mixtures via time-frequency masking,” IEEE Trans. on Signal Processing, Vol.52, No.7, pp. 1830-1847, 2004.
[22] J. Huang, N. Ohnishi, X. Guo, and N. Sugie, “Echo avoidance in a computational model of the precedence effect,” Speech Communication, Vol.27, No.3-4, pp. 223-233, 1999.
[23] P. M. Zurek, “The precedence effect,” Directional hearing, pp. 85-105, 1987.
[24] B. Rakerd and W. M. Hartmann, “Precedence effect with and without interaural differences – Sound localization in three planes,” The J. of the Acoustical Society of America, Vol.92, p. 2296, 1992.
[25] S. W. Kuffler and J. G. Nicholls, “Fron neuron to brain: a cellular approach to the function of the nervous system,” 1977.
[26] H. L. Van Trees and J. Wiley, “Optimum array processing,” Wiley Online Library, 2002.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] G. Medioni and S. B. Kang, “Emerging topics in computer vision,” Prentice Hall PTR Upper Saddle River, NJ, USA, 2004.

[2] [2] J. Huang, C. Zhao, Y. Ohtake, H. Li, and Q. Zhao, “Robot Position Identification Using Specially Designed Landmarks,” In Instrumentation and Measurement Technology Conference, 2006, IMTC 2006, Proc. of the IEEE, pp. 2091-2094, 2006.

[3] [3] R. S. Heffner and H. E. Heffner, “Evolution of sound localization in mammals,” The evolutionary biology of hearing, pp. 691-715, 1992.

[4] [4] J. Huang, N. Ohnishi, and N. Sugie, “Building ears for robots: sound localization and separation,” Artificial Life and Robotics, Vol.1, No.4, pp. 157-163, 1997.

[5] [5] P. Arabi and S. Zaky, “Integrated vision and sound localization,” In Information Fusion, 2000, FUSION 2000, Proc. of the Third Int. Conf. on, Vol.2, 2000.

[6] [6] H. Li, K. Thongam, A. Saji, K. Tanno, J. Huang, and Q. Zhao, “Robot Position Identification by Visual and Sound Beacons,” In FAN 2009, Proc. of The 19th Intelligent System Symposium, pp. 292-297, 2009.

[7] [7] M. Wax and T. Kailath, “Optimum localization of multiple sources by passive arrays,” Acoustics, Speech and Signal Processing, IEEE Trans. on, Vol.31, No.5, pp. 1210-1217, 1983.

[8] [8] K. C. Ho and M. Sun, “An accurate algebraic closed-form solution for energy-based source localization,” Audio, Speech, and Language Processing, IEEE Trans. on, Vol.15, No.8, pp. 2542-2550, 2007.

[9] [9] R. Roy and T. Kailath, “ESPRIT-estimation of signal parameters via rotational invariance techniques,” Acoustics, Speech and Signal Processing, IEEE Trans. on, Vol.37, No.7, pp. 984-995, 1989.

[10] [10] R. Schmidt, “Multiple emitter location and signal parameter estimation,” Antennas and Propagation, IEEE Trans. on, Vol.34, No.3, pp. 276-280, 1986.

[11] [11] J. M. Valin, F. Michaud, J. Rouat, and D. Létourneau, “Robust sound source localization using a microphone array on a mobile robot,” In Intelligent Robots and Systems, 2003 (IROS 2003), Proc. 2003 IEEE/RSJ Int. Conf. on, Vol.2, pp. 1228-1233, IEEE, 2003.

[12] [12] M. Brandstein and D.Ward, “Microphone arrays: signal processing techniques and applications,” Springer Verlag, 2001.

[13] [13] T. Gustafsson, B.D. Rao, and M. Trivedi, “Source localization in reverberant environments: modeling and statistical analysis,” Speech and Audio Processing, IEEE Trans. on, Vol.11, No.6, pp. 791-803, 2003.

[14] [14] Y. Rui and D. Florencio, “Time delay estimation in the presence of correlated noise and reverberation,” In Acoustics, Speech, and Signal Processing, 2004, Proc. (ICASSP’04). IEEE Int. Conf. on, Vol.2, pp. ii-133, IEEE, 2004.

[15] [15] X. Sheng and Y.-H. Hu, “Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks,” Signal Processing, IEEE Trans. on, Vol.53, No.1, pp. 44-53, 2005.

[16] [16] M. Matsumoto and S. Hashimoto, “Multiple signal classification by aggregated microphones,” IEICE Trans. on Fundamentals of Electronics Communications and Computer Sciences E. Series A, Vol.88, No.7, p. 1701, 2005.

[17] [17] T.-l. JU, Q.-c. Peng, H.-z., Shao, and J.-r. Lin, “Speech source 2D DOA estimation algorithm based on random microphone array,” Propagation & EM Theory, 2006, ISAPE ’06. 7th Int. Symposium on pp. 1-4, 2006.

[18] [18] J. Huang, N. Ohnishi, and N. Sugie, “Sound localization in reverberant environment based on the model of the precedence effect,” IEEE Trans. on Instrumentation and Measurement, Vol.46, No.4, pp. 842-846, 1997.

[19] [19] H. Li, T. Yosiara, Q. Zhao, T. Watanabe, and J. Huang, “A spatial sound localization system for mobile robots,” In Instrumentation and Measurement Technology Conf. Proc., 2007, IMTC 2007, IEEE, pp. 1-6, IEEE, 2007.

[20] [20] H. Nakashima, N. Onishi, and T. Mukai, “A learning system for estimating the elevation angle of a sound source by using a feature map of spectrum,” IEICE Trans. on Information and Systems, Vol.87, No.11, pp. 2034-2044, 2004.

[21] [21] O. Yılmaz and S. Rickard, “Blind separation of speech mixtures via time-frequency masking,” IEEE Trans. on Signal Processing, Vol.52, No.7, pp. 1830-1847, 2004.

[22] [22] J. Huang, N. Ohnishi, X. Guo, and N. Sugie, “Echo avoidance in a computational model of the precedence effect,” Speech Communication, Vol.27, No.3-4, pp. 223-233, 1999.

[23] [23] P. M. Zurek, “The precedence effect,” Directional hearing, pp. 85-105, 1987.

[24] [24] B. Rakerd and W. M. Hartmann, “Precedence effect with and without interaural differences – Sound localization in three planes,” The J. of the Acoustical Society of America, Vol.92, p. 2296, 1992.

[25] [25] S. W. Kuffler and J. G. Nicholls, “Fron neuron to brain: a cellular approach to the function of the nervous system,” 1977.

[26] [26] H. L. Van Trees and J. Wiley, “Optimum array processing,” Wiley Online Library, 2002.