Paper:
Development, Deployment and Applications of Robot Audition Open Source Software HARK
Kazuhiro Nakadai*,***, Hiroshi G. Okuno**, and Takeshi Mizumoto*
*Honda Research Institute Japan Co., Ltd.
8-1 Honcho, Wako-shi, Saitama 351-0114, Japan
**Graduate Program for Embodiment Informatics, Waseda University
2-4-12 Okubo, Shinjuku, Tokyo 169-0072, Japan
***Graduate School of Information Science and Engineering, Tokyo Institute of Technology
2-12-1 Ookayama, Meguro-ku, Tokyo 152-8552, Japan
- [1] K. Nakadai et al., “Active Audition for Humanoid,” AAAI-2000, pp. 832-839, 2000.
- [2] K. Nakadai et al., “Design and Implementation of Robot Audition System “HARK”,” Advanced Robotics, Vol.24, pp. 739-761, 2010.
- [3] C. Côté et al., “Code reusability tools for programming mobile robots,” IEEE/RSJ IROS 2004, pp. 1820-1825, 2004.
- [4] R. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Trans. on Antennas and Propagation, Vol.34, No.3, pp. 276-280, 1986.
- [5] F. Asano et al., “Localization and extraction of brain activity using generalized eigenvalue decomposition,” IEEE ICASSP 2008, pp. 565-568, 2008.
- [6] K. Nakamura et al., “A real-time super-resolution robot audition system that improves the robustness of simultaneous speech recognition,” Advanced Robotics, Vol.27, No.12, pp. 933-945, 2013.
- [7] T. Ohata et al, “Improvement in Outdoor Sound Source Detection Using a Quadrotor-Embedded Microphone Array,” IEEE/RSJ IROS, 2014.
- [8] H. Nakajima et al., “Blind Source Separation with Parameter-Free Adaptive Step-Size Method for Robot Audition,” IEEE Trans. ASLP, Vol.18, No.6, pp. 1476-1484, 2010.
- [9] H. Nakajima, N. Tanaka, and H. Tsuru, “Minimum sidelobe beamforming based on Mini-Max criterion,” Acoust. Sci. & Tech., Vol.25, No.6, pp. 486-488, 2004.
- [10] V. A. N. Barroso and J. M. F. Moura, “Maximum likelihood beamforming in the presence of outliers,” IEEE ICASSP-91, pp. 1409-1412, 1991.
- [11] M. L. Seltzer et al., “A Bayesian Framework for Spectrographic Mask Estimation for Missing Feature Speech Recognition,” Speech Communication, Vol.43, No.4, pp. 379-393, 2004.
- [12] R. A. Monzingo and T. W. Miller, “Introduction to adaptive arrays,” SciTech Publishing, 1980.
- [13] O. L. Frost, “An algorithm for linearly constrained adaptive array processing,” Proc. of the IEEE, Vol.60, No.8, pp. 926-935, 1972.
- [14] L. J. Griffiths and C. W. Jim, “An alternative approach to linearly constrained adaptive beamforming,” IEEE Trans. on Antennas and Propagation, Vol.30, No.1, pp. 27-34, 1982.
- [15] L. C. Parra and C. V. Alvino, “Geometric source separation: Mergin convolutive source separation with geometric beamforming,” IEEE Trans. on Speech and Audio Processing, Vol.10, No.6, pp. 352-362, 2002.
- [16] M. Knaak et al., “Geometrically Constrained Independent Component Analysis,” IEEE Trans. on ASLP, Vol.15, No.2, pp. 715-726, 2007.
- [17] H. Nakajima et al., “An easily-configurable robot audition system using Histogram-based Recursive Level Estimation,” IEEE/RSJ IROS 2010, pp. 958-963, 2010.
- [18] R. Takeda et al., “Efficient Blind Dereverberation and Echo Cancellation Based on Independent Component Analysis for Actual Acoustic Signals,” Neural Computation, Vol.24, No.1, pp. 234-272, 2011.
- [19] G. Ince et al., “Whole Body Motion Noise Cancellation of a Robot for Improved Automatic Speech Recognition,” Advanced Robotics, Vol.25, No.11, pp. 1405-1426, 2011.
- [20] Y. Bando et al., “Human-voice enhancement based on online RPCA for a hose-shaped rescue robot with a microphone array,” 2015 IEEE Int. Symposium on Safety, Security, and Rescue Robotics (SSRR), pp. 1-6, 2015.
- [21] S. Yamamoto et al., “Enhanced robot speech recognition based on microphone array source separation and missing feature theory,” IEEE/RAS ICRA 2005, pp. 1427-1482, 2005.
- [22] K. Nakadai et al., “Robot-Audition-based Human-Machine Interface for a Car,” IEEE/RSJ IROS 2015, pp. 6129-6136, 2015.
- [23] T. Mizumoto et al., “Design and implementation of selectable sound separation on the Texai telepresence system using HARK,” IEEE/RAS ICRA-2011, pp. 2130-2137, 2011.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.