VLSI Architecture for Robust Speech Recognition Systems and its Implementation on a Verification Platform
Shingo Yoshizawa, Noboru Hayasaka, Naoya Wada,
and Yoshikazu Miyanaga
Graduate School of Information Science and Technology, Hokkaido University, N-14 W-9 Kita-Ku, Sapporo 060-0814, Japan
This paper presents a VLSI architecture for a robust speech recognition system that enables high-speed, low-power operation. The proposed architecture improves recognition accuracy in noisy environments and realizes short-time response by implementing parallel and pipeline processing. We demonstrate improved processing time and power consumption by evaluating circuit performance in 0.25-μm CMOS technology. We also detail a verification platform that helps users implement our hardware-based robust speech recognition system. The verification platform facilitates software conversion to hardware and promptly provides testing environments on field-programmable gate arrays.
-  P. Gomez, A. Alvarez, R. Martinez, and M. Perez, “A DSP-based modular architecture for noise cancellation and speech recognition,” Proc. IEEE ISCAS98, Vol.5, pp. 178-181, June 1998.
-  N. Hataoka, H. Kokubo, Y. Obuchi, and A. Amano, “Development of robust speech recognition middleware on microprocessor,” Proc. IEEE ICASSP98, Vol.2, pp. 837-840, May 1998.
-  J. Pihl, T. Svendsen, and M. H. Johnsen, “A VLSI implementation of pdf computations in HMM based speech recognition,” Proc. IEEE TENCON’96, pp. 241-246, 1996.
-  B. Park, K. Cho, and J. Cho, “Low power VLSI architecture of Viterbi scorer for HMM-based isolated word recognition,” International Symposium on Quality Electronic Design, pp. 235-239, March 2002.
-  W. Han, K. Hon, and C. Chan, “An HMM-based speech recognition IC,” Proc. IEEE ISCAS2003, Vol.2, pp. 744-747, 2003.
-  S. J. Melnikoff, S. Quigley, and M. J. Russell, “Implementing a simple continuous speech recognition system on an FPGA,” Proc. IEEE Symposium on FPGAs for Custom Computing Machines (FCCM 2002), pp. 275-276, 2002.
-  F. Vargas, R. Fagundes, and D. Barros, “A FPGA-based Viterbi algorithm implementation for speech recognition systems,” Proc. IEEE ICASSP2001, Vol.2, pp. 1217-1220, May 2001.
-  S. Yoshizawa, N. Hayasaka, N. Wada, and Y. Miyanaga, “Cepstral amplitude range normalization for noise robust speech recognition,” IEICE Trans. Inf & Syst., Vol.E87-D, No.8, pp. 2130-2137, Aug. 2004.
-  A. Acero, and R. M. Stern, “Environmental robustness in automatic speech recognition,” Proc. ICASSP90, pp. 849-852, 1990.
-  H. Hermansky, “RASTA processing of speech,” IEEE Trans. Speech Audio Processing, Vol.2, pp. 578-589, 1994.
-  O. Viikki, D. Bye, and K. Lauria, “A recursive feature vector normalization approach for robust speech recognition in noise,” Proc. ICASSP98, pp. 733-736, 1998.
-  S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. on Acoust., Speech, Signal Processing, ASSP-33, Vol.27, pp. 113-120, 1979.
-  J. Shen, W. Hwang, and L. Lee, “Robust speech recognition features based on temporal trajectory filtering of frequency band spectrum,” Proc. ICSLP96, pp. 881-884, 1996.
-  N. Kanedera, T. Arai, and T. Funada, “Robust automatic speech recognition emphasizing important modulation spectrum,” IEICE Trans. Inf. & Syst., Vol.J84-D2, No.7, pp. 1261-1269, 2001.
-  A. Varga, and H. J. M. Steenken, “Assessment for automatic speech recognition: II. NOISEX-92: A database and experiment to study the effect of additive noise on speech recognition systems,” Speech Communication, Vol.12, No.3, pp. 247-251, 1993.
-  L. R. Rabiner, and B. H. Juang, “Fundamentals of speech recognition,” Prentice Hall, Englewood Cliffs, N.J., 1993.
This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.
Copyright© 2005 by Fuji Technology Press Ltd. and Japan Society of Mechanical Engineers. All right reserved.