Fujipress Home | Search | About FINDER

Paper:
Language: English:

Printed Japanese Character Recognition Using Multiple Commercial OCRs


Hidetoshi Miyao*, Yasuaki Nakano**, Atsuhiko Tani***, Hirosato Tabaru****, and Toshihiro Hananoi**


*Shinshu University, 4-17-1, Wakasato, Nagano 380-8553, Japan
**Kyushu Sangyo University, 2-3-1 Matsukadai, Higashi-ku, Fukuoka 813-8503, Japan
***Hitachi Software Engineering Co., Ltd., 5030 Totsuka-cho, Totsuka-ku, Yokohama 244-8555, Japan
****Fuji Xerox Co., Ltd., 2-3-1 Matsukadai, Higashi-ku, Fukuoka 813-8503, Japan


Received: July 31, 2003

Accepted: December 1, 2003


Keywords: OCR, DP matching, printed Japanese character recognition, character extraction, document image analysis

Journal ref: Journal of Advanced Computational Intelligence and Intelligent Informatics, Vol.8, No.2 pp. 200-207, 2004

Abstract



This paper proposes two algorithms for maintaining matching between lines and characters in text documents output by multiple commercial optical character readers (OCRs). (1) a line matching algorithm using dynamic programming (DP) matching and (2) a character matching algorithm using character string division and standard character strings. The paper proposes a method that introduces majority logic and reject processing in character recognition. To demonstrate the feasibility of the method, we conducted experiments on line matching recognition for 127 document images using five commercial OCRs. Results demonstrated that the method extracted character areas with more accuracy than a single OCR along with appropriate line matching. The proposed method enhanced recognition from 97.61% provided by a single OCR to 98.83% in experiments using the character matching algorithm and character recognition. This method is expected to be highly useful in correcting locations at which unwanted lines or characters occur or required lines or characters disappear.
preview Preview (PDF)  full text Full Text (PDF 81KB)

Reference

[Notice]
* "Preview" is the first 2 pages of the article. You don't need the registration.
* To read the PDF file you will then need to download and install the Adobe Reader.
Adobe Reader is free and available for download here:

adobe reader

Terms and Conditions | Privacy Policy | Recruit | Advertising Information | Contact Us