Auto-Selection of DPC Codes from Discharge Summaries by Text Mining in Several Hospitals and Analysis of Differences in Discharge Summaries
Shunsuke Doi*1, Takahiro Suzuki*2, Gen Shimada*3,
Mitsuhiro Takasaki*4, Shinsuke Fujita*5,
Toshiyo Tamura*1, and Katsuhiko Takabayashi*2
*1Graduate School of Engineering, Chiba University, 1-8-1 Inohana, Chuo-ku, Chiba 260-8677, Japan
*2Department of Medical Informatics and Management, Chiba University Hospital, Japan
*3Medical Information Center, St. Luke’s International Hospital, Japan
*4Division of Medical Informatics, Saga University Hospital, Japan
*5Department of Welfare and Medical Intelligence, Chiba University Hospital, Japan
Recently, Electronic Medical Record (EMR) systems have become popular in Japan, and numerous discharge summaries are being stored electronically, although they have not yet been reutilized. We performed text mining by using the term frequencyinverse document frequency method along with a morphological analysis of the discharge summaries from 3 hospitals (the Chiba University Hospital, St. Luke’s International Hospital, and the Saga University Hospital). We found differences in the styles of the summaries between hospitals, while the rates of properly classified Diagnosis Procedure Combination (DPC) codes were almost the same. Beyond the different styles for the discharge summaries, the text mining method was able to obtain appropriate extracts of the proper DPC codes. An improvement was observed by using the integrated model data between the hospitals. It appeared that a large database containing data from many hospitals could improve the precision of text mining.
-  H. Ono, K. Takabayashi, T. Suzuki, H. Yokoi, A. Imiya, and Y. Satomura, “Extraction of diagnosis related terminological information from discharge summary,” Medinfo. (CD), p. 1786, 2004.
-  T. Suzuki, H. Yokoi, S. Fujita, and K. Takabayashi, “Discharge Summaries can be diagnosed from extracted index terms by text mining,” Medinfo., pp. 2257-2259, 2007.
-  T. Suzuki, H. Yokoi, S. Fujita, and K. Takabayashi, “DPC Code Selection from Electronic Medical Record – Text Mining Trial of Discharge Summary –,” Methods Inf. Med., Vol.47, pp. 541-548, 2008.
-  S. Fujita, “The interaction of the reason for encounter (ICPC-2) and standardized physical findings (PHYXAM),” Proc. 2nd Annu Conf JAMI 2004, pp. 908-909.
-  T. Kudo, K. Yamamoto, and Y. Matsumoto, “Applying conditional random fields to Japanese morphological analysis,” Proc. EMNLP, pp. 230-237, 2004.
-  G. Salton, A. Wong, and C. S. Yang, “A Vector Space Model for Automatic Indexing,” CACM 1975, Vol.18, pp. 613-620, 1975.
-  J. A. Goldman,W.W. Chu, D. S. Parker, and R. M. Goldman, “Term domain distribution analysis: a data mining tool for text databases,” Methods Inf. Med., Vol.38, pp. 96-101, 1999.
-  N. Collier, A. Nazarenko, R. Baud, and P. Ruch, “Recent advances in natural language processing for biomedical applications,” Int. J. Med. Inform., Vol.75, pp. 413-417, 2006.
-  S. P. MeshMap, “A textmining tool for Medline,” Proc. AMIA Symp., pp. 642-646, 2001.
-  E. A. Mendonca and J. J. Cimino, “Automated knowledge extraction from MEDLINE citations,” Proc. AMIA Symp., pp. 575-579, 2000.
-  E. S. Chen, G. Hripcsak, H. Xu, M. Markatou, and C. Friedman, “Automated acquisition of disease drug knowledge from biomedical and clinical documents: an initial study,” J. Am. Med. Inform. Assoc., Vol.15, pp. 87-98, 2008.
-  T. Takemura, J. Sato, T. Kuroda, K. Nagase, A. Takada, K. Tanaka, J. Guo, and H. Yoshihara, “Development of the retrieval system of similar discharge summary in MML (Medical Markup Language) instance,” Proc. 5th Annu. Conf. JAMI, pp. 464-465, 2004.
-  B. W. Mamlin, D. T. Heinze, and C. J. McDonald, “Automated Extraction and Normalization of Findings from Cancer-Related Free-Text Radiology Reports,” AMIA Annu. Symp. Proc. 2003, pp. 420-424, 2003.
-  I. A. McCowan, D. C. Moore, A. N. Nguyen, R. V. Bowman, B. E. Clarke, E. E. Duhig, and M. J. Fry, “Collection of Cancer Stage Data by Classifying Free-text Medical Reports,” J. Am. Med. Inform. Assoc, Vol.14, pp. 736-745, 2007.
-  S. V. Pakhomov, A. Ruggieri, and C. G. Chute, “Maximum entropy modeling for mining patient medication status from free text,” Proc. AMIA Symp., pp. 587-591, 2002.
-  Y. Iwahashi and K. Ohe, “Trial of automating classification of incident reports,” Proc. 2nd Annu. Conf. JAMI, pp. 804-805, 2004.