Interactive Document Clustering System Based on Coordinated Multiple Views
Yasufumi Takama and Takuma Tonegawa
Graduate School of System Design, Tokyo Metropolitan University
6-6 Asahigaoka, Hino, Tokyo 191-0065, Japan
This paper proposes an interactive document clustering system, which is designed based on the concept of CMV (coordinated multiple views). An interactive document clustering is used by a user to obtain a set of document groups from a document collection in interactive manner. It is expected to be useful for various tasks such as text mining and document retrieval. As the result of document clustering consists of multiple objects such as clusters (document groups), documents, and words, each of those should be presented to users in different ways. Based on this consideration, the proposed system employs multiple views, each of which is designed for specific object such as document and keyword. A prototype system is implemented on TETDM (Total Environment for Text Data Mining), which is one of environments for developing text data mining tools. As it can provide the mechanism of coordination between modules, we decided to use it for developing the prototype system. The proposed system classifies information to be presented into 4 levels: clusters, document, bag of words, and word, each of which is displayed with different views. Experimental results with test participants show the effectiveness of the proposed system.
-  Y. Takama and R. Miyake, “Proposal of Interactive Clustering System Employing Grouping-based Pairwise Constraint Generation,” PacificVis2014, Poster Session, pp. 11-12, 2014.
-  J. C. Roberts, “State of the art: Coordinated & multiple views in exploratory visualization,” Int. Conf. on Coordinated and Multiple Views in Exploratory, pp. 61-71, 2007.
-  M. Scherr, “Multiple and Coordinated Views in Information Visualization,” The Media Informatics Advanced Seminar on Information Visualization, 2008/2009.
-  W. Sunayama, “Knowledge Emergence using Total Environment for Text Data Mining,” SCIS&ISIS2014, pp. 1506-1511, 2014.
-  Y. Takama and T. Tonegawa, “Development of Interactive Document Clustering System based on Coordinated Multiple Views,” IWACIII Part 2 2015, S1-4, 2015.
-  S. Ananiadou, D. B. Kell, and J. Tsujii, “Text Mining and Its Potential Applications in Systems Biology,” TRENDS in Biotechnology, Vol.24, No.12, pp. 571-579, 2006.
-  A. Hotho, A. Nürnberger, and G. Paaβ, “A Brief Survey of Text Mining,” LDV Forum -GLDV J. for Computational Linguistics and Language Technology, Vol.20, No.1, pp. 19-62, 2005.
-  F. Ciravegna, J. Matiasek, L. Gilardoni, and W. J. Black, “Facile: Classifying Texts Integrating Pattern Matching and Information Extraction,” Int. Joint Conf. on Artificial Intelligence, pp. 890-895, 1999.
-  T. Nasukawa, M. Morohashi, and T. Nagano, “Customer claim mining: Discovering knowledge in vast amounts of textual data,” Technical report, IBM Research, Japan, 1999.
-  E. Tani and W. Sunayama, “Feature Interpretation Support for Comparing Newcomers and Veterans in Electronic Health Records,” JSAI SIG-AM-03-07, pp. 37-43, 2013 (in Japanese). bibitemtakama-jsai15 Y. Takama, M. Kushima, and W. Sunayama, “Development of EMR Analysis Support Tool based on TETDM and Its Evaluation through Actual EMR Analysis,” Trans. of the Japanese Society for Artificial Intelligence, Vol.30, No.1, pp. 372-382, 2015 (in Japanese).
-  T. Kajinami, “Practical of Information Education using TETDM,” JSAI2013, 3B3-NFC-01a-5, 2013 (in Japanese).
-  H. Tokunaga and T. Sugimura, “Development of TETDM Tools which Unified R and Weka,” JSAI TETDM-01-SIG-IC-06-07, 2011 (in Japanese).
-  G. Andrienko and N. Andrienko, “Coordinated Multiple Views: a Critical View,” CMV’07, pp. 72-74, 2007.
-  T. Masui, M. Minakuchi, I. George, R. Borden, and K. Kashiwagi, “Multiple-View approach for smooth information retrieval,” Proc. of the ACM Symp. on User Interface and Software Technology (UIST’95), pp. 199-206, 1995.
-  M. Q. W. Baldonado, A. Woodruff, and A. Kuchinsky, “Guidelines for Using Multiple Views in Information Visualization,” Working Conf. on Advanced Visual Interface Conf., pp. 110-119, 2000.
-  Y. Takama, T. Ishibashi, T. Okada, and Y. Horii, “M2VSM: Extended VSM based on Meta Keyword and Its Application to Text Mining,” Int. J. of Computer Science and System Analysis (IJCSSA), Vol.2, No.2, pp. 115-120, 2008.
-  W. Sunayama and M. Yachida, “A Panoramic View System for Extracting Key Sentences with Discovering Keywords Featuring a Document,” The Trans. of the Institute of Electronics, Information and Communication Engineers, Vol.J84-D-I, No.2, pp. 146-154, 2001 (in Japanese).
-  W. Sunayama and M. Yachida, “A Panoramic View System for Extracting Key Sentences based on Viewpoints and an Application to the Search Engine,” Trans. of the Japanese Society for Artificial Intelligence, Vol.17, No.1, pp. 14-22, 2002 (in Japanese).