Interactive Document Clustering System Based on Coordinated Multiple Views

Yasufumi Takama; Takuma Tonegawa

doi:10.20965/jaciii.2016.p0139

single-jc.php

« previous

JACIII Vol.20 No.1 pp. 139-145

(2016)

doi: 10.20965/jaciii.2016.p0139

Paper:

Views over last 60 days: 7,316

Interactive Document Clustering System Based on Coordinated Multiple Views

Yasufumi Takama and Takuma Tonegawa

Graduate School of System Design, Tokyo Metropolitan University
6-6 Asahigaoka, Hino, Tokyo 191-0065, Japan

Received:

November 10, 2015

Accepted:

December 10, 2015

Online released:

January 19, 2016

Published:

January 20, 2016

Keywords:

coordinated multiple views, interactive clustering, text mining

Abstract

This paper proposes an interactive document clustering system, which is designed based on the concept of CMV (coordinated multiple views). An interactive document clustering is used by a user to obtain a set of document groups from a document collection in interactive manner. It is expected to be useful for various tasks such as text mining and document retrieval. As the result of document clustering consists of multiple objects such as clusters (document groups), documents, and words, each of those should be presented to users in different ways. Based on this consideration, the proposed system employs multiple views, each of which is designed for specific object such as document and keyword. A prototype system is implemented on TETDM (Total Environment for Text Data Mining), which is one of environments for developing text data mining tools. As it can provide the mechanism of coordination between modules, we decided to use it for developing the prototype system. The proposed system classifies information to be presented into 4 levels: clusters, document, bag of words, and word, each of which is displayed with different views. Experimental results with test participants show the effectiveness of the proposed system.

Cite this article as:

Y. Takama and T. Tonegawa, “Interactive Document Clustering System Based on Coordinated Multiple Views,” J. Adv. Comput. Intell. Intell. Inform., Vol.20 No.1, pp. 139-145, 2016.

Data files:

References

[1] Y. Takama and R. Miyake, “Proposal of Interactive Clustering System Employing Grouping-based Pairwise Constraint Generation,” PacificVis2014, Poster Session, pp. 11-12, 2014.
[2] J. C. Roberts, “State of the art: Coordinated & multiple views in exploratory visualization,” Int. Conf. on Coordinated and Multiple Views in Exploratory, pp. 61-71, 2007.
[3] M. Scherr, “Multiple and Coordinated Views in Information Visualization,” The Media Informatics Advanced Seminar on Information Visualization, 2008/2009.
[4] W. Sunayama, “Knowledge Emergence using Total Environment for Text Data Mining,” SCIS&ISIS2014, pp. 1506-1511, 2014.
[5] Y. Takama and T. Tonegawa, “Development of Interactive Document Clustering System based on Coordinated Multiple Views,” IWACIII Part 2 2015, S1-4, 2015.
[6] S. Ananiadou, D. B. Kell, and J. Tsujii, “Text Mining and Its Potential Applications in Systems Biology,” TRENDS in Biotechnology, Vol.24, No.12, pp. 571-579, 2006.
[7] A. Hotho, A. Nürnberger, and G. Paaβ, “A Brief Survey of Text Mining,” LDV Forum -GLDV J. for Computational Linguistics and Language Technology, Vol.20, No.1, pp. 19-62, 2005.
[8] F. Ciravegna, J. Matiasek, L. Gilardoni, and W. J. Black, “Facile: Classifying Texts Integrating Pattern Matching and Information Extraction,” Int. Joint Conf. on Artificial Intelligence, pp. 890-895, 1999.
[9] T. Nasukawa, M. Morohashi, and T. Nagano, “Customer claim mining: Discovering knowledge in vast amounts of textual data,” Technical report, IBM Research, Japan, 1999.
[10] E. Tani and W. Sunayama, “Feature Interpretation Support for Comparing Newcomers and Veterans in Electronic Health Records,” JSAI SIG-AM-03-07, pp. 37-43, 2013 (in Japanese). bibitemtakama-jsai15 Y. Takama, M. Kushima, and W. Sunayama, “Development of EMR Analysis Support Tool based on TETDM and Its Evaluation through Actual EMR Analysis,” Trans. of the Japanese Society for Artificial Intelligence, Vol.30, No.1, pp. 372-382, 2015 (in Japanese).
[11] T. Kajinami, “Practical of Information Education using TETDM,” JSAI2013, 3B3-NFC-01a-5, 2013 (in Japanese).
[12] H. Tokunaga and T. Sugimura, “Development of TETDM Tools which Unified R and Weka,” JSAI TETDM-01-SIG-IC-06-07, 2011 (in Japanese).
[13] G. Andrienko and N. Andrienko, “Coordinated Multiple Views: a Critical View,” CMV’07, pp. 72-74, 2007.
[14] T. Masui, M. Minakuchi, I. George, R. Borden, and K. Kashiwagi, “Multiple-View approach for smooth information retrieval,” Proc. of the ACM Symp. on User Interface and Software Technology (UIST’95), pp. 199-206, 1995.
[15] M. Q. W. Baldonado, A. Woodruff, and A. Kuchinsky, “Guidelines for Using Multiple Views in Information Visualization,” Working Conf. on Advanced Visual Interface Conf., pp. 110-119, 2000.
[16] Y. Takama, T. Ishibashi, T. Okada, and Y. Horii, “M2VSM: Extended VSM based on Meta Keyword and Its Application to Text Mining,” Int. J. of Computer Science and System Analysis (IJCSSA), Vol.2, No.2, pp. 115-120, 2008.
[17] W. Sunayama and M. Yachida, “A Panoramic View System for Extracting Key Sentences with Discovering Keywords Featuring a Document,” The Trans. of the Institute of Electronics, Information and Communication Engineers, Vol.J84-D-I, No.2, pp. 146-154, 2001 (in Japanese).
[18] W. Sunayama and M. Yachida, “A Panoramic View System for Extracting Key Sentences based on Viewpoints and an Application to the Search Engine,” Trans. of the Japanese Society for Artificial Intelligence, Vol.17, No.1, pp. 14-22, 2002 (in Japanese).

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[B1] [1] Y. Takama and R. Miyake, “Proposal of Interactive Clustering System Employing Grouping-based Pairwise Constraint Generation,” PacificVis2014, Poster Session, pp. 11-12, 2014.

[B2] [2] J. C. Roberts, “State of the art: Coordinated & multiple views in exploratory visualization,” Int. Conf. on Coordinated and Multiple Views in Exploratory, pp. 61-71, 2007.

[B3] [3] M. Scherr, “Multiple and Coordinated Views in Information Visualization,” The Media Informatics Advanced Seminar on Information Visualization, 2008/2009.

[B4] [4] W. Sunayama, “Knowledge Emergence using Total Environment for Text Data Mining,” SCIS&ISIS2014, pp. 1506-1511, 2014.

[B5] [5] Y. Takama and T. Tonegawa, “Development of Interactive Document Clustering System based on Coordinated Multiple Views,” IWACIII Part 2 2015, S1-4, 2015.

[B6] [6] S. Ananiadou, D. B. Kell, and J. Tsujii, “Text Mining and Its Potential Applications in Systems Biology,” TRENDS in Biotechnology, Vol.24, No.12, pp. 571-579, 2006.

[B7] [7] A. Hotho, A. Nürnberger, and G. Paaβ, “A Brief Survey of Text Mining,” LDV Forum -GLDV J. for Computational Linguistics and Language Technology, Vol.20, No.1, pp. 19-62, 2005.

[B8] [8] F. Ciravegna, J. Matiasek, L. Gilardoni, and W. J. Black, “Facile: Classifying Texts Integrating Pattern Matching and Information Extraction,” Int. Joint Conf. on Artificial Intelligence, pp. 890-895, 1999.

[B9] [9] T. Nasukawa, M. Morohashi, and T. Nagano, “Customer claim mining: Discovering knowledge in vast amounts of textual data,” Technical report, IBM Research, Japan, 1999.

[B10] [10] E. Tani and W. Sunayama, “Feature Interpretation Support for Comparing Newcomers and Veterans in Electronic Health Records,” JSAI SIG-AM-03-07, pp. 37-43, 2013 (in Japanese). bibitemtakama-jsai15 Y. Takama, M. Kushima, and W. Sunayama, “Development of EMR Analysis Support Tool based on TETDM and Its Evaluation through Actual EMR Analysis,” Trans. of the Japanese Society for Artificial Intelligence, Vol.30, No.1, pp. 372-382, 2015 (in Japanese).

[B11] [11] T. Kajinami, “Practical of Information Education using TETDM,” JSAI2013, 3B3-NFC-01a-5, 2013 (in Japanese).

[B12] [12] H. Tokunaga and T. Sugimura, “Development of TETDM Tools which Unified R and Weka,” JSAI TETDM-01-SIG-IC-06-07, 2011 (in Japanese).

[B13] [13] G. Andrienko and N. Andrienko, “Coordinated Multiple Views: a Critical View,” CMV’07, pp. 72-74, 2007.

[B14] [14] T. Masui, M. Minakuchi, I. George, R. Borden, and K. Kashiwagi, “Multiple-View approach for smooth information retrieval,” Proc. of the ACM Symp. on User Interface and Software Technology (UIST’95), pp. 199-206, 1995.

[B15] [15] M. Q. W. Baldonado, A. Woodruff, and A. Kuchinsky, “Guidelines for Using Multiple Views in Information Visualization,” Working Conf. on Advanced Visual Interface Conf., pp. 110-119, 2000.

[B16] [16] Y. Takama, T. Ishibashi, T. Okada, and Y. Horii, “M2VSM: Extended VSM based on Meta Keyword and Its Application to Text Mining,” Int. J. of Computer Science and System Analysis (IJCSSA), Vol.2, No.2, pp. 115-120, 2008.

[B17] [17] W. Sunayama and M. Yachida, “A Panoramic View System for Extracting Key Sentences with Discovering Keywords Featuring a Document,” The Trans. of the Institute of Electronics, Information and Communication Engineers, Vol.J84-D-I, No.2, pp. 146-154, 2001 (in Japanese).

[B18] [18] W. Sunayama and M. Yachida, “A Panoramic View System for Extracting Key Sentences based on Viewpoints and an Application to the Search Engine,” Trans. of the Japanese Society for Artificial Intelligence, Vol.17, No.1, pp. 14-22, 2002 (in Japanese).