Visualizing Fuzzy Relationship in Bibliographic Big Data Using Hybrid Approach Combining Fuzzy c-Means and Newman-Girvan Algorithm
Maslina Zolkepli, Fangyan Dong, and Kaoru Hirota
Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology, G3-49, 4259 Nagatsuta, Midori-ku, Yokohama 226-8502, Japan
Bibliographic big data visualization method is proposed by incorporating a combination of fuzzy c-means clustering and the Newman-Girvan clustering algorithm, where clustered results are displayed in a network view by grouping objects with similar cluster memberships. As current bibliographic visualizations focus on the crisp relationship among data, fuzzy analysis and visualization may offer insights to bibliographic big data, enabling faster decision making by improving displayed information precision. The proposed method is applied to the DBLP citation network dataset. Results show that merging two clustering algorithms and visualization using fuzzy techniques enables the user to converge a few target papers within an average of 5 minutes from 1.5 million papers stored in the DBLP. Users targeted for the proposed method include researchers, educators, and students who hope to use real-world social and biological networks. The proposal is planned to be opened to the public through the Internet.
-  N. Elmqvist and P. Tsigas, “CiteWiz: a tool for the visualization of scientific citation networks,” J. of Information Visualization, Vol.6, No.3, pp. 215-232, 2007.
-  Z. Shen, M. Ogawa, S. T. Teoh, and K. Ma, “BiblioViz: A System for Visualizing Bibliography Information,” Proc. of the Asia Pacific Symp. on Information Visualization (APVIS ’06), Vol.60, pp. 93-102, 2006.
-  A. Brüggemann-klein, R. Klein, and B. Landgraf, “BibRelEx: Exploring Bibliographic Databases by Visualization of Annotated Contents-Based Relations,” Int. Conf. on Information Visualization, Vol.5, No.11, pp. 19-24, 2000.
-  X. F. Yin, L. P. Khoo, Y. T. Chong, “A fuzzy c-means based hybrid evolutionary approach to the clustering of supply chain,” Comput. Ind. Eng., Vol.66, No.4, pp. 768-780, 2013.
-  J. D. Andrés, P. Lorca, and F. J. Cos Juez, “Bankruptcy forecasting: A hybrid approach using Fuzzy c-means clustering and Multivariate Adaptive Regression Splines (MARS),” Expert Syst. Appl., Vol.38, No.3, pp. 1866-1875, 2011.
-  J. Jin, Y. Liu, L. T. Yang, N. Xiong, and F. Hu, “An Efficient Detecting Communities Algorithm with Self-Adapted Fuzzy C-Means Clustering in Complex Networks,” IEEE 11th Int. Conf. on Trust, Security and Privacy in Computing and Communications (Trust-Com), Vol.1988, No.1993, pp. 25-27, 2012.
-  M. Girvan and M. E. J. Newman, “Community structure in social and biological networks, Proceedings of the National Academy of Sciences,” Vol.99, No.12, pp. 7821-7826, 2002.
-  A. S. Ehikioya, “A Characterization of Information Quality Using Fuzzy Logic,” Fuzzy In-formation Processing Society, NAFIPS. 18th Int. Conf. of the North American, pp. 635-639, 1999.
-  J. Tang, J. Zhang, L. Yao, L. Li, L. Zhang, and Z. Su, “ArnetMiner: Extraction and Mining of Academic Social Networks,” Proc. of the 14th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (SIGKDD’2008), pp. 990-998, 2008.
-  J. Tang, D. Zhang, and L. Yao, “Social Network Extraction of Academic Researchers,” Proc. of 2007 IEEE Int. Conf. on Data Mining (ICDM’2007), pp. 292-301, 2007.
-  Java Universal Network Graph
-  J. C. Bezdek, “Pattern Recognition with fuzzy objective functions algorithms,” New York: Plenum Press, 1981.
-  E. E. Gustafson, and W. C. Kessel, “Fuzzy clustering with a fuzzy covariance matrix,” pp. 761-766, IEEE CDC, 1979.
-  I. Gath and A. B. Geva, “Unsupervised optimal fuzzy clustering,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.11, No.7, pp. 773-781, 1988.
-  A. T. Azar, S. A. El-Said, and A. E. Hassanien, “Fuzzy and hard clustering analysis for thyroid disease,” Computer Methods Programs Biomed, Vol.111, No.1, pp. 1-16, 2013.
-  S. Fortunato, “Community Detection in Graphs,” Physics Reports, Vol.486, No.3-5, pp. 75-175, 2010.
-  M. E. J. Newman and M. Girvan, “Finding and evaluating community structure in networks,” Phys. Rev. E, Vol.69, pp. 026113, 2004.
-  R. Guimerà and L. A. Nunes Amaral, “Functional cartography of complex metabolic networks,” Nature, Vol.433, No.7028, pp. 895-900, 2005.
-  V. Batagelj and A. Mrvar, “Pajek datasets,”
-  B. Pham, A. Streit and R. Brown, “Visualization of Information Uncertainty: Progress and Challenges,” Trends in Interactive Visualization, pp. 19-48, 2009.
-  B. Pham and R. Brown, “Analysis of Visualization Requirements for Fuzzy Systems,” Proc. of the 1st Int. Conf. on Computer Graphics and Interactive Techniques in Australasia and South East Asia, Vol.1, No.212, pp. 181-187, 2003.
-  T. M. J. Fruchterman and R. M. Reingold, “Graph Drawing by Force-directed Placement,” Software Practice and Experience, Vol.21, No.11, pp. 1129-1164, 1991.
-  J. Gelernter, D. Cao, R. Lu, E. Fink, and J. G. Carbonell, “Creating and visualizing fuzzy document classification,” Proc. of the 2009 IEEE Int. Conf. on Systems, Man and Cybernetics (SMC’09), pp. 672-679, 2009.
-  Eclipse IDK 4.2.2,
-  C. D. Manning, P. Raghava, and H. Schutze, “Introduction to Information Retrieval,” pp. 151-161, Cambridge University Press, 2008.