Finding Communities Using User Preference in Web Structure Mining

Takeshi Yoshikawa; Hidetoshi Nonaka

doi:10.20965/jaciii.2011.p0377

single-jc.php

« previous

JACIII Vol.15 No.3 pp. 377-382

(2011)

doi: 10.20965/jaciii.2011.p0377

Paper:

Views over last 60 days: 997

Finding Communities Using User Preference in Web Structure Mining

Takeshi Yoshikawa and Hidetoshi Nonaka

Graduate School of Information Science and Technology, Hokkaido University, Kita 14, Nishi 9, Kita-ku, Sapporo 060-0814, Japan

Received:

October 30, 2010

Accepted:

January 5, 2011

Published:

May 20, 2011

Keywords:

web community, web structure mining, user preference

Abstract

Web structure mining is the method based on the graph structure of hyperlinks, and it does not use the information of web contents. HITS algorithm and PageRank algorithm are popular methods for web structure mining. In this study, we deal with the finding algorithm of web communities in web structure mining. This algorithm receives some URLs of user’s known web pages, and proposes to the user the candidates of pages in the web community by using the structure of bipartite graph. We investigate the effect of introduction of user preference to each known pages, and discuss the way to improve the finding algorithm of web communities.

Cite this article as:

T. Yoshikawa and H. Nonaka, “Finding Communities Using User Preference in Web Structure Mining,” J. Adv. Comput. Intell. Intell. Inform., Vol.15 No.3, pp. 377-382, 2011.

Data files:

References

[1] J. M. Kleinberg, R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins, “The Web as a Graph; Measurements, Models, and Methods,” Proc. of the 5th Annual Int. Conf. on Computing and Combinatorics, Lecture Notes in Computer Science, Vol.1627, 1999.
[2] L. Page, S. Brin, R. Motwani, and T.Winograd, “The Page Citation Ranking: Bringing Order to the Web,” Technical Report, Stanford University, 1998.
[3] Z. Gyongyi, H. Garcia-Molina, and J. Pedersen, “Combating Web Spam with TrustRank,” Technical Report, Stanford University, 2004.
[4] H. Shimizu, “A Study on Web Mining Based on Web Communities,” Graduation Thesis, Hokkaido University, 2006. (in Japanese)
[5] T. Murata, “Finding Related Web Pages Based on Connectivity Information from a Search Engine,” Poster Proc. of the Tenth Int. World Wide Web Conf. (WWW10), 2001.
[6] K. Eguchi, K. Oyama, A. Aizawa, and H. Ishikawa, “Overview of the Informational Retrieval Task at NTCIR-4 WEB,” Proc. of the Fourth NTCIRWorkshop on Research in Information Access Technologies Information Retrieval, Question Answering and Summarization, 2004.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[B1] [1] J. M. Kleinberg, R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins, “The Web as a Graph; Measurements, Models, and Methods,” Proc. of the 5th Annual Int. Conf. on Computing and Combinatorics, Lecture Notes in Computer Science, Vol.1627, 1999.

[B2] [2] L. Page, S. Brin, R. Motwani, and T.Winograd, “The Page Citation Ranking: Bringing Order to the Web,” Technical Report, Stanford University, 1998.

[B3] [3] Z. Gyongyi, H. Garcia-Molina, and J. Pedersen, “Combating Web Spam with TrustRank,” Technical Report, Stanford University, 2004.

[B4] [4] H. Shimizu, “A Study on Web Mining Based on Web Communities,” Graduation Thesis, Hokkaido University, 2006. (in Japanese)

[B5] [5] T. Murata, “Finding Related Web Pages Based on Connectivity Information from a Search Engine,” Poster Proc. of the Tenth Int. World Wide Web Conf. (WWW10), 2001.

[B6] [6] K. Eguchi, K. Oyama, A. Aizawa, and H. Ishikawa, “Overview of the Informational Retrieval Task at NTCIR-4 WEB,” Proc. of the Fourth NTCIRWorkshop on Research in Information Access Technologies Information Retrieval, Question Answering and Summarization, 2004.