single-jc.php

JACIII Vol.11 No.4 pp. 416-422
doi: 10.20965/jaciii.2007.p0416
(2007)

Paper:

Automatic Extraction of Key Sentences via Word Sense Identification for Chinese Text Summarization

Yau-Hwang Kuo and Hsun-Hui Huang

CREDIT, Department of Computer Science and Information Engineering, National Cheng Kung University, No.1, Ta-Hsueh Rd., Tainan, Taiwan

Received:
April 30, 2006
Accepted:
August 16, 2006
Published:
April 20, 2007
Keywords:
key sentences, text summarization, word sense disambiguation, sense representation, fuzzy transaction
Abstract

In this paper, a novel method of key sentences extraction is proposed for automatic Chinese text summarization. Key-senses/sense-patterns discovery and key sentences extraction are its two main components. Since there is no Chinese lexical database like WordNet available to the authors, a compromise is to word-segment, POS-tag a target Chinese text and translate all the nouns/verbs into English for sense disambiguation using WordNet. The characteristic of the proposed method is that each sentence is represented by senses and the key senses in each sentence form a fuzzy transaction. Each entry of the fuzzy transaction is the maximum similarity degree of the corresponding key sense with each of the senses in the sentence. A prototype of this automatic Chinese text summarization scheme is constructed and an intrinsic method with the information-retrieval criteria is used for measuring the summary quality. The results of applying the prototype to datasets with manually-generated summaries are shown.

Cite this article as:
Y. Kuo and H. Huang, “Automatic Extraction of Key Sentences via Word Sense Identification for Chinese Text Summarization,” J. Adv. Comput. Intell. Intell. Inform., Vol.11, No.4, pp. 416-422, 2007.
Data files:
References
  1. [1] S. Banerjee and T. Pedersen, “An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet,” In the 3rd International Conference on Intelligent Text Processing and Computational Linguistics, pp. 136-145, Mexico, 2002.
  2. [2] R. Barzilay and M. Elhadad, “Using Lexical Chains for Text Summarization,” In the Intelligent Scalable Text Summarization Workshop, Madrid, 1997.
  3. [3] H. H. Chen, C. C. Lin, and W. C. Lin, “Building a Chinese-English WordNet for Translingual Applications,” ACM Transactions on Asian Language Information Processing, 1(2): pp. 103-122, June, 2002.
  4. [4] W. Chuang and J. Yang, “Extracting Sentence Segments for Text Summarization: A Machine Learning Approach,” In the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 152-159, Athens, Greece, July, 2000.
  5. [5] CKIP, “Chinese Electronic Dictionary,” Academia Sinica, Taiwan, 1993.
  6. [6] CKIP, “Chinese Part-of-speech Analysis,” Technical Report 93-05, Academia Sinica, Taiwan, 1993.
  7. [7] CKIP, “AutoTag Version 1.0,” Academia Sinica, Taiwan.
    http://ckipsvr.iis.sinica.edu.tw/ ,
    1999.
  8. [8] R. Clason, “Finding Clusters: An Application of the Distance Concept,” The Mathematics Teacher, April 1990.
  9. [9] M. Delgado, N. Marín, D. Sánchez, and M.-A. Vila, “Fuzzy Association Rules: General Model and Applications,” IEEE Transactions on Fuzzy Systems, 11(2): pp. 214-225, April, 2003.
  10. [10] Z. Dong and Q. Dong. “HowNet [online] 2000,”
    http://www.keenage.com/zhiwang/e_zhiwang.html ,
    2000.
  11. [11] H. Edmundson, “New Methods in Automatic Abstracting,” Journal of ACM, 16(2): pp. 264-285, 1969.
  12. [12] C. Fellbaum (Ed.), “WordNet: An Electronic Lexical Database,” MIT Press, 1998.
  13. [13] J. Goldstein, V. Mittal, J. Carbonell, and J. Callan, “Creating and Evaluating Multi-document Sentence Extract Summaries,” In the 9th ACM International Conference on Information and Knowledge Management, pp. 165-172, McLean, VA, USA, November, 2000.
  14. [14] U. Hahn and I. Mani, “The Challenges of Automatic Summarization,” IEEE Computer, 33(11): pp. 29-36, November, 2000.
  15. [15] P. Hu, T. He, D. Ji, and M. Wang, “A Study of Chinese Text Summarization Using Adaptive Clustering of Paragraphs,” In the 4th International Conference on Computer and Information Technology (CIT’04), pp. 1159-1164, August, 2004.
  16. [16] K. Ishikawa, S. Ando, S. Doi, and A. Okumura, “Trainable Automatic Text Summarization Using Segmentation of Sentence,” In the 3rd NTCIR Workshop on Research in Information Retrieval, Automatic Text Summarization and Question Answering, September 2001-October 2002.
  17. [17] J. J. Jiang and D. W. Conrath, “Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy,” In the 10th International Conference on Research Computational Linguistics (ROCLING X), Taiwan, 1997.
  18. [18] J. Kupiec, J. Pedersen, and F. Chen, “A Trainable Document Summarizer,” In the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 68-73, Seattle, WA, USA, July, 1995.
  19. [19] J. J. Li and K. S. Choi, “Corpus-Based Chinese Text Summarization System,” In the 10th International Conference on Research Computational Linguistics (ROCLING X), pp. 237-241, Taiwan, 1997.
  20. [20] I. Mani, G. Klein, D. House, L. Hirschman, T. Firmin, and B. Sundheim, “SUMMAC: A Text Summarization Evaluation,” Natural Language Engineering, 8(1): pp. 43-68, 2002.
  21. [21] J. Mei, Y. Zhu, Y. Gao, and H. Yin (Eds.), “Tongyici Cilin,” Shangwu Press and Shanghai Dictionaries, Shanghai, 1983.
  22. [22] C. D. Paice, “Constructing Literature Abstracts by Computer: Techniques and Prospects,” Information Processing and Management, 26(1): pp. 171-186, 1990.
  23. [23] S. Patwardhan, S. Banerjee, and T. Pedersen, “Using Measures of Semantic Relatedness for Word Sense Disambiguation,” In the 4th International Conference on Intelligent Text Processing and Computational Linguistics, Mexico, 2003.
  24. [24] M. Sussna, “Word Sense Disambiguation for Free-text Indexing Using a Massive Semantic Network,” In the 2nd International Conference on Information and Knowledge Management (CIKM), Arlington, Virginia, USA, 1993.
  25. [25] Z. Xie, X. Li, B. Di, Eugenio, W. Xiao, T. Tirpak, and P. Nelson, “Using Gene Expression Programming to Construct Sentence Ranking Functions for Text Summarization,” In the 20th International Conference on Computational Linguistics, COLING-2004, pp. 1381-1384, Geneva, August, 2004.
  26. [26] J. Y. Yeh, H. R. Ke, and W. P. Yang, “Chinese Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis,” In the 5th International Conference on Asian Digital Libraries: Digital Libraries: People, Knowledge, and Technology, pp. 76-87, 2002.

*This site is desgined based on HTML5 and CSS3 for modern browsers, e.g. Chrome, Firefox, Safari, Edge, Opera.

Last updated on Sep. 24, 2020