
JACIII Vol.16 No.5 pp. 603-610
doi: 10.20965/jaciii.2012.p0603


Latent Topic Estimation Based on Events in a Document

Risa Kitajima and Ichiro Kobayashi

Advanced Sciences, Graduate School of Humanities and Sciences, Ochanomizu University, 2-1-1 Ohtsuka, Bunkyo-ku, Tokyo 112-8610, Japan

November 24, 2011
May 8, 2012
July 20, 2012
latent dirichlet allocation, event extraction, document retrieval, multi-document summarization
Several latent topic model-based methods such as Latent Semantic Indexing (LSI), Probabilistic LSI (pLSI), and Latent Dirichlet Allocation (LDA) have been widely used for text analysis. These methods basically assign topics to words, however, and the relationship between words in a document is therefore not considered. Considering this, we propose a latent topic extraction method that assigns topics to events that represent the relation between words in a document. There are several ways to express events, and the accuracy of estimating latent topics differs depending on the definition of an event. We therefore propose five event types and examine which event type works well in estimating latent topics in a document with a common document retrieval task. As an application of our proposed method, we also show multidocument summarization based on latent topics. Through these experiments, we have confirmed that our proposed method results in higher accuracy than the conventional method.
Cite this article as:
R. Kitajima and I. Kobayashi, “Latent Topic Estimation Based on Events in a Document,” J. Adv. Comput. Intell. Intell. Inform., Vol.16 No.5, pp. 603-610, 2012.
