Text-Data Reduction Method to Grasp the Sequence of a Disaster Situation: Case Study of Web News Analysis of the 2015 Typhoons 17 and 18
Shosuke Sato*,†, Toru Okamoto**, and Shunichi Koshimura*
*International Research Institute of Disaster Science, Tohoku University
Aoba 468-1, Aramaki, Aoba-ku, Sendai 980-0845, Japan
*2Nippon Sogo Systems, Inc., Sendai, Japan
This study aims to compress web news, delivered as a big-data source after disasters. In this paper, article clustering, which is a combination of conventional means and an algorithm that selects the representative articles of each cluster, is designed and adopted. Experiments are conducted by evaluators. The proposed algorithm is in accord with the evaluators for 50s% of the clustering and for about 30s% to 40s% of the representative-article selection.
-  H. Hayashi, “Goal to ‘Development of crisis management system suitable for Japan society’,” Research workshop abstract, MEXT Science and Technology Promotion, pp. 9-23, 2006.
-  ITS Japan：Actual Traffic Data, http://www.its-jp.org/saigai/ [accesssed Oct. 3, 2016]
-  TR Analysis on Disastster and Crisis Informatin, http://www.trendreader.jp/tr_analysis/tr_portal.html [accessed Apr. 1, 2015]
-  S. Sato, H. Hayashi, N. Maki, and M. Inoguchi, “The Development of an Algorithm Using the TFIDF / TF Index to Extract Automatically the Set of Keywords of Corpus about Fields Related to Emergency Management – A Case Study Utilizing Web News Articles for the 2004 Niigata-Ken-Chuetsu Earthquake Disaster –,” J. of Social Saftey Science, No.8, pp. 367-376, 2006.
-  S. Sato, H. Hayashi, K. Inoue, and T. Nihino, “Visualizing Chronological Behavior of Disaster Social Aspect Based on Web News Articles on Disasters and Crises – Support of Creating Common Operational Picture for National/Local Government Officers and Researchers Related to Disaster Management through the Web Publication of Keyword Extraction Results using TRENDREADER (TR) –,” J. of the Visualization Society of Japan, Vol.29, No.7, pp. 17-26, 2009.
-  T. Kudo, MeCab, http://mecab.sourceforge.net/ [accesssed Oct. 2, 2016]
-  G. Salton and C. Buckley, “Term-weighting approaches in automatic retrieval,” Information Processing & Management, Vol.24, Issue 5, pp. 513-523, 1988.
-  D. M. Blei, A. Y. Ng, and M. I. Jordan, ”Latent Dirichlet Allocation,” J. of Machine Learning Research, Vol.3, pp. 993-1022, 2003.
-  G. Miller, “The Magical Number Seven, Plus or Minus Two,” The Psychological Review, Vol.63, No.2, pp. 81-97, 1956.
-  S. Masuyama and K. Yamamto, “Some Research Topics and Future Prospects in Text Summarization,” J. of Information Processing Society of Japan, Vol.43, No.12, pp. 1310-1316, 2002.
-  Fire and Disaster Management Agency, “Situation of the 2015 Typhoons 18 (37th report),” http://www.fdma.go.jp/bn/2015/detail/926.html [accesssed Oct. 2, 2016]
-  Yahoo! JAPAN, Yahoo! News, http://headlines.yahoo.co.jp/hl [accesssed Oct. 2, 2016]