Common Sense from the Web? Naturalness of Everyday Knowledge Retrieved from WWW
Rafal Rzepka, Yali Ge, and Kenji Araki
Language Media Laboratory, Research Group of Information Media Science and Technology, Division of Media and Network Technologies, Graduate School of Information Science and Technology, Hokkaido University, Kita 14 Nishi 9, Kita-ku, Sapporo 060-0814, Japan
This paper is to suggest opportunities for advanced systems hiding in the millions of WWW pages. While usually the Internet is used for achieving knowledge for humans, we present opposite approach where a machine retrieves usual knowledge about humans, their common behaviors and feelings. We claim that in long run such capability will be necessary for every machine interacting with a human user. We will concentrate on our theories and illustrate them with the results of web-mining experiment.
-  M. Minsky, “A Framework for Representing Knowledge,” In: P. H. Winston (ed.), The Psychology of Computer Vision, New York: McGraw-Hill, pp. 211-277, 1975.
-  R. Schank and R. Abelson, “Scripts, Plans, Goals, and Understanding,” Hillsdale, NJ: Erlbaum, 1977.
-  R. Rzepka, K. Araki, and K. Tochinai, “Bacterium Lingualis – the Web-based Commonsensical Knowledge Discovery Method,” Lecture Notes in Artificial Intelligence series of Springer-Verlag, Vol.2843, pp. 460-467, 2003.
-  C. J. Fillmore, “The Case for Case,” E. Bach & R. T. Harms (eds.), Universals in Linguistic Theory, New York: Holt, Rinehart & Winston, pp. 1-88, 1968.
-  C. J. Fillmore, “Frames and the Semantica of Understanding,” In: V. Raskin (ed.), Round Table Discussion on Frame/Script Semantics, Part I, Quadernidi Semantics VI: 2, pp. 222-254, 1985.
-  M. Minsky, “Society of Mind,” Simon and Schuster, New York, 1988.
-  V. Raskin, “Script-Based Semantic Theory,” In: D. G. Ellis, and W. A. Donohue (eds.), Contemporary Issues in Language and Discourse Processes, Hillsdale, NJ: Erlbaum, pp. 23-61, 1986.
-  F. Keller and M. Lapatay, “Using the Web to Obtain Frequencies for Unseen Bigrams,” Computational Linguistics, Vol.29, No.3, pp. 459-484, September, 2003.
-  C. Santamaria, J. Gonzalo, and F. Verdejo, “Automatic Association of Web Directories with Word Senses,” Computational Linguistics, Vol.29, No.3, pp. 485-502, September, 2003.
-  R. Rzepka and K. Araki, “Automatic General Personality Generation Based on WWW,” CD Proceedings of the 18th Annual Conference of Japanese Society for Artificial Intelligence, Kanazawa, 2004.
-  R. Rzepka and K. Araki, “Automatic First Utterance Creation Based on the Strongest Association Retrieved from WWW,” CD Proceedings of the 19th Annual Conference of Japanese Society for Artificial Intelligence, Kitakyushu, 2005.
-  Y. Yao, N. Zhong, J. Liu, and S. Ohsuga, “Web Intelligence (WI): Research Challenges and Trends in the New Information Age,”Web Intelligence: Research and Development, LNAI, 2001.
-  D. Lenat et al., “Common Sense Knowledge Database CYC,” 1995.
-  P. Singh, “The public acquisition of commonsense knowledge,” Proceedings of AAAI Spring Symposium on Acquiring (and Using) Linguistic (and World) Knowledge for Information Access, Palo Alto, CA: AAAI, 2002.
-  R. Schank, “Conceptual Information Processing,” Amsterdam: North-Holland, 1975.
-  C. F. Kielkopf, “The Pictures in the Head of a Man Born Blind,” Philosophy and Phenomenological research, Vol.28, Issue 4, pp. 501-513, June, 1968.
-  J. F. Fletcher, “Spatial representation in blind children,” Development compared to sighted children, Journal of Visual Impairment and Blindness, 74, pp. 381-385, 1980.
-  R. Rzepka and K. Araki, “What Statistics Could Do for Ethics? – The Idea of Common Sense Processing Based Safety Valve,” Proceedings of AAAI Fall Symposia: Machine Ethics Symposium, Arlington, USA, November, 2005.
-  R. Rzepka, K. Araki, and K. Tochinai, “Is It Out There? The Perspectives of Emotional Information Retrieval from the Internet,” Proceedings of the IASTED Artificial Intelligence and Applications Conference, pp. 22-27, ACTA Press, 2002.
-  Y. Matsumoto, A. Kitauchi, T. Yamashita, Y. Hirano, H. Matsuda, and M. Asahara, “Japanese morphological analysis system ChaSen version 2.0 manual (2nd edition),” 1999.
-  www.google.com
-  Y. Ge, R. Rzepka, and K. Araki, “Support for Internet-based Commonsense Processing – Causal Knowledge Discovery Using Japanese If-Forms,” Proceedings of KES’2005 9th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, Springer-Verlag, LNAI 3682, pp. 950-956, Melbourne, Australia, 2005.
-  R. Rzepka, Y. Ge, and K. Araki, “Naturalness of an Utterance Based on the Automatically Retrieved Commonsense,” Proceedings of IJCAI 2005 – Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, 2005.
-  Y. Ge, R. Rzepka, and K. Araki, “Does the Commonsense Depend on Culture? Results of Simple Scripts Retrieval from WWW,” 2004 Joint Convention Record, the Hokkaido Chapters of the Institutes of Electrical and Information Engineers, Japan; 222, pp. 272-273, 2004.