Turing Test-Based Evaluation of an Experimental System for Generation of Casual English Sentences from Regular English Input
Eleanor Clark and Kenji Araki
Graduate School of Information Science and Technology, Hokkaido University, Kita 14, Nishi 8, Kita-ku, Sapporo, Hokkaido 060-0814, Japan
This paper proposes an experimental system for generating slang-style casual English sentences from regular English input using a phonetic database approach, primarily as an AI task, with real-life applications such as social media marketing. An original database consisting of multiple candidates of casual English phonemes was constructed, and linguistic analysis of Twitter data used to establish the optimum frequency of slang tokens per sentence. The human-likeness and legibility of output sentences of the experimental system were evaluated using an experiment based on the classical definition of the Turing test, in which fifty human evaluators attempted to distinguish sentences produced by the system from genuine humanauthored sentences. The experiment results demonstrated that the gap in human-likeness scores between the “human” and “machine” sentences was small, and that some “machine” sentences actually outperformed several of the “human sentences.” The “machine” sentences’ average score of 3.1 on a 5-point scale, where 3 indicated complete uncertainty of whether the sentences were human-authored or machine-authored, can be considered a pass of the Turing test in the established definition. In this paper, we describe the potential approaches to the task, the construction of the phonetic database and the proposed system, and discuss the evaluation results.
-  E. Clark and K. Araki, “Text Normalization in Social Media: Progress, Problems and Applications for a Pre-Processing System of Casual English,” Proc. of the 12th Conf. of the Pacific Association of Computational Linguistics, Kuala Lumpur, Malaysia, 2011.
-  A. Ritter, C. Cherry, and B. Dolan, “Unsupervised modeling of Twitter Conversations,” Proc. of HLT-NAACL 2010, Los Angeles, California, pp. 172-180, 2010.
-  A. Turing, “Computing Machinery and Intelligence,” Mind, Vol.59, No.236, pp. 433-460, 1950.
-  A. Aw, M. Zhang, J. Xiao, and J. Su, “A phrase-based statistical model for SMS text normalization,” Proc. of the COLING/ACL 2006 Main Conf. Poster Sessions, pp. 33-40, 2006.
-  C. Kobus, F. Yvon, and G. Damnati, “Normalizing SMS: are two metaphors better than one?,” Proc. of the 22nd Int. Conf. on Computational Linguistics, pp. 441-448, 2008.
-  http://www.speech.cs.cmu.edu/cgi-bin/cmudict
(Accessed on 2012.7.30)
-  M. D. Choudhury, Y. R. Lin, H. Sundaram, K. S. Candan, L. Xie, and A. Kelliher, “How does the sampling strategy impact the discovery of information diffusion in social media?,” Proc. of the 4th Int. Conf. on Weblogs and Social Media, Washington DC, USA, 2010.
-  S. Bird, E. Klein, and E. Loper, “Natural Language Processing with Python,” O’Reilly Media, 2009.