Supporting the Translation and Authoring of Test Items with Techniques of Natural Language Processing

Ming-Shin Lu; Yu-Chun Wang; Jen-Hsiang Lin; Chao-Lin Liu; Zhao-Ming Gao; Chun-Yen Chang

doi:10.20965/jaciii.2008.p0234

single-jc.php

« previous

JACIII Vol.12 No.3 pp. 234-242

(2008)

doi: 10.20965/jaciii.2008.p0234

Paper:

Views over last 60 days: 1,386

Supporting the Translation and Authoring of Test Items with Techniques of Natural Language Processing

Ming-Shin Lu^*, Yu-Chun Wang^**, Jen-Hsiang Lin^,
Chao-Lin Liu^, Zhao-Ming Gao^, and Chun-Yen Chang^*

^*National Chengchi University

^**National Taiwan University

^***National Taiwan Normal University

Received:

April 21, 2007

Accepted:

September 15, 2007

Published:

May 20, 2008

Keywords:

natural language processing, computer assisted education, controlled languages, test item translation, test item writing

Abstract

Using techniques of natural language processing to assist the preparation of educational resources for language learning has become an important field. We report two software systems that are designed for assisting the tasks of test item translation and test item authoring. We built a software environment to help experts translate the test items for the Trends in International Mathematics and Science Study (TIMSS). Test items of TIMSS are prepared in American English and will be translated to traditional Chinese. We also built a software environment for composing test items for introductory Chinese courses. The system currently aids the preparation of four important categories of test items, and the resulting test items can be administrated on the Internet.

Cite this article as:

M. Lu, Y. Wang, J. Lin, C. Liu, Z. Gao, and C. Chang, “Supporting the Translation and Authoring of Test Items with Techniques of Natural Language Processing,” J. Adv. Comput. Intell. Intell. Inform., Vol.12 No.3, pp. 234-242, 2008.

Data files:

References

[1] Cangjie method.
http://en.wikipedia.org/wiki/Cangjie_method
[2] CKIP Chinese parsing service.
http://parser.iis.sinica.edu.tw
[3] International Association for the Evaluation of Educational Achievement.
http://www.iea.nl/
[4] Lexicons. HowNet.
http://www.keenage.com ,
and WordNet
http://wordnet.princeton.edu ,
Concise Oxford English-Chinese Dictionary.
http://stardict.sourceforge.net/Dictionaries_zh_TW.php
[5] On-line translation.
Yahoo
http://tw.search.yahoo.com/language/
and Google
http://www.google.com/translate_t
[6] Pinyin methods.
http://en.wikipedia.org/wiki/Pinyin
[7] (329) science articles in Chinese. Popular Science Monthly,
http://www.ntsec.gov.tw/publish/pdf.asp
and Science Education Monthly,
http://140.122.147.172/journal/(new)journal main.htm
[8] Scientific American published between March 2002 and December 2006.
http://sa.ylib.com
[9] Sources of characters for searching similar characters. Ministry of Education, Taiwan.
http://www.edu.tw
and a book written in Chinese
http://www.cbflabs.com/book/ocj5/ocj5/index.html
[10] The JACOB Project.
http://danadler.com/jacob/
[11] L. Amaral, V. Metcalf, and D. Meurers, “Language Awareness through Re-use of NLP Technology,” Workshop on NLP in CALL – Computational and Linguistic Challenges, Annual Conf. of the Computer Assisted Language Instruction Consortium, 2006.
[12] K.-J. Chen and S.-H. Liu, “Word identification for mandarin Chinese sentences,” Proc. of the 14th Conf. on Computational Linguistics, pp. 101-107, 1992.
http://ckipsvr.iis.sinica.edu.tw/
[13] C.-C. Cheng, “Word-focused extensive reading with guidance,” In Selected Papers from the 13th Int’l Symposium on English Teaching, pp. 24-32, Crane Publishing Co, 2004.
http://elearning.ling.sinica.edu.tw/CWordframe.html
[14] D. Griffee, “Can we validly translate questionnaire items from English to Japanese?,” Shiken: JALT Testing & Evaluation SIG Newsletter, 2(2), pp. 15-17, 1998.
[15] N. Ide and J. Véronis, “Word sense disambiguation: The state of the art,” Computational Linguistics, 24(1), pp. 1-40, 1998.
[16] D. Juang, J.-H. Wang, C.-Y. Lai, C.-C. Hsieh, L.-F. Chien, and J.-M. Ho, “Resolving the unencoded character problem for Chinese digital libraries,” Proc. of the 5th ACM/IEEE Joint Conf. on Digital Libraries, pp. 311-319, 2005.
[17] D. Lin, “Dependency-based evaluation of MINIPAR,” Proc. of the Workshop on the Evaluation of Parsing Systems, the 1st Int’l Conf. on Language Resources and Evaluation, 1998.
http://www.cs.ualberta.ca/?lindek/minipar.htm
[18] C. D. Manning and H. Schütze, “Foundations of Statistical Natural Language Processing,” The MIT Press, 1999.
[19] M. O. Martin, K. D. Gregory, and S. E. Stemler (Eds.), “TIMSS 1999 Technical Report,” 1999.
http://timss.bc.edu/
[20] M. Maritxalar, N. Ezeiza, and M. Schulze (Eds.), Proc. of the Workshop on NLP for Educational Resources, Int’l Conf. on Recent Advances in NLP, 2007.
[21] F. J. Och, “An efficient method for determining bilingual word classes,” Proc. of European Chapter of the ACL, pp. 71-76, 1999.
http://www.fjoch.com/mkcls.html
[22] F. J. Och and H. Ney, “A systematic comparison of various statistical alignment models,” Computational Linguistics, 29(1), pp. 19-51, 2003.
http://www.fjoch.com/GIZA++.html
[23] K. O’Connor and B. Malak, “Translation and cultural adaptation of the TIMSS instruments,” pp. 87-100, in Martin et al. , 1999.
[24] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “BLEU: a method for automatic evaluation of machine translation,” Proc. of the 40th Annual Meeting of the ACL, pp. 311-318, 2002.
http://www.nist.gov/speech/tests/mt/resources/scoring.htm
[25] M. F. Porter, “An algorithm for suffix stripping,” Program, 14(3), pp. 130-137, 1980.
[26] A. Ratnaparkhi, “A maximum entropy model for part-of-speech tagging,” Proc. of the Conf. on Empirical Methods in NLP, pp. 133-142, 1996.
[27] C. W. Stansfield, “Test translation and adaptation in public education in the USA,” Language Testing, 20(2), pp. 189-207, 2003.
[28] A. Stolcke, “SRILM – an extensible language modeling toolkit,” Proc. of the Int’l Conf. on Spoken Language Processing, pp. 901-904, 2002.
http://www.speech.sri.com/projects/srilm/
[29] K.-Y. Su and J.-S. Chang, “Some key issues in designing MT systems,” Machine Translation, 5(4), pp. 265-300, Kluwer Academic Publishers, 1990.
[30] C. Wang, M. Collins, and P. Koehn, “Chinese syntactic reordering for statistical machine translation,” Proc. of the 2007 Joint Conf. on Empirical Methods in NLP and Computational Natural Language Learning, pp. 737-745, 2007.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[B1] [1] Cangjie method.
http://en.wikipedia.org/wiki/Cangjie_method

[B2] [2] CKIP Chinese parsing service.
http://parser.iis.sinica.edu.tw

[B3] [3] International Association for the Evaluation of Educational Achievement.
http://www.iea.nl/

[B4] [4] Lexicons. HowNet.
http://www.keenage.com ,
and WordNet
http://wordnet.princeton.edu ,
Concise Oxford English-Chinese Dictionary.
http://stardict.sourceforge.net/Dictionaries_zh_TW.php

[B5] [5] On-line translation.
Yahoo
http://tw.search.yahoo.com/language/
and Google
http://www.google.com/translate_t

[B6] [6] Pinyin methods.
http://en.wikipedia.org/wiki/Pinyin

[B7] [7] (329) science articles in Chinese. Popular Science Monthly,
http://www.ntsec.gov.tw/publish/pdf.asp
and Science Education Monthly,
http://140.122.147.172/journal/(new)journal main.htm

[B8] [8] Scientific American published between March 2002 and December 2006.
http://sa.ylib.com

[B9] [9] Sources of characters for searching similar characters. Ministry of Education, Taiwan.
http://www.edu.tw
and a book written in Chinese
http://www.cbflabs.com/book/ocj5/ocj5/index.html

[B10] [10] The JACOB Project.
http://danadler.com/jacob/

[B11] [11] L. Amaral, V. Metcalf, and D. Meurers, “Language Awareness through Re-use of NLP Technology,” Workshop on NLP in CALL – Computational and Linguistic Challenges, Annual Conf. of the Computer Assisted Language Instruction Consortium, 2006.

[B12] [12] K.-J. Chen and S.-H. Liu, “Word identification for mandarin Chinese sentences,” Proc. of the 14th Conf. on Computational Linguistics, pp. 101-107, 1992.
http://ckipsvr.iis.sinica.edu.tw/

[B13] [13] C.-C. Cheng, “Word-focused extensive reading with guidance,” In Selected Papers from the 13th Int’l Symposium on English Teaching, pp. 24-32, Crane Publishing Co, 2004.
http://elearning.ling.sinica.edu.tw/CWordframe.html

[B14] [14] D. Griffee, “Can we validly translate questionnaire items from English to Japanese?,” Shiken: JALT Testing & Evaluation SIG Newsletter, 2(2), pp. 15-17, 1998.

[B15] [15] N. Ide and J. Véronis, “Word sense disambiguation: The state of the art,” Computational Linguistics, 24(1), pp. 1-40, 1998.

[B16] [16] D. Juang, J.-H. Wang, C.-Y. Lai, C.-C. Hsieh, L.-F. Chien, and J.-M. Ho, “Resolving the unencoded character problem for Chinese digital libraries,” Proc. of the 5th ACM/IEEE Joint Conf. on Digital Libraries, pp. 311-319, 2005.

[B17] [17] D. Lin, “Dependency-based evaluation of MINIPAR,” Proc. of the Workshop on the Evaluation of Parsing Systems, the 1st Int’l Conf. on Language Resources and Evaluation, 1998.
http://www.cs.ualberta.ca/?lindek/minipar.htm

[B18] [18] C. D. Manning and H. Schütze, “Foundations of Statistical Natural Language Processing,” The MIT Press, 1999.

[B19] [19] M. O. Martin, K. D. Gregory, and S. E. Stemler (Eds.), “TIMSS 1999 Technical Report,” 1999.
http://timss.bc.edu/

[B20] [20] M. Maritxalar, N. Ezeiza, and M. Schulze (Eds.), Proc. of the Workshop on NLP for Educational Resources, Int’l Conf. on Recent Advances in NLP, 2007.

[B21] [21] F. J. Och, “An efficient method for determining bilingual word classes,” Proc. of European Chapter of the ACL, pp. 71-76, 1999.
http://www.fjoch.com/mkcls.html

[B22] [22] F. J. Och and H. Ney, “A systematic comparison of various statistical alignment models,” Computational Linguistics, 29(1), pp. 19-51, 2003.
http://www.fjoch.com/GIZA++.html

[B23] [23] K. O’Connor and B. Malak, “Translation and cultural adaptation of the TIMSS instruments,” pp. 87-100, in Martin et al. , 1999.

[B24] [24] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “BLEU: a method for automatic evaluation of machine translation,” Proc. of the 40th Annual Meeting of the ACL, pp. 311-318, 2002.
http://www.nist.gov/speech/tests/mt/resources/scoring.htm

[B25] [25] M. F. Porter, “An algorithm for suffix stripping,” Program, 14(3), pp. 130-137, 1980.

[B26] [26] A. Ratnaparkhi, “A maximum entropy model for part-of-speech tagging,” Proc. of the Conf. on Empirical Methods in NLP, pp. 133-142, 1996.

[B27] [27] C. W. Stansfield, “Test translation and adaptation in public education in the USA,” Language Testing, 20(2), pp. 189-207, 2003.

[B28] [28] A. Stolcke, “SRILM – an extensible language modeling toolkit,” Proc. of the Int’l Conf. on Spoken Language Processing, pp. 901-904, 2002.
http://www.speech.sri.com/projects/srilm/

[B29] [29] K.-Y. Su and J.-S. Chang, “Some key issues in designing MT systems,” Machine Translation, 5(4), pp. 265-300, Kluwer Academic Publishers, 1990.

[B30] [30] C. Wang, M. Collins, and P. Koehn, “Chinese syntactic reordering for statistical machine translation,” Proc. of the 2007 Joint Conf. on Empirical Methods in NLP and Computational Natural Language Learning, pp. 737-745, 2007.

Supporting the Translation and Authoring of Test Items with Techniques of Natural Language Processing

Ming-Shin Lu*, Yu-Chun Wang**, Jen-Hsiang Lin*, Chao-Lin Liu*, Zhao-Ming Gao**, and Chun-Yen Chang***

Ming-Shin Lu^*, Yu-Chun Wang^**, Jen-Hsiang Lin^,
Chao-Lin Liu^, Zhao-Ming Gao^, and Chun-Yen Chang^*