JACIII Vol.20 No.6 pp. 893-901
doi: 10.20965/jaciii.2016.p0893


Template-Based Model for Mongolian-Chinese Machine Translation

Jing Wu, Hongxu Hou, Feilong Bao, and Yupeng Jiang

College of Computer Science, Inner Mongolia University
Hohhot, China

February 21, 2016
July 13, 2016
Online released:
November 20, 2016
November 20, 2016
machine translation, template extraction, template translation, combined system, TBMT

Mongolian and Chinese statistical machine translation (SMT) system has its limitation because of the complex Mongolian morphology, scarce resource of parallel corpus and the significant syntax differences. To address these problems, we propose a template-based machine translation (TBMT) system and combine it with the SMT system to achieve a better translation performance. The TBMT model we proposed includes a template extraction model and a template translation model. In the template extraction model, we present a novel method of aligning and abstracting static words from bilingual parallel corpus to extract templates automatically. In the template translation model, our specially designed method of filtering out the low quality matches can enhance the translation performance. Moreover, we apply lemmatization and Latinization to address data sparsity and do the fuzzy match. Experimentally, the coverage of TBMT system is over 50%. The combined SMT system translates all the other uncovered source sentences. The TBMT system outperforms the baselines of phrase-based and hierarchical phrase-based SMT systems for +3.08 and +1.40 BLEU points. The combined system of TBMT and SMT systems also performs better than the baselines of +2.49 and +0.81 BLEU points.

