Hybrid Machine Translation with Multi-Source Encoder-Decoder Long Short-Term Memory in English-Malay Translation

Yin-Lai Yeong (1), Tien-Ping Tan (2), Keng Hoon Gan (3), Siti Khaotijah Mohammad (4)
(1)
(2)
(3)
(4)
Fulltext View | Download
How to cite (IJASEIT) :
Yeong, Yin-Lai, et al. “Hybrid Machine Translation With Multi-Source Encoder-Decoder Long Short-Term Memory in English-Malay Translation”. International Journal on Advanced Science, Engineering and Information Technology, vol. 8, no. 4-2, Sept. 2018, pp. 1446-52, doi:10.18517/ijaseit.8.4-2.6816.
Statistical Machine Translation (SMT) and Neural Machine Translation (NMT) are the state-of-the-art approaches in machine translation (MT). The translation produced by a SMT is based on the statistical analysis of text corpora, while NMT uses deep neural network to model and to generate a translation. SMT and NMT have their strength and weaknesses. SMT may produce better translation with a small parallel text corpus compared to NMT. Nevertheless, when the amount of parallel text available is large, the quality of the translation produced by NMT is often higher than SMT. Besides that, study also shown that the translation produced by SMT is better than NMT in cases where there is a domain mismatch between training and testing. SMT also has an advantage on long sentences. In addition, when a translation produced by an NMT is wrong, it is very difficult to find the error. In this paper, we investigate a hybrid approach that combine SMT and NMT to perform English to Malay translation. The motivation of using a hybrid machine translation is to combine the strength of both approaches to produce a more accurate translation. Our approach uses the multi-source encoder-decoder long short-term memory (LSTM) architecture. The architecture uses two encoders, one to embed the sentence to be translated, and another encoder to embed the initial translation produced by SMT. The translation from the SMT can be viewed as a “suggestion translation” to the neural MT. Our experiments show that the hybrid MT increases the BLEU scores of our best baseline machine translation in computer science domain and news domain from 21.21 and 48.35 to 35.97 and 61.81 respectively.

D. Bahdanau, K. Cho and Y. Bengio, “Neural machine translation by jointly learning to align and translate”. CoRR abs/1409.0473, 2014.

T. Luong, H. Pham and D. C. Manning, “Effective approaches to attention-based neural machine translation,” in Proc. Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal, pp. 1412-1421, Sept. 2015.

P. Koehn and R. Knowles, “Six challenges for neural machine translation,” in Proc. Workshop on Neural Machine Translation, pp. 28-39, 2017.

K. Cho, B. Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” in Arxiv preprint arXiv:1406.1078, 2014.

I. Sutskever, O. Vinyals and Q. V. Le, “Sequence to sequence learning with neural networks,” in Proc. International Conference on Neural Information Processing Systems, pp. 3104-3112, Dec. 2014.

L. Dahlmann, E. Matusov, P. Petrushkov and S. Khadivi, “Neural machine translation leveraging phrase-based models in a hybrid search,” in Proc. Conference on Empirical Methods in Natural Language Processing, Sept 2017.

F. Stahlberg, A. de Gispert, E. Hasler and B. Byrne, “Neural machine translation by minimising the Bayes-risk with respect to syntactic translation lattices,” in Proc. Conference of the European Chapter of the Association for Computational Linguistics, vol 2, pp. 362-368, April 2017.

X. Wang, Z. Lu, Z. Tu, H. Li, D. Xiong, and M. Zhang, “Neural machine translation advised by statistical machine translation,” in Proc. AAAI Conference on Artificial Intelligence, 2017.

J. Du and A. Way, “Neural pre-translation for hybrid machine translation,” in Proc. MT Summit XVI, vol.1, pp. 27-40, Sept. 2017.

R. Dabre, F. Cromieres and S. Kurohashi, “Enabling multi-source neural machine translation by concatenating source sentences in multiple languages”. arXiv preprint arXiv:1702.06135. 2017.

B. Zoph and K. Knight, “Multi-source neural translation,” in Proc. NAACL-HLT, pp. 30-34, June 2016.

J. Zhang, Q. Liu and J. Zhou, “ME-MD: An effective framework for neural machine translation with multiple encoders and decoders,” in Proc. IJCAI, pp. 3392-3398, Aug. 2017.

P. Koehn, “Pharaoh: a beam search decoder for phrase-based statistical machine translation models,” in Proc. AMTA, pp. 115-124, Sept. 2004.

P. Koehn, H. Hoang, A. Birch, C. Callison-Burch, M. Federico, N. Bertoldi, B. Cowan, W. Shen, C. Moran, R. Zens, C. Dyer, O. Bojar, A. Constantin and E. Herbst, “Moses: Open Source Toolkit for Statistical Machine Translation,” in Proc. ACL 2007, June 2007.

M. Olteanu, C. Davis, I. Volosen and D. Moldovan, “Phramer - an open source statistical phrase-based translator,” in Proc. Workshop on Statistical Machine Translation, pp. 146-149, June. 2006.

F. J. Och and H. Ney, “A systematic comparison of various statistical alignment models,” in Proc. Computational Linguistics, vol. 29, no.1, pp. 19-51, 2003.

Y. Deng and W. Byrne, “MKKT: An alignment toolkit for statistical machine translation,” in Proc. Human Language Technology Conference of the NAACL, pp. 265-268, 2006.

I. Mohd Yassin, R. Jailani, M. S. A. Megat Ali, R. Baharom, A. H. Abu Hassan and Z. I. Rizman, “Comparison between Cascade Forward and Multi-Layer Perceptron Neural Networks for NARX Functional Electrical Stimulation (FES)-Based Muscle Model,” International Journal on Advanced Science, Engineering and Information Technology, vol. 7, no. 1, pp. 215-221, 2017.

M. A. Nielsen, Neural Networks and Deep Learning. Determination Press, 2015.

A. A. Amri, A. R. Ismail and A. Ahmad Zarir, “Convolutional neural networks and deep belief networks for analysing imbalanced class issue in handwritten dataset,” International Journal on Advanced Science, Engineering and Information Technology, vol. 7, no. 6, pp. 2302-2307, 2017.

I. Sutskever, O. Vinyals and Q. V. Le, “Sequence to Sequence Learning with Neural Networks”. Advances in Neural Information Processing Systems, pp. 3104-3112, 2014.

A. Gí©ron, Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. California, US: O'Reilly Media. 2017.

R. Sennrich, M. Volk, “MT-based Sentence Alignment for OCR-generated Parallel Texts,” in Proc. AMTA 2010, 2010.

(2018) The MalaysiaKini website. [Online]. Available: https://www.malaysiakini.com/

T.-P. Tan, H. Li, E. K. Tang, X. Xiao and E. S. Chng, “MASS: A Malay Language LVCSR Corpus Resource,” in Proc. Oriental Cocosda, pp. 25-30, Aug. 2009.

A. Stolcke, “SRILM - an extensible language modeling toolkit,” in Proc. International Conference on Spoken Language Processing, pp. 901-904, 2002.

A. Bí©rard, O. Pietquin, L. Besacier and C. Servan, “Listen and translate: A Proof of Concept for End-to-End Speech-to-Text Translation,” in Proc. NIPS, pp. 1-5, 2016.

Authors who publish with this journal agree to the following terms:

    1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
    2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
    3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).