Cite Article

Hybrid Machine Translation with Multi-Source Encoder-Decoder Long Short-Term Memory in English-Malay Translation

Choose citation format

BibTeX

@article{IJASEIT6816,
   author = {Yin-Lai Yeong and Tien-Ping Tan and Keng Hoon Gan and Siti Khaotijah Mohammad},
   title = {Hybrid Machine Translation with Multi-Source Encoder-Decoder Long Short-Term Memory in English-Malay Translation},
   journal = {International Journal on Advanced Science, Engineering and Information Technology},
   volume = {8},
   number = {4-2},
   year = {2018},
   pages = {1446--1452},
   keywords = {Hybrid Machine Translation; Statistical Machine Translation; Neural Machine Translation},
   abstract = {

Statistical Machine Translation (SMT) and Neural Machine Translation (NMT) are the state-of-the-art approaches in machine translation (MT). The translation produced by a SMT is based on the statistical analysis of text corpora, while NMT uses deep neural network to model and to generate a translation. SMT and NMT have their strength and weaknesses. SMT may produce better translation with a small parallel text corpus compared to NMT. Nevertheless, when the amount of parallel text available is large, the quality of the translation produced by NMT is often higher than SMT. Besides that, study also shown that the translation produced by SMT is better than NMT in cases where there is a domain mismatch between training and testing. SMT also has an advantage on long sentences. In addition, when a translation produced by an NMT is wrong, it is very difficult to find the error. In this paper, we investigate a hybrid approach that combine SMT and NMT to perform English to Malay translation. The motivation of using a hybrid machine translation is to combine the strength of both approaches to produce a more accurate translation. Our approach uses the multi-source encoder-decoder long short-term memory (LSTM) architecture. The architecture uses two encoders, one to embed the sentence to be translated, and another encoder to embed the initial translation produced by SMT. The translation from the SMT can be viewed as a “suggestion translation” to the neural MT. Our experiments show that the hybrid MT increases the BLEU scores of our best baseline machine translation in computer science domain and news domain from 21.21 and 48.35 to 35.97 and 61.81 respectively.

},    issn = {2088-5334},    publisher = {INSIGHT - Indonesian Society for Knowledge and Human Development},    url = {http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=6816},    doi = {10.18517/ijaseit.8.4-2.6816} }

EndNote

%A Yeong, Yin-Lai
%A Tan, Tien-Ping
%A Gan, Keng Hoon
%A Mohammad, Siti Khaotijah
%D 2018
%T Hybrid Machine Translation with Multi-Source Encoder-Decoder Long Short-Term Memory in English-Malay Translation
%B 2018
%9 Hybrid Machine Translation; Statistical Machine Translation; Neural Machine Translation
%! Hybrid Machine Translation with Multi-Source Encoder-Decoder Long Short-Term Memory in English-Malay Translation
%K Hybrid Machine Translation; Statistical Machine Translation; Neural Machine Translation
%X 

Statistical Machine Translation (SMT) and Neural Machine Translation (NMT) are the state-of-the-art approaches in machine translation (MT). The translation produced by a SMT is based on the statistical analysis of text corpora, while NMT uses deep neural network to model and to generate a translation. SMT and NMT have their strength and weaknesses. SMT may produce better translation with a small parallel text corpus compared to NMT. Nevertheless, when the amount of parallel text available is large, the quality of the translation produced by NMT is often higher than SMT. Besides that, study also shown that the translation produced by SMT is better than NMT in cases where there is a domain mismatch between training and testing. SMT also has an advantage on long sentences. In addition, when a translation produced by an NMT is wrong, it is very difficult to find the error. In this paper, we investigate a hybrid approach that combine SMT and NMT to perform English to Malay translation. The motivation of using a hybrid machine translation is to combine the strength of both approaches to produce a more accurate translation. Our approach uses the multi-source encoder-decoder long short-term memory (LSTM) architecture. The architecture uses two encoders, one to embed the sentence to be translated, and another encoder to embed the initial translation produced by SMT. The translation from the SMT can be viewed as a “suggestion translation” to the neural MT. Our experiments show that the hybrid MT increases the BLEU scores of our best baseline machine translation in computer science domain and news domain from 21.21 and 48.35 to 35.97 and 61.81 respectively.

%U http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=6816 %R doi:10.18517/ijaseit.8.4-2.6816 %J International Journal on Advanced Science, Engineering and Information Technology %V 8 %N 4-2 %@ 2088-5334

IEEE

Yin-Lai Yeong,Tien-Ping Tan,Keng Hoon Gan and Siti Khaotijah Mohammad,"Hybrid Machine Translation with Multi-Source Encoder-Decoder Long Short-Term Memory in English-Malay Translation," International Journal on Advanced Science, Engineering and Information Technology, vol. 8, no. 4-2, pp. 1446-1452, 2018. [Online]. Available: http://dx.doi.org/10.18517/ijaseit.8.4-2.6816.

RefMan/ProCite (RIS)

TY  - JOUR
AU  - Yeong, Yin-Lai
AU  - Tan, Tien-Ping
AU  - Gan, Keng Hoon
AU  - Mohammad, Siti Khaotijah
PY  - 2018
TI  - Hybrid Machine Translation with Multi-Source Encoder-Decoder Long Short-Term Memory in English-Malay Translation
JF  - International Journal on Advanced Science, Engineering and Information Technology; Vol. 8 (2018) No. 4-2
Y2  - 2018
SP  - 1446
EP  - 1452
SN  - 2088-5334
PB  - INSIGHT - Indonesian Society for Knowledge and Human Development
KW  - Hybrid Machine Translation; Statistical Machine Translation; Neural Machine Translation
N2  - 

Statistical Machine Translation (SMT) and Neural Machine Translation (NMT) are the state-of-the-art approaches in machine translation (MT). The translation produced by a SMT is based on the statistical analysis of text corpora, while NMT uses deep neural network to model and to generate a translation. SMT and NMT have their strength and weaknesses. SMT may produce better translation with a small parallel text corpus compared to NMT. Nevertheless, when the amount of parallel text available is large, the quality of the translation produced by NMT is often higher than SMT. Besides that, study also shown that the translation produced by SMT is better than NMT in cases where there is a domain mismatch between training and testing. SMT also has an advantage on long sentences. In addition, when a translation produced by an NMT is wrong, it is very difficult to find the error. In this paper, we investigate a hybrid approach that combine SMT and NMT to perform English to Malay translation. The motivation of using a hybrid machine translation is to combine the strength of both approaches to produce a more accurate translation. Our approach uses the multi-source encoder-decoder long short-term memory (LSTM) architecture. The architecture uses two encoders, one to embed the sentence to be translated, and another encoder to embed the initial translation produced by SMT. The translation from the SMT can be viewed as a “suggestion translation” to the neural MT. Our experiments show that the hybrid MT increases the BLEU scores of our best baseline machine translation in computer science domain and news domain from 21.21 and 48.35 to 35.97 and 61.81 respectively.

UR - http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=6816 DO - 10.18517/ijaseit.8.4-2.6816

RefWorks

RT Journal Article
ID 6816
A1 Yeong, Yin-Lai
A1 Tan, Tien-Ping
A1 Gan, Keng Hoon
A1 Mohammad, Siti Khaotijah
T1 Hybrid Machine Translation with Multi-Source Encoder-Decoder Long Short-Term Memory in English-Malay Translation
JF International Journal on Advanced Science, Engineering and Information Technology
VO 8
IS 4-2
YR 2018
SP 1446
OP 1452
SN 2088-5334
PB INSIGHT - Indonesian Society for Knowledge and Human Development
K1 Hybrid Machine Translation; Statistical Machine Translation; Neural Machine Translation
AB 

Statistical Machine Translation (SMT) and Neural Machine Translation (NMT) are the state-of-the-art approaches in machine translation (MT). The translation produced by a SMT is based on the statistical analysis of text corpora, while NMT uses deep neural network to model and to generate a translation. SMT and NMT have their strength and weaknesses. SMT may produce better translation with a small parallel text corpus compared to NMT. Nevertheless, when the amount of parallel text available is large, the quality of the translation produced by NMT is often higher than SMT. Besides that, study also shown that the translation produced by SMT is better than NMT in cases where there is a domain mismatch between training and testing. SMT also has an advantage on long sentences. In addition, when a translation produced by an NMT is wrong, it is very difficult to find the error. In this paper, we investigate a hybrid approach that combine SMT and NMT to perform English to Malay translation. The motivation of using a hybrid machine translation is to combine the strength of both approaches to produce a more accurate translation. Our approach uses the multi-source encoder-decoder long short-term memory (LSTM) architecture. The architecture uses two encoders, one to embed the sentence to be translated, and another encoder to embed the initial translation produced by SMT. The translation from the SMT can be viewed as a “suggestion translation” to the neural MT. Our experiments show that the hybrid MT increases the BLEU scores of our best baseline machine translation in computer science domain and news domain from 21.21 and 48.35 to 35.97 and 61.81 respectively.

LK http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=6816 DO - 10.18517/ijaseit.8.4-2.6816