Tăng cường tri thức cú pháp cho dịch máy mạng neural sử dụng bộ mã hóa đồ thị

Chỉ số đề mục

20

Lĩnh vực nghiên cứu

Khoa học thông tin

Dạng tài liệu

Tác giả

Nguyễn Hồng Bửu Long⁽¹⁾, Phạm Hùng Việt

Nhan đề

Tăng cường tri thức cú pháp cho dịch máy mạng neural sử dụng bộ mã hóa đồ thị

Nhan đề tiếng anh

Syntax-enhanced neural machine translation with graph encoder

Nguồn trích

Tạp chí Khoa học - Đại học Sư phạm TP Hồ Chí Minh

Năm xuất bản

2022

Số

10

Trang

1725-1734

ISSN

1859-3100

Từ khóa

Dịch máy mạng neural (NMT), Mạng neural đồ thị, Cú pháp, Mã hóa

Từ khóa tiếng anh

Constituent tree, Graph neural networks, Neural machine translation, Syntax

Tóm tắt

Dịch máy mạng neural (NMT) là một mô hình mới trong dịch máy (MT) được hỗ trợ bởi những tiến bộ gần đây trong kĩ thuật học sâu. Với các mạng neural, NMT đã trở thành hướng tiếp cận dịch tự động hứa hẹn trong những năm gần đây. Mặc dù, đã có những thành công rõ ràng, NMT có một nhược điểm quan trọng là không có khả năng tích hợp tri thức cú pháp vào mô hình dịch. Bài báo này đề xuất mở rộng mô hình NMT để kết hợp thông tin cú pháp bổ sung từ cây phân tích cú pháp thành phần. Chúng tôi biểu diễn các cây cấu trúc thành phần dưới dạng biểu đồ được mã hóa bằng bộ mã hóa đồ thị để nâng cao cơ chế tập trung, giúp bộ giải mã có thể tập trung vào cả biểu diễn chuỗi tuần tự và đồ thị ở mỗi bước giải mã. Các thực nghiệm cho thấy kết quả khả quan của phương pháp được đề xuất trên bộ dữ liệu Anh-Việt, chứng minh tính hiệu quả của phương pháp NMT khi được tích hợp thêm thông tin tri thức cú pháp.

Tóm tắt tiếng anh

Neural Machine Translation (NMT) is a new paradigm in machine translation (MT) powered by recent advances in sequence to sequence learning frameworks. With the advance of Neural Networks, NMT has become the most promising MT approach in recent years. Despite the apparent success, NMT still suffers from one significant drawback in integrating syntactic knowledge into neural networks. This paper proposes an extension of the NMT model to incorporate additional syntactic information from constituency trees. We represent the constituency trees under graph forms encoded by a graph encoder to enhance the attention layer, which allows the decoder to focus on both sequential and graph representation at each decoding step. The experiments show promising results of the proposed method on English-Vietnamese datasets, proving the effectiveness of our syntax-enhanced NMT method.

Kí hiệu kho

TTKHCNQG, CTv 138

File toàn văn

Xem toàn văn

Tài liệu tham khảo

[1] Xu, K., Wu, L., Wang, Z., Feng, Y., & Sheinin, V. (2018), Graph2seq: Graph to sequence learning with attention-based neural networks,CoRR, abs/1804.00823. Retrieved f-rom http://arxiv.org/abs/1804.00823
[2] Wu, F., Fan, A., Baevski, A., Dauphin, Y. N., & Auli, M. (2019), Pay less attention with lightweight and dynamic convolutions,CoRR, abs/1901.10430 Retrieved f-rom http://arxiv.org/abs/1901.10430
[3] Wang, Y., Wang, L., Zeng, X., Wong, D. F., Chao, L. S., & Lu, Y. (2004), Factored statistical machine translation for grammatical error correction.,In Proceedings of the eighteenth conference on computational natural language learning: Shared task (pp. 83-90). Baltimore, Maryland: Association for Computational Linguistics. Retrieved f-rom https://aclanthology.org/W14-1711 doi: https://doi.org/10.3115/v1/W14-1711
[4] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., . . . Polosukhin, I. (2017), Attention is all you need.,CoRR, abs/1706.03762. Retrieved f-rom http://arxiv.org/abs/1706.03762
[5] Sennrich, R., Haddow, B., & Birch, A. (2016), Neural machine translation of rare words with subword units.,In Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: Long papers) (pp. 1715-1725). Berlin, Germany: Association for Computational Linguistics. Retrieved f-rom https://www.aclweb.org/anthology/P16-1162 doi: https://doi.org/10.18653/v1/P16-1162
[6] Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002), Bleu: A method for automatic evaluation of machine translation,In Proceedings of the 40th annual meeting on association for computational linguistics (p. 311-318). USA: Association for Computational Linguistics. Retrieved f-rom https://doi.org/10.3115/1073083.1073135 doi: https://doi.org/10.3115/1073083.1073135
[7] Nădejde, M., Reddy, S., Sennrich, R., Dwojak, T., Junczys-Dowmunt, M., Koehn, P., & Birch, A. (2017), Predicting target language CCG supertags improves neural machine translation.,In Proceedings of the second conference on machine translation (pp. 68-79). Copenhagen, Denmark: Association for Computational Linguistics. Retrieved f-rom https://aclanthology.org/W17-4707 doi: https://doi.org/10.18653/v1/W17-4707
[8] Li, J., Xiong, D., Tu, Z., Zhu, M., Zhang, M., & Zhou, G. (2017), Modeling source syntax for neural machine translation.,In Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: Long papers) (pp. 688-697). Vancouver, Canada: Association for Computational Linguistics. Retrieved f-rom https://aclanthology.org/P17-1064 doi: https://doi.org/10.18653/v1/P17-1064
[9] Koehn, P., & Hoang, H. (2017), Factored translation models,In Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL) (pp. 868-876). Prague, Czech Republic: Association for Computational Linguistics. Retrieved f-rom https://aclanthology.org/D07-1091
[10] Koehn, P. (2004), Statistical significance tests for machine translation evaluation.,In Proceedings of the 2004 conference on empirical methods in natural language processing (pp. 388-395). Barcelona, Spain: Association for Computational Linguistics. Retrieved f-rom https://aclanthology.org/W04-3250
[11] Kingma, D. P., & Ba, J. (2015), Adam: A method for stochastic optimization,In Y. Bengio & Y. LeCun (Eds.), 3rd international conference on learning representations, ICLR 2015, San Diego, Ca, Usa, May 7-9, 2015, conference track proceedings. Retrieved f-rom http://arxiv.org/abs/1412.6980
[12] Gehring, J., Auli, M., Grangier, D., Yarats, D., & Dauphin, Y. N. (2017), Convolutional sequence to sequence learning. CoRR, abs/1705.03122,Retrieved f-rom http://arxiv.org/abs/1705.03122
[13] Eriguchi, A., Hashimoto, K., & Tsuruoka, Y. (2016), Tree-to-sequence attentional neural machine translation. In Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: Long papers) (pp. 823-833).,Berlin, Germany: Association for Computational Linguistics. Retrieved f-rom https://aclanthology.org/P16-1078 doi: https://doi.org/10.18653/v1/P16-1078
[14] Cettolo, M., Niehues, J., Stüker, S., Bentivogli, L., Cattoni, R., & Federico, M. (2015), The iwslt 2015 evaluation campaign. Chen, H., Huang, S., Chiang, D., & Chen, J. (2017, July). Improved neural machine translation with a syntax-aware encoder and decoder. In Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: Long papers) (pp. 1936–1945).,Vancouver, Canada: Association for Computational Linguistics. Retrieved f-rom https://aclanthology.org/P17-1177 doi: https://doi.org/10.18653/v1/P17-1177
[15] Bahdanau, D., Cho, K., & Bengio, Y. (2015), Neural machine translation by jointly learning to align and translate.,In Y. Bengio & Y. LeCun (Eds.), 3rd international conference on learning representations, ICLR 2015, Sandiego, Ca, Usa, May 7-9, 2015, conference track proceedings. Retrieved f-rom http://arxiv.org/abs/1409.0473