Lọc theo danh mục
liên kết website
Lượt truy cập
- Công bố khoa học và công nghệ Việt Nam
12
Khoa học máy tính
Phạm Vĩnh Khang, Nguyễn Hồng Bửu Long(1)
Hướng đến tiền huấn luyện cross-attention trong dịch máy bằng nơ-ron
Towards cross-attention pre-training in neural machine translation
Tạp chí Khoa học - Đại học Sư phạm TP Hồ Chí Minh
2022
10
1749-1755
1859-3100
TTKHCNQG, CTv 138
- [1] Yang, J., Wang, M., Zhou, H., Zhao, C., Zhang, W., Yu, Y., & Li, L. (2020), Towards making the most of bert in neural machine translation.,Proceedings of the AAAI Conference on Artificial Intelligence, 9378–9385.
- [2] Weng, R., Yu, H., Huang, S., Cheng, S., & Luo, W. (2020), Acquiring Knowledge f-rom Pre-trained Model to Neural Machine Translation.,Proceedings of the AAAI Conference on Artificial Intelligence, 9266-9273.
- [3] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., . . . Polosukhin, I. (2017), Attention is All you Need.,NIPS.
- [4] Tran, N. L., Le, D. M., & Nguyen, D. Q. (2022), BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese.,arXiv preprint arXiv:2109.09701.
- [5] Song, K., Tan, X., Qin, T., Lu, J., & Liu, T.-Y. (2019), MASS: Masked Sequence to Sequence Pretraining for Language Generation.,arXiv preprint arXiv:1905.02450.
- [6] Ren, S., Zhou, L., Liu, S., Wei, F., Zhou, M., & Ma, S. (2021), SemFace: Pre-training Encoder and Decoder with a Semantic Interface for Neural Machine Translation.,Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 4518-4527.
- [7] Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018), Improving Language Understanding by Generative Pre-Training,OpenAI.
- [8] Ott, M., Edunov, S., Baevski, A., Fan, A., Gross, S., Ng, N., . . . Auli, M. (2019), fairseq: A Fast, Extensible Toolkit for Sequence Modeling.,Proceedings of NAACL-HLT 2019: Demonstrations.
- [9] Nguyen, Q. D., & Nguyen, T. A. (2020), PhoBERT: Pre-trained language models for Vietnamese.,arXiv preprint arXiv:2003.00744.
- [10] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., . . . Stoyanov, V. (2019), RoBERTa: A Robustly Optimized BERT Pretraining Approach.,arXiv preprint arXiv:1907.11692.
- [11] Liu, Y., Gu, J., Goyal, N., Li, X., Edunov, S., Ghazvininejad, M., . . . Zettlemoyer, L. (2020), Multilingual Denoising Pre-training for Neural Machine Translation.,Transactions of the Association for Computational Linguistics, 8, 726-742.
- [12] Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., . . . Zettlemoyer, L. (2019), BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension.,arXiv preprint arXiv:1910.13461.
- [13] Lample, G., Conneau, A., Denoyer, L., & Ranzato, M. (2017), Unsupervised Machine Translation Using Monolingual Corpora Only.,
- [14] Lample, G., & Conneau, A. (2019), Cross-lingual Language Model Pretraining.,arXiv preprint arXiv:1901.07291.
- [15] Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018), BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,arXiv preprint arXiv:1810.04805.
- [16] Artetxe, M., Labaka, G., & Agirre, E. (2018), A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings.,Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 789-798.
- [17] Kingma, D. P., & Ba, J. (2014), Adam: A Method for Stochastic Optimization.,arXiv preprint arXiv:1412.6980.
