So sánh một số phương pháp học máy giải quyết bài toán phân tích cảm xúc trong câu

Chỉ số đề mục

Lĩnh vực nghiên cứu

10201 - Khoa học máy tính

Dạng tài liệu

Tác giả

Ma Thị Hồng Thu⁽¹⁾, Phùng Thị Thu Trang

Nhan đề

So sánh một số phương pháp học máy giải quyết bài toán phân tích cảm xúc trong câu

Nhan đề tiếng anh

A comparison of machine learning approaches to sentiment analysis in sentences

Nguồn trích

Tạp chí Khoa học - Đại học Quảng Nam

Năm xuất bản

2020

Số

16

Trang

104-113

ISSN

0866-7586

Từ khóa

Cảm xúc câu, Học máy, Trí tuệ nhân tạo, Học sâu, Xử lý ngôn ngữ tự nhiên

Từ khóa tiếng anh

Sentiment analysis, Machine learning, Artificial intelligence, Deep learning, Natural language processing

Tóm tắt

Phân tích cảm xúc trong câu đang là một trong những bài toán quan trọng của lĩnh vực xử lý ngôn ngữ tự nhiên. Đã có rất nhiều các phương pháp học máy được đề xuất để giải quyết bài toán này. Tuy nhiên, các phương pháp đó chỉ thực hiện ở những bộ dữ liệu nhỏ và ít so sánh đánh giá với các phương pháp khác. Trong bài báo này, chúng tôi đưa ra 5 phương pháp học máy khác nhau và so sánh chúng trên cùng bộ cơ sở dữ liệu Foody.vn. Các đặc trưng được đưa vào 5 phương pháp lần lượt là 1000, 1500 và 2000 đặc trưng. Sau khi so sánh, kết quả cho thấy sự khác biệt về độ chính xác giữa các phương pháp là không nhiều (khoảng 2%) và độ chênh lệnh kết quả giữa các đặc trưng khác nhau trong khoảng 4%. Có thể thấy rằng, việc lựa chọn phương pháp học máy phức tạp hay đơn giản không ảnh hưởng nhiều đến kết quả của bài toán mà còn phụ thuộc vào lượng đặc trưng được sử dụng.

Tóm tắt tiếng anh

Sentiment analysis is one of the most important problems in natural language processing. There have been many machine learning approaches proposed to solve this problem. However, these approaches only work on small datasets and they are less compared to others. This paper presents 5 different machine learning approaches and makes a comparison between them on the same database of Foody.vn. 1000, 1500 and 2000 specificities are respectively incorporated into these five approaches to draw distinctions. The results show that the difference in accuracy and results between these approaches is not much, about 2% and 4 % respectively. It can be seen that the outcome of the problem is not affected by the choice of a complex or simple machine learning approach, but it depends on the sum of specificity used.

Kí hiệu kho

TTKHCNQG, CVv 472

File toàn văn

Xem toàn văn

Tài liệu tham khảo

[1] (), Bộ cơ sở dữ liệu Foody.vn,Link: https://github.com/congnghia0609/ntc-scv/tree/ master/data.
[2] Hochreiter và Schmidhuber (1997), Long Short Term Memory networks,Journal of Neural Computation, MIT Press, 9(8):1735 – 1780.
[3] Yoon Kim (2014), Convolutional neural networks for sentence classification,Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Qatar, pp. 1746–1751
[4] Bishop, Christopher M (2006), Pattern recognition and Machine Learning,
[5] Vũ Hữu Tiệp (2018), Machine Learning cơ bản,
[6] (), Kỹ thuật trích chọn đặc trưng tf-idf,https://en.wikipedia.org/wiki/Tf%E2%80%93idf
[7] Q. Vo, H. Nguyen, B. Le and M. Nguyen (2017), Multi-channel LSTM-CNN model for Vietnamese sentiment analysis,9th International Conference on Knowledge and Systems Engineering (KSE), Hue, pp. 24-29
[8] Duyen Nguyen. Thi, Ngo. Xuan Bach, Tu. Minh Phuong (), An empirical study on sentiment analysis for Vietnamese,Advanced Technologies for Communications (ATC) International Conference on. IEEE, 2014
[9] Phan Dang-Hung, Cao Tuan-Dung (2014), Applying skip-gram word estimation and SVM-based classification for opinion mining Vietnamese food places text reviews,Proceedings of the Fifth Symposium on Information and Communication Technology (SoICT) ACM, Ha Noi, pp. 232 – 239
[10] P. D. Turney (), Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews,,in Proceedings of the 40th annual meeting on association for computational linguistics, 2002, pp. 417-424
[11] V. Hatzivassiloglou and J. M. Wiebe (), Effects of adjective orientation and gradability on sentence subjectivity,in Proceedings of the 18th conference on Computational linguistics-Volume 1, 2000, pp. 299-305