Análisis de sentimientos en Twitter: Un estudio comparativo

Fernando Andres Lovera; Yudith Cardinale

doi:10.51252/rcsi.v3i1.418

Authors

Fernando Andres Lovera Universidad Simón Bolívar https://orcid.org/0000-0002-3042-5953
Yudith Cardinale Universidad Simón Bolívar https://orcid.org/0000-0002-5966-0113

DOI:

https://doi.org/10.51252/rcsi.v3i1.418

Keywords:

sentiment analysis, learning, classification, twitter

Abstract

Sentiment analysis helps to determine the perception of users in different aspects of daily life, such as product preferences in the market, level of user confidence in work environments, or political preferences. The idea is to predict trends or preferences based on feelings. In this article we evaluate the most common techniques used for this type of analysis, considering machine learning and deep machine learning techniques. Our main contribution is based on a proposal for a methodological strategy that covers the phases of data preprocessing, construction of predictive models and their evaluation. From the results, the best classical model was SVM, with 78% accuracy, and 79% F1 metric (F1 score). For the Deep Learning models, the classical models had the best results. The model with the best performance was the Deep Learning Long Short Term Memory (LSTM), reaching 88% accuracy and 89% F1 metric. The worst of the Deep Learning models was the CNN, with 77% accuracy as an F1 metric. Concluding that the Long Short Term Memory (LSTM) algorithm proved to be the best performance, reaching up to 89% accuracy.

References

Abirami, A. M., & Gayathri, V. (2017). A survey on sentiment analysis methods and approach. 2016 Eighth International Conference on Advanced Computing (ICoAC), 72–76. https://doi.org/10.1109/ICoAC.2017.7951748

Al-Ayyoub, M., Khamaiseh, A. A., Jararweh, Y., & Al-Kabi, M. N. (2019). A comprehensive survey of arabic sentiment analysis. Information Processing & Management, 56(2), 320–342. https://doi.org/10.1016/j.ipm.2018.07.006

Angiani, G., Ferrari, L., Fontanini, T., Fornacciari, P., Iotti, E., Magliani, F., & Manicardi, S. (2016). A comparison between preprocessing techniques for sentiment analysis in Twitter. CEUR Workshop Proceedings, 1748, 1–11. https://ceur-ws.org/Vol-1748/paper-06.pdf

Baby, C. J., Khan, F. A., & Swathi, J. N. (2017). Home automation using IoT and a chatbot using natural language processing. 2017 Innovations in Power and Advanced Computing Technologies (i-PACT), 1–6. https://doi.org/10.1109/IPACT.2017.8245185

Breck, E., & Cardie, C. (2017). Opinion Mining and Sentiment Analysis. In The Oxford Handbook of Computational Linguistics 2nd edition. Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199573691.013.43

Cambria, E. (2016). Affective Computing and Sentiment Analysis. IEEE Intelligent Systems, 31(2), 102–107. https://doi.org/10.1109/MIS.2016.31

Chaturvedi, I., Cambria, E., Welsch, R. E., & Herrera, F. (2018). Distinguishing between facts and opinions for sentiment analysis: Survey and challenges. Information Fusion, 44, 65–77. https://doi.org/10.1016/j.inffus.2017.12.006

Chu, E., & Roy, D. (2017). Audio-Visual Sentiment Analysis for Learning Emotional Arcs in Movies. 2017 IEEE International Conference on Data Mining (ICDM), 829–834. https://doi.org/10.1109/ICDM.2017.100

De Albornoz, J. C., Plaza, L., Gervás, P., & Díaz, A. (2011). A Joint Model of Feature Mining and Sentiment Analysis for Product Review Rating. In Advances in Information Retrieval (pp. 55–66). https://doi.org/10.1007/978-3-642-20161-5_8

Desai, M., & Mehta, M. A. (2016). Techniques for sentiment analysis of Twitter data: A comprehensive survey. 2016 International Conference on Computing, Communication and Automation (ICCCA), 149–154. https://doi.org/10.1109/CCAA.2016.7813707

Dobbin, K. K., & Simon, R. M. (2011). Optimally splitting cases for training and testing high dimensional classifiers. BMC Medical Genomics, 4(1), 31. https://doi.org/10.1186/1755-8794-4-31

Giachanou, A., & Crestani, F. (2016). Like It or Not: A Survey of Twitter Sentiment Analysis Methods. ACM Computing Surveys, 49(2), 1–41. https://doi.org/10.1145/2938640

Go, A., Bhayani, R., & Huang, L. (2009). Twitter Sentiment Classification using Distant Supervision. Processing, 1–6. https://www-cs-faculty.stanford.edu/people/alecmgo/papers/TwitterDistantSupervision09.pdf

Grus, J. (2015). Data Sciemce from Scratch: first principles with python (1st ed.). O’Reilly Media.

Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’04, 168. https://doi.org/10.1145/1014052.1014073

Hussein, D. M. E.-D. M. (2018). A survey on sentiment analysis challenges. Journal of King Saud University - Engineering Sciences, 30(4), 330–338. https://doi.org/10.1016/j.jksues.2016.04.002

Jianqiang, Z., & Xiaolin, G. (2017). Comparison Research on Text Pre-processing Methods on Twitter Sentiment Analysis. IEEE Access, 5, 2870–2879. https://doi.org/10.1109/ACCESS.2017.2672677

Kao, A., & Poteet, S. R. (2007). Natural Language Processing and Text Mining (1st ed.). Springer London. https://doi.org/10.1007/978-1-84628-754-1

Kaur, H., Mangat, V., & Nidhi. (2017). A survey of sentiment analysis techniques. 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 921–925. https://doi.org/10.1109/I-SMAC.2017.8058315

Kharde, V. A., & Sonawane, S. S. (2016). Sentiment Analysis of Twitter Data: A Survey of Techniques. International Journal of Computer Applications, 139(11), 5–15. https://doi.org/10.5120/ijca2016908625

Li, W., Shao, W., Ji, S., & Cambria, E. (2022). BiERU: Bidirectional emotional recurrent unit for conversational sentiment analysis. Neurocomputing, 467, 73–82. https://doi.org/10.1016/j.neucom.2021.09.057

Liang, B., Su, H., Gui, L., Cambria, E., & Xu, R. (2022). Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowledge-Based Systems, 235, 107643. https://doi.org/10.1016/j.knosys.2021.107643

Lin, P., Luo, X., & Fan, Y. (2020). A Survey of Sentiment Analysis Based on Deep Learning. International Journal of Computer and Information Engineering, 14(12), 473–485. https://publications.waset.org/10011630/a-survey-of-sentiment-analysis-based-on-deep-learning

Liu, B. (2010). Sentiment analysis and subjectivity. Handbook of Natural Language Processing, Second Edition, 2, 627–666. https://www.cs.uic.edu/~liub/FBS/NLP-handbook-sentiment-analysis.pdf

Lovera, F. A., Cardinale, Y. C., & Homsi, M. N. (2021). Sentiment Analysis in Twitter Based on Knowledge Graph and Deep Learning Classification. Electronics, 10(22), 2739. https://doi.org/10.3390/electronics10222739

Lunt, M. (2015). Introduction to statistical modelling: linear regression. Rheumatology, 54(7), 1137–1140. https://doi.org/10.1093/rheumatology/ket146

Martínez Cámara, E., Rodríguez Barroso, N., Moya, A. R., Fernández, J. A., Romero, E., & Herrera, F. (2019). Deep Learning Hyper-parameter Tuning for Sentiment Analysis in Twitter based on Evolutionary Algorithms. 255–264. https://doi.org/10.15439/2019F183

Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations ofwords and phrases and their compositionality. Advances in Neural Information Processing Systems, 1–10. https://proceedings.neurips.cc/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf

Mirtalaie, M. A., Hussain, O. K., Chang, E., & Hussain, F. K. (2018). Sentiment Analysis of Specific Product’s Features Using Product Tree for Application in New Product Development. In Advances in Intelligent Networking and Collaborative Systems (8th ed., pp. 82–95). Springer. https://doi.org/10.1007/978-3-319-65636-6_8

Mostafa, M. M. (2013). More than words: Social networks’ text mining for consumer brand sentiments. Expert Systems with Applications, 40(10), 4241–4251. https://doi.org/10.1016/j.eswa.2013.01.019

Oliveira, D. J. S., Bermejo, P. H. de S., & dos Santos, P. A. (2017). Can social media reveal the preferences of voters? A comparison between sentiment analysis and traditional opinion polls. Journal of Information Technology & Politics, 14(1), 34–45. https://doi.org/10.1080/19331681.2016.1214094

Plisson, J., Lavrac, N., & Mladenić, D. D. (2004). A rule based approach to word lemmatization. Proceedings of the 7th International Multiconference Information Society (IS’04), 83–86. http://eprints.pascal-network.org/archive/00000715/

Ribeiro, F. N., Araújo, M., Gonçalves, P., André Gonçalves, M., & Benevenuto, F. (2016). SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods. EPJ Data Science, 5(1), 23. https://doi.org/10.1140/epjds/s13688-016-0085-1

Tabinda Kokab, S., Asghar, S., & Naz, S. (2022). Transformer-based deep learning models for the sentiment analysis of social media data. Array, 14, 100157. https://doi.org/10.1016/j.array.2022.100157

Wu, Y., Zhang, Q., Huang, X., & Wu, L. (2011). Structural opinion mining for graph-based sentiment representation. Empirical Methods in Natural Language Processing, 1332–1341. https://aclanthology.org/D11-1123.pdf

Xu, S. (2018). Bayesian Naïve Bayes classifiers to text classification. Journal of Information Science, 44(1), 48–59. https://doi.org/10.1177/0165551516677946

Yujian, L., & Bo, L. (2007). A Normalized Levenshtein Distance Metric. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 1091–1095. https://doi.org/10.1109/TPAMI.2007.1078

Zhang, L., Wang, S., & Liu, B. (2018). Deep learning for sentiment analysis: A survey. WIREs Data Mining and Knowledge Discovery, 8(4). https://doi.org/10.1002/widm.1253