Web proceedings papers

Authors

Majlinda Axhiu and Azir Aliu

Abstract

Bearing in mind the exponential increase of online data generated by the social networks’ users in every language, the urge need of sentiment analysis is also increasing. However, we have reached to a point that even the overall sentiment of an opinion is not enough that is why the necessity of Aspect-based Sentiment Analysis (ABSA) is very high. Considering our aim, to work on the first phase of the ABSA task, namely to extract the aspect terms from the reviews in Albanian language, and considering the lack of research on this field for this language and the lack of resources, we have chosen the unsupervised approach beside the supervised one. In this technique two of the mostly used models that are considered to be the state of art for topic modeling are Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF). We have done a comparative analysis for these two models by using a dataset that we have created from Facebook reviews, in the domain of restaurants. We have successfully extracted the aspects with both models. As a sample of the results we have listed the top 10 words that were extracted by both models and which were classified in three different topics. Taking into account the results from the evaluation measures (Precision, Recall and F1-score) it resulted that both models worked well for extracting the aspects, having NMF with a higher accuracy than LDA. NMF was also more accurate in the classification of the aspects into different topics.

Keywords

Non-negative matrix factorization, Latent Dirichlet allocation, Aspect extraction, Aspect-based sentiment analysis