International Journal on Advanced Science, Engineering and Information Technology, Vol. 6 (2016) No. 6, pages: 1067-1073, DOI:10.18517/ijaseit.6.6.1456

Comparison of Machine Learning Approaches on Arabic Twitter Sentiment Analysis

Merfat.M. Altawaier, Sabrina Tiun

Abstract

With the dramatic expansion of information over internet, users around the world express their opinion daily on the social network such as Facebook and Twitter. Large corporations nowadays invest on analyzing these opinions in order to assess their products or services by knowing the people feedback toward such business. The process of knowing users’ opinions toward particular product or services whether positive or negative is called sentiment analysis. Arabic is one of the common languages that have been addressed regarding sentiment analysis. In the literature, several approaches have been proposed for Arabic sentiment analysis and most of these approaches are using machine learning techniques. Machine learning techniques are various and have different performances. Therefore, in this study, we try to identifying a simple, but workable approach for Arabic sentiment analysis on Twitter. Hence, this study aims to investigate the machine learning technique in terms of Arabic sentiment analysis on Twitter. Three techniques have been used including Naïve Bayes, Decision Tree (DT) and Support Vector Machine (SVM). In addition, two simple sub-tasks pre-processing have been also used; Term Frequency-Inverse Document Frequency (TF-IDF) and Arabic stemming to get the heaviest weight term as the feature for tweet classification. TF-IDF aims to identify the most frequent words, whereas stemming aims to retrieve the stem of the word by removing the inflectional derivations. The dataset that has been used is Modern Arabic Corpus which consists of Arabic tweets. The performance of classification has been evaluated based on the information retrieval metrics precision, recall and f-measure. The experimental results have shown that DT has outperformed the other techniques by obtaining 78% of f-measure.

Keywords:

Arabic Sentiment Analysis; Opinion Mining; Twitter data.

cite this paper     download