International Journal on Advanced Science, Engineering and Information Technology, Vol. 7 (2017) No. 4, DOI:10.18517/ijaseit.7.4.2198

Comparative Analysis of Text Classification Algorithms for Automated Labelling of Quranic Verses.

Abdullahi Oyekunle Adeleke, Noor Azah Samsudin, Aida Mustapha, Nazri Nawi

Abstract

The ultimate goal of labelling a Quranic verse is to determine its corresponding theme. However, the existing Quranic verse labelling approach is primarily depending on the availability of Quranic scholars who have expertise in Arabic language and Tafseer. In this paper, we propose to automate the labelling task of the Quranic verses using text classification algorithms. We applied three classification algorithms namely, k-Nearest Neighbour, Support Vector Machine, and Naïve Bayes in automating the labelling procedure. In our experiment with the classification algorithms English translation of the verses are presented as features. The English translation of the verses are then classified as "Shahadah" (the first pillar of Islam) or "Pray" (the second pillar of Islam). It is found that all of the text classification algorithms are capable to achieve more than 70% accuracy in labelling the Quranic verses.

Keywords:

Holy Quran; Feature Selection Techniques; k-Nearest Neighbour; Support Vector Machine; Naïve Bayes

cite this paper