Cite Article

Indonesian Text Classification using Back Propagation and Sastrawi Stemming Analysis with Information Gain for Selection Feature

Choose citation format

BibTeX

@article{IJASEIT8858,
   author = {Mahendra Dwifebri Purbolaksono and Feddy Dea Reskyadita and - Adiwijaya and Arie Ardiyanti Suryani and Arief Fatchul Huda},
   title = {Indonesian Text Classification using Back Propagation and Sastrawi Stemming Analysis with Information Gain for Selection Feature},
   journal = {International Journal on Advanced Science, Engineering and Information Technology},
   volume = {10},
   number = {1},
   year = {2020},
   pages = {234--238},
   keywords = {feature selection; information gain; text mining; neural network; classification.},
   abstract = {

The second fundamental source of law for Moslems is the Hadith. The Hadith can be used to explain Quranic texts.  However, Hadith still needs to be translated according to each national language to easily understand its meaning [1]. In Indonesia Hadith more usually refers to a special class of relevance to more particular religious concern [1]. Base on that, this research will Classify the translation Hadith Text into three classes: Obligation, Prohibition, and Information. From previous research, the Back Propagation Neural Network (BPNN) showed good performance in classifying hadith text. Therefore, BPNN was used to solve the problem of hadith text classification in this study. However, the dataset has a huge number of varied bag-of-words, which are features that will be used in the classification process. Hence, Information Gain (IG) was utilized to select influential features, and as the sequential process before the classification process. To measure the performance of this system, the Macro F1-Score was used. The F1-Score enables one to observe exactness from precision and completeness from recall. The Macro F1-score is also needed for the performance evaluation of more than two classes.  Based on the experiment conducted, the system was able to classify hadith text using BPNN, IG, and without stemming, yielding the highest F1-score of 84.63%. However, the system performance that included the stemming process yielded an F1-score of 80.92%. This shows that the stemming process could decrease classification performance. This decreasing performance is due to some influential words merging with more noninfluential words.

},    issn = {2088-5334},    publisher = {INSIGHT - Indonesian Society for Knowledge and Human Development},    url = {http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=8858},    doi = {10.18517/ijaseit.10.1.8858} }

EndNote

%A Purbolaksono, Mahendra Dwifebri
%A Reskyadita, Feddy Dea
%A Adiwijaya, -
%A Suryani, Arie Ardiyanti
%A Huda, Arief Fatchul
%D 2020
%T Indonesian Text Classification using Back Propagation and Sastrawi Stemming Analysis with Information Gain for Selection Feature
%B 2020
%9 feature selection; information gain; text mining; neural network; classification.
%! Indonesian Text Classification using Back Propagation and Sastrawi Stemming Analysis with Information Gain for Selection Feature
%K feature selection; information gain; text mining; neural network; classification.
%X 

The second fundamental source of law for Moslems is the Hadith. The Hadith can be used to explain Quranic texts.  However, Hadith still needs to be translated according to each national language to easily understand its meaning [1]. In Indonesia Hadith more usually refers to a special class of relevance to more particular religious concern [1]. Base on that, this research will Classify the translation Hadith Text into three classes: Obligation, Prohibition, and Information. From previous research, the Back Propagation Neural Network (BPNN) showed good performance in classifying hadith text. Therefore, BPNN was used to solve the problem of hadith text classification in this study. However, the dataset has a huge number of varied bag-of-words, which are features that will be used in the classification process. Hence, Information Gain (IG) was utilized to select influential features, and as the sequential process before the classification process. To measure the performance of this system, the Macro F1-Score was used. The F1-Score enables one to observe exactness from precision and completeness from recall. The Macro F1-score is also needed for the performance evaluation of more than two classes.  Based on the experiment conducted, the system was able to classify hadith text using BPNN, IG, and without stemming, yielding the highest F1-score of 84.63%. However, the system performance that included the stemming process yielded an F1-score of 80.92%. This shows that the stemming process could decrease classification performance. This decreasing performance is due to some influential words merging with more noninfluential words.

%U http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=8858 %R doi:10.18517/ijaseit.10.1.8858 %J International Journal on Advanced Science, Engineering and Information Technology %V 10 %N 1 %@ 2088-5334

IEEE

Mahendra Dwifebri Purbolaksono,Feddy Dea Reskyadita,- Adiwijaya,Arie Ardiyanti Suryani and Arief Fatchul Huda,"Indonesian Text Classification using Back Propagation and Sastrawi Stemming Analysis with Information Gain for Selection Feature," International Journal on Advanced Science, Engineering and Information Technology, vol. 10, no. 1, pp. 234-238, 2020. [Online]. Available: http://dx.doi.org/10.18517/ijaseit.10.1.8858.

RefMan/ProCite (RIS)

TY  - JOUR
AU  - Purbolaksono, Mahendra Dwifebri
AU  - Reskyadita, Feddy Dea
AU  - Adiwijaya, -
AU  - Suryani, Arie Ardiyanti
AU  - Huda, Arief Fatchul
PY  - 2020
TI  - Indonesian Text Classification using Back Propagation and Sastrawi Stemming Analysis with Information Gain for Selection Feature
JF  - International Journal on Advanced Science, Engineering and Information Technology; Vol. 10 (2020) No. 1
Y2  - 2020
SP  - 234
EP  - 238
SN  - 2088-5334
PB  - INSIGHT - Indonesian Society for Knowledge and Human Development
KW  - feature selection; information gain; text mining; neural network; classification.
N2  - 

The second fundamental source of law for Moslems is the Hadith. The Hadith can be used to explain Quranic texts.  However, Hadith still needs to be translated according to each national language to easily understand its meaning [1]. In Indonesia Hadith more usually refers to a special class of relevance to more particular religious concern [1]. Base on that, this research will Classify the translation Hadith Text into three classes: Obligation, Prohibition, and Information. From previous research, the Back Propagation Neural Network (BPNN) showed good performance in classifying hadith text. Therefore, BPNN was used to solve the problem of hadith text classification in this study. However, the dataset has a huge number of varied bag-of-words, which are features that will be used in the classification process. Hence, Information Gain (IG) was utilized to select influential features, and as the sequential process before the classification process. To measure the performance of this system, the Macro F1-Score was used. The F1-Score enables one to observe exactness from precision and completeness from recall. The Macro F1-score is also needed for the performance evaluation of more than two classes.  Based on the experiment conducted, the system was able to classify hadith text using BPNN, IG, and without stemming, yielding the highest F1-score of 84.63%. However, the system performance that included the stemming process yielded an F1-score of 80.92%. This shows that the stemming process could decrease classification performance. This decreasing performance is due to some influential words merging with more noninfluential words.

UR - http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=8858 DO - 10.18517/ijaseit.10.1.8858

RefWorks

RT Journal Article
ID 8858
A1 Purbolaksono, Mahendra Dwifebri
A1 Reskyadita, Feddy Dea
A1 Adiwijaya, -
A1 Suryani, Arie Ardiyanti
A1 Huda, Arief Fatchul
T1 Indonesian Text Classification using Back Propagation and Sastrawi Stemming Analysis with Information Gain for Selection Feature
JF International Journal on Advanced Science, Engineering and Information Technology
VO 10
IS 1
YR 2020
SP 234
OP 238
SN 2088-5334
PB INSIGHT - Indonesian Society for Knowledge and Human Development
K1 feature selection; information gain; text mining; neural network; classification.
AB 

The second fundamental source of law for Moslems is the Hadith. The Hadith can be used to explain Quranic texts.  However, Hadith still needs to be translated according to each national language to easily understand its meaning [1]. In Indonesia Hadith more usually refers to a special class of relevance to more particular religious concern [1]. Base on that, this research will Classify the translation Hadith Text into three classes: Obligation, Prohibition, and Information. From previous research, the Back Propagation Neural Network (BPNN) showed good performance in classifying hadith text. Therefore, BPNN was used to solve the problem of hadith text classification in this study. However, the dataset has a huge number of varied bag-of-words, which are features that will be used in the classification process. Hence, Information Gain (IG) was utilized to select influential features, and as the sequential process before the classification process. To measure the performance of this system, the Macro F1-Score was used. The F1-Score enables one to observe exactness from precision and completeness from recall. The Macro F1-score is also needed for the performance evaluation of more than two classes.  Based on the experiment conducted, the system was able to classify hadith text using BPNN, IG, and without stemming, yielding the highest F1-score of 84.63%. However, the system performance that included the stemming process yielded an F1-score of 80.92%. This shows that the stemming process could decrease classification performance. This decreasing performance is due to some influential words merging with more noninfluential words.

LK http://ijaseit.insightsociety.org/index.php?option=com_content&view=article&id=9&Itemid=1&article_id=8858 DO - 10.18517/ijaseit.10.1.8858