International Journal on Advanced Science, Engineering and Information Technology, Vol. 6 (2016) No. 6, pages: 1148-1153, DOI:10.18517/ijaseit.6.6.1487

Comparative Analysis of Data Mining Techniques for Malaysian Rainfall Prediction

Suhaila Zainudin, Dalia Sami Jasim, Azuraliza Abu Bakar

Abstract

Climate change prediction analyses the behaviours of weather for a specific time. Rainfall forecasting is a climate change task where specific features such as humidity and wind will be used to predict rainfall in specific locations. Rainfall prediction can be achieved using classification task under Data Mining. Different techniques lead to different performances depending on rainfall data representation including representation for long term (months) patterns and short-term (daily) patterns. Selecting an appropriate technique for a specific duration of rainfall is a challenging task. This study analyses multiple classifiers such as Naïve Bayes, Support Vector Machine, Decision Tree, Neural Network and Random Forest for rainfall prediction using Malaysian data. The dataset has been collected from multiple stations in Selangor, Malaysia. Several pre-processing tasks have been applied in order to resolve missing values and eliminating noise. The experimental results show that with small training data (10%) from 1581 instances Random Forest correctly classified 1043 instances. This is the strength of an ensemble of trees in Random Forest where a group of classifiers can jointly beat a single classifier.

Keywords:

Rainfall prediction; data mining; classification; Random Forest; ensemble.

Viewed: 43 times (since Sept 4, 2017)

cite this paper     download