Image Classification of Tourist Attractions with K-Nearest Neighbor, Logistic Regression, Random Forest, and Support Vector Machine

Herry Sujaini (1)
(1) Department of Informatics, University of Tanjungpura, Jl. Prof.Hadari Nawawi, Pontianak, 78124, Indonesia
Fulltext View | Download
How to cite (IJASEIT) :
Sujaini, Herry. “Image Classification of Tourist Attractions With K-Nearest Neighbor, Logistic Regression, Random Forest, and Support Vector Machine”. International Journal on Advanced Science, Engineering and Information Technology, vol. 10, no. 6, Dec. 2020, pp. 2207-12, doi:10.18517/ijaseit.10.6.9098.
K-Nearest Neighbor (KNN), Logistic regression (LR), Random Forest (RF), and Support Vector Machine (SVM) are four methods of identification. The methods are widely used in various research in data mining, especially classifications in recent years. We have used the four classification methods in the study to classify images of five natural attractions, namely Danau Toba (North Sumatra), Nusa Penida (Bali), Raja Ampat (West Papua), Tanah Lot (Bali), and Wakatobi (Southeast Sulawesi). Our research results have concluded that the Logistic Regression method's performance has the best performance in classifying natural images as done in this research. The LR method can classify images that other methods such as kNN, SVM, and RF cannot be correctly classified. However, SVM also shows good performance by only making one error in the classification results; it can even be corrected using the Linear Kernel. In general, it is shown that the LR method has the highest precision value of 100%, followed by the method of kNN and SVM with a precision of 91.9% and RF with a precision of 81.9%. Variations of the variables used in the experiment also determine each method's precision. Chebyshev Metric has the highest precision value in the kNN method, and Ridge Regularization has the highest precision value in the LR method. The number of best on the RF method is 11, and Linear Kernel is the Kernel that gets the best precision value on the SVM method.

Simpson, T. Mcintire, and M. Sienko, "An improved hybrid clustering algorithm for natural scenes," IEEE Transactions on Geoscience and Remote Sensing, vol. 38, no. 2, pp. 1016-1032, 2000.

L. Breiman, Random forests. Machine Learning, 45, 5-32. 2001.

K. Anwar, A. Harjoko, and S. Suharto, "A New Method for Measuring Texture Regularity based on the Intensity of the Pixels in Grayscale Images," International Journal of Computer Applications, vol. 137, no. 7, pp. 1-5, 2016.

Y. Wicaksono, R.S. Wahono, and V. Suhartono, "Color and Texture Feature Extraction Using Gabor Filter - Local Binary Patterns for Image Segmentation with Fuzzy C-Means", Journal of Intelligent Systems, vol. 1, no. 1, pp 15-21, 2015.

O. R. Indriani, E. J. Kusuma, C. A. Sari, E. H. Rachmawanto, and D. R. I. M. Setiadi, "Tomatoes classification using K-NN based on GLCM and HSV color space," 2017 International Conference on Innovative and Creative Information Technology (ICITech), 2017.

E. H. Abdelfattah, "Using the Logistic Regression to Predict Saudi's Kidney Transplant Rejection Patients," Biometrics & Biostatistics International Journal, vol. 5, no. 2, 2017.

L. A. Ahmed, "Using logistic regression in determining the effective variables in traffic accidents," Applied Mathematical Sciences, vol. 11, pp. 2047-2058, 2017.

J. Panyavaraporn and P. Horkaew, "Classification of Alzheimer's Disease in PET Scans using MFCC and SVM," International Journal on Advanced Science, Engineering and Information Technology, vol. 8, no. 5, p. 1829, 2018.

R. V. K. Reddy and U. R. Babu, "Efficient Handwritten Digit Classification using User-defined Classification Algorithm," International Journal on Advanced Science, Engineering and Information Technology, vol. 8, no. 3, p. 970, 2018.

H. W. Nugroho, T. B. Adji, and N. A. Setiawan, "Random Forest Weighting based Feature Selection for C4.5 Algorithm on Wart Treatment Selection Method," International Journal on Advanced Science, Engineering and Information Technology, vol. 8, no. 5, p. 1858, 2018.

F.N. Iandola, M.W. Moskewicz, K.. Ashraf, S. Han, W.J. Dally, and K. Keutzer, SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size. CoRR, abs/1602.07360., 2016.

J. Demsar, T. Curk, A. Erjavec, C. Gorup, T. Hocevar, M. Milutinovic, M. Mozina, M. Polajnar. M. Toplak, A. Staric, M. Stajdohar, L. Umek, L. Zagar, J. Zbontar, M., Zitnik, and B. Zupan. Orange: Data Mining Toolbox in Python. Journal of Machine Learning Research 14(Aug), pp. 2349−2353. 2013.

Authors who publish with this journal agree to the following terms:

    1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
    2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
    3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).