International Journal on Advanced Science, Engineering and Information Technology, Vol. 12 (2022) No. 6, pages: 2378-2385, DOI:10.18517/ijaseit.12.6.16462

Bayesian Model Averaging (BMA) Based on Logistic Regression for Gene Selection and Classification of Animal Tumor Disease on Microarray Data

Heri Kuswanto, Ika Nur Laily Fitriana

Abstract

Tumor is one of the deadly diseases which is frequently to be found in animals. However, identifying whether an animal has a tumor still becomes a big challenge. Classification of tumor disease can be done through gene expression, which consists of hundreds of genes, but only a small number of samples is taken. This data structure is called microarray data having the characteristic of high-dimensional data. The choice of a single model can be a problem for high-dimensional data because it ignores model uncertainty. This research proposed to use Bayesian Model Averaging (BMA) to model the uncertainty model by averaging the posterior distribution of all best models, weighted by their posterior model probabilities. Selecting relevant genes to diagnose animal tumors is very important; hence, variable selection needs to be carried out. The selection of predictor variables is carried out by using the iterative BMA algorithm. The BMA results showed that from 335 gene expressions, 12 genes were selected to be relevant genes for classifying whether the animals have a tumor or normal. Moreover, from 2335 possible models formed, 12 of the best models are selected. The accuracy of BMA results is assessed using the Brier Score, resulting from a value indicating that the BMA model is good enough to classify animals, whether they have a tumor or not. This research has proven that BMA with logistic performance has very good predictability; hence, the method can be applied to classify other diseases.

Keywords:

Animal tumor; BMA; gene expression; microarray.

Viewed: 104 times (since abstract online)

cite this paper     download