International Journal on Advanced Science, Engineering and Information Technology, Vol. 7 (2017) No. 4-2: Special Issue on the Emerging Trends in Software Engineering and Soft Computing Applications, pages: 1595-1600, Chief Editor: Shahreen Kasim Editorial Boards: Rohayanti Hassan, Hairulnizam Mahdin, Mohd Farhan Md Fudzee & Azizul Azhar Ramli, DOI:10.18517/ijaseit.7.4-2.3395

An Improved Parallelized mRMR for Gene Subset Selection in Cancer Classification

Rohani Mohammad Kusairi, Kohbalan Moorthy, Habibollah Haron, Mohd Saberi Mohamad, Suhaimi Napis, Shahreen Kasim

Abstract

DNA microarray technique has become a more attractive tool for cancer classification in the scientific and industrial fields. Based on the previous researchers, the conventional approach for cancer classification is primarily based on morphological appearance of the tumor. The limitations of this approach are bias in identify the tumors by expert and faced the difficulty in differentiate the cancer subtypes due to most cancers being highly related to the specific biological insight.  Thus, this study propose an improved parallelized Minimum Redundancy Maximum Relevance (mRMR), which is a particularly fast feature selection method for finding a set of both relevant and complementary features. The mRMR can identify genes more relevance to biological context that leads to richer biological interpretations. The proposed method is expected to achieve accurate classification performance using small number of predictive genes when tested using two datasets from Cancer Genome Project and compared to previous methods.

Keywords:

feature selection; cancer classification; mRMR filter method; parallelized mRMR; random forest classifier

Viewed: 1125 times (since abstract online)

cite this paper     download