Pathway-based Analysis with Support Vector Machine (SVM-LASSO) for Gene Selection and Classification

Nurul Athirah Nasrudin (1), Weng Howe Chan (2), Mohd Saberi Mohamad (3), Safaai Deris (4), Suhaimi Napis (5), Shahreen Kasim (6)
(1) Artificial Intelligence and Bioinformatics Research Group, Universiti Teknologi Malaysia, 81310, Johor Bahru, Johor, Malaysia.
(2) Artificial Intelligence and Bioinformatics Research Group, Universiti Teknologi Malaysia, 81310, Johor Bahru, Johor, Malaysia.
(3) Faculty of Creative Technology and Heritage, Universiti Malaysia Kelantan, Karung Berkunci 01, 16300, Bachok, Kelantan, Malaysia.
(4) Faculty of Creative Technology and Heritage, Universiti Malaysia Kelantan, Karung Berkunci 01, 16300, Bachok, Kelantan, Malaysia.
(5) Department of Cell and Molecular Biology, Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, 43400, Serdang, Selangor, Malaysia.
(6) Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, 86400 Batu Pahat, Johor, Malaysia.
Fulltext View | Download
How to cite (IJASEIT) :
Nasrudin, Nurul Athirah, et al. “Pathway-Based Analysis With Support Vector Machine (SVM-LASSO) for Gene Selection and Classification”. International Journal on Advanced Science, Engineering and Information Technology, vol. 7, no. 4-2, Sept. 2017, pp. 1609-14, doi:10.18517/ijaseit.7.4-2.3397.
Genomic knowledge has become a popular research field in bioinformatics biological process that providing further biological process information. Many methods have been done to address the issues of high data throughput due to increased use of microarray technology. However, it is still not able to determine the appropriate diseases accurately. This is because of existing non-informative genes that could be included in the analysis of context specific data like cancer gene expression data, which affect the classification performance. This study proposed a pathway-based analysis for gene classification. Pathway-based analysis enable handling microarray data in order to improved biological interpretation of the analysis outcome. Secondly, Support Vector Machine with Least Absolute Shrinkage and Selection Operator algorithm (SVM-LASSO) is proposed, which to find informative genes for each pathway to ensure efficient gene selection and classification in every pathway. Experiments are done using lung cancer dataset and breast cancer dataset that widely used in cancer classification area. A stratified 10-fold cross validation is implement to evaluate the performance of the proposed method in terms of accuracy, specificity and sensitivity. Moreover, biological validation have been done on the selected genes based on biological literatures and biological databases. Next, the results from the proposed methods are compared with the previous study throughout all the data sets in terms of performance. As conclusion, this research finding can contribute in biology area especially in cancer classification area.
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms:

    1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
    2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
    3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).