A Hybrid Gene Selection Strategy Based on Fisher and Ant Colony Optimization Algorithm for Breast Cancer Classification

Mohammed Hamim, Ismail El Moudden, Mohan D Pant, Hicham Moutachaouik, Mustapha Hain

Abstract


Breast cancer poses the greatest threat to human life and especially to women's life. Despite the progress made in data mining technology in recent years, the ability to predict and diagnose such fatal diseases based on gene expression data still reveals a limited prediction performance, which may not be surprising since most of the genes in expression data are believed to be irrelevant or redundant. The dimensionality reduction process may be considered as a crucial step to analyze gene expression data, as it can reduce the high dimensionality of the breast cancer datasets, which may result into a better prediction performance of such diseases. The paper suggests a new hybrid approach-based gene selection that combines the filter method and the Ant Colony Optimization algorithm to find the smallest subset of informative genes (genes markers) among 24,481 genes. The proposed approach combines four machine learning algorithms - C5.0 Decision Tree, Support Vector Machines, K-Nearest Neighbors algorithm, and Random Forest Classifier - to classify each of the selected samples (patients) into two classes which have cancer or not.  Compared with existing methods in the literature, experimental results indicate that our proposed gene selection approach achieved globally higher classification accuracies with a relatively smaller number of genes.


Keywords


Breast cancer. Gene Selection. Ant Colony Optimization. Microarray technology

Full Text:

PDF



International Journal of Online and Biomedical Engineering (iJOE) – eISSN: 2626-8493
Creative Commons License
Indexing:
Scopus logo Clarivate Analyatics ESCI logo IET Inspec logo DOAJ logo DBLP logo EBSCO logo Ulrich's logo Google Scholar logo MAS logo