Gene Microarray Cancer Classification using Correlation Based Feature Selection Algorithm and Rules Classifiers

Mohammad Subhi Al-Batah, Belal Mohammad Zaqaibeh, Saleh Ali Alomari, Mowafaq Salem Alzboon

Abstract


Gene microarray classification problems are considered a challenge task since the datasets contain few number of samples with high number of genes (features). The genes subset selection in microarray data play an important role for minimizing the computational load and solving classification problems. In this paper, the Correlation-based Feature Selection (CFS) algorithm is utilized in the feature selection process to reduce the dimensionality of data and finding a set of discriminatory genes. Then, the Decision Table, JRip, and OneR are employed for classification process. The proposed approach of gene selection and classification is tested on 11 microarray datasets and the performances of the filtered datasets are compared with the original datasets. The experimental results showed that CFS can effectively screen irrelevant, redundant, and noisy features. In addition, the results for all datasets proved that the proposed approach with a small number of genes can achieve high prediction accuracy and fast computational speed. Considering the average accuracy for all the analysis of microarray data, the JRip achieved the best result as compared to Decision Table, and OneR classifier. The proposed approach has a remarkable impact on the classification accuracy especially when the data is complicated with multiple classes and high number of genes.

Keywords


Feature selection, gene expression data, Correlation-based Feature Selection algorithm, Decision Table, JRip, and OneR.

Full Text:

PDF



International Journal of Online and Biomedical Engineering (iJOE) – eISSN: 2626-8493
Creative Commons License
Indexing:
Scopus logo Clarivate Analyatics ESCI logo IET Inspec logo DOAJ logo DBLP logo EBSCO logo Ulrich's logo Google Scholar logo MAS logo