A New Gene Selection Method for Reduction of Reundant Genes in Cancer Microarray Classification
Microarray technologies have made it possible to measure gene expression levels for thousand of genes simultaneously. As an accurate classification of tumors is necessary for successful treatment of cancers, classification of cDNA microarray data has been widely used in effective diagnosis of cancers and some other biological diseases. But the most important problem in classification of microarray data is the intense asymmetry between the dimensionality of features (usually thousands or even tens of thousands of genes) and that of tissues (few hundreds of samples). So the large number of features (genes) makes the need for feature selection techniques more crucial than ever. From various ranking-based filter procedures to classifier-based wrapper techniques, many studies have devised their own flavor of feature selection methods. This paper presents a new gene selection method to reduce irrelevant and redundant genes to improve microarray data classification results. We will present a new multi-stage hybrid approach for gene selection which uses advantages of both filter and wrapper approaches in its different stages. To reduce correlation-based redundancies we use gene clustering. We also use an evolutionary algorithm to search for a small gene subset among the redundancy-free gene set. We have tested our proposed method on four publicly available cancer microarray datasets. Experimental results have demonstrated that not only the proposed model outperforms existing conventional filter feature selection methods, but also it could be compared
with some recently introduced outstanding wrapper and hybrid approaches. The paper concludes that the resulting reduced gene subset can classify the microarray gene expression data effectively.