Influential Gene Identification for Cancer Classification
Abstract-In the field of cancer diagnosis, the DNA microarray technology is used to estimate the level of expression of thousands of genes, which can then be used to differentiate between cancer and control subjects. For the diagnosis purpose, the number of genes is often more than our computational limit. Selecting a small number of influential genes can also lead to a better classification accuracy. To meet this criteria, a popular approach is feature selection. If a limited number of influential genes is identified, drug designing for cancer becomes more practical and easier. Previously, some statistical approaches have been used for this purpose. However, it is still nowhere near enough and much more research is needed in this field. Thus, to make further contribution, we have identified influential genes for colon cancer and leukemia and then biomarker genes are selected from them. In this paper, we have used Kruskal-Wallis test and Bonferroni correction to select features of the microarray gene expression data. We have also differentiated the up-regulated and down-regulated genes using FC values and heatmap plot. Finally, we have identified 11 influential genes with 93.33% classification accuracy for colon cancer dataset and 4 influential genes with 91.67% classification accuracy for leukemia dataset using support vector machine.
sales on Site11,021