Intelligent Engineering Systems through Artificial Neural Networks, Volume 20
41 Using Noise Perturbation along with Ga-SVM to Overcome over Fitting and Identify Biomarker Sets for Colorectal Cancer
Download citation file:
This paper describes an ongoing research effort to identify gene sets that predict the survival of colorectal cancer patients based on gene expression data. Since the dataset includes 395 genes (after initial feature reduction) and 122 patients, the issue of over fitting must be addressed. A genetic algorithm (GA) specifically designed for feature set selection is used in combination with a support vector machine (SVM). By evaluating groups of genes as opposed to individual genes, complementary sets are obtained. To combat over fitting, the original measurements are perturbed by noise using variances appropriate to each measurement and an overall gain...