71 Boosting Classification Accuracy with Samples Chosen from a Validation Set
-
Published:2007
Download citation file:
It is common to train a classifier with a training set, and to test it with a testing set to study the classification accuracy. In this paper, we show how to effectively use a number of validation sets obtained from the original training data to improve the performance of a classifier. The proposed validation boosting algorithm is illustrated with a support vector machine (SVM) in the application of lymphography classification. A number of runs with the algorithm is generated to show its robustness as well as to generate consensus results. At each run, a number of validation datasets are generated by randomly picking a portion of the original training dataset. At each iteration during a run, the trained classifier is used to classify the current validation dataset. The misclassified validation vectors are added to the training set for the next iteration. Every time the training set is changed, new classification borders are generated with the classifier used. Experimental results on a lymphography dataset shows that the proposed method with validation boosting can achieve much better generalization performance with a testing set than the case without validation boosting.