Skip to Main Content
Skip Nav Destination
ASME Press Select Proceedings
International Conference on Information Technology and Computer Science, 3rd (ITCS 2011)
Editor
V. E. Muhin
V. E. Muhin
National Technical University of Ukraine
Search for other works by this author on:
W. B. Hu
W. B. Hu
Wuhan University
Search for other works by this author on:
ISBN:
9780791859742
No. of Pages:
656
Publisher:
ASME Press
Publication date:
2011

K-Nearest Neighbor (KNN) algorithm is simple and good at stability that it has been widely used in text classification. But the higher dimensions of document vector and larger size of the text classification sample, it will seriously affect the accuracy and efficiency of classification. For the above shortcomings, a new selection method of samples based on spanning tree document clustering is presented, whose basic idea is that the documents samples in each category have been divided automatic into different clusters based on spanning tree document clustering. Within each category ,there are sub-tree generated which they have the same sub-categories. Each sub-tree is cut based on node density. As reserving typical samples and reducing training samples, the train samples remained have a good representative. Experiments result show that the method not only improves the efficiency of the method of classification, and the classification accuracy has been improved to some extent.

Abstract
Keywords
I. Introduction
II. Related Knowledge
III. An Reducing Method of Training Samples Based on Spanning Tree Documents Clustering
IV. Experiments and Analysis
V. Conclusions
VI. Acknowledgments
References
This content is only available via PDF.
You do not currently have access to this chapter.
Close Modal

or Create an Account

Close Modal
Close Modal