Skip to Main Content
ASME Press Select Proceedings

International Conference on Information Technology and Computer Science, 3rd (ITCS 2011)

Editor
V. E. Muhin
V. E. Muhin
National Technical University of Ukraine
Search for other works by this author on:
W. B. Hu
W. B. Hu
Wuhan University
Search for other works by this author on:
ISBN:
9780791859742
No. of Pages:
656
Publisher:
ASME Press
Publication date:
2011

K-Nearest Neighbor (KNN) algorithm is simple and good at stability that it has been widely used in text classification. But the higher dimensions of document vector and larger size of the text classification sample, it will seriously affect the accuracy and efficiency of classification. For the above shortcomings, a new selection method of samples based on spanning tree document clustering is presented, whose basic idea is that the documents samples in each category have been divided automatic into different clusters based on spanning tree document clustering. Within each category ,there are sub-tree generated which they have the same sub-categories. Each sub-tree is cut based on node density. As reserving typical samples and reducing training samples, the train samples remained have a good representative. Experiments result show that the method not only improves the efficiency of the method of classification, and the classification accuracy has been improved to some extent.

This content is only available via PDF.
Close Modal
This Feature Is Available To Subscribers Only

Sign In or Create an Account

Close Modal
Close Modal