Skip to Main Content
Skip Nav Destination
ASME Press Select Proceedings
International Conference on Instrumentation, Measurement, Circuits and Systems (ICIMCS 2011)Available to Purchase
By
Chen Ming
Chen Ming
Search for other works by this author on:
ISBN:
9780791859902
No. of Pages:
1400
Publisher:
ASME Press
Publication date:
2011

This paper discussed the problem in number selection in clustering and variable selection based on behaviors of a huge website in China. By comparing the traditional model based on BIC criterion and the method based on prediction strength, we try to construct a set of general framework of web data cluster analysis by introducing prediction strength. We obtain the following conclusions: In web usage analysis, traditional parametric clustering method based on mixture model fails to solve the central problems in web usage cluster analysis. Prediction strength designed by nonparametric statistics and machine learning is fast and flexible over than model-based BIC criterion in clustering number selection. We find another advantage of prediction strength is its convenience to operate variable selection and number selection. Based on the special properties on web usage data, we present clustering process in application which combining with data cleaning and clustering techniques.

Abstract
Keywords:
Introduction
Theories on Extimating the Number of Clusters
Emprical Study
Conclusion and Suggestions
Acknowledgments
References
This content is only available via PDF.
You do not currently have access to this chapter.

or Create an Account

Close Modal
Close Modal