Skip to Main Content
ASME Press Select Proceedings

International Conference on Instrumentation, Measurement, Circuits and Systems (ICIMCS 2011)

By
Chen Ming
Chen Ming
Search for other works by this author on:
ISBN:
9780791859902
No. of Pages:
1400
Publisher:
ASME Press
Publication date:
2011

This paper discussed the problem in number selection in clustering and variable selection based on behaviors of a huge website in China. By comparing the traditional model based on BIC criterion and the method based on prediction strength, we try to construct a set of general framework of web data cluster analysis by introducing prediction strength. We obtain the following conclusions: In web usage analysis, traditional parametric clustering method based on mixture model fails to solve the central problems in web usage cluster analysis. Prediction strength designed by nonparametric statistics and machine learning is fast and flexible over than model-based BIC criterion in clustering number selection. We find another advantage of prediction strength is its convenience to operate variable selection and number selection. Based on the special properties on web usage data, we present clustering process in application which combining with data cleaning and clustering techniques.

This content is only available via PDF.
Close Modal
This Feature Is Available To Subscribers Only

Sign In or Create an Account

Close Modal
Close Modal