Skip to Main Content
Skip Nav Destination
ASME Press Select Proceedings

International Conference on Instrumentation, Measurement, Circuits and Systems (ICIMCS 2011)

Chen Ming
Chen Ming
Search for other works by this author on:
No. of Pages:
ASME Press
Publication date:

This paper discussed the problem in number selection in clustering and variable selection based on behaviors of a huge website in China. By comparing the traditional model based on BIC criterion and the method based on prediction strength, we try to construct a set of general framework of web data cluster analysis by introducing prediction strength. We obtain the following conclusions: In web usage analysis, traditional parametric clustering method based on mixture model fails to solve the central problems in web usage cluster analysis. Prediction strength designed by nonparametric statistics and machine learning is fast and flexible over than model-based BIC criterion in clustering number selection. We find another advantage of prediction strength is its convenience to operate variable selection and number selection. Based on the special properties on web usage data, we present clustering process in application which combining with data cleaning and clustering techniques.

This content is only available via PDF.
You do not currently have access to this chapter.
Close Modal

or Create an Account

Close Modal
Close Modal