Abstract
The exponentially growing online reviews have become a great wealth of information into which many researchers have started tapping. Using online reviews as a source of customer feedback, product designers are able to better understand customers’ preferences and improve product design accordingly. However, while predicting future product demand as a function of product attributes and customer heterogeneity has proved to be effective, not many literatures have studied the impact of non-product-related features, such as number of reviews and average ratings, on product demand using a large-scale dataset. As such, this paper proposes a data-driven methodology to investigate the influence of online ratings and reviews in purchase behavior by using discrete choice analysis. In the absence of information about the true customer choice set, we generate an estimated customer choice set based on a probability sampling using customer clustering and product clustering. In order to examine the effect of number of reviews and average rating, we have computed, for all the laptops in the choice set of each customer, the number of reviews and thus average rating at the date of this particular customer’s review. Using laptops for our case study, our experiment has shown that the number of reviews and average ratings are statistically significant, and the inclusion of these features will greatly improve the predictive ability of the model.