The authors of this work propose an algorithm that determines optimal search keyword combinations for querying online product data sources in order to minimize identification errors during the product feature extraction process. Data-driven product design methodologies based on acquiring and mining online product-feature-related data are presented with two fundamental challenges: (1) determining optimal search keywords that result in relevant product related data being returned and (2) determining how many search keywords are sufficient to minimize identification errors during the product feature extraction process. These challenges exist because online data, which is primarily textual in nature, may violate several statistical assumptions relating to the independence and identical distribution of samples relating to a query. Existing design methodologies have predetermined search terms that are used to acquire textual data online, which makes the resulting data acquired, a function of the quality of the search term(s) themselves. Furthermore, the lack of independence and identical distribution of text data from online sources impacts the quality of the acquired data. For example, a designer may search for a product feature using the term “screen,” which may return relevant results such as “the screen size is just perfect,” but may also contain irrelevant noise such as “researchers should really screen for this type of error.” A text mining algorithm is introduced to determine the optimal terms without labeled training data that would maximize the veracity of the data acquired to make a valid conclusion. A case study involving real-world smartphones is used to validate the proposed methodology.
Skip Nav Destination
Article navigation
June 2016
Research-Article
A Bayesian Sampling Method for Product Feature Extraction From Large-Scale Textual Data
Sunghoon Lim,
Sunghoon Lim
Industrial and Manufacturing Engineering,
The Pennsylvania State University,
University Park, PA 16802
e-mail: slim@psu.edu
The Pennsylvania State University,
University Park, PA 16802
e-mail: slim@psu.edu
Search for other works by this author on:
Conrad S. Tucker
Conrad S. Tucker
Mem. ASME
Engineering Design and Industrial
and Manufacturing Engineering,
The Pennsylvania State University,
University Park, PA 16802
e-mail: ctucker4@psu.edu
Engineering Design and Industrial
and Manufacturing Engineering,
The Pennsylvania State University,
University Park, PA 16802
e-mail: ctucker4@psu.edu
Search for other works by this author on:
Sunghoon Lim
Industrial and Manufacturing Engineering,
The Pennsylvania State University,
University Park, PA 16802
e-mail: slim@psu.edu
The Pennsylvania State University,
University Park, PA 16802
e-mail: slim@psu.edu
Conrad S. Tucker
Mem. ASME
Engineering Design and Industrial
and Manufacturing Engineering,
The Pennsylvania State University,
University Park, PA 16802
e-mail: ctucker4@psu.edu
Engineering Design and Industrial
and Manufacturing Engineering,
The Pennsylvania State University,
University Park, PA 16802
e-mail: ctucker4@psu.edu
1Corresponding author.
Contributed by the Design Automation Committee of ASME for publication in the JOURNAL OF MECHANICAL DESIGN. Manuscript received June 29, 2015; final manuscript received March 24, 2016; published online April 20, 2016. Assoc. Editor: Gary Wang.
J. Mech. Des. Jun 2016, 138(6): 061403 (9 pages)
Published Online: April 20, 2016
Article history
Received:
June 29, 2015
Revised:
March 24, 2016
Citation
Lim, S., and Tucker, C. S. (April 20, 2016). "A Bayesian Sampling Method for Product Feature Extraction From Large-Scale Textual Data." ASME. J. Mech. Des. June 2016; 138(6): 061403. https://doi.org/10.1115/1.4033238
Download citation file:
Get Email Alerts
Large Language Models for Predicting Empathic Accuracy Between a Designer and a User
J. Mech. Des (April 2025)
Repurposing as a Decommissioning Strategy for Complex Systems: A Systematic Review
J. Mech. Des (May 2025)
A Dataset Generation Framework for Symmetry-Induced Mechanical Metamaterials
J. Mech. Des (April 2025)
Related Articles
Latent Customer Needs Elicitation by Use Case Analogical Reasoning From Sentiment Analysis of Online Product Reviews
J. Mech. Des (July,2015)
Phrase Embedding and Clustering for Sub-Feature Extraction From Online Data
J. Mech. Des (May,2022)
Quantifying Product Favorability and Extracting Notable Product Features Using Large Scale Social Media Data
J. Comput. Inf. Sci. Eng (September,2015)
Trend Mining for Predictive Product Design
J. Mech. Des (November,2011)
Related Proceedings Papers
Related Chapters
Automatic Classification of Persian Texts Employing Keywords
International Conference on Computer Research and Development, 5th (ICCRD 2013)
Mining Services for Cache Replacement Method
International Conference on Mechanical and Electrical Technology, 3rd, (ICMET-China 2011), Volumes 1–3
Identification of Temporal Interval Relations of Frequent Patterns during Incremental Phase
International Conference on Computer Engineering and Technology, 3rd (ICCET 2011)