## Abstract

User-driven customization is a particular design paradigm where customers act as co-designers to configure products based on their needs. However, due to insufficient product usage experience, customers may design a product incompatible with their environment and needs. Such incompatibility can negatively affect the performance of some customized features or even cause product failure. As a result, customers may hesitate to customize products because additional complexities and uncertainties are perceived. Product usage context (PUC), as all the environment and application factors that affect customer needs and product performance, can be used to facilitate customer co-design in user-driven customization. Identifying individual customer’s PUC can help customers foresee potential design failures, make more holistic design decisions, and be confident with their designs. Against the background, this paper proposes a PUC knowledge graph (PUCKG) construction method using user-generated content (UGC). The proposed method can convert crowdsourced corner cases into structured PUCKG to support personal PUC prediction, summarization, and reasoning. A case study of robot vacuum cleaners is conducted to validate the efficacy of the proposed method.

## 1 Introduction

Due to the increasingly fierce market competition, customization has become an essential design strategy to help companies increase sales. Customization is a particular engineering design paradigm that aims to increase product variety to fulfill individual customer needs [1]. Typically, customization involves two design operations: user-driven customization (i.e., customer configuration) at the front end, and product family design at the back end. In this article, customization refers to the former one. Unlike typical product design, user-driven customization shifts customers’ roles from passive product recipients to active co-designers. During the process, customers are required to express their needs explicitly and select feasible product features from a configuration list [2]. However, average customers, especially potential customers, may lack product knowledge and experience to configure a robust product. They may only have general ideas of what they want and are unaware of specific needs.

Two challenges are hindering the effectiveness of customer co-design in customization. First, customers cannot specify their needs holistically. Current methods primarily focus on the technical domain of customer needs, i.e., specifying the alternative of each feature in a configuration list [3]. However, this approach is ineffective in practice as customers can only specify the features they know and cannot imagine what they have not experienced. This is because customer needs are heavily context-dependent, and it is difficult to put forward customer needs without referring to their usage context [4,5]. Due to the lack of such methods in user-driven customization, many customer needs remain unexplored, which are potential risks affecting customization effectiveness. Second, since customers cannot assess customized features in advance, they may be uncertain whether a customized product can fit their usage context to complete tasks. Under this uncertainty, some customers may have unrealistic expectations. Failure to meet those expectations will lead to trust loss and further make potential customers unconfident about their designs or even hesitate to customize products.

To address above-mentioned challenges, companies should go beyond the technical domain and contextualize customization based on personal product usage context (PUC). PUC refers to all the environmental and application factors that affect customer needs and product performance [4]. Analyzing PUC when customizing products will help customers formulate their needs holistically, choose more feasible configurations, and be more confident with their designs. Traditionally, PUC factors are assumed and measured by designers using surveys, which are incomplete, ambiguous, and imprecise. Today, due to the advancement of the Internet, user-generated content (UGC) is becoming a promising alternative for PUC identification. A large amount of PUC knowledge can be found in UGC, including how products are used, how they interact with personal environments, and how they affect customer needs. Much PUC knowledge in UGC is corner cases that designers rarely know. Therefore, analyzing PUC in UGC will help to understand the actual causes behind customer needs and design more effective products.

Knowledge graph (KG), as a particular form of the graph representing knowledge through entities and relations, which correspond to nodes and edges in a graph, shows its advantage in representing PUC from UGC. Compared with previous knowledge representation methods, KG is more flexible and extendable, therefore showing its advantage in accommodating crowdsourced data. In addition, by using KG, data and knowledge can be navigated in hierarchical, textual, and graphical forms. A more efficient knowledge search and discovery method is provided across machines and humans. Besides, by applying KG to represent PUC knowledge, a common technical language is established to fill the semantic and knowledge gap between customers and designers. As a result, KG will support a more reliable, less ambiguous customization process without additional cognitive burdens.

Against the background, this paper proposed a systematic process framework to construct a product-specific usage context knowledge graph (PUCKG) using UGC. The remainder of this paper is organized as follows. The historical investigation of user-driven customization, PUC, and KG were summarized in Sec. 2. The detailed PUCKG construction process framework is presented in Sec. 3. A case study used to validate the efficacy of the proposed framework is shown in Sec. 4. The discussions about contributions and limitations of this paper are shown in Sec. 5. And the conclusion is in Sec. 7.

## 2 Related Works

### 2.1 User-Driven Customization.

Customization is a design strategy that aims to fulfill individual customer needs while maintaining mass production efficiency [6]. A typical customization strategy involves two design operations: user-driven customization (i.e., customer configuration, or customer co-design) at the front end and product family design at the back end. User-driven customization is essentially a customer co-design activity where customers are invited to specify their needs, and designers act to map them into the product's physical domain [2]. Product family design is driven by designers, aiming to derive a set of similar products from a common product platform. In this article, customization refers to user-driven customization. The importance of customer co-design in user-driven customization cannot be overstated. Previous research has shown that it can generate additional value for customers by improving process enjoyment, customer satisfaction, and user experience [79]. Nevertheless, despite its advantages, customer co-design also imposed significant cognitive burdens on customers. Such drawback is termed mass confusion, which means excess variety may result in increased external complexity perceived by customers [10]. There are three sources of mass confusion identified: the burden of choice, mutual trust with companies, and need-specification mapping [11]. First, due to limited cognitive capabilities, customers can easily become bored and impatient if they have to spend much effort on customization. Second, due to the asymmetrical distribution of information, customers may be reluctant to pay for a product they have never seen. Third, due to insufficient product knowledge, customers cannot translate their needs into product features.

Researchers have made several efforts to mitigate mass confusion issues. To reduce the burden of choice, Wang et al. proposed an adaptive configurator that uses game theory to reduce configuration procedures [12]. To enhance information transparency and mutual trust, Liu et al. proposed a blockchain-based customization paradigm [13]. Through this, customers and designers can broadcast data on the blockchain for the decentralized validation of product price against requirement fulfillment. To address customers’ inability to map needs with specifications, a deep learning-based method is proposed to automatically translate vaguely expressed needs into product specifications to facilitate customer co-design [14]. Besides, some research suggested that offering customer co-design toolkits that integrate different visualization technologies (e.g., 3D modeling, virtual reality) can also reduce perceived complexity in the co-design process [1517].

While previous research has devoted significant efforts to resolving mass confusion, their focuses were primarily on reducing perceived complexity, enhancing collaboration, and improving user experience. The customers’ uncertainty about product usability and functional performance remains unresolved. Previous research subjectively assumes customers can specify their needs holistically, even if vague, and define the difficulty in mapping customer needs with product specifications as a semantic gap issue (i.e., the discrepancy between customers’ and designers’ linguistic representations of the same object). For example, customers may say they want a “powerful” engine (i.e., a typical vaguely expressed customer need) instead of a “V-8” engine (i.e., a clear product specification). Such an assumption neglects that customers may be unaware of certain needs and often specify incomplete needs. It is acknowledged that such a fact can make designers miss some implicit or unarticulated needs that significantly affect a customized product's utility and functional performance. To address above-mentioned challenges, companies should go beyond the technical domain and contextualize the customization process based on the personal usage context of customers [5].

### 2.2 Product Usage Context.

Product usage context (PUC) refers to all the environment and application factors that affect product performance and/or customer preference for product attributes [18]. Green et al., in a study of mobile lighting products, concluded that differences in customer needs could be convincingly explained by the differences in PUC [4]. Considering successful customization relies on improved customer needs solicitation, identifying PUC will help designers better understand the casual causes of customer needs [19], leading to more insightful, explainable, and predictive customer needs solicitation results. However, due to time and budget constraints, the state-of-the-art PUC identification methods are imprecise, inaccurate, and incomplete, which can only be served as background research prior to design, therefore, cannot effectively support design customization [20]. Besides, PUC is constituted by a set of fleeting and dynamic factors that are constantly changing. For example, some social trends can significantly affect customer needs while quickly passing away [21]. In that case, inappropriately using PUC models could result in product failure and customer dissatisfaction. For example, smart products frequently report such failures when personalized features are recommended with inaccurate PUC models [22].

It is necessary to propose new methods of modeling, constructing, and updating PUC models to support design customization. There are a number of PUC models that can be identified in previous research. Ram and Jung classified PUC into three categories: social interaction, experiential consumption, and functional utilization [23]. Green et al. divided PUC into application tasks, infrastructure, environment, and utility [18]. Belk classified PUC into five categories: physical surroundings, social surroundings, temporal perspective, task definition, and antecedent states [24]. Also, some quantitative abstractions of PUC are proposed, such as choice modeling [25] and multi-sensor modeling [26]. Those models are developed to cope with traditional customer research activities, therefore, are unfeasible to analyze UGC.

With the rapid development of the Internet, large volumes of UGC are available online containing PUC information. Considering customers perceive the utility and performance of the same product differently from designers, analyzing PUC from UGC may not only help reveal holistic customer needs but also reduce design uncertainty by avoiding potential product failures caused by contextual incompatibility. Also, using PUC to support design customization will help to eliminate mass confusion issues as customers are more familiar with their environment and product applications. To date, a few studies have tried to solicit PUC using UGC. For instance, Suryadi and Kim defined three grammatical rules to automatically identify product application factors (i.e., a typical PUC category) from UGC [27]. Some research also used product application factors form UGC to support product-service system configuration, design concept evaluation, and design knowledge recommendation [2830]. However, the full potential of PUC is still yet to be explored for two reasons. First, PUC concerns not only product application factors but also environmental factors (e.g., human, artifact, and natural) that could be coupled with certain product functionalities in usage. Different from product application factors, environmental factors are constituted by a variety of unique knowledge from domains other than the target product; therefore, even experienced designers will have limited knowledge about it. Second, PUC is also constituted by a set of interactions to explain why specific PUC factors are contextually relevant. To best uncover PUC knowledge, it is necessary to explore unique PUC entities that lie across product domains and propose a more structured method to represent PUC.

### 2.3 Knowledge Graph.

Knowledge graph (KG) is a graph-based knowledge representation method that connects real-world entities by semantic relations [31]. Due to its underlying structure, data in KG are given formal semantics through data annotations and manipulation in a machine-readable format. As a result, KG can enable better comprehension, reasoning, and interpretation of knowledge for both humans and machines. The advantages of KG to product design cannot be overstated. On the one hand, KG provides manufacturing companies with a semantic-based and more in-depth knowledge management platform to facilitate knowledge retrieval during design. On the other hand, KG can predict new relationships based on stored domain knowledge to stimulate the designer's creativity during knowledge queries [32]. Recent advances in KG-based research focus on KG construction, embedding, and domain-specific applications [33].

KG construction is a two-step process. The first step is constructing a domain ontology to define the knowledge structure. The second step is to process knowledge data into factual triples in the form of (head, relation, and tail) or (subject, predicate, and object) to extend ontology [34]. With the continuous generation of new data, KG is updated by expanding the given domain ontology [31]. Typically researchers use four methods to construct KG: dictionary-based method, text clustering method, association rule method, and knowledge-based method [33]. The data used for KG construction could be various forms of unstructured text (e.g., UGC) and other structured (e.g., IoT data) or semi-structured data (e.g., patents).

KG embedding is the activity of embedding entities and relations of KG into continuous vector spaces [35]. Through knowledge embedding, the knowledge similarity can be quantified as similar knowledge nodes are embedded closer to each other in the vector space. KG embedding contains three steps: knowledge encoding, similarity measurement, and knowledge decoding. Common knowledge encoding models include convolutional neural networks (CNN), graph convolutional networks (GCN), re-sampling networks (RSN), and CoKE [36].

Due to its capability to support knowledge organization, management, and retrieval in a more creative fashion, KG has attracted much attention from design researchers. Various research has used KG to resolve their specific design problems. Relevant research topics include concept generation and configuration of smart product-service systems [30,37], design knowledge recommendation [28], time-series industrial knowledge graph embedding [38], design rules construction for additive manufacturing [39], and new design knowledge prediction [40].

Nevertheless, using KG to model PUC to support design customization still has not been investigated. Constructing KG of PUC using UGC has three advantages. First, KG makes knowledge adaptable and extendable. Since new UGC keeps accumulating and PUC is dynamically evolving, KG can effectively make PUC knowledge updated. Second, KG can represent more comprehensive PUC knowledge. It represents PUC not only through PUC entities but also through relationships between entities. For example, KG can represent PUC with different levels of details through particular relations (e.g., subclass of, part of, and instance of), which resolved a limitation highlighted in previous research [27]. Third, using KG, data and knowledge can be easily navigated in hierarchical, textual, and graphical forms. Through a variety of graph-based computing methods, KG can easily support customer queries and personal PUC identification for user-driven customization.

## 3 Methodology

Constructing a PUCKG using UGC is challenging for two reasons. First, UGC is a type of unstructured data containing many undesirable noises. Significant effort should be devoted to processing and analyzing data. Second, PUC knowledge is highly abstract and complex that differs from product to product. Therefore no “one-size fits all” ontology can be referred to extract PUC entities and relations automatically. To overcome the two challenges mentioned earlier, our proposed framework for PUCKG construction (Fig. 1) involves four interrelated modules: knowledge resources, product-specific PUC ontology modeling, PUCKG construction, and customization applications. All the modules are elaborated in Secs. 3.13.4, respectively.

Fig. 1
Fig. 1
Close modal

### 3.1 Knowledge Resources.

In our proposed approach, PUC knowledge is acquired from three resources: design documents, generic KG, and UGC.

Design documents (e.g., logbooks, investigation reports, and questionnaire checklists) are types of semi-structured data that prescribe prior PUC knowledge of designers. Design documents are acknowledged as a reliable knowledge resource for developing an initial PUC ontology. Generic KG (e.g., Google KG, Wikidata, and DBpedia) is a type of KG that accommodates open-world, cross-domain knowledge. Generic KG includes specific entity and relation types that can be referenced for PUCKG construction. For example, “floor covering” is a PUC factor for vacuum cleaners. Given a new entity “rug,” by querying generic KG, the “rug” can directly link with “floor covering” through the relation” subclass of.” UGC refers to content created by web users on e-commerce and social media platforms. Customer complaints on e-commerce platforms contain enormous amounts of information about unexpected product failures in their personal environment, and much of that information is PUC knowledge unknown to designers. Therefore, UGC serves as the primary knowledge resource for PUCKG construction.

### 3.2 Product-Specific Product Usage Context Ontology Modeling

#### 3.2.1 Product Usage Context Classes.

In our proposed approach, there are four PUC classes: The surrounding environment (Env), the social networks (Soc), the interacting products (Pro), and the target product behaviors (Beh). Based on the scope of representation, four PUC classes can be arranged on a pyramid with the Beh at the base and the Env at the top (Fig. 2).

Fig. 2
Fig. 2
Close modal

The surrounding environments (Env) refer to the structural boundaries or conditions in which the product operates. Env is a relatively general class that might be shared by different product categories, and it has the largest scope of representation. The Env can be classified into the natural environment (e.g., time, location, and weather) and built environment (e.g., building, infrastructure, and facility). However, in reality, designers can hardly find any absolute natural environment. To further clarify Env, a number of environment concepts (e.g., office environment, smart home environment, transportation environment) are raised by designers based on human life, work, and recreational activities. It should be noted that, for multi-environment products (e.g., vehicles, portable products, mobile devices), it is necessary to carefully formulate environment concepts to ensure all the scenarios are covered.

Social networks (Soc) refer to a unity of PUC factors caused by any kind of people relationship. The importance of Soc cannot be overstated. The presence of specific people's relationships in an Env can often result in unpredictable preference changes. According to the relationship typologies, people's relationships can be classified as personal relationships (e.g., friends, partners, and immediate family) and social relationships (e.g., colleagues, distant relatives, and acquaintances). Among them, personal relationships are more intimate and interdependent, whereas social relationships lack closeness and only happen occasionally. According to the scope of representation, Soc lays under Env, which means Soc is partially determined by Env. For example, under the Env of “office environment,” Soc is likely to be “colleague.” However, the sweeping trend of online social networking has significantly increased the complexity of Soc and has put forward the necessity to identify more implicit and occasional Soc.

The interacting products (Pro) refer to other products coupled with the target product in the Env. Since Env is shared by different categories of products either functionally or physically, it is common that specific products can interact with the target product and affect its performance accordingly. In traditional design practice, designers have limited knowledge about Pro since they are typically identified in rear or special conditions based on customers’ self-report behaviors. Such knowledge deficiencies add uncertainties to product performance and could lead to significant product failures, malfunctions, or system breakage. This problem has become more prominent in smart products. Many customized smart features are trained in structured PUC models and are often incompatible with customers’ unstructured environments. Therefore, it is important to propose a retroactive investigation of Pro in the product usage stage, considering Pro when customizing new products to help customers identify corner cases and unforeseen consequences.

Target product behaviors (Beh) describe the activities of the target product in reality. Considering customers often lack knowledge to express product applications technically, they typically refer to the “behaviors” of products instead. Generally, Beh is classified into normal and abnormal Beh. Normal Beh, according to the Function–Behavior–Structure ontology, refers to product activities and use applications that were intentionally designed to realize functionalities [41]. Abnormal Beh, in contrast, refers to undesirable and unexpected activities. As abnormal Beh often implies certain types of failures or errors a product undergoes, designers should pay more attention to abnormal Beh.

#### 3.2.2 Product Usage Context Ontology Modeling.

In a PUC ontology, PUC entities are represented by a set of nodes, and their relations are represented by a set of links between nodes. Based on a formal ontology definition, the PUC ontology (O) in our research is represented as O = 〈N, HN, R〉, as shown in Fig. 3. N refers to the node set and consists of four types of PUC entities, NEnv, NSoc, NPro, NBeh, referring to Env, Soc, Pro, Beh entities, respectively. The formal representation is $N=NEnv∪NSoc∪NPro∪NBeh$.

Fig. 3
Fig. 3
Close modal

HN is the taxonomy of N, which consists of four types: $HNEnv$ refers to the taxonomy of NEnv, $HNSoc$ refers to the taxonomy of NSoc, $HNPro$ refers to the taxonomy of NPro, and $HNBeh$ refers to the taxonomy of NBeh. And HN(N1, N2) means N2 is a subclass of N1 in the taxonomy.

R denotes a set of relations between nodes. R relates nodes non-taxonomically. There are six types of relationships: $R=REnvSoc∪REnvPro∪REnvBeh∪RSocPro∪RSocBeh∪RProBeh$. In each relationship type, the subscript refers to different PUC classes. For example, REnvSoc refers to the relationships between Env and Soc.

Design documents and generic KG are used as data sources to develop the product-specific PUC ontology. As PUC knowledge in design documents is typically documented with relatively fixed templates, domain experts can directly formalize data into an initial PUC ontology. For example, Fig. 4 is a design document template (i.e., context factor checklist) proposed by Green et al. [19]. The template consists of a list of general context factors, contextual questions, interviewers’ factor values, and combined context scenarios (i.e., aggregated factor values). In the figure, the colored blocks and lines represent different types of entities and relations that can be directly obtained or referenced for PUC ontology development. Specifically, in Fig. 4's case, different questions in the question list directly represent corresponding PUC classes. And interviewees’ answers to questions represent instances of specific PUC classes.

Fig. 4
Fig. 4
Close modal

As shown in Fig. 5, a flowchart is proposed to help designers model a product-specific PUC ontology by following the design document's template. Design documents contain a list of open questions (headlines) and corresponding answers (content), representing general PUC classes and corresponding PUC entities. Therefore, they are analyzed separately. Texts in design documents are split into individual sentences. Part-of-speech of each word is also tagged to determine word types.

Fig. 5
Fig. 5
Close modal

The first step is to identify N from elicitation questions. N in elicitation questions are typically expressed by nouns. For example, “surrounding,” “space,” and “weather” of using a product. Therefore, N can be extracted by recognizing nouns. Besides, standard survey questions are formulated using the 5W1H (i.e., who, what, when, why, where, and how) method; therefore, interrogative pronouns are also inferred to identify N. For example, “how” typically infers to the applications and functions of the product, therefore related to the NBeh, “where” typically infers to the NEnv. The recognized interrogative pronouns should be paraphrased using formal design language to become N: for example, “how often” is replaced with “frequency,” “how long” is replaced with “duration,” and “where” is replaced with “location.”

The second step is to recognize N from the interviewees’ answers. This step is similar to the first step except for two differences. First, there are no interrogative sentences in the interviewees’ answers. Therefore, interrogative pronouns are not considered for N recognition. Second, recognized entities in this step should be replaced with corresponding ancestor nodes in a generic KG to become N. For example, “bedroom” is replaced with “room type,” and “desk” is replaced with “furniture.”

The third step is to classify N into corresponding PUC classes. In this step, N are classified by measuring their semantic similarities with each PUC class using a generic KG. Given that n1, n2 are two nodes randomly selected from N, and their similarity sim(n1, n2) can be computed via Eq. (1) [42]
$sim(n1,n2)=β2δn1=n2+β(1−β)(δn1∈r(n2)|r(n2)|+δn2∈r(n1)|r(n1)|)+(1−β)2|r(n1)∩r(n2)||r(n1)||r(n2)|$
(1)

In Eq. (1), r(n) refers to the set of relations linked to the node n in the generic KG. Considering a one-step random walk starting from n1 with probability β to stay on n1, then the probability of reaching one of the linked neighbor nodes is represented by 1 − β/|r(n)|. Therefore, the similarities of two nodes sim(n1, n2) can be regarded as the probability of the least path of two such one-step random walks starting from n1 and n2 ending up on the same node. Here, δP stands for 1 when P is true and 0 otherwise. Through this approach, identified N are classified into four classes.

The fourth step aims to assign relations to nodes based on question-answer pairs. First, the node n obtained from the question is aggregated with $ni′$ from corresponding answers to generate a set of $(n,ni′)$ pairs. Then, for each $(n,ni′)$ pair, if n and $ni′$ lie in the same PUC class, then $ni′$ is regarded as a subclass of n, which is formalized into $HN(n,ni′)$. For example, “room type” is a subclass of “location.” If n and $ni′$ do not lie in the same PUC class, which typically indicates an implicit relation between n and $ni′$ across PUC classes. Experienced designers should investigate particular $(n,ni′)$ pair to create $R(n,ni′)$ manually.

The fifth step aims to assign relations based on the co-occurrence of $ni′$ in interviewees' answers. If $ni′$ and $nj′$ belong to different PUC classes, it typically implies an implicit relationship. For example, if a customer mentions “bedroom” and “desk” simultaneously, then in this customer's personal PUC, “desk” locates in a “bedroom.”

Through this method, a product-specific PUC ontology is developed for PUCKG construction.

### 3.3 Product-Specific Usage Context Knowledge Graph Construction.

This section proposed a systematic method to extract PUC knowledge from UGC for PUCKG construction. The method has two modules: 1. PUC entity identification module and 2. relation identification module.

#### 3.3.1 Product Usage Context Entities Identification.

First, a systematic data pre-processing method is followed to reduce data noise, improve data quality, and separate data into individual sentences [43]. A discourse marker set (e.g., {“and,” “or,” “then,” “if,” “while,” “when,” “cause,” and “make”}) is used to further segment long sentences into a sequence of elementary discourse units (EDUs). An illustration of EDU segmentation is shown in Fig. 6.

Fig. 6
Fig. 6
Close modal

In the second step, named entity recognition (NER) is performed to extract potential PUC entities from each EDU. Three types of NER models are commonly adopted by previous research for KG construction: supervise learning-based, rule-based, and pre-trained NER models. Pre-trained NER models are used in our method for two main reasons. First, it is impossible to prepare a comprehensive and balanced dataset to train deep learning models (e.g., Bi-LSTM and BERT). Because PUC is a cross-domain knowledge that is often beyond the designer's scope and control. Moreover, different PUC entities have a significantly imbalanced distribution of observations, as many PUC factors are from corner cases that rarely happen. Second, rule-based models require text data follows strict grammatical rules of writing. Considering UGC is a typical type of unstructured data that contains much noise, the identification efficiency is expected to be low. Therefore, NER application programming interfaces (e.g., Textrazor [44]) pre-trained by generic KGs are selected as they outperform other methods for entity recognition across general human knowledge domains [45,46].

Based on word types, there are two types of potential PUC entities. The first type is represented by nouns or noun phrases, which applies to PUC classes including Env, Soc, and Pro. It also applies to part of Beh, representing use applications and product tasks. The second type is a set of Beh represented by verbs as average customers often describe many specific product tasks using “verb.” The first type of PUC entity can be identified and classified based on the generic KG. The second type of PUC entity can be identified by retrieving verbs from a subject–verb–object pattern in an EDU.

All the identified entities form a collection of potential PUC entity dataset and will be further analyzed in the next module. The mathematical formulation of the dataset is represented as follows. For a specific product, a set of UGC from K number of users are collected for PUC entity identification, which is denoted as U = {uk|k = 1, 2, …, K}. Each UGC uk in U contains J number of review sentences, which are collectively represented as R = {rj,k|j = 1, 2, …, J}. For each review sentence rj,k, a set of I potential PUC entities are recognized as E = {ei,j|i = 0, 1, 2, …, I}.

#### 3.3.2 Relation Identification and Product-Specific Usage Context Knowledge Graph Construction.

The second module aims to identify different kinds of relations by extracting causality patterns from sentences. Causality pattern means that one PUC entity can cause one or more PUC entities to occur as the effect. The casualty patterns can be classified into consequence and concurrence patterns, which exist at both word, EDU, and sentence levels. Each UGC sentence will go through the process in Fig. 7 to identify relations of PUC entities. Data failing to identify any relation will be discarded.

Fig. 7
Fig. 7
Close modal

In the first step, relations are identified at the word level. Two operations are performed in this step. In the first operation, relations are identified by measuring the semantic similarity between potential PUC entities ei,j and PUC ontology entities nm. As mentioned in the former section, the semantic similarity of two entities refers to the least path that connects two entities in a generic KG. A high semantic similarity typically indicates that two nodes are directly related in the same knowledge domain, which corresponds to the explicit relationships in the taxonomy of the PUC, such as “part_of,” “subclass_of,” and “instance_of.” The second operation aims to find implicit relations (i.e., causality pattern) at word level. In many cases, a noun phrase can be separated into a noun and a modifier, where the modifier will cause the noun to happen as an effect. Therefore, given a noun phrase PUC entity ei,j, if it is semantically similar to another noun PUC entity $ei,j′$, the noun phrase will be separated into two PUC entities and assigned a relation” feature_of.” For example, by comparing the semantic similarity between the entity’ dog hair’ and “hair,” designers can obtain the relation that “hair” is a feature of “dog.”

The second step of relation identification is performed at the EDU level. Through part-of-speech tagging, words in EDU can be matched with subject–verb–object (SVO) patterns. Based on different SVO patterns, two types of relations can be identified. First, if both subject and object are PUC entities. Then, the verb in the SVO pattern will be defined as the relation of two PUC entities. The verbs are manually paraphrased into formal relation types and can be further trained to perform automatic relation classification. Second, if only subject or object is a PUC entity, then the verb will be paraphrased into a noun using formal design language to become a PUC entity. The new PUC entity is directly linked to its subject through the relation” cause_of” (if subject is PUC entity) or its object through the relation “effect_of” (if object is PUC entity).

The third step aims to extract relations at a sentence level. Since a sentence has been segmented into a set of EDUs using discourse markers, designers can use discourse markers to identify causality patterns in a sentence and match them with relations. There are two types of causality patterns in a sentence, as shown in Fig. 8, concurrence pattern and consequence pattern. In the concurrence pattern, the discourse markers are often {“and,” “or,” “while,” and “when”}; as a result, EDU1 is the cause where EDU2 and EDU3 act as effects concurrently. In the consequence pattern, the discourse markers are often {“if,” “then,” “cause,” “make”}. In that case, EDU5 is a result of EDU4 and is also a cause of EDU6. For the concurrence patterns, the relation type ought to be identical. For the consequence patterns, relation types should be manually created based on causality types.

Fig. 8
Fig. 8
Close modal

The ei,j that remain unlinked will be discarded. Based on the relations of different PUC entities from different levels of sentences, a PUCKG is established.

### 3.4 Product-Specific Usage Context Knowledge Graph-Based Personal Product Usage Context Inference, Summarization, and Reasoning.

The established PUCKG is used for personal PUC summarization and reasoning to support the customer co-design process. Figure 9 depicts how the PUCKG is used. Specifically, there are three steps involved in the application.

• Step 1. Customers sparsely query PUC information

Fig. 9
Fig. 9
Close modal

As shown in Fig. 9 Step 1, at the beginning of the customer co-design process, a customer starts by expressing a set of ambiguous needs based on his or her own PUC. To avoid additional cognitive burdens, the customer only needs to briefly describe the use intentions and relevant environment. The PUC entity identification model will map customer descriptions with PUC entities in the PUCKG. At this step, nodes on the PUCKG are relatively scattered, indicating that the PUC is incomplete. In that case, the PUCKG will try to connect sparsely queried entities through semantic inference and reasoning.

• Step 2. Infer and recommend potentially relevant PUC entities

In the second step, the PUCKG aims to find and recommend the most relevant PUC information based on the scattered PUC entities identified in Step 1. In this step, the model used for KG summarization and recommendation is proposed by Safavi et al. [47]. The basic concept behind the recommendation model is shown by followed: First, defining the PUCKG as G = (E, R, T) consisting of a set of PUC entities E, and a set of triples TE × R × E. Let Pr(ei|Qu) to be a customer's perceived preference for the PUC entity eiE. Here, Qu refers to the query log. Since the customer constantly querying information, the customer's historical preference for ei in Pr(ei|Qu) is captured. Based on which, the local graph structure around ei is accounted. As customer queries come in the form of connected graphs (i.e., triples T), answers to queries involving ei must involve ej (i.e., neighbors of ei), the authors assume an interest in a single PUC entity may signal interest in connected PUC entities in the KG.

Denoting the set of all neighbors of ei in G as N(ei) = {ej|(ei, rk, ej) ∈ T}, the user's preference with ei can be represented by Eq. (2)
$Pr(ei|Qu)∝∑GQ∈QuLEQ(ei)+γ∑ej∈N(ei)LEQ(ej)$
(2)
$LEQ(ei)$ refers to the historical preference of the customer, which is measured by query history. $∑ej∈N(ei)LEQ(ej)$ refers to the graph structure. γ ∈ [0, 1] is the weighting factor that controls the influence of neighbors.$LX(x)$ is an indicator function that is equal to 1 if xX and 0 otherwise.
Similar to the $Pr(ei|Qu)$, the customer's preference for triple entities and relation xijk = (ei, rk, ej) ∈ T is represented by Eq. (3)
$Pr(xijk|Qu)∝Pr(ei|Qu)Pr(rk|Qu)Pr(ej|Qu)$
(3)
$Pr(rk|Qu)$ is computed as the proportion of queries in the query log Qu containing relation rk. The $Pr(ei|Qu)$ and $Pr(rk|Qu)$ are used to represent the personal preference of personal PUC.

The potentially relevant PUC entities will be recommended to the customer, and the customer will decide whether recommendations are relevant to his or her personal PUC. Based on the feedback, the model will make new inferences and recommendations. The process will undergo several iterations until it reaches the customer's highest cognitive load. As a result, a collection of PUC entities relevant to the personal PUC are identified.

• Step 3. Combine PUC entities into personal usage scenarios

In the third step, PUC entities are further reasoned and combined into personal usage scenarios. Given the user preference model described in the previous section, the estimation of how well a constructed summary Su = (Eu, Ru, Tu) captures the user's inferred preference, conditioned on Qu, is represented by Eq. (4)
$Pr(Su|Qu)∝∏eεEuPr(e|Qu)⏟"topic"pref.∏xijkεTuPr(xijk|Qu)⏟factpref.$
(4)
Therefore, the personal KG summary problem can be described as follows: Given the PUCKG G, a user u generated initial queries Qu to G, with a number of triples K. Finding the personal PUC summary Su = (Eu, Ru, Tu) ⊆ G of K triples that maximizes the log-likelihood of $Pr(Su|Qu)$:
$argmaxSu⊆GlogPr(Su|Qu)s.t.|Tu|≤K$
(5)

K roughly corresponds to the customer's cognitive load to querying PUC. For example, suppose a customer is only willing to explore ten PUC information, then K ≈ 10.

Through this process, a relatively holistic and comprehensive personal PUC is summarized. The nodes in the personal PUC summary are further reasoned to highlight the potential product behaviors that might be affected. First, identified PUC entities are connected based on the least path for semantic reasoning. This approach will help highlight a more comprehensive usage scenario. Second, the personal PUCKG will be extended toward the nearest Beh entities through one-hop query and path-based query. As a result, the PUCKG will summarize a combined usage scenario to describe the individual's personal PUC, the combined usage scenario, and potential product behavior that will be affected by surroundings. All those information will help designers to foresee potential customization failures and to make more holistic customization decisions accordingly.

## 4 A Case Study of Robot Vacuum Cleaner

### 4.1 Background.

A case study of robot vacuum cleaners (RVC) is presented to showcase how to follow the proposed method to address a real-world problem and validate its practical applicability and effectiveness. RVC is a type of vacuum floor-cleaning product that can detect and clean dirt autonomously. It is also a typical smart product that carries highly customized features to fulfill different floor cleaning needs. However, while designers have devoted substantial efforts to making RVC reliable, many customers have reported their negative experiences with frequent product failures. Such negative experiences can convincingly reflect the mass confusion problem in customization. Customers choose RVC instead of traditional floor vacuuming products because they perceive those customized features are more convenient, whereas, given that RVC is a typical type of domestic robot requiring specific knowledge to operate, improper design and use will result in product failure, which can significantly reduce customers’ trust and confidence in those customized features. Besides, given that RVCs are designed and trained with ideal PUC models, which cannot be perfectly consistent with customers’ unstructured environments, customers should be able to select appropriate RVCs based on their personal PUC. Developing the PUCKG of RVC can be attempted to improve the customization process. First, PUCKG enables customers to foresee potential failures, choose the most appropriate RVC, and set realistic expectations of RVC accordingly. Second, PUCKG helps designers better understand the root causes of abnormal behaviors, providing more efficient diagnoses and improving product and service design.

### 4.2 Product-Specific Product Usage Context Ontology Modeling Result.

Troubleshooting guides are used to construct PUC ontology. Twenty troubleshooting guides are retrieved from different brands’ customer support homepages. The troubleshooting guides for RVC contain a list of questions and answers to help customers resolve errors by themselves before contacting service teams. Three main topics are concerned in the troubleshooting guides: software failures, hardware failures, and human errors. This research follows the structure of troubleshooting guides (i.e., topics, questions, and corresponding answers) to construct PUC ontology. Table 1 shows an illustrative example of abstracting a question and answer into nodes and relations in the PUC ontology. After abstracting all the questions and answers in troubleshooting guides, the classes of the PUC ontology are established as illustrated in Fig. 10 and stored in Neo4j (i.e., an online graph database) [48]. The PUC ontology specifies 20 classes of PUC entities, including eight classes of target product behaviors, three classes of interacting products, seven classes of surrounding environments, and two classes of social networks. Besides, 11 types of relations are summarized as well.

Fig. 10
Fig. 10
Close modal
Table 1

An illustration of PUC ontology modeling result

DataPUC entitiesRelations
Q: My vacuum has lost suction powerNBeh1: SuctionRBehBeh (NBeh2, NBeh1) negatively_affect
REnvBeh (NEnv1, NBeh2) lead_to
A: Remove any debris or build-up from the brush head and make sure no debris (such as hair) is tangled or wrapped around the partsNBeh2: Obstruction
NEnv1: Debris (home waste)
DataPUC entitiesRelations
Q: My vacuum has lost suction powerNBeh1: SuctionRBehBeh (NBeh2, NBeh1) negatively_affect
REnvBeh (NEnv1, NBeh2) lead_to
A: Remove any debris or build-up from the brush head and make sure no debris (such as hair) is tangled or wrapped around the partsNBeh2: Obstruction
NEnv1: Debris (home waste)

### 4.3 Product-Specific Usage Context Knowledge Graph Construction Result.

The top 20 bestselling RVCs from the vacuum cleaners and floor care category on Amazon.com are selected as target products to identify PUC entities and construct PUCKG. Two data selection rules are followed to further screen high-quality UGC. First, UGC with less than 50 words is discarded to ensure customers have sufficient insights about the PUC. Second, the numbers of positive, neutral, and negative customer ratings are evenly distributed to avoid sampling bias. 2657 UGC are selected for PUCKG construction. All the data are generated by verified purchasers, so they are expected to be authentic. Following the identification and classification process, the PUCKG is constructed step by step.

The efficacy of PUC entity identification is evaluated in this step. First, we used two selection criteria to select discourse markers for EDU categorization. The first criterion is the frequency of occurrence of a discourse marker in the data set. A high frequency of occurrence indicates a higher possibility of segmenting long and complex sentences into EDUs. The second criterion is segmentation accuracy. Segmentation accuracy is considered low if segmented EDUs are not comprehensible. Experienced designers will evaluate segmentation accuracies manually. Based on the sample test, eight discourse markers are selected for EDU categorization, as shown in Table 2. Among all the discourse markers, “and” and “but” reached both high frequencies of occurrence (over 10%) and accuracies (over 90%). The markers “when,” “if,” and “so” are desirable markers as they achieved desirable frequencies of occurrence (over 5%) and accuracies (over 80%). Discourse markers “or,” “then,” and “because” are acceptable markers due to their high segmentation accuracies.

Table 2

Accuracies of discourse markers

Discourse markerFrequency of occurrenceAccuracy
And47.6%94.0%
Or1.3%99.6%
But13.1%99.1%
If5.0%99.1%
Then1.2%85.8%
Because2.2%88.0%
So7.7%80.3%
When5.5%92.4%
Discourse markerFrequency of occurrenceAccuracy
And47.6%94.0%
Or1.3%99.6%
But13.1%99.1%
If5.0%99.1%
Then1.2%85.8%
Because2.2%88.0%
So7.7%80.3%
When5.5%92.4%

The main factor affecting segmentation accuracy is semantic diversity. In practice, many discourse markers have more than one meaning. For example, the discourse marker “so” frequently shows faulty results when segmenting sentences containing “so happy,” “so good,” and “so far,” etc. After applying the discourse marker set to segment sentences, 20% of sentences are still not segmented. Among those sentences, 83% are short enough to become individual EDU, and only 3.3% of data are complex sentences that cannot be segmented.

After EDU segmentation, EDUs are sent for PUC entity identification. The result is summarized in the pie chart in Fig. 11. Each fraction in Fig. 11 is labeled by the most frequent entities in the cluster. Generally, for RVC, Env entities are mainly built environment entities such as building elements, infrastructure, flooring material, and room. Some natural environment elements such as venue, sound, and light are also identified. Frequently mentioned Soc entities include family members and pets. Frequently mentioned Pro entities include floor coverings, furniture, home appliances, and virtual assistant. Frequently mentioned product behaviors include product features, product tasks, product failures, and product errors. Our proposed method generally shows desirable entity identification and classification efficiency. However, it is notable that using NER Application Programming Interface (API) could cause identification errors. First, many PUC entities are too specific to be identified by NER APIs. For example, in the RVC case, PUC entities such as cleaning patterns (parallel, random, and circular patterns) cannot be identified. To avoid this problem, conducting a manual review and summarizing a list of product-specific entities for NER identification are necessary.

Fig. 11
Fig. 11
Close modal

The accuracy of relation identification and classification is shown in Table 3. First, for word-level relation identification and classification, the accuracy is dependent on the efficacy of NER models. Among all the four categories of PUC entities, the accuracies of Soc, Pro, and Env entities are relatively high. However, there was still considerable room for improvement in exploring relations for Beh entities. Two factors affected the accuracy. First, the terminologies used to describe Beh entities are often ambiguous. For example, customers use “find” and “walk around” to describe Beh “ navigation,” leading to relation identification errors. Second, the NER model could misclassify PUC entities that are homonyms. For example, in the context of RVC, “dock” typically refers to the charging station. However, the NER model will mistakenly classify “dock” as a subclass of architectural structure because “dock” also refers to a boat parking infrastructure. To disambiguate entities and avoid misclassification errors, design researchers should restrict the knowledge types in NER models for entity identification. EDU-level relation identification achieved high accuracy. However, the lack of subjects in an EDU could result in misclassification. For example, the EDU “mopped twice a day” could refer to both product behavior and user behavior. For sentence-level relation identification, the accuracies for identifying both concurrence and sequential patterns are desirable. Its accuracy is affected by two factors. First, some discourse markers, although not frequently occur, can refer to both concurrence and sequential patterns. For example, in the sentence, “It just spins around, turns off, and needs to be docked manually” shows a sequential pattern, whereas “and” is pre-determined as an indicator of a concurrence pattern. Second, some discourse markers also function to support other semantic purposes. For example, the marker “so” could also be used as an intensifying adverb at the end of a sentence, therefore increasing error rates.

Table 3

Accuracies of relation identification

StepsAccuracy
FirstSecondary
Word levelEnv83.3%
Soc93.3%
Pro95.7%
Beh66.7%
EDU level83.6%
Sentence level2.2%75.0%
StepsAccuracy
FirstSecondary
Word levelEnv83.3%
Soc93.3%
Pro95.7%
Beh66.7%
EDU level83.6%
Sentence level2.2%75.0%

As mentioned before, the PUC knowledge in PUCKG is shown through a set of TE × R × E triple, and some of the PUC knowledge are listed in Table 4. An illustration of PUCKG is shown in Fig. 12. In the graph, brown nodes stand for Env entities, green nodes stand for Soc entities, blue nodes stand for Pro entities, and yellow nodes stand for Beh entities.

Fig. 12
Fig. 12
Close modal
Table 4

A partial list of PUC knowledge

NodeLinkNode
IDNameIDNameIDName
19Pet39Sensitive_to21Noise
21Noise5Subclass_of6Sound
30Sunlight30Negatively_affect3Navigation
28Stair21Instance_of13Architecture_element
17Residence_type28Locate_in16Outdoor_environment
NodeLinkNode
IDNameIDNameIDName
19Pet39Sensitive_to21Noise
21Noise5Subclass_of6Sound
30Sunlight30Negatively_affect3Navigation
28Stair21Instance_of13Architecture_element
17Residence_type28Locate_in16Outdoor_environment

### 4.4 Personal Product Usage Context Inference, Summary, and Reasoning Results.

A latent semantic analysis (LSA)-based topic summarization model is used to cluster customers into four groups, and the four most representative PUC entities are summarized for each group. The four PUC entities are assumed as sparsely queried PUC mentioned by customers in the initial stage. Figure 13 shows the result of PUC topics in four customer groups. The keywords in the topics are mapped with corresponding PUC nodes in the PUCKG. Given the initial PUC entities, the potentially relevant PUC entities that customers may be interested in are returned and ordered by preference score from high to low. In this case study, λ is set to 0.8.

Fig. 13
Fig. 13
Close modal

Table 5 shows a sample of summarized PUC entities, with predicted preference values. First, given the initial PUC queries in each customer group, several potentially relevant PUC entities are recommended. For example, in customer group 1, three new PUC entities: “dog,” “toy,” and “suction,” are identified. The PUC entity “loud” reached the highest score because it is shared by most local knowledge triples. It should note that the ranking scores across different customer groups vary significantly. For example, the highest score in customer group 3 is only 3.6, which is identical to the lowest score in customer group 1. This is because in customer group 3, the initial queries are relatively sparse in the PUCKG, and the connections between queries are relatively weak. In that case, it is difficult to summarize personal PUC, and more queries are required.

Table 5

PUC summary for each customer group, ranked by preference score

Group 1Group 2Group 3Group 4
PUC entityScorePUC entityScorePUC entityScorePUC entityScore
Loud10.4Stuck13.2Carpet3.6Suction8
Pet8.8Furniture9.4Obstacle2Dust4.8
Dog8.4Rug7.6Charging1.8Rug3.6
Hair8Pelt5.8Hard floor1.8Stuck2.8
Dust7.2Toy5.8Navigation1.8
Toy4.2House5.4Hard floor1.8
Suction3.6Window3.4
Group 1Group 2Group 3Group 4
PUC entityScorePUC entityScorePUC entityScorePUC entityScore
Loud10.4Stuck13.2Carpet3.6Suction8
Pet8.8Furniture9.4Obstacle2Dust4.8
Dog8.4Rug7.6Charging1.8Rug3.6
Hair8Pelt5.8Hard floor1.8Stuck2.8
Dust7.2Toy5.8Navigation1.8
Toy4.2House5.4Hard floor1.8
Suction3.6Window3.4

Path query is performed to identify the product behaviors directly linked with summarized PUC entities to support the basic reasoning [40]. The output of PUCKG is a summary of the potential impacts of PUC on product utility and performance (i.e., the features that might pay attention to, or the potential product failure that might happen). The result of basic PUC reasoning for each group is summarized in Table 6. In customer group 1, the PUC knowledge can be represented by six tuples. And a comprehensive PUC can be expressed as “Pets are sensitive to noise, dog generate additional hair, which may negatively affect brush.” Similarly, other triples can be reasonably explained using natural language as well, helping customers foresee the PUC they need to consider when customizing products.

Table 6

Basic PUC reasoning of each customer group

GroupsBasic PUC reasoning
1
• < Pet, sensitive_to, Noise >

• < Dog, generate, hair >, < hair, negatively_affect, Brush >

• < Dust, require, Suction >

• < Toy, used_by, Pet >, < Toy, lead_to, Stuck >

2
• < Pelt lead_to, Stuck >, < Rug, lead_to, Stuck >

• < Furniture, sensitive_to, Stuck >

• < Toy led_to Stuck >

• < Stair, part_of, House >, < Stair, require, memory >, < Stair, lead_to, dropping >

3
• < Obstacle, lead_to, Stuck >

• < Carpet, require, Suction >

• < Charging, require, Navigation >

4
• < Carpet, require, Suction >

• < Toy, lead_to, Stuck >

• < Sunlight, affect, Navigation >

GroupsBasic PUC reasoning
1
• < Pet, sensitive_to, Noise >

• < Dog, generate, hair >, < hair, negatively_affect, Brush >

• < Dust, require, Suction >

• < Toy, used_by, Pet >, < Toy, lead_to, Stuck >

2
• < Pelt lead_to, Stuck >, < Rug, lead_to, Stuck >

• < Furniture, sensitive_to, Stuck >

• < Toy led_to Stuck >

• < Stair, part_of, House >, < Stair, require, memory >, < Stair, lead_to, dropping >

3
• < Obstacle, lead_to, Stuck >

• < Carpet, require, Suction >

• < Charging, require, Navigation >

4
• < Carpet, require, Suction >

• < Toy, lead_to, Stuck >

• < Sunlight, affect, Navigation >

### 4.5 Relationship Between Product Usage Context Entities and Customer Satisfaction.

Considering PUC can significantly affect customer preference, customers are supposed to have strong sentiment polarity (i.e., either satisfactory or unsatisfactory) toward different PUC. Therefore, sentiment analysis is also applied to quantify customer satisfaction in each customer group. According to the sentiment polarity (i.e., positive, negative, and neutral), customer satisfactions toward PUC topics are mapped into different levels, as shown in Fig. 14. The x-axis of the figure represents four PUC topics, which correspond to four customer groups. The y-axis represents the customer sentiment score. Here, the neutral sentiment range (−0.2–0.2) is combined in one row. The result indicates that each customer group shows a very strong sentiment tendency. Therefore, this research can assume that the identified PUC entities can be used to support customization and increase customer satisfaction.

Fig. 14
Fig. 14
Close modal

## 5 Discussion

The case study of the RVC reveals that the PUCKG can be effectively constructed using the proposed methodology. Besides, PUCKG shows its efficacy in supporting personal PUC inference, summarization, and reasoning. Compared with traditional approaches, this research could promote user-driven customization in two ways.

For customers, the traditional approach requires customers to specify their needs explicitly, whereas customers may only have general ideas of what they want and are unaware of specific needs. Our proposed approach will enable customers to foresee the potential implications and risks of selected product features in their PUC, help them express needs clearly, make holistic decisions, and be confident with customization.

For designers, the proposed method will help them better understand personal PUC to develop more robust customized products. Compared with traditional approaches that customized features are designed based on assumed PUC models, crowdsourced data could help designers have a more holistic understanding of PUC and its consequences. Besides, since the KG can be continuously expanded and updated, the proposed method will help designers have the most updated knowledge to improve the robustness of customized features.

Two limitations should be noted in the proposed approach. First, the nodes and relations of PUCKG are identified using generic KG. Therefore, the efficacy of PUCKG might be affected by the quality of generic KG. Since much product-specific knowledge is not reflected in the generic KG, some domain-specific relations between PUC entities might be missing. For example, the PUC entity “dock” in generic KG is defined as a subclass of “architectural structure,” whereas to RVC, “dock” typically refers to the charging station of RVC.

Second, the accuracies of entity and relation classifications are subject to the semantic similarity measure. In practice, some PUC entities are semantically similar but contextually different. For example, in terms of semantics, both “chair” and “bench” are subclasses of “furniture” and function to facilitate “sitting.” However, in terms of context, “chair” is a part of the indoor environment, whereas “bench” is typically a part of the outdoor space.

In that case, to further improve the effectiveness of our proposed method, designers could also integrate domain-specific KG as a reference for PUCKG construction. Besides, some context-aware semantic similarity models could be integrated to further improve the effectiveness of our proposed approach.

## 6 Conclusion

In conclusion, this paper presents a PUCKG construction process to support user-driven customization. Our proposed method is intended to help customers explore needs holistically, prevent potential customization failures and be confident with their design decisions. Based on the theoretical investigation, PUCKG construction, and case study, the three research findings are identified:

• Compared to the traditional user-driven customization, where customers cannot specify needs holistically, the new approach can summarize personal PUC to help customers foresee potentially relevant needs.

• Compared to traditional user-driven customization, where customers cannot foresee the utility and performance of customized features, our approach provides a more confident way to avoid potential product failures, which mitigates uncertainties perceived by customers.

• Compared with traditional methods, PUCKG provides a more efficient way to construct, update, and represent PUC knowledge. It is expected that PUCKG can result in an improved understanding of individual customer needs.

Future directions will be placed on incorporating different kinds of machine learning and artificial intelligence techniques to further automate the PUCKG construction process.

## Acknowledgment

This work is supported by an Australian Government Research Training Program (RTP) Scholarship.

## Conflict of Interest

There are no conflicts of interest.

## Data Availability Statement

The data sets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.

## References

1.
Tseng
,
M. M.
,
Wang
,
Y.
, and
Jiao
,
R. J.
,
2017
, “Mass Customization,”
CIRP Encyclopedia of Production Engineering
, The International Academy for Production,
L
.
Laperrière
, and
G
.
Reinhart
, eds.,
Springer
,
Berlin/Heidelberg
, pp.
1
8
.
2.
Tseng
,
M. M.
, and
Du
,
X.
,
1998
, “
Design by Customers for Mass Customization Products
,”
CIRP Ann.
,
47
(
1
), pp.
103
106
.
3.
Wang
,
Y.
, and
Tseng
,
M. M.
,
2011
, “
Integrating Comprehensive Customer Requirements Into Product Design
,”
CIRP Ann.
,
60
(
1
), pp.
175
178
.
4.
Green
,
M. G.
,
Palani Rajan
,
P. K.
, and
Wood
,
K. L.
,
2004
, “
Product Usage Context: Improving Customer Needs Gathering and Design Target Setting
,”
Proceedings of the ASME 2004 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Vol. 3a: 16th International Conference on Design Theory and Methodology
,
Salt Lake City, UT
,
Sept. 28–Oct. 2
, pp.
393
403
.
5.
Ulwick
,
A. W.
, and
Bettencourt
,
L. A.
,
2008
, “
“Giving Customers a Fair Hearing,” (in English)
,”
MIT Sloan Manage. Rev.
,
49
(
3
), pp.
62
68
.
6.
Tseng
,
M. M.
,
Jiao
,
R. J.
, and
Wang
,
C.
,
2010
, “
Design for Mass Personalization
,”
CIRP Ann.
,
59
(
1
), pp.
175
178
.
7.
Schnurr
,
B.
, and
Scholl-Grissemann
,
U.
,
2015
, “
Beauty or Function? How Different Mass Customization Toolkits Affect Customers’ Process Enjoyment
,”
J. Consum. Behav.
,
14
(
5
), pp.
335
343
.
8.
Franke
,
N.
,
Schreier
,
M.
, and
Kaiser
,
U.
,
2010
, “
The “I Designed It Myself” Effect in Mass Customization
,”
Manage. Sci.
,
56
(
1
), pp.
125
140
.
9.
Du
,
X.
,
Jiao
,
J.
, and
Tseng
,
M. M.
,
2006
, “
Understanding Customer Satisfaction in Product Customization
,”
Int. J. Adv. Manuf. Technol.
,
31
(
3
), pp.
396
406
.
10.
Huffman
,
C.
, and
Kahn
,
B. E.
,
1998
, “
Variety for Sale: Mass Customization or Mass Confusion?
,”
J Retail.
,
74
(
4
), pp.
491
513
.
11.
Piller
,
F.
,
Schubert
,
P.
,
Koch
,
M.
, and
Möslein
,
K.
,
2017
, “
Overcoming Mass Confusion: Collaborative Customer Co-Design in Online Communities
,”
J. Comput.-Mediat. Commun.
,
10
(
4)
.
12.
Wang
,
Y.
, and
Tseng
,
M. M.
,
2011
, “
Adaptive Attribute Selection for Configurator Design via Shapley Value
,”
Artif. Intell. Eng. Des. Anal. Manuf.
,
25
(
2
), pp.
185
195
.
13.
Liu
,
A.
,
Zhang
,
D.
,
Wang
,
X.
, and
Xu
,
X.
,
2021
, “
Blockchain-Based Customization Towards Decentralized Consensus on Product Requirement, Quality, and Price
,”
Manuf. Lett.
,
27
, pp.
18
25
.
14.
Wang
,
Y.
,
Luo
,
L.
, and
Liu
,
H.
,
2022
, “
Bridging the Semantic Gap Between Customer Needs and Design Specifications Using User-Generated Content
,”
IEEE Trans. Eng. Manage.
,
69
(
4
), pp.
1622
1634
.
15.
Lin
,
Y.
,
Yu
,
S.
,
Zheng
,
P.
,
Qiu
,
L.
,
Wang
,
Y.
, and
Xu
,
X.
,
2017
, “
VR-based Product Personalization Process for Smart Products
,”
Procedia Manuf.
,
11
, pp.
1568
1576
.
16.
Dellaert
,
B. G. C.
, and
Dabholkar
,
P. A.
,
2009
, “
Increasing the Attractiveness of Mass Customization: The Role of Complementary On-line Services and Range of Options
,”
Int. J. Electron. Commer.
,
13
(
3
), pp.
43
70
.
17.
Benade
,
M. S.
,
2018
, “
Essays on Smart Customization: Towards a Better Understanding of the Customer's Perspective on Smart Customization Offers
,”
PhD dissertation
,
RWTH Aachen University
.
18.
Green
,
M. G.
,
Tan
,
J.
,
Linsey
,
J. S.
,
Seepersad
,
C. C.
, and
Wood
,
K. L.
,
2005
, “
Effects of Product Usage Context on Consumer Product Preferences
,”
Proceedings of the ASME 2005 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Volume 5a: 17th International Conference on Design Theory and Methodology
,
Long Beach, CA
,
Sept. 24–28
, pp.
171
185
.
19.
Green
,
M. G.
,
Linsey
,
J. S.
,
Seepersad
,
C. C.
,
Wood
,
K. L.
, and
Jensen
,
D. J.
,
2006
, “
Frontier Design: A Product Usage Context Method
,”
Proceedings of the ASME 2006 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Volume 4a: 18th International Conference on Design Theory and Methodology
,
Philadelphia, PA
,
Sept. 10–13
, pp.
99
113
.
20.
Wang
,
X.
,
Liu
,
A.
, and
Kara
,
S.
,
2022
, “
An Ontology-Based Product Usage Context Modeling Method for Smart Customization
,”
Procedia CIRP
,
109
, pp.
641
646
.
21.
Jin
,
J.
,
Liu
,
Y.
,
Ji
,
P.
, and
Kwong
,
C. K.
,
2019
, “
Review on Recent Advances in Information Mining From Big Consumer Opinion Data for Product Design
,”
ASME J. Comput. Inf. Sci. Eng.
,
19
(
1
), p.
010801
.
22.
He
,
W.
,
Martinez
,
J.
,
Padhi
,
R.
,
Zhang
,
L.
, and
Ur
,
B.
,
2019
, “
When Smart Devices Are Stupid: Negative Experiences Using Home Smart Devices
,”
Proceedings of the 2019 IEEE Security and Privacy Workshops (SPW)
,
San Francisco, CA
,
May 19–23
, pp.
150
155
.
23.
Ram
,
S.
, and
Jung
,
H.-S.
,
1990
, “
The Conceptualization and Measurement of Product Usage
,”
J. Acad. Mark. Sci.
,
18
(
1
), pp.
67
76
.
24.
Belk
,
R. W.
,
1975
, “
Situational Variables and Consumer Behavior
,”
J. Consum. Res.
,
2
(
3
), pp.
157
164
.
25.
He
,
L.
,
Chen
,
W.
,
Hoyle
,
C.
, and
Yannou
,
B.
,
2012
, “
Choice Modeling for Usage Context-Based Design
,”
ASME J. Mech. Des.
,
134
(
3
), p.
031007
.
26.
Gellersen
,
H. W.
,
Schmidt
,
A.
, and
Beigl
,
M.
,
2002
, “
Multi-Sensor Context-Awareness in Mobile Devices and Smart Artifacts
,”
Mob. Netw. Appl.
,
7
(
5
), pp.
341
351
.
27.
Suryadi
,
D.
, and
Kim
,
H. M.
,
2019
, “
A Data-Driven Approach to Product Usage Context Identification From Online Customer Reviews
,”
ASME J. Mech. Des.
,
141
(
12
), p.
121104
.
28.
Li
,
X.
,
Chen
,
C.-H.
,
Zheng
,
P.
,
Jiang
,
Z.
, and
Wang
,
L.
,
2021
, “
A Context-Aware Diversity-Oriented Knowledge Recommendation Approach for Smart Engineering Solution Design
,”
Knowl.-Based Syst.
,
215
, p.
106739
.
29.
Wang
,
Z.
,
Chen
,
C.-H.
,
Li
,
X.
,
Zheng
,
P.
, and
Khoo
,
L. P.
,
2021
, “
A Context-Aware Concept Evaluation Approach Based on User Experiences for Smart Product-Service Systems Design Iteration
,”
Adv. Eng. Inform.
,
50
, p.
101394
.
30.
Wang
,
Z.
,
Chen
,
C.-H.
,
Zheng
,
P.
,
Li
,
X.
, and
Song
,
W.
,
2022
, “
A Hypergraph-Based Approach for Context-Aware Smart Product-Service System Configuration
,”
Comput. Ind. Eng.
,
163
, p.
107816
.
31.
Abu-Salih
,
B.
,
2021
, “
Domain-Specific Knowledge Graphs: A Survey
,”
J. Netw. Comput. Appl.
,
185
, p.
103076
.
32.
Li
,
X.
,
Lyu
,
M.
,
Wang
,
Z.
,
Chen
,
C.-H.
, and
Zheng
,
P.
,
2021
, “
Exploiting Knowledge Graphs in Industrial Products and Services: A Survey of key Aspects, Challenges, and Future Perspectives
,”
Comput. Ind.
,
129
, p.
103449
.
33.
Weng
,
S.-S.
,
Tsai
,
H.-J.
,
Liu
,
S.-C.
, and
Hsu
,
C.-H.
,
2006
, “
Ontology Construction for Information Classification
,”
Expert Syst. Appl.
,
31
(
1
), pp.
1
12
.
34.
Ehrlinger
,
L.
, and
Wöß
,
W.
,
2016
, “
Towards a Definition of Knowledge Graphs
,”
SEMANTiCS (Posters, Demos, SuCCESS)
,
48
(
1–4
), p.
2
.
35.
Wang
,
Q.
,
Mao
,
Z.
,
Wang
,
B.
, and
Guo
,
L.
,
2017
, “
Knowledge Graph Embedding: A Survey of Approaches and Applications
,”
IEEE Trans. Knowl. Data Eng.
,
29
(
12
), pp.
2724
2743
.
36.
Ji
,
S.
,
Pan
,
S.
,
Cambria
,
E.
,
Marttinen
,
P.
, and
Yu
,
P. S.
,
2022
, “
A Survey on Knowledge Graphs: Representation, Acquisition, and Applications
,”
IEEE Trans. Neural Netw. Learn. Syst.
,
33
(
2
), pp.
494
514
.
37.
Li
,
X.
,
Chen
,
C.-H.
,
Zheng
,
P.
,
Wang
,
Z.
,
Jiang
,
Z.
, and
Jiang
,
Z.
,
2020
, “
A Knowledge Graph-Aided Concept–Knowledge Approach for Evolutionary Smart Product–Service System Development
,”
ASME J. Mech. Des.
,
142
(
10
), p.
101403
.
38.
Zhou
,
B.
,
Shen
,
X.
,
Lu
,
Y.
,
Li
,
X.
,
Hua
,
B.
,
Liu
,
T.
, and
Bao
,
J.
,
2022
, “
Semantic-Aware Event Link Reasoning Over Industrial Knowledge Graph Embedding Time Series Data
,”
Int. J. Prod. Res.
, pp.
1
18
.
39.
Ko
,
H.
,
Witherell
,
P.
,
Lu
,
Y.
,
Kim
,
S.
, and
Rosen
,
D. W.
,
2021
, “
Machine Learning and Knowledge Graph Based Design Rule Construction for Additive Manufacturing
,”
Addit. Manuf.
,
37
, p.
101620
.
40.
Liu
,
A.
,
Zhang
,
D.
,
Wang
,
Y.
, and
Xu
,
X.
,
2022
, “
Knowledge Graph with Machine Learning for Product Design
,”
CIRP Ann.
,
71
(
1
), pp.
117
120
.
41.
Gero
,
J.
, and
Milovanovic
,
J.
,
2021
, “
The Situated Function-Behavior-Structure co-Design Model
,”
CoDesign
,
17
(
2
), pp.
211
236
.
42.
Delpeuch
,
A.
,
2019
, “Opentapioca: Lightweight Entity Linking for Wikidata,” arXiv preprint arXiv:1904.09131.
43.
Ireland
,
R.
, and
Liu
,
A.
,
2018
, “
Application of Data Analytics for Product Design: Sentiment Analysis of Online Product Reviews
,”
CIRP J. Manuf. Sci. Technol.
,
23
, pp.
128
144
.
44.
“TextRazor: Technology,” https://www.textrazor.com/technology
45.
Dale
,
R.
,
2018
, “
Text Analytics APIs, Part 2: The Smaller Players
,”
Nat. Lang. Eng.
,
24
(
5
), pp.
797
803
.
46.
Camburn
,
B.
,
He
,
Y.
,
Raviselvam
,
S.
,
Luo
,
J.
, and
Wood
,
K.
,
2020
, “
Machine Learning-Based Design Concept Evaluation
,”
ASME J. Mech. Des.
,
142
(
3
), p.
031113
.
47.
Safavi
,
T.
,
Belth
,
C.
,
Faber
,
L.
,
Mottin
,
D.
,
Müller
,
E.
, and
Koutra
,
D.
,
2019
, “
Personalized Knowledge Graph Summarization: From the Cloud to Your Pocket
,”
Proceedings of the 2019 IEEE International Conference on Data Mining (ICDM)
,
Beijing, China
,
Nov. 8–11
, pp.
528
537
.
48.
Miller
,
J. J.
,
2013
, “
Graph Database Applications and Concepts With Neo4j
,”
Proceedings of the Southern Association for Information Systems Conference
,
Atlanta, GA
,
Mar. 8–9
, Vol. 2324, No. 36.