Cyber-Empathic Design: A data-driven framework for product design

This item includes six files: two text files that contain datasets, two Microsoft Excel files with multiple worksheets, a "read me" file and an HTML file that redirects to the publisher page for the journal article. Please note that a program like Liquid Studio 2017 or similar is necessary to open the two large .txt files of data.


Introduction and Motivation
A fundamental task common to the design of consumer products is to translate consumer information to engineering requirements. In the case of product redesign, the typical process of mapping information from consumer space to design space can be abstracted to four basic steps [1][2][3], where raw information is collected directly from consumers using surveys and other instruments, followed by processing of the collected information using models of consumer perceptions. These models try to understand the overall consumer utility. Subsequently, the designer identifies and maps how consumer factors relate to product attributes that ultimately influence consumer behavior. Finally, new product alternatives are developed using the aggregation of consumer and product information and tools such as quality deployment function (QFD) [4][5][6] or discrete choice theory [7] to obtain a set of optimal design options.
Traditionally, consumer information is mapped onto a single construct-utility, using methods like discrete choice analysis (DCA) or choice analysis (CA) [7]. These methods have their origins in the marketing domain where they are used to understand the important design parameters/attributes affecting the fulfillment of consumer perceptions and ultimately their preferences. There are two challenges with this process [4][5][6][7][8]: (a) Reliance on a single construct-utility, which is referred to indirectly by mathematical models. (b) Reliance on a designer to map insights obtained from consumer information to design parameters.
This mapping can be influenced by various forms of designer bias including prior experience, existing mental models, and industry/team/firm culture and practices [8].
In addition to utility, other constructs also affect consumer preferences. An alternative approach in understanding these preferences comes from behavioral research in consumer psychology where constructs measuring specific thoughts, perceptions, and attitudes are mapped onto a network of interconnected judgments that predict downstream consumer preferences. This network of interconnected judgments is a "causal path structure," which can be traced to understand the reason behind a particular perception or downstream use intention. This structural path analysis of user psychology, if obtained, can be rich information for the designer potentially reducing the designer's bias/preconceptions while mapping consumer information to attributes. By developing a structural path model of user psychology, the designer will not only be able to understand "what" the preferences of the consumers are, but also "why" consumers have a particular preference (i.e., they can map particular sentiments embodied in psychological constructs to engineered features of the product).
One of the strengths of marketing models, like DCA, is their quantitative nature and scalability [7]. On the other hand, consumer psychology-based methods (referred to as subjective methods) are largely not quantitative and struggle with scalability. Another factor that affects consumer perception is user-product interaction. Both DCA and CA do not incorporate user-product interaction data in a quantitative manner. The premise of this work is that the incorporation of user-product interaction data processed through a network of interconnected judgments will reduce the cognitive load and potential mental bias of designers by assisting them in mapping product features and attributes to consumer perceptions. The long-term vision for this research is the development of a data-driven design paradigm capable of modeling consumer perceptions and preferences on an individual basis for products in the field by incorporating user-product interaction data. Toward establishing the foundations of such a paradigm, there are three research challenges this work attempts to address: (i) How can quantitative and scalable user-product interaction data be obtained? (ii) How can the user-product interaction data be used to understand consumer perceptions and map them onto a network of psychological constructs? (iii) Is incorporating user-product interaction data effective to model user perceptions?
In addressing research challenge (i), we look to the emerging internet of things where increased integration of sensors in common consumer products is ongoing to make "smarter" products and environments (e.g., see Nest Thermostats [9]) and serves as critical cyberinfrastructure for data acquisition. In this work, we embed sensors in consumer products to collect user-product interaction data and extract information that will help in understanding consumer perceptions.
To address the other two research challenges, a customizable framework is developed where user-product interaction data are integrated with information obtained from traditional methods like surveys. The proposed framework represents an extension of empathic design [10][11][12], where we observe users nonintrusively using sensors. Thus, we refer to the proposed approach as cyberempathic design.
In this work, we present a brief overview of the existing methods in design in Sec. 2. We then present the proposed framework and the analytical technique in Sec. 3, followed by a case study with results and discussion in Sec. 4. Finally, in Sec. 5, we present the conclusions and future work.

Current State of Design Methods
Many theories and techniques have been proposed by the design community to understand consumer requirements and then leverage them in a design process in order to make better products. A primary focus of these methods is to model the relationship between the consumer space and design space.

Mapping Consumer and Design
Spaces. Many quantitative methods are functionally based in the marketing domain, while other more subjective methods attempt to improve design by considering the user-product experiences using data sources such as surveys and focus groups. Consumer requirements are eventually mapped to technical specifications and product attributes with the objective of maximizing the value of the product. Recently, there have been attempts to extend and merge the quantitative and subjective information by using various techniques emerging in data analytics. In this section, we present a brief review of these methods, while also noting the need for new methods to complement the existing approaches.
Many of the roots of the methods to map consumer and design spaces lie in QFD [4][5][6]. The first step in QFD is to capture the "voice of the customer" using interviews, surveys, and focus groups resulting in a set of requirements along with their relative importance. The house of quality (HOQ) [4] is a tool used to map consumer requirements to technical specifications. The challenge with HOQ is that it relies on subjective assessments from the designers to identify the mapping and importance of the relationships between individual customer requirements and technical specifications. HOQ has evolved and can include uncertainty in consumers' and designers' information and can also classify requirements according to the Kano model [13]. Though QFD and HOQ have evolved since their inception, the primary focus and challenge remains the same, which is to map consumer requirements to technical specifications using designers' mental models [14].
The use of demand modeling as a basis for product development decisions is represented by the decision-based-design framework [15]. One of the most important steps in demand modeling is the development of an analytical model to link consumer perceptions and valuations of products along with their individual attributes, to specific product features and performance levels, which are under the control of the designer. This approach allows designers to optimize the product through tradeoffs in demand, price, and production cost to ultimately maximize profit or an equivalent surrogate such as value [15][16][17]. The fundamentals for the demand-based approach lie in multi-attribute utility theory [18] and aggregate demand models to model the preferences of a population [19].
In the design community, there has been an increased focus on integrating demand modeling and using it as a fundamental approach to inform engineering decisions. Much of the work has focused on integrating DCA [7] as the theoretical basis for this approach. For instance, in Ref. [20], DCA was integrated into Hazelrigg's decision-based-design framework. Research has also focused on developing models to represent market heterogeneity [21]. It has been shown in Ref. [20] that it is not sufficient for a designer to use a (normative) multi-attribute (utility) decisionmaking approach to represent the design preferences of an entire market population. For similar reasons, the aggregate demand models [19] used in engineering design are limited as they insufficiently capture the variability found among consumers.
On the other hand, disaggregate demand-modeling techniques use data of individuals instead of group averages and enable more accurate capture of the variation among individuals. Depending on the degree of heterogeneity and the specific design problem, different types of DCA models, such as multinomial logit [22], nested logit [23], and mixed logit [24], have been utilized in design to capture heterogeneity in consumer preferences. In Ref. [25], by allowing random taste variation across the population using a hierarchical Bayes mixed logit model, the heterogeneity in consumer preference is modeled using random parameters without including the customer profile into the choice modeling. In Ref. [26], continuous representations of consumer preference using hierarchical Bayes mixed logit models are compared with discrete representation using latent class mixed logit models where consumers are grouped into segments based on their preferences. Limitations of the latent class mixed logit models for fully capturing preference behavior are also identified. In addition to the random heterogeneity, in Ref. [27], systematic consumer heterogeneity using explicit terms of customer profile attributes is introduced.
In addition to these contributions, other work has focused on the area of efficient data collection for DCA [28]. Attempts have been made to integrate a DCA methodology with other traditional methodologies like HOQ [29] and the analytical hierarchy process [30]. The discrete choice methodology has also been extended to include uncertainty in the discrete choice parameters and to establish relationships with uncertainty in profitability [31].
Although these are effective methods and serve as good starting points in understanding consumer preferences, there are still challenges in implementation [32]. These methods are not effective in handling inconsistencies in consumer preferences that exist at psychological levels, like differences between stated and revealed preferences [33]. In addition, the introduction of designers' biases presents a significant challenge in these methods. In discrete choice-based methods, the mapping of consumer preferences and product features is indirectly carried out using mathematical models built on partworths. However, the designer is still responsible for determining which features to survey on, indirectly providing a cognitive bias toward the engineered attribute mapping process.
Another critical piece of information missing in the current quantitative methodologies is the user-product interaction data. Consumers develop an emotional connection with their products and the manner in which consumers use the products affect their perceptions. The use of quantitative information captured from actual user-product interaction data is currently limited. As a result, current models fail to provide insights into why certain choices were made because of the limited scope of the psychological factors included in the model.

Methods to Capture User
Experiences. In addition to the previous methods, there are other methods used to improve design by considering the user-product experiences and interfaces. One of the critical differences between the methods of Sec. 2.1 and those discussed here is that these methods include various individual subjective, emotional, and psychological cues that are largely not included in the mapping methods. These methods include empathic design, universal design and affordance based design.
Empathic design provides insights leading to potential product innovations by focusing on specific cues that aid in identifying these opportunities. These cues include but are not limited to frustrations and confusions, fears and anxieties, wasted time, use triggers, etc. [10]. In Ref. [10], the focus is on expanding an empathic design methodology into the auto industry through the development of specific methods that leverage the Kano model to identify "delight" attributes of the products. Researchers have also attempted to develop approaches using "empathic lead users" in ways that differ greatly from the typical user in order to gain insights into how product developments could benefit a significant portion of the market [12].
Universal design is an ideology for product design that focuses on usability effectiveness for all types of users and across full user lifespans regardless of abilities [34]. Based on this perspective, seven design principles are utilized: equitable use, flexibility in use, simple and intuitive use, perceptible information, tolerance for error, low physical effort, and size and space for approach and use [34].
Affordance-based design relies on user-derived perspectives to develop a set of abilities afforded the user by the product. The concept of affordances comes from psychology and was first introduced by Gibson [35], and later referenced by Donald Norman in the context of product design [36,37]. This methodology focuses on shifting designer thinking away from product functions and toward the behavior that a product should afford the user [36][37][38].
Recently, attempts have been made to extract user experience information from nontraditional sources of information including online product reviews. This information is then translated using data analytics in order to extract product features and quantitatively investigate product feature preferences [39][40][41][42][43]. In preference modeling, the mathematical techniques traditionally rely on linear mapping. Using advanced nonlinear techniques in data analytics, the accuracy of preference prediction have been improved [44]. These extensions also demonstrate their effectiveness in handling large and varied datasets of information.
Social behavior and network analysis have also emerged as an effective technique in extending existing methods to model product preferences [45][46][47][48]. The improvements are definitely promising, but still lack incorporation of user-product interaction data in a quantitative manner. In addition, all these methods provide hypotheses about design and designers' perceptions about the mapping of product attributes to consumer perceptions but are unable to provide a basis for confirming or falsifying whether those perceptions are true. We submit that data generated directly during product use (from, for example, embedded sensors) can complement existing methods while also addressing many of the challenges raised in previous work.
The timing of this work aligns with the recent emergence of the internet of things where information can now be acquired and utilized directly from product operation [49][50][51]. However, acquiring and utilizing data generated by products is restricted to capital intensive products such as automobiles, aircrafts, etc. [51]. Also, use of such data are restricted to predictive maintenance [51][52][53][54], marketing [49][50][51], and environmental impact assessment [51,55]. In addition, there is a lack of research on utilizing data generated by products as feedback to a design process [51,56].
Based on the discussion in this section, there is a clear need and opportunity to obtain quantitative information regarding userproduct interactions to map these interactions to psychological constructs and to develop an analytical technique to model and statistically confirm or deny the relationship. In Sec. 3, we present a flexible architecture to fill this current gap in design capability. The proposed architecture and method has its basis in utilizing embedded sensor data and structural equation modeling (SEM).

Proposed Method
In this section, the method to address the challenges discussed in Secs. 1, 2.1, and 2.2 is presented. Specifically, a flexible framework and an analytical technique for consumer perception modeling in design are presented.
3.1 Framework. Fundamentally, this work is built upon the paradigm of empathic design as we attempt to make field observations about how a user interacts with a given product, although nonintrusively. We provide a framework to incorporate userproduct interaction data along with a network of various psychological constructs that are of interest to designers in that they provide insight into user perceptions of the product. The network of psychological constructs can help a designer understand why a user has a particular perception about a product.
Currently, there is a surge in the development of smart products that provide real-time feedback to both users and product manufacturers. We imagine that in the near future, it will be possible to embed sensors in every product with which a user interacts. These sensors would collect information and provide feedback to the user directly, as well as provide useful and important userproduct interaction data to the product designers. This work uses embedded sensors in products to collect user-product interaction data, to extract relevant features from raw data (e.g., use statistics), and to map those features onto a network of interconnected judgments, which are represented by psychological constructs of the users that are of interest to designers.
The long-term vision of this work is to provide a framework that leads designers to explore specific product manipulation studies, based upon product features that have significant influence on user perceptions. These types of "actionable insights" will help designers to develop products that are more aligned with individual user needs and experiences. To provide such insight the framework must prove capable of delivering more specific information by replacing formative survey measures with formative sensor measures. Demonstrating the effectiveness of the use of sensors through improved model fit is the focus of this study. In Sec. 3.2, the analytical technique and the psychological constructs used for this work are presented. The framework architecture is flexible meaning that the framework allows additional psychological constructs to be included.

Analytical Technique.
To develop the analytical framework for the cyber-empathic (CE) framework using psychological constructs, SEM is used as the modeling foundation. SEM is a multivariate statistical technique widely used in biology, economics, sociology, psychology, and consumer research [57][58][59]. The basic model underlying the SEM approach is reflected in the following equations: where g refers to the vector of endogenous random variables (variables influenced by other exogenous or endogenous variables), n refers to the vector of exogenous random variables (variables influencing other endogenous variables), B represents the coefficient matrix showing the influence of the endogenous variables on each other, C represents the coefficient matrix showing the influence of the exogenous variables on the endogenous variables, x and y are the vectors of observed variables, K x and K y are the coefficient matrices showing the relationship of exogenous and endogenous variables on the observed variables, respectively, f represents the structural error, and and d are the measurement errors.
Equation (1) represents the structural model, and Eqs. (2) and (3) represent the measurement model. The structural model shows the relationships between the theoretical or hypothetical (latent) constructs, which are not directly measurable. In SEM, latent variables are referred to as hidden constructs that cannot be measured directly. In product design, latent variables could take form in "unknown user needs" or represent certain "psychological constructs" that are of interest to designers. In the context of the CE framework, latent variables in SEM are psychological constructs that are of interest to designers.
The SEM represents a set of linear structural equations where the parameters are not just descriptive measures of association but also reveal an invariant causal relation to the extent the basic assumptions about the data are met [57]. This general system of equations allows us to link measured variables coming from various sources to underlying psychological constructs, along with their interrelationship, all in a single model. In SEM, using confirmatory factor analysis, multiple hypotheses of the interrelationships of the psychological constructs (structural models) can be tested. The structural model of the psychological constructs is defined by the designer and is usually an iterative process. Testing multiple structural models allows testing of multiple mental models until there is convergence on a best (minimum error) model. Thus, for the cyber-empathic framework, confirmatory factor analysis can be used to test the designers' hypotheses regarding the structure of the interconnected psychological constructs.
A general framework for cyber-empathic design is shown in Fig. 1. The exogenous and endogenous latent variables are the psychological constructs which represent user perceptions that are of interest to the designer but are not directly measurable. Instead, these are quantified through reflective and/or formative measurements [60].
In a reflective model, a construct is posited as the common cause of the measurements/indicator variables and causal action flows from the construct to the measurement variables. The reflective measurement data are collected using self-report surveys, which help users "reflect" on their product usage and other psychological constructs such as their perceived comfort.
In some cases, user reflections are shaped by actual userproduct interaction, which acts as a formative measure [60]. A formative model posits a construct (latent variable) that represents the common variation in a collection of measurement/indicators. The causal action flows from the measurement variable to the construct. The user-product interaction affects a user's reflections about a product and their associated psychological constructs. In the CE framework since user-product interaction data are obtained using sensors, the sensor data represent the formative measures. Thus, the framework integrates multimodal data (represented by rectangles in Fig. 1) and represents an important contribution of this work.
A representative example of the CE framework is shown in Fig. 2. The model is dependent on the product and context in which the product is used. It is a nontrivial task to identify psychological constructs and establish relationships. As a starting point, we suggest that designers use psychological constructs that are of interest to them and develop a theory-driven model (as in the case of the representative example). Recommendations described in Ref. [61] should be followed to develop and assess the model appropriately.
The example of the CE framework includes a set of psychological constructs based on studies and measurement models from the marketing domain [59,[62][63][64][65][66]. Based on past product usage research, it is expected that "prior experience"/"expertise" [62] and the users' "involvement" [63] with the product category could affect their perception about the product's "ease of use." Thus, these two psychological constructs are used as two exogenous constructs in the CE causal model. According to the technology adoption model [64], the ease of use [65] as well as "product capability perceptions" [63] drive "usage intentions" [65,66]. In this example, the product capability perception is considered as an exogenous variable. Usage intention can be dependent on and influenced by product capability perception and, therefore, is considered an endogenous variable in the example.
The psychological constructs considered so far in the literature do not leverage the user-product interaction data. For users, the "physical ease of use" of the product is also an important factor that affects their perceptions. If the user is not at ease interacting with or using a product, they will likely not purchase the product.
In this example, user-product interaction data from embedded sensors is leveraged to understand and model physical ease of use, which can represent more specific constructs like "comfort" depending on the scenario. This construct captured by the sensor data is another exogenous variable in the framework. Physical ease of use can affect "perceived ease of use" as well, which as stated earlier is an endogenous variable.
This set of psychological constructs in the representative example reflects the intended adoption or stated preferences of the user. However, in previous work, it is illustrated that stated and revealed preferences may diverge, limiting the predictive value of preference models [33]. As a result, it is important to consider revealed preferences from a design point of view. Revealed preferences can be obtained by using information regarding the buying history of the user. However, owning a product and actually using or adopting them are different. A customer may buy a product, but if not satisfied with the product postpurchase, may not use it affecting the future buying behavior of the customer. Thus, to further understand the user's perception of the product, their "actual adoption" [65] is an important factor to be considered and embedded sensors can be used to model this adoption. In this example, actual adoption or "usage adoption" is considered as the final psychological construct and since it is dependent on other constructs, it is an endogenous variable. To summarize, the psychological constructs considered in this example along with their classification and their measurement mode are shown in Table 1.
An important advantage of this framework lies in the use of sensor data to quantify the influence (via structural path analysis) of a particular design attribute or feature on a particular perception. A sensor-augmented path analysis will provide more precise (at the resolution of sensors) and objective information that will aid designers in defining future product manipulation studies (i.e., studies that systematically vary engineered features to quantify the changes in user perception). While product attribute manipulations are beyond the scope of this study, in Sec. 4, we present an application of the framework to develop a structural path model for a particular type of shoe.

Case Study, Results, and Discussion
To demonstrate the effectiveness of incorporating user-product interaction information using the CE framework, a case study of a sensor-integrated shoe is presented in this section. The psychological constructs along with the theory-based relationship used among these factors are also presented. To demonstrate the effectiveness of the sensor-enabled framework two models are compared-one model is based only on survey data and the second incorporates additional data from embedded sensors.

Sensor Data Acquisition
System. As part of the study, standard walking shoes were retrofitted with various sensors. The sensors include force sensitive resistors (FSR), accelerometers, flex sensors, and temperature sensors. The sensor suite consists of (8) FSRs, (1) accelerometer, (1) flex, and (1) temperature sensor. The FSRs target the forefoot, midfoot, and the hind-foot area as shown in Fig. 3. The accelerometer is used to understand the orientation of the foot and how a person is walking. The temperature sensor is used to understand how warm the shoe feels after a period of time and how it affects the comfort of users. The sensors were attached under the removable insert of the shoe and their placement is shown in Fig. 3. An Arduino Mega was used as the microcontroller to collect the data at a frequency of 22 Hz. It should be noted that for this case study, the product attributes of the shoes remained unchanged throughout the experiment. Section 4.2 presents the experimental protocol to capture the users' perceptions when they use the sensor-integrated device.

Experimental Protocol.
To collect data, student, staff, and faculty participants were recruited from the University at Buffalo-SUNY. The shoe sizes were limited to women's sizes 7, 8, and 9 (U.S.) and men's sizes 8, 9, and 10 (U.S.). Participants were paid $20 as compensation for their participation. In total, 151 users participated in the study; however, data from only 142 users could be used for analysis. The surveys were conducted using QUALTRICS software.
For the study, each participant completed three surveys using their smartphones and walked on a designated path across the campus for approximately 25 min. The path included tasks like walking on a flat surface, walking upstairs, walking downstairs, sitting, and standing. Each participant walked approximately 1 mile for the study.
The three surveys collected data specific to certain psychological constructs and were collected at specific intervals-one before starting the walking experiment, one during the experiment (after completing a set of tasks), and one after completing the  The sensor data were collected only when the participants were performing the tasks (which included completing the second survey while sitting). Over 95 GB of sensor data were acquired and analyzed for this work. It should be noted that if data (sensor and survey) are being collected from n users over a period of time, it has the potential to fulfill the four V's criteria of big data-volume, velocity, variety, and veracity [56]. Therefore, the CE framework could accommodate big data. However, to explore and validate this capability, we will conduct more studies with more complex products in future work. The psychological constructs considered for this work and measured using the consumer surveys are presented in Sec. 4.3.

Psychological Constructs.
For this case study, the psychological constructs being considered are design appeal, perceived capability, perceived usability, perceived comfort, user evaluation, and usage intention. The first survey (i.e., the presession survey, which is completed prior to the walking tasks) measures design appeal, perceived capability, and perceived usability. The second survey (i.e., the in-situation survey completed at the halfway point of the walking circuit) measures perceived comfort, while the last survey (i.e., the postsession survey) measures all the constructs. Readers are referred to Ref. [67] for the survey questionnaire related to each construct. The additional sensor data are used to understand the perceived comfort construct.
Based on past research concerning the factors that impact perceived comfort [68,69], the assumed relationships among the psychological constructs, which represents our hypothesis, are shown in Fig. 4 and this hypothesis is tested using SEM. Of course, there is a possibility that other construct relationships exist possibly producing different results than ours. Analysis and comparison of different construct relationships is a topic of future work and can readily be supported by the CE framework and corresponding fit indices. Covariance-based SEM (CB-SEM), which is based on the principle of maximum likelihood [57], is used as the parameter estimation algorithm and for hypothesis testing. For this work, results are obtained using SPSS-AMOS software. The analysis procedure to test our hypothesis is presented in Sec. 4.4.

Analytic
Procedure. The analysis procedure to estimate the parameters of the CE model using the CB-SEM method is shown in Fig. 5. The sensor signal data are first cleaned to eliminate noise followed by a feature extraction procedure. The extracted features are a representation of the sensor data, which act as input to the SEM model. It should be noted that feature designing and extraction from sensor data is a complex task and can vary based on the situation. For example, features designed for activity classification can be different from features designed to estimate comfort rating. Modern methods like machine learning can be used to automatically learn features [70,71]; however, the application and integration of these methods is beyond the scope of this work.
Once the features are defined and extracted, CB-SEM factor analysis is performed using the features. The factor scores act as the inputs to the measurement model for the perceived comfort construct. Similarly, the survey measures are used as inputs for the measurement model of the corresponding constructs. The measurement model is then used to develop the structural model and estimate the parameters.
CB-SEM assumes that the input data are normally distributed; hence, the data (survey and sensor features) are transformed to normal distributions using a Box-Cox [72] transformation technique. In the case of the CE design framework, the process of parameter estimation and obtaining a satisfactory model using CB-SEM is dependent on the quality of the extracted features. The process is iterative until the fit indices criteria are met and the best model is obtained (based on the convergence of the fit indices). In this process, as uncertainty lies mostly in the feature engineering process, new features have to be generated until satisfactory fit index measures are obtained. The results reported in this work are based on the best features extracted to obtain a satisfactory model. The results obtained for the case study using the analysis procedures and related discussion is presented in Sec. 4.5.

Results.
To test the effectiveness of incorporating user-product interaction data using the CE framework, two models are compared in this section. The first model is based only on survey data (survey-based model) and the second model integrates the user-product interaction data from the sensors (CE model). The only psychological construct that differs between the surveybased model and CE model is perceived comfort. In the surveybased model, localized comfort measures for specific areas of the foot are obtained using surveys, while in the CE model, localized measures are obtained from sensors. These localized measures are the formative indicators for the perceived comfort construct. The surveys are presented in Ref. [67].
For the CB-SEM, based on recommendations in the SEM literature [57], goodness of fit index (GFI), the comparative fit index (CFI), the non-normed fit index (NNFI), the root-mean-square error of approximation (RMSEA), the Akaike information criterion (AIC), and the Bayesian information criterion (BIC) are used as the fit criteria for model comparison. Higher GFI, CFI, and NNFI correspond to better models while lower RMSEA, AIC, and BIC correspond to better models [73][74][75]. In addition, the Browne-Cudeck single-sample cross-validation (BCC) index is used to compare the generalizability and predictive capability of the models [76]. As per recommendations, a good model has BCC in the range 0-2.0, while models with BCC from 2.0 to 4.0 are considered weak models. Models with lower BCC are considered better.

Psychological Constructs Assessment (Measurement Model
). The first step in CB-SEM is to develop the measurement model and assess the psychological constructs. In this section, we present the results of confirmatory factor analysis (CFA) for both models considered in this work-survey-based model and CE model. For this work, we develop and assess two measurement models. As the first step for assessment, we develop a measurement model using only reflective indicators for all psychological constructs (Fig. 4). Based on the assessment, we then develop a measurement model only for perceived comfort, as it is a multiple indicator and multiple cause (MIMC) model, i.e., it has both formative and reflective [60] indicators. Such a scenario agrees with intuition because the localized comfort (formative) measures represented by survey questions about pressure felt in various regions of the foot (Fig. 3(a)) will drive the overall perceived comfort construct, which is reflected by the overall comfort perception measures (reflective) represented by survey questions about the comfort of the shoe.
The CFA results obtained using only the reflective indicators are presented in Table 2. Results of Table 2 do not include the factor scores corresponding to the localized formative comfort measures. CR is the composite reliability and AVE is the average variance extracted. We observe the following from Table 2:  Satisfactory values: CR, GFI, CFI, NNFI ! 0.9, AVE ! 0.5 [73][74][75] -Factor loadings of (most) reflective measures of design appeal, perceived effectiveness, perceived usability, user evaluation, and adoption intention are greater than 0.7. -Cronbach's alpha, CR, and AVE values of design appeal, perceived effectiveness, perceived usability, user evaluation and adoption intention are satisfactory. -Overall fit indices GFI, CFI, and NNFI are also satisfactory.
Hence, the measurement models obtained for these factors are satisfactory. -Based on the values for factor loadings, Cronbach's alpha, CR, and AVE, reflective measures are not adequate to model perceived comfort. Thus, there is a need to introduce formative measures as well and develop an MIMC model for perceived comfort.
To build the MIMC model for perceived comfort, we require direct measure of formative indicators, for example, pressure at specified locations in the shoe. In the survey-based model, these formative indicators are measured using survey questions regarding pressure felt in fore-, mid-, and hind-foot as in Fig. 3(a). As the localized survey measures in the survey-based model focused on only perceived pressure, for model comparison purposes, only force sensor (FSR) data are utilized in the CE model. That is, for the CE model, formative survey measures are replaced by sensor measurements. This represents the key concept of the cyberempathic framework.
In this work, only simple features which are representations of the raw signal data are used as input to the CFA model. Specifically, mean force is used as the feature in this work. Multiple iterations were conducted with different features according to the procedure presented in Sec. 4.4. However, only mean force resulted in satisfactory results. As stated previously, extracting features is a complex topic and is not the focus of this paper. A detailed analysis with other features is the focus of future work.
The results of the MIMC model for perceived comfort (for survey-based model and CE model) are presented in Table 3. Factor loadings of only reflective measures are presented in Table 3. We observe the following from Table 3: -The factor loadings of perceived comfort are satisfactory.
-The overall fit indices, CFI, GFI, and NNFI are satisfactory.
Thus, we can state that the MIMC model for perceived comfort for both survey-based and CE model are satisfactory. -While comparing models, the model with lower AIC and BIC is considered more effective. The values of AIC and BIC obtained for the CE model are better than that of the surveybased model, demonstrating that the measurement model of perceived comfort for the CE model is better than that of the survey-based model.
This is a significant finding as it confirms that sensor data can be used to quantitatively model and understand psychological constructs like perceived comfort. This result shows that objective user-product interaction data can be collected using sensors and incorporated in the model, reducing and potentially eliminating the dependency on consumers to provide survey feedback. In other words, there is a potential to replace or augment formative survey measures with formative sensor measures.
However, based on the fit indices of the CE model, it is clear that there is opportunity to improve the measurement model as not all fit indices are in the satisfactory range (CFI and NNFI are less than the desired level of 0.9). For the CE model, as shown in Fig. 5 and discussed in Sec. 4.4, the feature extraction from raw sensor data is a critical step and affects the measurement model quality. Research on improved feature extraction strategies and/or automatic feature extraction using machine learning methods and its effects on the model is a topic of future work [77].
Based on the CFA results, the SEM model is estimated and presented in Sec. 4.5.2 to illustrate the effectiveness of the cyberempathic model compared to the pure survey-based model. 4.5.2 SEM Model Assessment and Comparison. The factor scores obtained using the CFA presented in Sec. 4.5.1 are used to estimate the parameters of the overall structural model. The underlying hypothesis of the psychological construct relationship is shown in Fig. 4. Causal discovery (i.e., developing the underlying structural relationship of psychological constructs) is not in the scope of this work. Instead a structural relationship is assumed for this case study based on past research [68,69]. The results discussed in this section assume that the overall structural relationship defined is accurate and includes all relevant confounds. We acknowledge that if one cannot verify these conditions during experimentation, the results obtained may be misleading.
Survey-based model assessment: For the survey-based model, the structural model obtained is presented in Fig. 6. The p-values obtained to test the significance of the structural relationships and the overall model fit indices are presented in Table 4. We observe from the p-values that the relationships-design appeal to perceived comfort and perceived usability to perceived comfort-are not significant. Thus, as per previous recommendations [61], these two relationships are eliminated and the structural model is re-evaluated.
After re-evaluation, the structural model along with the pvalues of the relationships and overall fit-indices are presented in Fig. 7 and Table 5, respectively. We observe that all relationships are now significant and from the overall fit indices, we can also state that a satisfactory model is obtained. In addition, by comparing the AIC and BIC values of Tables 4 and 5, we can also state that an improved survey-based model is obtained. A similar assessment for the CE-model is also conducted and compared with the survey-based model to test the hypothesis that user-product interaction data are useful for improving user perception modeling.
CE model assessment: The structural model and the parameters obtained for the CE model are presented in Fig. 8 and Table 6. From the fit indices, it is also clear in this case that the model is satisfactory. However, similar to the survey-based model, the relationships design appeal to perceived comfort and perceived usability to perceived comfort are not significant. Thus, these two relationships are eliminated and the structural model is reevaluated.
The re-evaluated structural model along with the p-values and overall fit indices are presented in Fig. 9 and Table 7, respectively. We observe that all relationships are significant and also the model is satisfactory. As a result, the survey-based and the CE model can now be compared.
To test the hypothesis that user-product interaction data are effective to model user perception and address research challenge (iii), a comparison of the survey-based model and CE model is conducted based on the overall fit indices of Tables 5 and 7. We observe that the AIC and BIC values of the CE model are lower than those of the survey-based model. As per previous recommendations [57], we conclude that for this case study and the assumed structural relationship, the CE model is more effective and has better explanatory power than the survey-based model.
In addition to AIC and BIC, we observe that the BCC score, which is a single sample cross-validation index, of the survey- based model is greater than 2.0, while for the CE model, it is less than 2.0. As per recommendations [76], since the BCC score of the CE model is less than 2.0 and is lower than the BCC score of the survey-based model, we conclude that the CE model has better generalizability relative to the survey-based model. Thus, we can say that for this case study, we can replace or augment formative survey measures with formative sensor measures to model user perception for the sensor-integrated shoe. Hence, the CE model is more effective for modeling user perceptions about a product addressing research challenge (iii) presented in Sec. 1. Similar model testing using other structural model assumptions and other products is required to validate this claim further, which is a topic of future research.
To summarize, we have addressed the research challenges in the following ways: (i) we use product-embedded sensors to capture user-product interaction data (Sec. 3.1), (ii) we use SEM to establish a relationship between user-product interaction data and psychological constructs (Sec. 3.2), and (iii) using a sensor-integrated shoe case study, we demonstrate the effectiveness of incorporating user-product interaction data captured using sensors to model user perceptions (Sec. 4).
As a result of this enhanced model of user perceptions, engineers can be exposed to previously unknown relationships between product form and function. In Sec. 4.5.3, we highlight one such example.

Example of Possible Actionable
Insight. Based on the factor analysis and measurement model of the CE framework, an important insight is revealed. In the survey-based model, participants were asked to provide localized comfort measures specific to certain areas-fore-, mid-, and hind-foot ( Fig. 3(a)). These specific locations and corresponding survey measures represent a designer's mental model as we concentrated on these specific areas of the foot. However, the factor analysis revealed sensor groups that differ from the areas specified in the survey-based model. These groups are shown in Fig. 10, where each group is highlighted (black dotted, gray or white). As the measurement model obtained for CE is better than that of the survey-based model, we can conclude that these sensor groups have better explanatory power to describe how comfort is experienced by the participants. Thus, for a future generation of this shoe, instead of focusing on the regions specified in Fig. 3, focus might be better directed toward the regions shown in Fig. 10. This observation and the aforementioned suggestion is a significant result for the experiment conducted because it underscores the argument presented in Secs. 1 and 2 that designers' mental models may include invalid assumptions about how products are experienced by endusers. We consider this as an actionable insight that would guide product manipulation studies in which we might vary the stiffness of material in the three regions of Fig. 10 to systematically look for better designs for individual users.
Surveys alone cannot reveal such bias, but the sensors can decouple user feedback from potential designer bias by replacing survey-based formative indicators with sensor-based measures. The objective of the CE framework is to discover and reveal these    potential patterns and information as demonstrated in this case study. However, this leads to another important question for the designer, which is whether the areas shown by the sensor grouping and the associated engineering properties are responsible for certain perceived comfort perception of the user. This can be studied and analyzed by conducting product manipulations, a topic of future work.

Current Limitations.
Although the results of this work are promising and a good starting point for further development, there are a few important caveats to consider.
In the current form and the demonstration of the CE framework, readers should not consider the CE model obtained and assessed as a causal model. To develop a causal model, product attributes should be manipulated (with the assumption that it will change the manner in which users interact with the product), obtain sensor information, and assess the effect of manipulation on user perception. The resultant mapping of the product attribute and user perception from the sensor information can lead to causal model development. In this work, we have demonstrated that user-product interaction information obtained using sensors can be useful in modeling user perception. The structural relationship of the psychological construct is theory driven and adopted from the well-established technology adoption model. This is with the assumption that all relevant confounds have been taken into consideration. If the technology adoption model is not suitable for a particular product or a scenario, then alternative models should be considered, explored, and tested. The model exploration and testing should be conducted following the principles of structural equation modeling that have been demonstrated in this work as well. The CE framework is also dependent on the types of sensors used, their placement, and their relationship with the attributes. This represents another form of bias and guidelines should be developed in the future to address this. There is restriction on the manner in which the experiments are conducted and its relation to the CE framework. When users are interacting with the product, designers should be aware of the usage contexts in which the product is being used. For example, in the case study presented, we limited the usage contexts to walking, walking upstairs, walking downstairs, and sitting. If designers are not aware of the usage contexts, then the sensor feature extraction procedure can be complicated and also introduce other confounds that can affect user perception. In that case, the CE model may not result in useful results or may fail altogether.

Conclusion and Future Work
Current product design approaches face significant challenges when trying to integrate user-product interaction data in a quantitative manner to support design decisions. In addition, these methods rely on designers' mental models to map important product attributes to consumer perceptions. In this paper, we present a framework demonstrating that user-product interaction can be used to model user perceptions and also provide actionable insights. This new cyber-empathic framework captures user-product interaction data using embedded sensors. Structural equation modeling is used as the analytical method for modeling and parameter estimation. This technique can be useful to reveal unknown patterns that can differ from preconceived mental models of the designer. There are multiple potential benefits of the proposed framework in semicontrolled scenarios: (a) It can be scaled to n-users where designers can "observe" product use and extract information in real time. (b) The information obtained is semicontrolled data, revealing actual variability, and more information from broad use scenarios. This can lead to identification of certain design cues that designers may not have considered.  (c) As the sensor data can provide direct information regarding specific product attributes to the designers, this has the potential of reducing (and even eliminating) the designers' cognitive load and mental model bias, by providing actionable insights to designers. Direct information obtained about product attributes eliminates the sole reliance on a designer to map consumer information to the design space. (d) As the user-product interaction data obtained from sensors are mapped onto a network of interconnected judgments (psychological constructs), the designer obtains a hierarchical model. At the lowest level of this model is the user-product interaction data and at the highest level is user-product perception. The model is intended to help the designer in understanding the cause behind the perception and map it to the product attribute most significantly affecting consumer perceptions.
A shoe case study is used to demonstrate the framework effectiveness and test our hypotheses. Using the case study, we learned and demonstrated that sensor signals can be used to model user perceptions in a more effective manner than using survey data alone. Using the case study, we also demonstrate how various psychological constructs can be used to develop a model for user perceptions using the SEM technique. The cyber-empathic model revealed sensor-based groups that were different from those we identified in the shoe design literature, which influenced our survey questions. This leads to a need to study and analyze the causality of the new shoe regions identified using sensors and associated engineering attributes on user perception by product manipulation.
While the insights and results are very encouraging, there remains significant potential for further studies and advances. Currently, we have demonstrated the mapping of sensor information onto a network of psychological constructs and have shown model improvement over a standard survey-based model. This provides a foundation for future studies that explicitly tests design inferences using the sensor information and psychological constructs. Specifically, there is potential to manipulate product features and then measure the changes in perception using sensors and psychological construct models.
In addition, the shoe case study used in this work has limited complexity, with a limited number of sensors integrated. A more complex case study is needed to study the effectiveness of the framework as product complexity scales. A more complex case study would also facilitate the integration of big data methods with the cyber-empathic framework.
Integration of big data techniques would allow the framework to incorporate data from a variety of heterogeneous sensors. Sensors that not only embedded in the products but also attached to the users (e.g., a heart rate monitor) may lead to better understanding about users' perception. Research will be conducted to incorporate different types of sensors capturing various modalities of interaction to use with the proposed model.
Also, this work is limited to only linear interactions among the psychological constructs. However, there is a possibility that these relationships are nonlinear in nature. Therefore, the framework presented here should be extended to incorporate nonlinear relationships among the constructs.
Finally, there are a number of advances to be made on the structural model itself. While MIMC is used to model the perceived comfort construct, a more theoretical analysis is needed to validate its broader usefulness. We also see an opportunity to integrate machine learning algorithms with the cyber-empathic framework to provide more effective feature extraction and integration of sensors as explored in Ref. [77].