Abstract
Engineering design involves information acquisition decisions such as selecting designs in the design space for testing, selecting information sources, and deciding when to stop design exploration. Existing literature has established normative models for these decisions, but there is lack of knowledge about how human designers make these decisions and which strategies they use. This knowledge is important for accurately modeling design decisions, identifying sources of inefficiencies, and improving the design process. Therefore, the primary objective in this study is to identify models that provide the best description of a designer’s information acquisition decisions when multiple information sources are present and the total budget is limited. We conduct a controlled human subject experiment with two independent variables: the amount of fixed budget and the monetary incentive proportional to the saved budget. By using the experimental observations, we perform Bayesian model comparison on various simple heuristic models and expected utility (EU)-based models. As expected, the subjects’ decisions are better represented by the heuristic models than the EU-based models. While the EU-based models result in better net payoff, the heuristic models used by the subjects generate better design performance. The net payoff using heuristic models is closer to the EU-based models in experimental treatments where the budget is low and there is incentive for saving the budget. This indicates the potential for nudging designers’ decisions toward maximizing the net payoff by setting the fixed budget at low values and providing monetary incentives proportional to saved budget.
1 Introduction
The engineering design process is recognized as an iterative decision-making process, which proceeds in stages, with each stage marked by decisions such as material selection, design evaluation, manufacturing process selection, etc. [1]. A particular class of design decisions is information acquisition decisions, which includes decisions such as whether to gain more information about a concept, whether to execute a simulation, how much to refine a model, and which model to choose. The understanding of how a designer makes information acquisition decisions is important for improving the outcomes of design processes, e.g., effectively managing the trade-off between the design performance and the cost of information gathering. This understanding can allow researchers to predict the outcomes of engineering design and systems engineering processes (e.g., Refs. [2,3]), to identify human-related sources of inefficiencies such as cognitive biases, and to find ways to reduce inefficiencies.
Despite extensive research on decision-making in design, there is a lack of quantitative descriptive models of information acquisition decisions that represent how humans actually make such decisions in engineering design and systems engineering. Existing studies in design decision-making have focused on normative frameworks (e.g., Refs. [4–7]), such as the expected utility (EU) theory [8], which assumes that designers are rational decision makers. However, it is well known that humans do not necessarily follow the normative models of decision-making [9–11]. Researchers in cognitive psychology and behavioral economics have developed various descriptive models of human decision makers [12,13]. Examples of these descriptive models include bounded rationality-based models [14], fast and frugal heuristics [13], models based on deviations from rationality [12], and cognitive architecture-based models [15]. Although existing descriptive models are alternatives to the normative models, they do not account for the nuances of information acquisition decisions in engineering design. For example, engineering design decisions require comparisons between multiple information sources (e.g., simulation and physical prototypes), and they are constrained by budget, time, and resources. Hence, the primary objective of this paper is to identify models that provide the best description of a designer’s information acquisition decisions when multiple information sources are present and the total budget is limited.
The approach followed in this paper consists of (i) designing a simple, but nontrivial, experimental task representative of the sequential information acquisition process, (ii) collecting experimental evidence on individuals’ decisions, (iii) formulating alternative models of decision strategies, and (iv) performing Bayesian model comparison for identifying the best-fit models and estimating the posterior distribution over the model parameters for quantification of treatment effects (see Fig. 1). A within-subject controlled experiment is useful for evidence collection as it mitigates the influence of external factors [16] and maintains a degree of realism by using humans as designers and real money as incentives [17]. A controlled experiment, as opposed to protocol analysis, elicits decision data in a nonintrusive manner. The decision models used in the paper incorporate strategies ranging from simple heuristics to expected utility-based judgments. This approach is adaptable to different design situations by changing the experiment task and associated candidate models.
To bound the scope of the paper, we make a few simplifying assumptions. The design task in the experiment consists of a one-dimensional parametric design problem with a continuous design space. A designer’s problem-specific domain knowledge does not have significant bearing on their sequential information acquisition decisions, and they adhere to the same decision strategy until stopping. The analysis approach and the models of the individual decisions are, however, general and can be applied to more complex design decision-making problems. Despite the simplifications, the experimental task represents a surrogate problem that embodies certain characteristics of real design problems. It requires evaluation of different designs using multiple information sources, assessing their values, and deciding the best design [1], all under a fixed budget. The best design is not known until information search, processing of the acquired information, and stopping are completed [18].
The primary contribution of this paper is an approach that combines computational models of decision-making with behavioral experiments to understand human decision-making behavior in design under uncertainty. The paper points to specific heuristic models that describe the subjects’ information acquisition decisions better than the counterpart expected utility-based models. Another contribution is the insights about nudging toward cost-effectiveness by fixing the budget at low values and/or using an incentive to reduce budget spending. Systems engineers can leverage these insights for balancing the trade-off between performance and design evaluation costs in their design processes.
The paper is organized as follows. Section 2 introduces the sequential information acquisition process, which we use as a basis for the design of the experiment task. Section 3 provides a rationale for various descriptive decision models. Section 4 presents the experimental treatments, subject population, and payment structure. Sections 5 and 6 provide the results and discuss their implications in the engineering design context.
2 Information Acquisition in Engineering Design
We begin with an abstraction of the design process called a sequential information acquisition and decision-making process [19], where design is considered a problem-solving activity with known design parameters and evaluation criteria but unknown mapping between the two.
2.1 Sequential Information Acquisition Decisions.
A designer’s objective is to find the design parameter values that maximize the performance (see Fig. 2). To achieve this objective, the designer performs iterative evaluations of the design performance. At each iteration, the designer makes the following three decisions: (1) a decision to choose next design, (2) a decision to choose an information source for performance evaluation, and (3) a decision of whether to stop evaluations.
The process is constrained by a fixed budget, which limits the number of design evaluations. The budget type may be financial (e.g., fixed cash or capital) or technical (e.g., fixed computational resources or energy) [20]. There are many examples of such a design situation. For example, in the control problem for a room heating system, a designer finds the temperature set point that minimizes energy consumption while maintaining thermal comfort [21]. Another example is the design of superconducting materials such as CuxBi2Se3, where a designer finds the dopant composition (x) that maximizes superconductivity through a series of magnetization experiments [22].
The process of iterative design evaluations, here referred to as information acquisition, is typically performed with the help of multiple prototypes that serve as information sources with different costs and uncertainty. Various front-end methods for quantifying the uncertainty associated with information sources are available, e.g., probability distribution fitting on performance data, Delphi approach to elicit expert knowledge, and evidence theory or information gap theory to model information deficit [23].
Based on the preceding specifications, we model the case of multiple information sources as follows. Set denotes the design space, and denotes a point in the design space. The performance functionf(x) is a scalar function of the design, i.e., . However, the value f(x) is not directly observable. A designer can obtain information about f(x) through the query of a costly and uncertain information source. We assume that the designer has access to M ≥ 1 such information sources. The information source labeled by m in {1, …, M} has a cost cm ≥ 0. When this information source is evaluated at a point x, it reports a performance measurement , where is a random variable representing the measurement uncertainty.
Considering that the designer performs information acquisition sequentially, at each step of the process, the designer chooses a design point and an information source to query based on their current state of knowledge about the performance function and the measurement uncertainties. Mathematically, at iteration i, the designer evaluates the information source mi at a point using the observed history of evaluations , where along with any prior beliefs (see Fig. 2). For a parallel design procedure, in contrast, the designer would query multiple pairs of information sources and designs at each step without incorporating learning derived from the current knowledge of performance observations. Although the design process may comprise parallel and sequential queries [24,25], we focus on sequential queries as a first step toward modeling information acquisition in engineering design.
After observing the performance at the end of each step, the designer decides whether to continue or stop the evaluations. If the decision is to continue, the designer evaluates the performance function at a new point. The designer cannot perform an additional evaluation if the cost of querying the information source is larger than the available budget amount. If the decision is to stop, the outcome of the design process is the most recent state of knowledge about f, denoted by the probability measure .
2.2 Design of the Experimental Task.
To operationalize the sequential information acquisition process, we designed an experiment task using specific assumptions about the nature of the design space, information sources and fixed budget. The task involves a subject making design decisions, and a user interface processing acquired information in the back-end to display the state of knowledge about the design performance in a visual format. The roles of the subject and the user interface are separated to maintain uniformity in how different subjects process the acquired information. The incentives are proportional to the outcomes of the process to motivate the subject to maximize the design performance.
Continuous design space
The design performance f(x) is a scalar continuous function of a single design parameter x.
Sequential evaluation process
Subjects evaluate multiple designs sequentially. Each evaluation takes one unit of time to run, during which the subject may not begin another evaluation.
Multiple uncertain information sources
Subjects evaluate the performance using either a low-fidelity or a high-fidelity information source (M = 2).
We denote the low-fidelity information source by m = 1 or “L” and the high-fidelity information source by m = 2 or “H.” We denote the total number of low (high)-fidelity observations at step i by ni,L (ni,H). It is where 1A(·) is the indicator function of the set A. An example of a low-fidelity source is a stochastic computer-based simulation with large uncertainty due to approximations such as discretization of the design space, computational limitations, and errors from theoretical inadequacy. An example of a high-fidelity source is a physical prototype with relatively low uncertainty due to manufacturing defects or machining tolerances when preparing a test specimen. Simulations and prototype tests in this scenario assume aleatory uncertainty, in that they generate different observations from different evaluations of the same design point.
Gaussian measurement uncertainty
The measurement process is modeled as a Gaussian random variable centered at the true (but unknown) performance function.
Known costs
The evaluation of the performance at any design point costs a fixed amount, which is known a priori to the designer. The cost in this scenario is tied to the definition of budget, which may mean financial or technical resources.
If cL and cH are the costs of the low-fidelity and high-fidelity observations, respectively, then cL < cH.
Visualization of the state of knowledge
The user interface visualizes the state of knowledge by displaying the mean estimate of true performance, the 5th and 95th percentiles. This way the acquired information gets processed and visualized in the same manner for all subjects, and observing the subjects’ decisions remains the focus of the experiment task.
Fixed budget
Each subject has a total budget of B for performance evaluations. A subject may stop before exhausting the entire budget B, so the total cost incurred, Ci = cLni,L + cHni,H, is less than or equal to B.
Performance-based payment
The subject’s payment includes a fixed payment, a bonus proportional to the best high-fidelity observation, and a bonus proportional to the budget saved.
3 Formulation of Descriptive Decision Models
The descriptive models of information acquisition decisions fall on a spectrum with expected utility theory on one end and simple heuristics on the other end. Models closer to the EU theory embody rational judgments such as where the expectation of information gain is maximum and whether maximum performance has been achieved. They are based on criteria such as the probability of improvement (PI) [26], the expected improvement (EI) [27], the expected conditional improvement (ECI) [28], and the maxima-region entropy. Models closer to simple heuristics use cues from the environment (e.g., user interface in the experiment). Examples of such cues are predictive mean, variance, remaining budget, and the number of evaluations. Simple heuristic models of the information acquisition decisions include the upper confidence bound (UCB) [29], conditional UCB (CUCB), the fixed sample number (FSN), the fixed remaining budget (FRB), and the dominant physical prototype (DPP). Table 1 lists descriptive models used in this paper and their underlying type (expected utility-based or simple heuristic). These models are inspired from the literature, observations of past experiments [30], and survey responses in the experiment detailed in Ref. [31].
Decision model | Underlying strategy | Model type |
---|---|---|
1. Decision to choose next design | ||
a. UCB [29] | Explore design space during initial iterations while exploit during later iterations. | Simple heuristic |
b. Probability of improvement (PI) [26] | Selection probability proportional to PI value. | EU based |
c. EI [27] | Selection probability proportional to EI value. | EU based |
d. ECI [28] | Selection probability inversely proportional to ECI value. | EU based |
2. Decision to choose information source | ||
a. FSN | Test high-fidelity source after a fixed number of samples. | Simple heuristic |
b. FRB | Test high-fidelity source if the remaining budget is smaller than a fixed value. | Simple heuristic |
c. Fixed maximum-region entropy (FME) | Test high-fidelity source if the information entropy of the location of maximum is smaller than a fixed value. | EU based |
d. Fixed expected conditional improvement (FECI) | Test high-fidelity source when the difference between EI from one step and EI from two steps is smaller than a fixed value. | EU based |
3. Decision to stop | ||
a. FSN | Stop after a fixed number of samples. | Simple heuristic |
b. FRB | Stop after a fixed amount of budget is remaining. | Simple heuristic |
c. DPP | Stop when the best high-fidelity measurement minus the largest predictive mean is smaller than a fixed value. | Simple heuristic |
d. FME | Stop after the entropy of the location of maximum is smaller than a fixed value. | EU based |
e. FEI [32] | Stop after EI is below a fixed value. | EU based |
Decision model | Underlying strategy | Model type |
---|---|---|
1. Decision to choose next design | ||
a. UCB [29] | Explore design space during initial iterations while exploit during later iterations. | Simple heuristic |
b. Probability of improvement (PI) [26] | Selection probability proportional to PI value. | EU based |
c. EI [27] | Selection probability proportional to EI value. | EU based |
d. ECI [28] | Selection probability inversely proportional to ECI value. | EU based |
2. Decision to choose information source | ||
a. FSN | Test high-fidelity source after a fixed number of samples. | Simple heuristic |
b. FRB | Test high-fidelity source if the remaining budget is smaller than a fixed value. | Simple heuristic |
c. Fixed maximum-region entropy (FME) | Test high-fidelity source if the information entropy of the location of maximum is smaller than a fixed value. | EU based |
d. Fixed expected conditional improvement (FECI) | Test high-fidelity source when the difference between EI from one step and EI from two steps is smaller than a fixed value. | EU based |
3. Decision to stop | ||
a. FSN | Stop after a fixed number of samples. | Simple heuristic |
b. FRB | Stop after a fixed amount of budget is remaining. | Simple heuristic |
c. DPP | Stop when the best high-fidelity measurement minus the largest predictive mean is smaller than a fixed value. | Simple heuristic |
d. FME | Stop after the entropy of the location of maximum is smaller than a fixed value. | EU based |
e. FEI [32] | Stop after EI is below a fixed value. | EU based |
The definition of a descriptive decision model involves two stages, (i) formulating a decision strategy as an acquisition function or a feature of observed history and (ii) modeling deviation from the strategy using a likelihood function. Acquisition functions and features are deterministic models that predict decisions for a given decision strategy, while likelihood functions, with their model parameters, impose a layer of uncertainty around those predictions. Such a construct assumes that designers are likely to make errors and deviate from predicted decisions, irrespective of whether their underlying strategies are EU based or heuristic based. For the EU-based models, the assumption of probabilistic decisions mirrors the limited cognitive ability of designers to make accurate decisions even though their judgments may be aligned with rational judgments.
3.1 Modeling the Decision to Choose the Next Design.
3.1.1 Upper Confidence Bound.
3.1.2 Probability of Improvement.
3.1.3 Expected Improvement.
3.1.4 Expected Conditional Improvement.
3.2 Models of the Decision to Choose an Information Source.
In a threshold-based decision model, we always include a constant, negative basis function because the difference between the weighted sum and a threshold determines the decision strategy. Furthermore, we assume that each designer’s strategy relies upon a single element of history, and the decision model uses two features (Rm = 2), one more in addition to the constant one. This assumption reflects that people’s cognitive ability is limited, and they do not consider all the relevant information while making decisions [33].
3.2.1 Fixed Sample Number.
3.2.2 Fixed Remaining Budget.
In this model, the remaining budget determines the choice between the two information sources. Low-fidelity observations, if any, are collected during initial iterations until a fixed amount of remaining budget is left, and high-fidelity observations are collected thereafter.
3.2.3 Fixed Maximum-Region Entropy.
A strategy is based on the judgment of whether the region of function maximum has been sufficiently identified given the observed history. It is assumed that once the information entropy of the posterior probability density of the performance maximum reduces to a fixed value, the designer starts design evaluations using the high-fidelity information source.
3.2.4 Fixed Expected Conditional Improvement.
3.3 Models of the Decision to Stop.
3.3.1 Fixed Sample Number.
3.3.2 Fixed Remaining Budget.
3.3.3 Dominant Physical Prototype.
3.3.4 Fixed Expected Improvement.
3.4 Conditional Decisions to Choose the Next Design and to Choose the Information Source.
The decisions of choosing the next design and choosing information source can be interdependent. For example, a designer may use the low-fidelity information source to evaluate design points with large uncertainty (for exploration) and the high-fidelity information source to evaluate design points closer to regions of large performance and relatively low uncertainty (for exploitation). We call this model the CUCB model.
4 A Controlled Experiment to Elicit Design Decisions
For parameter estimation and comparison of the decision models, we gathered data from decisions by conducting an experiment with the experimental task presented in Sec. 2.2.
4.1 Subjects, Treatments, and Payment.
A total of 63 student subjects were recruited from an introductory undergraduate level machine design course. The participation was voluntary and was not considered toward students’ grades.
Each subject performed 18 runs of the experimental task with a distinct unknown performance function. A run of the experimental task is called a period. The objective in each period was to find the maximum of an unknown function. For every iteration in each period, subjects made three decisions, (i) decision to choose x, (ii) decision to choose an information source, and (iii) decision about whether to stop. The 18 distinct functions were randomly generated prior to the experiment and were fixed for all subjects. The assignment of these functions to periods was randomized for each subject to minimize potential confounding between functions and treatments. Some parameters of the experimental task were fixed. In particular, the design evaluation costs were cL = 2, cH = 8, the measurement variance were vL = 10, vH = 0.0, the design space was , and the fixed minimum payment . The user interface is shown in Fig. 3. For the ease of understanding of the subjects, we termed the low-fidelity information source as a computer simulation, and the high-fidelity information source as a physical prototype.
The experiment was divided into three parts:
Trial part (two periods): The first part involved two trial periods to help the subjects get familiarized with the user interface before starting the actual experiment. The outcomes of these functions were not considered toward the subjects’ payment.
Use-it-or-lose-it part (nine periods): For this part, the subjects were allocated a fixed budget per period. Any remaining budget was discarded and not added to the subject’s payment. In this part, a subject evaluated nine unknown functions in nine periods, with three functions each for three treatments of fixed budget per period: (i) treatment T1 : B = 20, (ii) treatment T2 : B = 40, and (iii) treatment T3 : B = 60.
Save-remaining-budget part (nine periods): For this part, the subjects were allocated a fixed budget per period and any remaining budget at the end of every period was added to payment as a bonus Hb. Subjects evaluated nine unknown functions in nine periods, with three functions each for the three treatments of fixed budget per period: (i) treatment T4 : B = 20, (ii) treatment T5 : B = 40, and (iii) treatment T6 : B = 60.
At the end of the above three parts (six treatments), the subjects completed a survey on their computer screen where they responded to three questions asking them to list the strategies they used for the three decisions.
The order of treatments was varied across the subjects to control for order effects [35]. The four different orders of six treatments were as follows: (i) T1 − T2 − T3 − T4 − T5 − T6, (ii) T3 − T2 − T1 − T6 − T5 − T4, (iii) T4 − T5 − T6 − T1 − T2 − T3, and (iv) T6 − T5 − T4 − T3 − T2 − T1.
4.2 Data Acquisition.
We collected data on the choice of design point, , the choice of information source, mi ∈ {L, H}, and the choice of stopping, si which is 0 if the subject stopped after the iteration or 1 otherwise. We also recorded the related quantities such as gross payoff, fixed budget, functional performances, and the index of iteration i associated with every evaluation. In addition, we recorded the text of the subjects’ survey responses.
The descriptive statistics for different treatments are shown in Table 2. With the increase in the amount of fixed budget, the subjects performed more iterations and evaluated more number of high-fidelity information sources. When the incentive to save budget was in place, the number of iterations reduced. Higher number of iterations resulted in better performance on average as well as higher costs, as shown in Fig. 4, but the net payoff (performance minus cost) was lower. Net payoff increased with the incentive-to-save-budget. These observations highlight the usefulness of high fixed budget for improving design performance and that of the incentive-to-save-budget for reducing spending and improving net payoff.
Treatments | ||||||
---|---|---|---|---|---|---|
Attribute | T1 | T2 | T3 | T4 | T5 | T6 |
Number of subjects | 63 | 63 | 63 | 63 | 63 | 63 |
Total number of decisions | 1292 | 2100 | 2602 | 1155 | 1608 | 1743 |
Number of iterations per period | 6.8 (±1.7) | 11.1 (±3.5) | 13.8 (±4.7) | 6.1 (±1.6) | 8.5 (±2.9) | 9.2 (±3.0) |
Number of high-fidelity information sources per period | 1.0 (±0.6) | 2.7 (±1.2) | 4.3 (±1.8) | 1.0 (±0.5) | 1.8 (±1.1) | 2.3 (±1.7) |
Duration of iteration (s) | 11.5 (±8.7) | 11.1 (±6.6) | 11.6 (±7.6) | 11.5 (±8.1) | 10.6 (±6.1) | 11.1 (±9.3) |
Duration of period (s) | 76.7 (±30.8) | 123.6 (±54.2) | 159.3 (±70.0) | 68.9 (±28.4) | 89.4 (±40.6) | 101.9 (±50.3) |
Treatments | ||||||
---|---|---|---|---|---|---|
Attribute | T1 | T2 | T3 | T4 | T5 | T6 |
Number of subjects | 63 | 63 | 63 | 63 | 63 | 63 |
Total number of decisions | 1292 | 2100 | 2602 | 1155 | 1608 | 1743 |
Number of iterations per period | 6.8 (±1.7) | 11.1 (±3.5) | 13.8 (±4.7) | 6.1 (±1.6) | 8.5 (±2.9) | 9.2 (±3.0) |
Number of high-fidelity information sources per period | 1.0 (±0.6) | 2.7 (±1.2) | 4.3 (±1.8) | 1.0 (±0.5) | 1.8 (±1.1) | 2.3 (±1.7) |
Duration of iteration (s) | 11.5 (±8.7) | 11.1 (±6.6) | 11.6 (±7.6) | 11.5 (±8.1) | 10.6 (±6.1) | 11.1 (±9.3) |
Duration of period (s) | 76.7 (±30.8) | 123.6 (±54.2) | 159.3 (±70.0) | 68.9 (±28.4) | 89.4 (±40.6) | 101.9 (±50.3) |
5 Bayesian Model Comparison of the Descriptive Decision Models
In this section, we describe the variational Bayes approach to find approximations to posterior distributions of the model parameters and to estimate lower bounds to marginal log-likelihoods of the decision models conditional on the experimental data [36]. The variational Bayes approach is useful in complex stochastic models where analytical forms of posterior distributions are intractable. The model evidence lower bound (ELBO), that is, an approximation to the expectation of log posterior probability of the data, quantifies the support for a model, i.e., the accuracy with which a model represents the experimental data. We denote ELBO, say for model Mj, as .
It is assumed that all models are equally likely to represent the data a priori. To facilitate more intuitive explanation of the parameter estimates in the results, we transform the likelihood functions in Eqs. (15), (21), and (27) and take the weight parameters wm,2 and ws,2 out of the summation. This way the threshold parameters are given by wm,1/wm,2 and ws,1/ws,2.
A hierarchical form of a decision model is fitted to the experimental data for each treatment group separately, as described in Fig. 5. Hyperparameters are parameters of the prior distributions over model parameters. They characterize the group-level preferences of the subject population. The prior distributions of model parameters are functions of samples from hyperpriors, that is, priors over the hyperparameters. Subject-specific model parameters are independent samples from these prior distributions. With this setup, the group-level treatment effects and individualized treatment effects are implied from the posterior distributions of hyperparameters and models parameters, respectively. See Table 3 for the hyperpriors and the prior distributions.
Decision to choose | Decision to choose | Decision to choose | |
---|---|---|---|
Hyperpriors over hyperparameters | for CUCB | for DPP | |
Priors over model parameters | for FME, FRB, FECI, CUCB for FSN for FME, FECI, CUCB for FSN, FRB | for FME, FRB, FECI FSN, DPP for FME, FEI for FSN, DPP, FRB |
Decision to choose | Decision to choose | Decision to choose | |
---|---|---|---|
Hyperpriors over hyperparameters | for CUCB | for DPP | |
Priors over model parameters | for FME, FRB, FECI, CUCB for FSN for FME, FECI, CUCB for FSN, FRB | for FME, FRB, FECI FSN, DPP for FME, FEI for FSN, DPP, FRB |
The posterior distribution approximations and ELBO for decision models were estimated using automatic differentiation variational inference [38] algorithm in PyMC3 module of python [39]. This algorithm was run for 50,000 iterations, and among those, last 5000 iterations were used to calculate the average ELBO.
5.1 Estimates of Model Evidence Lower Bound.
The positive values of estimates of model ELBOs relative to random sampling in Fig. 6 highlight that the predictions of the decision models are more accurate than random predictions. The ELBO of random sampling for the decision to choose next design is assuming a uniform distribution function over the design space [−10, 10] and N is the training data size. It is Nlog (0.5) for the decisions to choose an information source and to stop assuming the probability of 0.5 for each of the two alternatives in both the decisions. We are able to compare the support for any two models, say j1 and j2, by comparing and , because remains constant in a given treatment.
For the decision to choose the next design, the UCB model and the CUCB model have the highest ELBOs.
From Result 1 and Fig. 6, we conclude that exploration during initial iterations while exploitation during later iterations, captured by UCB and CUCB models, is the most likely strategy for choosing the next design point. At low-budget treatments T1 and T4, the ELBO of the CUCB model is the highest, suggesting that the selection of a design point and an information source are interdependent. The subjects use low-fidelity observations for exploration and high-fidelity observations for exploitation.
For the decision to select an information source, the FSN model and the CUCB model have higher ELBOs than the other alternative models. The FSN model has the highest ELBO at low budget, whereas the CUCB model has the highest ELBO at medium and high budgets.
Selecting the first high-fidelity observation after a fixed number of iterations is the most likely strategy at low budget. However, with higher total budget, subjects also rely on the predictive mean to select whether to choose the high-fidelity information source. Evaluating high-fidelity observations at locations closest to the highest predictive mean is the most likely strategy for medium and high budgets.
For the decision to stop, the FRB model has the highest ELBO in all treatments, except in medium- and high-budget treatments of “save-remaining-budget” part where the DPP model has the ELBO similar to that of the FRB model.
According to the results in Fig. 6, the subjects stopped after exhausting the entire or part of the available budget in treatments T1, T2, T3, and T4. However, at medium and high budget in the “save-remaining-budget” part (treatments T5 and T6), the subjects stopped when the existing best performance from high-fidelity observations was closer to the highest mean prediction of the performance.
Results 1–3 also hold true for the test dataset based on computed lppd metrics in Fig. 6 and prediction accuracy scores in Fig. 7. Both the larger computed lppd and larger accuracy score imply better support for a model. The differences in prediction accuracy from different models are substantial given that lppd is defined in the logarithmic scale.
5.2 Effects of Fixed Budget and Payment Incentives.
The amount of fixed budget and the incentive-to-save-budget affect the posterior distributions of model parameters and, by implication, the subjects’ strategies for information acquisition decisions. The exceptions are the rate parameter for the EI, PI, and ECI models, where remains largely constant. Small means that the decision is random according to the model, whereas a large means the information acquisition function is closely followed. Posterior distributions suggest that the UCB and CUCB models have high ’s, which is consistent with the ELBO results. Appendix A.2 presents the posterior distributions of the model parameters. Specific observations about the subjects’ behaviors are as follows.
Exploration of design space increases with the increase in fixed budget.
As observed in Fig. 8, the mean posterior estimates of the exploration scale α increase with fixed budget between treatments. Note that the mean posterior estimates of α are similar at medium and high budget (treatments T5 and T6) in “save-remaining-budget” part of the experiment. This implies that there is a reduction in exploration when the subjects have an incentive to reduce budget spending and when large savings are possible.
There is an increase in the probability of selecting high-fidelity information source as the fixed budget increases, except for medium and high budgets in the “save-remaining budget” part.
We observe that the posterior distribution of the threshold parameter wm,1/wm,2 for the FSN model increases with the increasing fixed budget and decreases with the incentive-to-save budget. Theses posterior values on average equal the number of high-fidelity information sources provided in Table 2. The subjects choose a single high-fidelity observation after six samples at low-budget treatments (treatments T1 and T4). The average number of high-fidelity observations increases from 2.6 in treatment T2 to 4.2 in treatment T3.
The probability of stopping early at high values of remaining budget increases with an increase in fixed budget and with the incentive-to-save budget.
The result follows from the posterior distribution of model parameters in the FRB model. Figure 9 shows the posterior probability of stopping as a function of the remaining budget from the FRB model. The FRB model’s threshold parameter estimate for the “use-it-or-lose-it budget” part (mean posterior ws,1/ws,2 ≈ 1) is smaller than the cost of one high-fidelity observation (cL = 2), which implies that the subjects stop after exhausting almost the entire fixed budget. On the other hand, in “save-remaining budget” part, the mean estimates of ws,1/ws,2 increase as the fixed budget increases.
6 Discussion
6.1 Accuracy of the Models of Designers’ Decisions.
The results indicate that the simple heuristic models represent designers’ decisions in the sequential information acquisition process more accurately than the expected utility-based models. No single model captures all strategies exactly; however, the heuristic models with the highest ELBOs provide most accurate approximations to the subjects’ strategies. As a result of accurate predictions of the information acquisition decisions, the heuristic models also predict the performance more accurately than the expected utility-based models, e.g., EI, FECI, and FEI. To verify the results, we performed 150 simulation runs for the sequential information acquisition process using both a triplet of highest ELBO heuristic models and a triplet of the expected utility-based models. At each iteration i of a run, we quantified the current belief about the design performance using normalization of the highest predictive mean maxj=1≥j≤iμi(xj). A comparison of the predictions of these quantities with their actual values in the test dataset in Fig. 10 confirms that the highest ELBO heuristic triplet has better predictive strength than the expected-utility based triplet.
The heuristic models remain more likely to represent subjects’ decisions if the assumptions about the prior state of knowledge are changed. For example, when Gaussian priors with means 30 and 50 are implemented as the prior state of knowledge instead of the zero mean Gaussian prior in Eq. (4), the ELBO of the CUCB model still remains higher than that of the EI-based model for choosing next the design. The ELBOs of the CUCB model in treatment T1 for means 0, 30, and 50, respectively, are 1783, 1722, and 1771, whereas those of the EI model are 1977, 1826, and 1850. A possible reason for the high ELBO of the CUCB model is the interdependence between the decisions of choosing the next design and choosing an information source. Such interdependence is inevitable as the subjects have few available cues for most of the decisions [33].
6.2 Implications for Engineering Design.
On the objective functions in the test dataset, the heuristic triplet model generates higher design performance (gross payoff) than the EU-based triplet model. Figure 11 plots the gross payoff (Eq. (5)) for both the heuristic triplet and the EU-based triplet models. We observe that the EU-based triplet has poorer gross performance, especially in treatment T3. The gross performance of the heuristic triplet model improves with the high fixed budget and without the incentive for saving budget. That is because with the high fixed budget and without the incentive for saving budget, the heuristic model triplet completes larger number of iterations, conducts more design exploration, and receives better performance. Figure 12 provides evidence for this explanation. Therefore, if the goal is to maximize the gross design performance, the system designers should allow designers to spend a large fixed budget without any incentive to reduce spending. Under this incentive structure, there is a greater chance of finding the best design that maximizes the performance.
Despite the large gross payoff and the closeness to human design decisions, the heuristic models are less efficient in terms of the net payoff. More iterations of the heuristic triplet model result in higher total cost and therefore reduce the net payoff, i.e., the difference in the achieved performance and the total cost incurred until stopping, as shown in Fig. 11. We observe that the EU-based triplet model provides higher net payoff on average than the heuristic triplet models in all treatments. The difference in average net payoffs from the heuristic triplet model and the EU-based triplet model reduces with the decreasing fixed budget and with the incentive to save budget.
If the goal is to maximize the net payoff, system designers should restrict the amount of fixed budget or implement monetary incentives proportional to the saved budget. The latter option is more viable than the former if the appropriate amount of the fixed budget cannot be determined. Under monetary incentives for reducing spending, not only are the designers more likely to maximize the net payoff but also their decisions are more likely to be aligned with the expected utility-based models. The prediction accuracy score of the FEI model on the test data is larger in treatment T6 with the incentive-to-save budget than in treatment T3 without such an incentive. The same holds true for the training data where the FEI model’s accuracy scores in treatments T3 and T6 are, respectively, 0.71 ± 0.02 and 0.76 ± 0.024 (pvalue < 0.0001). Note that we implemented the incentive-to-save-budget by paying to the subjects the entire remaining budget they saved. However, its effects may likely be obtained by paying a smaller amount proportional to the remaining budget. This is because people have the comparative view of monetary benefits and prefer avoiding losses to acquiring equavalent gains; for them, failing to receive even a small potential benefit is a lost opportunity [9].
The implication for our understanding of human decision-making in engineering design is that designers may be more attentive to the design performance than the cost of design evaluations or the relative difference in two. Possible explanations of this observed gap may include high cognitive load associated with processing predictive uncertainty and estimating utility of the next design in relation to the cost of evaluation. It is also likely that the subjects are driven by intrinsic factors such as satisfaction from finding the best design and delivering the best outcomes for a given task. Further research is needed to determine the root causes for this observed trend.
7 Conclusion
In this paper, we present an approach that combines computational modeling and behavioral experiments to quantify designers’ decision strategies during the sequential information acquisition process. By using Bayesian inference, we observe that the heuristic models provide the best descriptions of subjects’ strategies for making information acquisition decisions. The subjects rely on simple cues accessible via graphical interfaces for making the most of the information acquisition decisions. This reliance on simple cues for making design decisions may be attributed to the relatively smaller cognitive effort involved in using simple cues. Moreover, the subjects’ decisions are affected by the amount of fixed budget and incentives to save budget. For example, the subjects select design points close to the highest upper confidence bound (UCB and CUCB models) when seeking to maximum design performance. The subjects mostly select a fixed number of low-fidelity and a fixed number of high-fidelity observations (FSN model) at low budget. At large budget, they query the low-fidelity source for evaluating high uncertainty regions (exploration in CUCB model) and the high-fidelity source for low uncertainty regions (exploitation in CUCB model). For stopping evaluations, the subjects exhaust entire or a fixed fraction of the fixed budget (FRB model), unless they are incentivized to save budget in which case they stop if the current best performance is marginally better than the mean of the predicted performance (DPP model).
The insights from the analysis have implications for engineering design research and practice. With the models that incorporate simple heuristics, researchers can quantify design performance in terms of designers’ decision strategies, as illustrated in Fig. 10. The applications of this include design crowdsourcing where game-theoretic models lack design process models [2,40] and the agent-based models of engineering systems design where characterization of quality as a function of designer effort is difficult to achieve [3]. Furthermore, system engineers and managers can set the fixed budget at low values or provide monetary incentives for reducing spending to nudge a designer’s decisions toward EU-based strategies, which are efficient for maximizing net payoff (design performance minus cost of evaluation).
The methodology used for eliciting decisions and estimating decision strategies is particularly suited for the embodiment phase of the design process. There is a need for further research to establish generalizability across contexts, problems, and populations. Some assumptions require further validation, e.g., Assumption 6, which states based on the existing studies in Refs. [41–43] that the Gaussian processes closely represent human information processing. Because this study utilizes student population and a short-run decision-making process, more empirical evidence is required to establish the generalizability of the results to engineers as designers and long-term design processes. Future descriptive modeling efforts need to account for context-dependent design situations, where decision strategies depend on the availability of problem-specific information or the lack thereof [44], an acceptable quantification of predictive uncertainty is absent [45], and the mapping between resources expended and the value of prototypes created varies across disciplines and knowledge domains [46]. Such design situations should include multiple objectives and/or multidimensional design parameters.
Acknowledgment
The authors gratefully acknowledge the financial support from the US National Science Foundation (NSF) CMMI (Grant No. 1662230).
Appendix
A.1 Design Performance Functions
Design performance functions are shown in Fig. 13.
A.2 Posterior Distributions of the Model Parameters
Posterior distributions of the model parameters are presented in Table 4.
Model parameters | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Decision to choose design x | ||||||||||||||||||
or | a or a′ | b or b′ | ||||||||||||||||
Model | T1 | T2 | T3 | T4 | T5 | T6 | T1 | T2 | T3 | T4 | T5 | T6 | T1 | T2 | T3 | T4 | T5 | T6 |
UCB | 12.87 (1.37) | 12.08 (0.65) | 10.0 (0.46) | 11.18 (2.38) | 11.62 (0.96) | 10.87 (1.08) | 5.37 (0.13) | 6.99 (0.14) | 7.12 (0.14) | 4.75 (0.12) | 6.24 (0.14) | 6.42 (0.15) | 0.15 (0.03) | 0.11 (0.02) | 0.1 (0.03) | 0.17 (0.03) | 0.11 (0.03) | 0.11 (0.03) |
CUCB | 12.81 (1.26) | 9.47 (0.75) | 6.96 (0.45) | 12.18 (1.11) | 9.58 (0.79) | 9.75 (0.84) | 5.72 (0.14) | 7.21 (0.15) | 7.8 (0.16) | 6.11 (0.14) | 7.14 (0.16) | 7.23 (0.16) | 0.13 (0.05) | 0.12 (0.05) | 0.1 (0.05) | 0.16 (0.04) | 0.12 (0.13) | 0.12 (0.05) |
PI | 1.47 (0.15) | 1.58 (0.11) | 1.58 (0.1) | 1.56 (0.17) | 1.32 (0.14) | 1.3 (0.14) | – | – | ||||||||||
EI | 1.92 (0.17) | 1.9 (0.11) | 1.95 (0.11) | 1.88 (0.16) | 1.72 (0.15) | 1.65 (0.13) | – | – | ||||||||||
ECI | 1.4 (0.14) | 1.32 (0.1) | 1.08 (0.08) | 1.35 (0.14) | 1.17 (0.12) | 1.19 (0.12) | – | – | ||||||||||
Decision to choose an information source | ||||||||||||||||||
wm,2 | wm,1/wm,2 | – | ||||||||||||||||
FSN | 7.14 (1.26) | 0.58 (0.05) | 0.39 (0.04) | 2.82 (0.37) | 0.5 (0.04) | 0.59 (0.05) | 5.38 (0.08) | 10.06 (0.25) | 10.89 (0.3) | 5.13 (0.07) | 8.57 (0.2) | 8.65 (0.22) | – | |||||
FRB | 0.2 (0.02) | 0.07 (0.0) | 0.03 (0.0) | 0.17 (0.01) | 0.06 (0.0) | 0.03 (0.0) | 2.79 (0.25) | 3.22 (0.27) | 4.03 (0.29) | 2.79 (0.25) | 2.66 (0.29) | 2.74 (0.3) | – | |||||
FME | 1.17 (0.08) | 0.95 (0.06) | 0.73 (0.05) | 1.05 (0.07) | 1.03 (0.06) | 0.93 (0.06) | 0.56 (0.08) | 0.83 (0.1) | 0.96 (0.11) | 0.52 (0.08) | 0.65 (0.09) | 0.56 (0.09) | – | |||||
FECI | 1.93 (0.12) | 2.81 (0.13) | 3.09 (0.17) | 1.69 (0.11) | 1.94 (0.11) | 1.92 (0.11) | 0.34 (0.05) | 0.52 (0.03) | 0.58 (0.03) | 0.34 (0.05) | 0.41 (0.05) | 0.39 (0.04) | – | |||||
CUCB | 0.57 (0.11) | 0.18 (0.02) | 0.16 (0.02) | 0.24 (0.03) | 0.23 (0.03) | 0.19 (0.02) | 1.48 (0.15) | 1.15 (0.25) | 2.28 (0.29) | 6.48 (0.3) | 6.05 (0.34) | 5.99 (0.39) | – | |||||
Decision to stop | ||||||||||||||||||
ws,2 | ws,1/ws,2 | – | ||||||||||||||||
FSN | 5.27 (0.6) | 1.4 (0.09) | 1.37 (0.1) | 2.31 (0.22) | 1.09 (0.07) | 1.15 (0.07) | 6.42 (0.04) | 11.45 (0.12) | 14.32 (0.14) | 6.08 (0.06) | 9.09 (0.13) | 9.85 (0.12) | – | |||||
FRB | 2.71 (0.33) | 1.71 (0.18) | 1.06 (0.13) | 1.86 (0.2) | 0.33 (0.03) | 0.1 (0.01) | 2.86 (0.19) | 3.3 (0.2) | 3.29 (0.24) | 4.35 (0.19) | 9.59 (0.36) | 9.01 (0.5) | – | |||||
DPP | 0.52 (0.12) | 0.25 (0.03) | 0.42 (0.03) | 0.27 (0.04) | 0.41 (0.06) | 0.41 (0.05) | −2.2 (0.19) | 1.35 (0.16) | 2.21 (0.11) | −2.61 (0.23) | −0.74 (0.19) | −0.16 (0.21) | – | |||||
FME | 1.16 (0.07) | 1.6 (0.06) | 1.79 (0.06) | 1.03 (0.07) | 1.39 (0.07) | 1.45 (0.06) | 0.41 (0.07) | 0.25 (0.04) | 0.2 (0.04) | 0.44 (0.07) | 0.32 (0.05) | 0.29 (0.05) | – | |||||
FEI | 1.83 (0.17) | 1.42 (0.1) | 1.01 (0.08) | 0.92 (0.07) | 1.26 (0.1) | 1.52 (0.14) | 2.07 (0.11) | 1.97 (0.12) | 2.05 (0.19) | 1.82 (0.12) | 1.99 (0.11) | 1.93 (0.11) | – |
Model parameters | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Decision to choose design x | ||||||||||||||||||
or | a or a′ | b or b′ | ||||||||||||||||
Model | T1 | T2 | T3 | T4 | T5 | T6 | T1 | T2 | T3 | T4 | T5 | T6 | T1 | T2 | T3 | T4 | T5 | T6 |
UCB | 12.87 (1.37) | 12.08 (0.65) | 10.0 (0.46) | 11.18 (2.38) | 11.62 (0.96) | 10.87 (1.08) | 5.37 (0.13) | 6.99 (0.14) | 7.12 (0.14) | 4.75 (0.12) | 6.24 (0.14) | 6.42 (0.15) | 0.15 (0.03) | 0.11 (0.02) | 0.1 (0.03) | 0.17 (0.03) | 0.11 (0.03) | 0.11 (0.03) |
CUCB | 12.81 (1.26) | 9.47 (0.75) | 6.96 (0.45) | 12.18 (1.11) | 9.58 (0.79) | 9.75 (0.84) | 5.72 (0.14) | 7.21 (0.15) | 7.8 (0.16) | 6.11 (0.14) | 7.14 (0.16) | 7.23 (0.16) | 0.13 (0.05) | 0.12 (0.05) | 0.1 (0.05) | 0.16 (0.04) | 0.12 (0.13) | 0.12 (0.05) |
PI | 1.47 (0.15) | 1.58 (0.11) | 1.58 (0.1) | 1.56 (0.17) | 1.32 (0.14) | 1.3 (0.14) | – | – | ||||||||||
EI | 1.92 (0.17) | 1.9 (0.11) | 1.95 (0.11) | 1.88 (0.16) | 1.72 (0.15) | 1.65 (0.13) | – | – | ||||||||||
ECI | 1.4 (0.14) | 1.32 (0.1) | 1.08 (0.08) | 1.35 (0.14) | 1.17 (0.12) | 1.19 (0.12) | – | – | ||||||||||
Decision to choose an information source | ||||||||||||||||||
wm,2 | wm,1/wm,2 | – | ||||||||||||||||
FSN | 7.14 (1.26) | 0.58 (0.05) | 0.39 (0.04) | 2.82 (0.37) | 0.5 (0.04) | 0.59 (0.05) | 5.38 (0.08) | 10.06 (0.25) | 10.89 (0.3) | 5.13 (0.07) | 8.57 (0.2) | 8.65 (0.22) | – | |||||
FRB | 0.2 (0.02) | 0.07 (0.0) | 0.03 (0.0) | 0.17 (0.01) | 0.06 (0.0) | 0.03 (0.0) | 2.79 (0.25) | 3.22 (0.27) | 4.03 (0.29) | 2.79 (0.25) | 2.66 (0.29) | 2.74 (0.3) | – | |||||
FME | 1.17 (0.08) | 0.95 (0.06) | 0.73 (0.05) | 1.05 (0.07) | 1.03 (0.06) | 0.93 (0.06) | 0.56 (0.08) | 0.83 (0.1) | 0.96 (0.11) | 0.52 (0.08) | 0.65 (0.09) | 0.56 (0.09) | – | |||||
FECI | 1.93 (0.12) | 2.81 (0.13) | 3.09 (0.17) | 1.69 (0.11) | 1.94 (0.11) | 1.92 (0.11) | 0.34 (0.05) | 0.52 (0.03) | 0.58 (0.03) | 0.34 (0.05) | 0.41 (0.05) | 0.39 (0.04) | – | |||||
CUCB | 0.57 (0.11) | 0.18 (0.02) | 0.16 (0.02) | 0.24 (0.03) | 0.23 (0.03) | 0.19 (0.02) | 1.48 (0.15) | 1.15 (0.25) | 2.28 (0.29) | 6.48 (0.3) | 6.05 (0.34) | 5.99 (0.39) | – | |||||
Decision to stop | ||||||||||||||||||
ws,2 | ws,1/ws,2 | – | ||||||||||||||||
FSN | 5.27 (0.6) | 1.4 (0.09) | 1.37 (0.1) | 2.31 (0.22) | 1.09 (0.07) | 1.15 (0.07) | 6.42 (0.04) | 11.45 (0.12) | 14.32 (0.14) | 6.08 (0.06) | 9.09 (0.13) | 9.85 (0.12) | – | |||||
FRB | 2.71 (0.33) | 1.71 (0.18) | 1.06 (0.13) | 1.86 (0.2) | 0.33 (0.03) | 0.1 (0.01) | 2.86 (0.19) | 3.3 (0.2) | 3.29 (0.24) | 4.35 (0.19) | 9.59 (0.36) | 9.01 (0.5) | – | |||||
DPP | 0.52 (0.12) | 0.25 (0.03) | 0.42 (0.03) | 0.27 (0.04) | 0.41 (0.06) | 0.41 (0.05) | −2.2 (0.19) | 1.35 (0.16) | 2.21 (0.11) | −2.61 (0.23) | −0.74 (0.19) | −0.16 (0.21) | – | |||||
FME | 1.16 (0.07) | 1.6 (0.06) | 1.79 (0.06) | 1.03 (0.07) | 1.39 (0.07) | 1.45 (0.06) | 0.41 (0.07) | 0.25 (0.04) | 0.2 (0.04) | 0.44 (0.07) | 0.32 (0.05) | 0.29 (0.05) | – | |||||
FEI | 1.83 (0.17) | 1.42 (0.1) | 1.01 (0.08) | 0.92 (0.07) | 1.26 (0.1) | 1.52 (0.14) | 2.07 (0.11) | 1.97 (0.12) | 2.05 (0.19) | 1.82 (0.12) | 1.99 (0.11) | 1.93 (0.11) | – |
Note: Columns T1, T2, T3, T4, T5, and T6 denote different experiment treatments.