Abstract

We propose a nested weighted Tchebycheff Multi-objective Bayesian optimization (WTB MOBO) framework where we built a regression model selection procedure from the ensemble of models, toward better estimation of the uncertain parameters (utopia) of the weighted Tchebycheff expensive black-box multi-objective function. In our previous work, a weighted Tchebycheff MOBO approach has been demonstrated which attempts to estimate the model parameters (utopia) in formulating the acquisition function of the weighted Tchebycheff multi-objective black-box functions, through calibration using an a priori selected regression model. However, the existing MOBO model lacks flexibility in selecting the appropriate regression models given the guided sampled data and, therefore, can under-fit or over-fit as the iterations of the MOBO progress. This ultimately can reduce the overall MOBO performance. As, in general, it is too complex to a priori guarantee a best model, this motivates us to consider a portfolio of different families (simple-to-complex) of predictive models that have been fitted with current training data guided by the WTB MOBO, and the best model is selected following a user-defined prediction root-mean-square error-based approach. The proposed approach is implemented in optimizing a thin tube design under constant loading of temperature and pressure, minimizing the risk of creep-fatigue failure and design cost. Finally, the nested WTB MOBO model performance is compared with different MOBO frameworks with respect to accuracy in parameter estimation, Pareto-optimal solutions, and function evaluation cost. This approach is generalized enough to consider different families of predictive models in the portfolio for best model selection, where the overall design architecture allows for solving any high-dimensional (multiple functions) complex black-box problems and can be extended to any other global criterion multi-objective optimization methods where prior knowledge of utopia is required.

1 Introduction

In the early design phase, it is very important for the designers to be able to identify potential good design decisions in a large design space while the design cost is low. In practice, most of the design problems are too complex to be handled by simple optimization frameworks due to having constraints in cost, time, formulation, etc. This technical brief considers problems having black-box objective functions with high function evaluation costs. When we have no or limited knowledge of the expensive true objective function, we cannot guarantee the maximization of our learning toward an optimal solution without proper guidance or expertise. In such black-box engineering design problems, a Bayesian Optimization technique (BO), which eliminates the need for standard formulation of objective functions [13], is widely applied in sequential learning to provide better-guided design sampling to minimize expensive function evaluations in finding the optimal region of the unknown design space.

1.1 Research Motivation.

In our previous work [4,5], a design architecture to solve multi-objective black-box problems—weighted Tchebycheff multi-objective Bayesian optimization (MOBO)—is demonstrated, where the unknown utopia values (model parameters) are estimated iteratively using a priori selected predictive model. The utopia point is the optimum of each objective individually and is needed for the formulation of the global criterion multi-objective methods such as the weighted Tchebycheff (WTB). The stated framework reduces the model complexities, by reducing the high-dimensional function space to a one-dimensional (1D) multi-objective function space in formulating acquisition function, and also increases the overall Pareto-optimal solution accuracy from estimating utopia instead of considering an educative guess [6,7]. However, this existing architecture lacks flexibility as a simple linear regression model has been predefined for utopia estimation at each iteration of the MOBO. Obviously, the simple regression model can under-fit if the unknown objective function is very complex (nonlinear). To avoid this, on the other hand, predefining a complex predictive model can lead to overfitting if the unknown objective function is truly simpler. In both cases, the error in the utopia estimation increases.

In the WTB, at a given vector of weighting factors w on two objectives f1 and f2, the method forms a rectangle and searches along the diagonal to find the Pareto-optimal solution on the Pareto frontier curve. Let us assume, in Fig. 1, that the true Pareto-optimal solution lies at (C) where true utopia (u) is shown. If the location of (u) is unknown and is estimated to be at position (u′) (in X or Y), we see with the same weighting factor (w) that the WTB will now project the Pareto-optimal solution to C′ (the diagonal is shifted). Thus, we cannot find the true optimal solution (C) for weighting factor (w), if we use the incorrect estimate for the utopia of u′. In the domain of expensive black-box problems, we do not find the true utopia and need to settle with a cheap surrogate model to estimate the same; however, the goal would be to maximize the accuracy of estimation. With increasing the error in estimating utopia value, for instance, due to inappropriate selection of predictive models, we can observe an increase in deviation of the optimal solutions from the desired trade-off between the objectives; thereby, the MOBO model may perform poorly. For obvious reasons, it is hard to know a priori what the true nature of the black-box objective functions could be, thus challenging to a priori select an appropriate predictive model. Also, in MOBO, as the data sampling is done sequentially toward finding the multi-objective optimal region, the performance of different simple-to-complex regression models can vary iteratively depending on the available data at each iteration.

Fig. 1
Incorrect utopia values leading to deviate from true Pareto-optimal solution
Fig. 1
Incorrect utopia values leading to deviate from true Pareto-optimal solution
Close modal

1.2 Research Contribution.

Attempting to improve the model performance, we introduce a predictive model selection approach nested into the existing weighted Tchebycheff MOBO. With iteratively augmented training data in the MOBO, we estimate the utopia values from the iteratively selected cheap surrogate predictive models through this nested design. We call this the nested weighted Tchebycheff MOBO. For simplicity, we also refer to this as nested MOBO. Our main goal is to propose a design architecture for multi-objective black-box problems in which one properly utilizes the existing sampled training data during model calibration and improves in predicting model uncertain parameters (e.g., utopia) rather than having a fixed predefined predictive model, by adding flexibility to choose the current best model from the portfolio of ensemble of predefined models. Another goal is to generalize the model comparison and selection approach where different families of predictive models can be put into consideration for building the portfolio. To illustrate the approach, we consider a numerical test problem and an engineering problem: cyclic pressure-temperature loaded thin tube design where we will be able to compare the results obtained from the proposed model with the true solutions from an exhaustive search. The roadmap of this technical brief is as follows: Sec. 2 provides a brief literature review. Section 3 provides the general description of the proposed nested weighted Tchebycheff MOBO or nested MOBO design architecture. Section 4 describes briefly the case studies. Section 5 illustrates the comparative results of the nested weighted Tchebycheff MOBO among other design architectures in terms of different performance parameters. Section 6 concludes the report with final thoughts.

2 Literature Review

The Bayesian optimization approach [813] has two major components: a predictor or Gaussian process model [1416], and the acquisition function [8,1724]. As shown in Fig. 2, starting with a Gaussian prior, we build a posterior Gaussian process model (GRM), given the currently available data from the expensive function evaluations. The surrogate GPM then predicts the outputs of any unsampled designs within the region of feasible design space. The uncertainty of these outputs near the observational data is small and increases as the unsampled designs are farther away from the observational data, thereby related to kriging models where the errors are not independent. The acquisition function strategically guides toward the best design locations through exploration/exploitation for future evaluations, defined from the posterior simulations which are obtained from the GP model.

Fig. 2
Bayesian Optimization Framework
Fig. 2
Bayesian Optimization Framework
Close modal
This technical brief builds upon our prior work on numerical optimization methods integrated into a multi-objective Bayesian Optimization (MOBO) architecture. The most fundamental numerical optimization algorithm in solving a general multi-objective problem (MOO) is the Weighted Sum Method (WS) [25] where we transform all the objectives into a single objective of weighted objectives. The method though simple and fast is inefficient to find the true Pareto-optimal points, mainly in the non-convex region. To improve the performance, a global criterion method, the WTB method [2628] is introduced where the multi-objectives are combined into a weighted distance metric that guarantees to find all the Pareto-optimal solutions. While there are other methods like those of Refs. [2936], it is a challenging task to pick the best approach to Multi-objective optimization (MOO). Here, our research is focused upon the use of the weighted Tchebycheff method since our goal is to (a) convert the multi-dimensional function space to a single weighted multi-objective function space which reduces the computational cost of acquisition function of MOBO, and (b) accurately represent an arbitrary non-convex Pareto Frontier which indicates the use of the WTB method. Readers are pointed to our previous work [5] for a more detailed discussion and review of BO, MOBO, and MOO approaches and desirable features of the WTB approach. The general formula of the WTB method, for the objectives that are not black-box, is as follows:
(1)
where Yi is the ith objective function, ui is the utopia value of ith objective, wi is the trade-off weight associated with ith objective.

3 Design Methodology

In this section, we present the design methodology of the proposed nested weighted Tchebycheff Multi-objective Bayesian optimization (MOBO). The outer loop of the proposed design architecture defines the existing Bayesian optimization [5], solving for multiple objectives, using the weighted Tchebycheff-based acquisition function. We summarized the outer loop in the Supplementary material—Appendix A (available in the ASME Digital Collection). In this technical brief, we present the inner loop structure which is the proposed integration with the existing architecture.

Inner Loop of Nested WTB MOBO: Model Selection for unknown parameter estimation

Here, we describe broadly the model selection algorithm to estimate utopia values following a prediction root-mean-square error approach, which is nested in the weighted Tchebycheff MOBO (Step 3, Algorithm 1a). We will call this the inner loop of the weighted Tchebycheff MOBO. Figure 3 shows the detailed flowchart of the model selection procedure. The key point in the procedure is to compute two criteria, which together define the model selection criterion. In Fig. 3, the steps (blocks) highlighted in blue, green, and red are involved in criterion 1, criterion 2, and both criteria 1 and 2 respectively. The selection procedure is based on the research objective to answer how good a certain model has estimated the certain parameters (utopia values); we do not consider the complexity of the model as part of the selection criteria. Our work aims to add flexibility to the model comparison among different families of models. In addition to classical linear regression models, we have also nonlinear predictive and classifier models, together with Bayesian linear regression models for estimation methods. The comparison ignores the trade-off between model accuracy and computational cost due to the increase of model parameters (e.g., number of regression co-efficient). This is because, in the proposed BO framework, the computational cost for regression model fitting is negligible compared to the function evaluations from any expensive black-box design problems. It is to be noted though our case studies described are not “truly” but assumed expensive for the sake of the proof of concept, the method presented in this technical brief has been focused on solving any expensive black-box designs.

Fig. 3
Model selection algorithm to estimate utopia (inner loop of the MOBO)
Fig. 3
Model selection algorithm to estimate utopia (inner loop of the MOBO)
Close modal

3.1 Criteria 1: Global Improvement.

The first criterion to consider in the model selection procedure is the overall global improvement by the model to estimate the designs across the feasible design space. Thus, criterion 1 is focused on selecting a model at iteration k of the nested MOBO model, which gives the best fit in general. With a best fit in general, the model has a higher likelihood to provide a better estimation of utopia values. Algorithm 1a describes the steps to follow for computing criteria 1:

Algorithm 1a Inner loop of nested weighted Tchebycheff MOBO: Computing Criteria 1

  • Step 1: Training Data: We define X as design input variables and Y as output functions. Create the feasible sampled data matrix, assuming at iteration k of the MOBO, Df,k = [Xk, Y(Xk)] and Df,kDk. Define the ensemble of models to estimate utopia as M, which build the model portfolio. The portfolio of models for each utopia can be different, thus adding flexibility to the choice of models for different functions.

  • Step 2: Conduct Monte-Carlo cross-validation:

    • Step 2a. Split Df,k into two subsets of training Df,kT and validation Df,kV data sets without replacement at the proportion of pT and pV, respectively, where pT + pV = 1.

    • Step 2b. Given Df,kT, fit all the predefined regression models to estimate objective i, Mr,i,k, where r is the defined regression model number.

    • Step 2c. Estimate the objectives and create vectors of μk^(Yi) for each input design in Df,kV, given the regression model Mr,i,k.

    • Step 2d. Validate the estimated objectives in 2c with the true objective values in Df,kV. Thus, calculate mean-squared-error ɛ1,i,l to estimate objective i and at iteration l of the cross-validation for all the input designs XkV in Df,kV
      (2)
    • Step 2e. Repeat Step 2a.–2d. for L times. In this case study, L = 100.

  • Step 3: Define Criteria 1: Calculate root-mean-square of the vectors MC cross-validated mean-squared-errors of ith objective, ɛ1,i,.. Thus, criterion 1 at iteration k of MOBO for ith objective can be stated as
    (3)

3.2 Criteria 2: Local Improvement.

The second criterion is the local improvement by the model specific to our region of interest in the design space, which is the utopia region. It is to be noted that the purpose of the regression model is to have a good estimate of the utopia region, as a large error in any other region is not going to impact the MOBO model performance. Though selecting a model with a good overall prediction accuracy as in criterion 1 has a likelihood of better estimation of the utopia region as well, it does not provide a guarantee of the best estimation among other models. In other words, comparing fitted models 1 and 2 as in Fig. 4(a) using the training data, although model 1 has a higher error for the validation data, it has a lower error in predicting utopia design. In this example, we see the overall better fit model 2 has a high error at the utopia region, which is the region of interest. Thus, along with the global improvement, we have also focused on the second criterion for reducing the estimation error specifically in the utopia region. However, in doing so, the challenges lie that since the utopia design is unknown (red dot in Figs. 4(a) and 4(b)), we will not know the true values of the objectives. This restricts the straightforward MC cross-validation as in Sec. 3.2.1.

Fig. 4
(a) Model Comparison: Model 1 has a higher error on predicting validation data, but a lower error on predicting utopia. Here, models 1 and 2 are different regression models. (b) Objective for Criterion 2: To minimize the error ξ^ of predicted utopia between model 2 (fitted with all available data) and model 1 (fitted with only subsampled training data). Here, models 1 and 2 are same regression model, but since model 1 is fitted with more data (more knowledge), it is assumed to be more likely to get closer toward true utopia.
Fig. 4
(a) Model Comparison: Model 1 has a higher error on predicting validation data, but a lower error on predicting utopia. Here, models 1 and 2 are different regression models. (b) Objective for Criterion 2: To minimize the error ξ^ of predicted utopia between model 2 (fitted with all available data) and model 1 (fitted with only subsampled training data). Here, models 1 and 2 are same regression model, but since model 1 is fitted with more data (more knowledge), it is assumed to be more likely to get closer toward true utopia.
Close modal

To mitigate this issue, we follow the assumption that fitting a model with more data will have a higher likelihood of better estimation. Thus, the estimation error of the model, after fitting with the full feasible sampled data set Df,k is likely to be lower than that when fitted with the feasible subsampled data set Df,kT. Following this, we assume that the reference utopia values for cross-validation are the estimated utopia values that fit with the full feasible sampled data set Df,k (denoted by red dot on model 2 regression line in Fig. 4(b)) and the criterion 2 is to select the model which minimizes the predicted error ξ^ in Fig. 4(b). Algorithm 1b describes the steps to follow for computing criterion 1:

Algorithm 1b Inner loop of nested weighted Tchebycheff MOBO: Computing Criterion 2

  • Step 1: Training Data: Same as stated in Step 1, Algorithm 2a. In addition, let Xf be the feasible unsampled grid matrix for which the objective values are unknown.

  • Step 2: Estimate utopia with full data:

    • Step 2a. Given Df,k, fit all the predefined regression models to estimate objective i, M_r,i,k, where r is the defined regression model number.

    • Step 2b. Estimate the objectives and create vectors of μk_^(Yi) for each unsampled design Xf, given the regression model M_r,i,k.

    • Step 2c. Estimate the utopia of objective i, as the minimum of vectors μk_^(Yi):
      (4)

Store the estimated utopia μk_^(ui) and the respective design values, xf,i,k_Xf.

  • Step 3: Conduct Monte-Carlo cross-validation:

    • Step 3a. Consider the same training Df,kT subsampled dataset and the fitted model Mr,i,k as in Steps 2a and 2b, Algorithm 2a.

    • Step 3b. Estimate the objective or utopia values μk^(ui) for respective input design xf,i,k_, given the regression model Mr,i,k.

    • Step 3c. Validate the estimated utopia values in 3b with the estimation done with full samples in 2c. Thus, calculate squared-error ε2,i,l to estimate the utopia of objective i and at iteration l of the cross-validation
      (5)
    • Step 3d. Repeat Steps 3a.–3c. for L times. In this case study, L = 100.

  • Step 4: Define Criterion 2: Calculate root-mean-square of the MC cross-validated squared-errors of ith objective, ɛ2,i,.. Thus, the criterion 2 at iteration k of MOBO ith objective can be stated as
    (6)

Finally, the combined model selection criteria (among r models) for ith objective to estimate utopia in the minimum of the addition of normalized values of Eqs. (3) and (6) and can be stated as
(7)
where the optimal estimated solution is μk^(ui)opt=μk_^(ui|M_i,k,opt). This value is inputted into the weighted Tchebycheff black-box objective function (Eq. (9)) for MOBO model calibration.

4 Case Studies

In this section, we describe one benchmark problem and one engineering design problem:

  • 2D Six-hump camel back function and Inversed-Ackley's Path function [37].

  • Thin tube design problem.

We introduce our implementation and illustration of the results of the proposed nested weighted Tchebycheff MOBO to these problems, to aid in understanding the predictive model selection approach, toward better estimation of the unknown parameters in the proposed surrogate or meta-model based design architecture (MOBO) for solving expensive/complex black-box multi-objective problems. We considered the same engineering problem for this technical brief as our earlier paper [5], in order to build the performance comparison between the existing weighted Tchebycheff and the proposed “nested” weighted Tchebycheff MOBO. Due to the limited scope of the technical brief, we have provided the descriptions of the test case problems in the Supplementary Material—Appendix B (available in the ASME Digital Collection). The detailed computation of the whole process for the thin tube engineering problem can be found in Ref. [38]

Description of the Model Portfolio

In this technical brief, we focus on the ensemble of simple-to-complex regression models, building an example of a model portfolio. Since the true nature of the design space or unknown objectives is assumed unknown (black-box), we have considered an ensemble of different flavors of regression models to understand their selection as the iteration of the MOBO progress. In total, we have considered seven models to build the portfolio of models, used in regression analysis for estimation and prediction of outputs for given independent variables, including the following:

  1. Mean model (MM),

  2. Multiple linear regression model (MLR),

  3. Log-transform of multiple linear regression model (log-MLR),

  4. Bayesian multiple linear regression model (BMLR) [39,40],

  5. Second-order polynomial model (SOP) (or quadratic model),

  6. Support Vector Machine regression model (SVMR) [41,42] and

  7. GPM [43,44].

As previously stated, the design architecture is not constrained to implement only these seven models, and any regression models can be opted out or introduced into the nested MOBO framework at the start of the optimization (with any prior educative guess from existing similar problems and when the knowledge is very limited) or at the mid or later stage of the optimization (when we have better knowledge). However, for the sake of simplicity in model comparison, we have considered these models throughout the optimization process, assuming we have limited knowledge of the nature of the objectives to start with. Next, we have presented the formulation of fitting these models, using the sampled data in our case study. As our goal is to predict the utopia which is the optimal solution of an objective function independent of other objectives, we have considered selecting n = 2 regression models for learning n = 2 objectives independently among the ensembles. For the limited scope of the technical brief, we have provided the detailed formulation of each regression model in the Supplementary Material—Appendix C (available in the ASME Digital Collection).

5 Results

In this section, we are going to present and discuss the results of comparing the proposed nested weighted Tchebycheff MOBO model with other design architectures at convergence with respect to three major performance criteria. Those are as follows: (1) to maximize the overall prediction accuracy of utopia values, (2) to minimize the number of expensive function evaluations until convergence of MOBO, and (3) to maximize the overall accuracy of converged to true Pareto-optimal solutions. We consider weighting factors on objectives distance and cost functions as w1 = [0, 0.1, …, 1] and w2 = 1 − w1, respectively. We used the DACE package [45] in matlab for the regression and surrogate GP models. For fitting other regression models, we have used matlab functions as fitlm (for fitting MM, MLR, log-MLR and SOP or Quadratic), bayeslm (for fitting BMLR), and fitrsvm (for fitting SVMR). The full nested weighted Tchebycheff MOBO model with Expected Improvement type acquisition function has been coded in matlab 2018 and run in a machine with a configuration of Windows 10, Intel Processor 3.4 GHz, and 16 GB RAM.

In this technical brief, we present a detailed analysis and comparison to quantify the improvement of the modification of our existing architecture, in terms of the trade-off between accuracy and cost of function evaluations. However, in order to showcase a simple analysis of the stated benchmark problem, we provided the respective result in the Supplementary Material—Appendix D (available in the ASME Digital Collection).

Case study: Cyclic pressure-temperature loaded thin tube design problem

Here, we present the results for the multi-objective thin tube design problem as referred to in Supplementary Material (available in the ASME Digital Collection). Here, we have started with investigating the proportion of each regression model selection to capture the nature of the objectives for estimation of the utopia.

5.1 Discussion on the Proportion of Models Selected by Nested MOBO.

Figure 5 shows the proportion of selecting each of the predefined seven models for model calibration to estimate the model parameters (utopia) toward multi-objective optimization of the thin tube design over all iterations until convergence of the MOBO. To estimate the utopia value for objective Eq. (B.3) (available in the Supplemental Materials), we see that model selection varies among log-MLR, BMLR, SOP, and GP with a higher percentage of log-MLR and SOP. As the weight of objective 1 increases, we see the model selection of utopia 1 shift from log-MLR to SOP, thus trading off toward the linear model with higher complexities. Furthermore, we do not see any selection of MM, MLR, and SVMR (linear kernel). Thus, changing the kernel function affects the selection of the SVMR model significantly as now the model is inefficient to capture any non-linearity of the objectives. This shows the nature of the objective is not fully linear, and therefore, relatively simpler linear models considered in this case study are not appropriate here. However, we see a small proportion of the Bayesian linear regression model due to its superiority over its frequentist version as it contains the prior information of the regression coefficients. This result agrees with our linearity validation of the objectives when the MOBO has been calibrated with only MLR models where the assumption was not perfectly met. One interesting observation is when we optimize with the full preference on objective 1 with w11 = 1 (bottom middle figure), we see almost the whole proportion shifted to the GP model. This is because of the special case that the multi-objective acquisition function of the MOBO framework also guides the sampling at the objective 1 utopia region as we are giving importance entirely to minimizing objective 1. Thus, eventually with more sequential sampling in the utopia region, the architecture is flexible to use the benefit of error dependency of the Gaussian process as the prediction error of utopia will be much lower. This is the same reason why we see such a high percentage of GPM selection to estimate utopia for objective Eq. (B.4) (available in the Supplemental Materials). We started this MOBO with the sampling done during the pre-optimization stage where the objective is to locate the unknown creep-fatigue failure constraint (Eq. (B.5) available in the Supplemental Materials). This region is at the utopia of objective 2 since the minimization of the cost of tube will maximize risk, which eventually converges toward the creep-fatigue failure constraint. Thus, the architecture has the flexibility to choose regression models for estimation and calibration of the MOBO based on the starting samples, weighting preferences of the multiple objectives and available sequential sampling guided by acquisition function.

Fig. 5
Thin tube design problem: Proportion of regression models selected till nested MOBO convergence at various weighting factors for thin tube design problem. Here, the model portfolio includes the following: (1) MM, (2) MLR, (3) log-MLR, (4) BMLR, (5) SOP (or quadratic model), (6) SVMR with linear kernel, and (7) GPM.
Fig. 5
Thin tube design problem: Proportion of regression models selected till nested MOBO convergence at various weighting factors for thin tube design problem. Here, the model portfolio includes the following: (1) MM, (2) MLR, (3) log-MLR, (4) BMLR, (5) SOP (or quadratic model), (6) SVMR with linear kernel, and (7) GPM.
Close modal

5.2 Comparison of Different MOBO Architectures.

Table 1 shows the overall quantitative measurement of the performance of different MOBO design architectures across the weighting factors, w1, in terms of the stated performance criteria for the thin tube problem. Figure 6 is the visualization of the performance of different architectures at each weighting factors, w1. The true maximum values for both objectives of the thin tube problem are 0.1764 and 0.3545, respectively. Similarly, the true Pareto-optimal solutions are obtained from the exhaustive search with true utopia values. The different architectures specified in Table 1 are listed below, whereas in the existing architectures D and E, the utopia (unknown parameter) estimation is done with a priory selected regression model. Also, the line connecting black stars * in Figs. 6(a) and 6(c) are the respective MOBO model performance without any calibration or iterative estimation of utopia, considering a fixed value of (0, 0): this assumption leads to the worst estimation (as has been addressed in our earlier paper). In this report, we are drawing comparisons among the architectures where estimation of utopia has been done, but with different procedures.

  • Architecture A: nested MOBO with model selection criterion 1.

  • Architecture B: nested MOBO with model selection criterion 2.

  • Architecture C: nested MOBO with model selection criteria 1 and 2 (proposed).

  • Architecture D: MOBO model integrated with MLR (without an ensemble of models).

  • Architecture E: MOBO model integrated with BMLR (without an ensemble of models).

Fig. 6
Thin tube design problem: (a) Euclidean norms between predicted and true utopia values for w1 = [0, 0.1, …, 1]. (b) Total MOBO guided function evaluation till convergence for w1 = [0, 0.1, …, 1]. (c) Euclidean norms between predicted and true pareto-optimal values for w1 = [0, 0.1, …, 1].
Fig. 6
Thin tube design problem: (a) Euclidean norms between predicted and true utopia values for w1 = [0, 0.1, …, 1]. (b) Total MOBO guided function evaluation till convergence for w1 = [0, 0.1, …, 1]. (c) Euclidean norms between predicted and true pareto-optimal values for w1 = [0, 0.1, …, 1].
Close modal
Table 1

MOBO design architectures performance comparison for thin tube design

ArchitectureABCDE
Euclidean norm (utopia)Mean0.10.1450.0830.1030.057
Std.0.0340.1150.0470.1060.021
Func. eval.Mean492487508596690
Euclidean norm (Pareto-optimal)Mean0.0680.0910.070.1040.078
Std.0.030.0330.030.0380.039
ArchitectureABCDE
Euclidean norm (utopia)Mean0.10.1450.0830.1030.057
Std.0.0340.1150.0470.1060.021
Func. eval.Mean492487508596690
Euclidean norm (Pareto-optimal)Mean0.0680.0910.070.1040.078
Std.0.030.0330.030.0380.039

As we investigated Table 1, architectures A, C (proposed), and E are the competitive ones. Comparing our proposed architectures C (considering model selection) with E (considering fixed BMLR), the overall accuracy in utopia estimation, though E has better estimation, is very equivalent as the difference in the accuracy is within the margin or errors. However, E has almost 200 excessive function evaluations compared to C which can be viewed as a significant increase in cost when solving an expensive black-box design problem. Next, comparing both A and C (considering model selection with different selection criteria), we get very close results in all three performance criteria. Thus, we cannot say any architecture clearly supersedes all the others in Table 1. Thus, to measure the performance from the comparison done based on the three criteria, Table 1 is converted into scores as per Table 2. The score does not only tell the rank of the design architectures but also provides the measure of how close or far way the performance of architecture is from the best among them. It is to be noted we calculate the scores only for the mean Euclidean norms, but not for the standard deviation of the same. This is because we give first preference to minimizing the mean norms, as the design architecture having lower mean norms and higher standard deviations are better than higher mean norms and lower standard deviations. In case we have the same mean Euclidean norms for two architectures, the scores for the standard deviation of the norms come into play to break the tie.

Table 2

Design architecture performance metric (scores)

ArchitectureABCDE
Criterion 1: Utopia prediction accuracy (Mean Euclidean norm)51.1070.547.7100
Criterion 2: Function evaluation97.510089.746.30
Criterion 3: Pareto-optimal solution accuracy (Mean Euclidean norm)10036.194.4072.2
Total scores (out of 300)248.6136.1254.694172.2
ArchitectureABCDE
Criterion 1: Utopia prediction accuracy (Mean Euclidean norm)51.1070.547.7100
Criterion 2: Function evaluation97.510089.746.30
Criterion 3: Pareto-optimal solution accuracy (Mean Euclidean norm)10036.194.4072.2
Total scores (out of 300)248.6136.1254.694172.2

Note: Scores are between 0 and 100 with 100 being the best and is calculated from individual rows (1, 3, and 4) of Table 2, following the equation: sx=100(xxminxmaxxmin×100).

From Table 2, we can see existing architecture D is among the lower scores in every criterion, with an overall worst performance (lowest total score) as the linearity assumption was found not to be met. Architecture E has the best scores in predicting utopia (criterion 1); the next best architecture is the proposed architecture C with a score of 70.5. Architecture A is behind architecture E, with half the performance of the best architecture E. In reducing the expensive function evaluation cost (criterion 2), architectures A, B, and C clearly outperform architectures D and E, which demonstrates the value of flexibility in selecting regression models with much faster convergence in reaching optimal solutions. Although architecture B has the best scores, the proposed architecture C is close in performance.

With respect to the accuracy of Pareto-optimal solutions (criterion 3), architectures B and D lag by significant margins. Both architectures also had the worst scores in predicting the utopia, which intuitively makes sense as an incorrect prediction of utopia has more likelihood to give incorrect Pareto-optimal solutions. Also, architecture B, as in the benchmark problem, turned out to be the worst among A and C. Thus, for both the case studies, we see that although the region of interest is the utopia region, providing some priority on overall good fit in the selection criteria of the regression model is also necessary. Another interesting comparison between A and C is that although the architecture C performance score was much higher in predicting the utopia, it attains a slightly lower score in the accuracy of Pareto-optimal solution than A. However, we see both architectures have the same absolute error of 0.076. Thus, this could be due to another scenario where the utopia prediction is incorrect but along the direction of the weight combination as shown in Appendix Fig. E.2 (see Supplemental Material). However, an infinite number of such values are possible which is difficult to know, and while incorrect utopia predictions may lead to accurate Pareto solutions (by chance), the focus of this work is to estimate closer to the true utopia to generalize the design architecture for solving any similar problems, not only for this case study. The proposed architecture C is the second-best in criterion 3 only slightly behind the best architecture A. Finally, we see that the proposed architecture C has not out-performed in any of the performance criteria, but has the best all-around performance with the best score of 254.6. The next best is architecture A. Comparing the standard deviations, we see the same standard deviation for both A and C in the accuracy of Pareto-optimal solutions. Thus, in a general sense, we can see that the integration of the flexibility of model selection strategy from portfolio to build a “nested” WTB MOBO surpassed the performance of the existing rigid WTB MOBO architecture.

6 Conclusion

In this technical brief, we presented a nested weighted Tchebycheff multi-objective Bayesian optimization framework, where the parameter (utopia values) of the acquisition function of the weighted Tchebycheff multi-objective function is estimated from cheap regression analysis in order to calibrate the MOBO for better performance, with only iteratively available sampled data as guided by the acquisition function. The utopia estimation is done from the chosen regression model among a predefined portfolio of models with various complexities based on proposed selection criteria. The complete model selection procedure is nested with the MOBO and is formulated to run iteratively as part of the model calibration. The results from the case study of cyclic pressure-temperature load thin tube design problem with two objectives of minimizing risk and cost show that the introduction of the flexibility in model selection from a portfolio of models for calibration, when we cannot or too complex to find one best model, has given a much better all-round performance in better estimation of utopia, faster convergence to locate Pareto-optimal solutions and finally better accuracy in locating the Pareto-optimal solutions. The proposed nested MOBO architecture is applicable to any black-box multi-objective optimization problems with minimal or no increase of model complexities with a higher number of multiple objectives; no restriction to define any specific number or families of regression models in the selection procedure; the flexibility to compare between two totally different models for utopia estimation (like here with MLR vs. SVMR or GPM); add/remove or change any regression models other than the defined for this case study either at the start (based on prior expert opinions or information gained from historic analysis) or at any iteration of the MOBO (with sequential learning on the nature of the black-box objectives). The two selection criteria in choosing the regression model for estimation worked efficiently when coupled together in the nested MOBO design architecture. Though considering only selection criterion 1 in the nested MOBO competes well with the same considering both selection criteria, selection criterion 2 is still important to consider for this problem as the ultimate goal is focused only on efficient prediction of the utopia point.

Based on the analysis from the results discussed, the proposed design architecture can be improved further by introducing bias in the weighting combination (current default setting is equal weights) between the model selection criteria 1 and 2, which will be addressed in the future. Aside of this, the architecture can be further improved by integrating an automated predefined sampling scheme of weighting factors of the objectives in finding the respective pareto solutions. Although the focus of this technical brief is on the weighted Tchebycheff method, the architecture can be easily extended to any other global criterion multi-objective optimization methods where prior knowledge of utopia is required. Finally, the full framework will be implemented in the complex high-dimensional design of a diffusion bonded heat exchanger.

Acknowledgment

This research was funded in part by DOE NEUP DE-NE0008533. The opinions, findings, conclusions, and recommendations expressed are those of the authors and do not necessarily reflect the views of the sponsor.

Conflict of Interest

There are no conflicts of interest.

Data Availability Statement

The data sets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.

References

1.
Sarkar
,
S.
,
Mondal
,
S.
,
Joly
,
M.
,
Lynch
,
M. E.
,
Bopardikar
,
S. D.
,
Acharya
,
R.
, and
Perdikaris
,
P.
,
2019
, “
Multifidelity and Multiscale Bayesian Framework for High-Dimensional Engineering Design and Calibration
,”
ASME J. Mech. Des.
,
141
(
12
), p.
121001
.
2.
Sexton
,
T.
, and
Ren
,
M. Y.
,
2017
, “
Learning an Optimization Algorithm Through Human Design Iterations
,”
ASME J. Mech. Des.
,
139
(
10
), p.
101404
.
3.
Shu
,
L.
,
Jiang
,
P.
,
Shao
,
X.
, and
Wang
,
Y.
,
2020
, “
A New Multi-objective Bayesian Optimization Formulation With the Acquisition Function for Convergence and Diversity
,”
ASME J. Mech. Des.
,
142
(
9
), p.
091703
.
4.
Biswas
,
A.
,
Fuentes
,
C.
, and
Hoyle
,
C.
,
2020
, “
An Approach to Bayesian Optimization in Optimizing Weighted Tchebycheff Multi-objective Black-Box Functions
,”
Proceedings of the International Mechanical Engineering Congress and Exposition IMECE2020
,
Virtual
,
Nov. 16–19
, pp. V006T06A030–V006T06A043.
5.
Biswas
,
A.
,
Fuentes
,
C.
, and
Hoyle
,
C.
,
2022
, “
A Multi-Objective Bayesian Optimization Approach Using the Weighted Tchebycheff Method
,”
ASME J. Mech. Des.
,
144
(
1
) p.
011703
.
6.
Al-Dujaili
,
A.
, and
Suresh
,
S.
,
2019
, “
Revisiting Norm Optimization for Multi-objective Black-Box Problems: A Finite-Time Analysis
,”
J. Global Optim.
,
73
(
3
), pp.
659
673
.
7.
Važan
,
P.
,
Červeňanská
,
Z.
,
Kotianová
,
J.
, and
Holík
,
J.
,
2019
, “
Problems of a Utopia Point Setting in Transformation of Individual Objective Functions in Multi-objective Optimization
,”
Res. Pap. Fac. Mater. Sci. Technol. Slovak Univ. Technol.
,
27
(
45
), pp.
64
71
.
8.
Brochu
,
E.
,
Cora
,
V. M.
, and
de Freitas
,
N.
,
2020
, “
A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning
,” arxiv10122599 Cs, 2010, Accessed January 4, 2020, http://arxiv.org/abs/1012.2599
9.
Lizotte
,
D.
,
Wang
,
T.
,
Bowling
,
M.
, and
Schuurmans
,
D.
,
2007
, “
Automatic Gait Optimization With Gaussian Process Regression
,” pp.
944
949
.
10.
Lizotte
,
D.
,
2008
, “
Practical Bayesian Optimization
,” PhD thesis,
University of Alberta
,
Edmonton, Alberta, Canada
.
11.
Cora
,
V. M.
,
2008
,
Model-Based Active Learning in Hierarchical Policies
,
Master's thesis, University of British Columbia
,
Vancouver, Canada
.
12.
Frean
,
M.
, and
Boyle
,
P.
,
2008
, “Using Gaussian Processes to Optimize Expensive Functions,”
AI 2008: Advances in Artificial Intelligence
,
W.
Wobcke
and
M.
Zhang
, eds., Lecture Notes in Computer Science, vol. 5360,
Springer
,
Berlin/Heidelberg
, pp.
258
267
.
13.
Martinez-Cantin
,
R.
,
de Freitas
,
N.
,
Brochu
,
E.
,
Castellanos
,
J.
, and
Doucet
,
A.
,
2009
, “
A Bayesian Exploration-Exploitation Approach for Optimal Online Sensing and Planning With a Visually Guided Mobile Robot
,”
Auton. Robots
,
27
(
2
), pp.
93
103
.
14.
Xing
,
W.
,
Elhabian
,
S. Y.
,
Keshavarzzadeh
,
V.
, and
Kirby
,
R. M.
,
2020
, “
Shared-Gaussian Process: Learning Interpretable Shared Hidden Structure Across Data Spaces for Design Space Analysis and Exploration
,”
ASME J. Mech. Des.
,
142
(
8
), p.
081707
.
15.
Bostanabad
,
R.
,
Chan
,
Y.-C.
,
Wang
,
L.
,
Zhu
,
P.
, and
Chen
,
W.
,
2019
, “
Globally Approximate Gaussian Processes for Big Data With Application to Data-Driven Metamaterials Design
,”
ASME J. Mech. Des.
,
141
(
11
), p.
111402
.
16.
Erickson
,
C. B.
,
Ankenman
,
B. E.
, and
Sanchez
,
S. M.
,
2018
, “
Comparison of Gaussian Process Modeling Software
,”
Eur. J. Oper. Res.
,
266
(
1
), pp.
179
192
.
17.
Kushner
,
H. J.
,
1964
, “
A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise
,”
ASME J. Basic Eng.
,
86
(
1
), pp.
97
106
.
18.
Cox
,
D. D.
, and
John
,
S.
,
1992
, “
A Statistical Method for Global Optimization
,”
Proceedings of 1992 IEEE International Conference on Systems, Man, and Cybernetics
,
Chicago, IL
,
Oct. 18–21
,
vol. 2
, pp.
1241
1246
.
19.
Emmerich
,
M. T. M.
,
Giannakoglou
,
K. C.
, and
Naujoks
,
B.
,
2006
, “
Single- and Multiobjective Evolutionary Optimization Assisted by Gaussian Random Field Metamodels
,”
IEEE Trans. Evol. Comput.
,
10
(
4
), pp.
421
439
.
20.
Abdolshah
,
M.
,
Shilton
,
A.
,
Rana
,
S.
,
Gupta
,
S.
, and
Venkatesh
,
S.
,
2018
, “
Expected Hypervolume Improvement with Constraints
,”
Proceedings of the 24th International Conference on Pattern Recognition (ICPR)
,
Beijing, China
,
Aug. 20–24
, pp.
3238
3243
.
21.
Yang
,
K.
,
Emmerich
,
M.
,
Deutz
,
A.
, and
Bäck
,
T.
,
2019
, “
Multi-objective Bayesian Global Optimization Using Expected Hypervolume Improvement Gradient
,”
Swarm Evol. Comput.
,
44
, pp.
945
956
.
22.
Wang
,
Z.
, and
Jegelka
,
S.
2018
Max-value Entropy Search for Efficient Bayesian Optimization
,” ArXiv170301968 Cs Math Stat, Accessed April 28, 2020. http://arxiv.org/abs/1703.01968
23.
Hernández-Lobato
,
D.
,
Hernández-Lobato
,
J. M.
,
Shah
,
A.
, and
Adams
,
R. P.
2016
Predictive Entropy Search for Multi-objective Bayesian Optimization
,” ArXiv151105467 Stat, Accessed April 28, 2020. [Online]. Available: http://arxiv.org/abs/1511.05467
24.
Abdolshah
,
M.
,
Shilton
,
A.
,
Rana
,
S.
,
Gupta
,
S.
, and
Venkatesh
,
S.
2019
Multi-objective Bayesian Optimisation With Preferences Over Objectives
,” ArXiv190204228 Cs Stat, Accessed April 28, 2020. http://arxiv.org/abs/1902.04228
25.
Bhaskar
,
V.
,
Gupta
,
S. K.
, and
Ray
,
A. K.
,
2000
, “
Applications of Multiobjective Optimization in Chemical Engineering
,”
Rev. Chem. Eng.
,
16
(
1
), pp.
1
54
.
26.
Olson
,
D. L.
,
1993
, “
Tchebycheff Norms in Multi-objective Linear Programming
,”
Math. Comput. Modell.
,
17
(
1
), pp.
113
124
.
27.
Bowman
,
V. J.
,
1976
, “On the Relationship of the Tchebycheff Norm and the Efficient Frontier of Multiple-Criteria Objectives,”
Multiple Criteria Decision Making
,
Springer
,
Berlin/Heidelberg
, pp.
76
86
.
28.
Mandal
,
W. A.
,
2021
, “
Weighted Tchebycheff Optimization Technique Under Uncertainty
,”
Ann. Data Sci.
,
8
, pp.
709
731
.
29.
Grandinetti
,
L.
,
Guerriero
,
F.
,
Laganà
,
D.
, and
Pisacane
,
O.
,
2010
, “An Approximate ε-Constraint Method for the Multi-Objective Undirected Capacitated Arc Routing Problem,”
Experimental Algorithms
,
P.
Festa
ed., SEA 2010. Lecture Notes in Computer Science, vol. 6049,
Springer
,
Berlin/Heidelberg
, pp.
214
225
.
30.
Rentmeesters
,
M. J.
,
Tsai
,
W. K.
, and
Lin
,
K.-J.
,
1996
, “
A Theory of Lexicographic Multi-criteria Optimization
,”
Proceedings of ICECCS ‘96: 2nd IEEE International Conference on Engineering of Complex Computer Systems (Held Jointly With 6th CSESAW and 4th IEEE RTAW)
,
Montreal, Canada
,
Oct. 21–25
, pp.
76
79
.
31.
Zhang
,
W.
, and
Fujimura
,
S.
,
2010
, “
Improved Vector Evaluated Genetic Algorithm with Archive for Solving Multiobjective PPS Problem
,”
2010 International Conference on E-Product E-Service and E-Entertainment
,
Henan, China
,
Nov. 7–9
, pp.
1
4
.
32.
Improved Rank-Niche Evolution Strategy Algorithm for Constrained Multiobjective Optimization | Emerald Insight
.” https://www.emerald.com/insight/content/doi/10.1108/02644400810874949/full/html. Accessed April 28, 2020.
33.
Coello Coello
,
C. A.
, and
Lechuga
,
M. S.
,
2002
, “
MOPSO: A Proposal for Multiple Objective Particle Swarm Optimization
,”
Proceedings of the 2002 Congress on Evolutionary Computation. CEC’02 (Cat. No.02TH8600)
,
vol. 2
, pp.
1051
1056
.
34.
Deb
,
K.
,
Pratap
,
A.
,
Agarwal
,
S.
, and
Meyarivan
,
T.
,
2002
, “
A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II
,”
IEEE Trans. Evol. Comput.
,
6
(
2
), pp.
182
197
.
35.
Cui
,
Y.
,
Geng
,
Z.
,
Zhu
,
Q.
, and
Han
,
Y.
,
2017
, “
Review: Multi-Objective Optimization Methods and Application in Energy Saving
,”
Energy
,
125
, pp.
681
704
.
36.
Marler
,
R. T.
, and
Arora
,
J. S.
,
2004
, “
Survey of Multi-Objective Optimization Methods for Engineering
,”
Struct. Multidiscip. Optim.
,
26
(
6
), pp.
369
395
.
37.
Pohlheim
,
H.
,
2006
, “
Examples of Objective Functions (GEATbx.com)
.”
38.
Biswas
,
A.
, and
Hoyle
,
C.
,
2021
, “
An Approach to Bayesian Optimization for Design Feasibility Check on Discontinuous Black-Box Functions
,”
ASME J. Mech. Des.
,
143
(
3
), p.
031716
.
39.
Seidou
,
O.
,
Asselin
,
J. J.
, and
Ouarda
,
T. B. M. J.
,
2007
, “
Bayesian Multivariate Linear Regression With Application to Change Point Models in Hydrometeorological Variables
,”
Water Resour. Res.
,
43
(
8
).
40.
Chen
,
T.
, and
Martin
,
E.
,
2009
, “
Bayesian Linear Regression and Variable Selection for Spectroscopic Calibration
,”
Anal. Chim. Acta
,
631
(
1
), pp.
13
21
.
41.
Yang
,
H.
,
Chan
,
L.
, and
King
,
I.
,
2002
, “
Support Vector Machine Regression for Volatile Stock Market Prediction
,”
Intelligent Data Engineering and Automated Learning—IDEAL 2002
,
Berlin, Heidelberg
, pp.
391
396
.
42.
Patil
,
M. A.
,
Tagade
,
P.
,
Hariharan
,
K. S.
,
Kolake
,
S. M.
,
Song
,
T.
,
Yeo
,
T.
, and
Doo
,
S.
,
2015
, “
A Novel Multistage Support Vector Machine Based Approach for Li Ion Battery Remaining Useful Life Estimation
,”
Appl. Energy
,
159
, pp.
285
297
.
43.
Snelson
,
E.
Flexible and Efficient Gaussian Process Models for Machine Learning
,”
2007
. https://discovery.ucl.ac.uk/id/eprint/1445855/. Accessed November 3, 2020.
44.
Seeger
,
M.
,
2004
, “
Gaussian Processes for Machine Learning
,”
Int. J. Neural Syst.
45.
Nielsen
,
H. B.
,
Lophaven
,
S. N.
, and
Søndergaard
,
J.
DACE—A Matlab Kriging Toolbox
,”
2002
. http://www2.imm.dtu.dk/pubdb/pubs/1460-full.html. Accessed November 3, 2020.

Supplementary data