Abstract

In this paper, we propose a sparse modeling method for automatically creating a surrogate model for nonlinear time-variant systems from a very small number of time series data with nonconstant time steps. We developed three machine learning methods, namely, (1) a data preprocessing method for considering the correlation between errors, (2) a sequential thresholded non-negative least-squares method based on term size criteria, and (3) a solution space search method involving similarity model classification—to apply sparse identification of nonlinear dynamical systems, as first proposed in 2016, to temperature prediction simulations. The proposed method has the potential for wide application to fields where the concept of equivalent circuits can be applied. The effectiveness of the proposed method was verified using time series data obtained by thermofluid analysis of a power module. Two types of cooling systems were verified: forced air cooling and natural air cooling. The model created from the thermofluid analysis results with fewer than the number of input parameters, predicted multiple test data, including extrapolation, with a mean error of less than 1 K. Because the proposed method can be applied using a very small number of data, has a high extrapolation accuracy, and is easy to interpret, it is expected not only that design parameter can be fine-tuned and actual loads can be taken into account, but also that condition-based maintenance can be realized through real-time simulation.

1 Introduction

Methods have been developed for creating models that can be easily used in simulations by analyzing data obtained from experiments or detailed numerical analyses performed by machine learning. Such models are called surrogate models because they can be a substitute for detailed numerical analyses, which are computationally inefficient. Surrogate models represented in lower dimensional space are also called reduced order models (ROMs). ROMs describe the system of interest as a parametric dynamical system. This method can be applied in various fields, including the mechanics of materials and electrical engineering, as well as heat transfer engineering, which is the subject of this research. In general, surrogate models are created using a data-driven approach and they can handle complex shapes directly. With advantages in terms of accuracy and computational efficiency, surrogate models can be used to improve the sophistication and efficiency of the detailed design and operation of products and systems. Accordingly, these models attracted increasing attention as a candidate for digital twin technology for realizing cyber-physical systems.

Previous studies have used machine learning to create surrogate models that take physical information into account, including devising inductive bias, with the aim of improving generalizability, reducing the amount of learning data, and improving interpretability [13]. Taking neural networks as an example, methods for devising loss functions [415] and model structures [1624] have been reported.

As for devising loss functions, Raissi et al. [4] considered physical properties such as partial differential equations, initial conditions, and boundary conditions in the loss function by using automatic differentiation. This method has been widely applied [510]. Greydanus et al. [11] defined a loss function that satisfies the canonical equation used in Hamiltonian mechanics; this method was later extended to Lagrangian mechanics [12,13].

As for devising model structures, Chen et al. [21] proposed a method for treating the neural network as an ordinary differential equation by making the layers continuous, applying the idea of the residual network [25]. That method has been widely applied because ordinary differential equations are often used as mathematical models of physical phenomena. In addition, the application of graph and neural network theory [2628] has been attracting attention as a general approach for handling complex shapes. For example, Sanchez-Gonzalez et al. [17] applied graph and neural network theory to particle-based simulation, while Pfaff et al. [18] and Horie et al. [19,20] applied it to the finite element method. A framework incorporating the method proposed by Raissi et al. [4] was recently proposed for solving partial differential equation-governed forward and inverse problems [29].

These methods are superior to neural networks that do not consider physical information in terms of generalization ability, number of learning data, and interpretability. However, the surrogate model remains insufficient for solving social problems. For example, when targeting actual complex products or systems, there are often challenges such as memory usage and the large amount of time required for learning, even with high-performance computers.

In this paper, we propose a nonlinear time-variant reduced-order model, a sparse modeling method for system identification that is superior to conventional methods in terms of generalization ability, number of learning data, and interpretability. The proposed method improves the applicability to engineering simulations of sparse identification of nonlinear dynamics (SINDy), which was proposed by Brunton et al. [30] as an extension of symbolic regression [31]. Symbolic regression has evolved in the framework of genetic programming [3236]. Recently, neural network approaches [3739] and sparse modeling approaches [30,40] have been increasingly introduced to improve search efficiency. Among them, SINDy is a heuristic method based on simple sparse linear regression that is highly extensible and has been widely applied [41]. In SINDy, the coefficients of a library composed of basis function candidates were estimated using a newly developed sparse regression method, under the assumption that the true model can be represented by a linear combination of basis functions, including nonlinear functions. Using basis function candidates to place constraints on the form of the model has both advantages and disadvantages, and the proposed method maximizes the advantages of this constraint.

Specifically, we constrain the form of the output model by using the thermal network method, which applies the framework of equivalent circuits to predict temperature. The mathematical model of the thermal network method is a simultaneous ordinary differential equation, and it is highly compatible because it can be denoted in the form of a linear regression equation, which is the framework of SINDy. The thermal network model can be interpreted as a graph because the nodes, which are collocation points, are connected by thermal resistance. Each node stores a state quantity (i.e., temperature), and its changeability is represented by its heat capacity. The number of nodes is very small compared with the detailed numerical analysis performed using the finite volume method or the finite element method. In other words, dimensionality reduction methods such as proper orthogonal decomposition and convolutional neural network-based auto-encoder (CNN-AE) are not necessary. Therefore, the library can be configured in a form that follows physical laws, including theoretical and empirical formulas developed over the long history of heat transfer engineering, and the created model has superior generalization ability and interpretability.

The proposed nonlinear time-variant reduced-order model, based on a sparse modeling method, comprises the following three novel machine learning methods. (1) A data preprocessing method that considers the correlation between errors. This method predicts not only a few steps ahead but also very many steps ahead with high accuracy. (2) A sequential thresholded non-negative least-squares method based on term size criteria efficiently performs the selection of basis functions as well as coefficient estimation by suppressing the effect of multicollinearity among basis function candidates. It improves upon the sequential thresholded least-squares algorithm (STLS) sparse regression method proposed in the SINDy framework. (3) A space search method based on similar basis function classification assigns the basis function candidates to groups and controls each group individually to achieve a physically valid combination of basis functions.

The proposed sparse modeling method has the following features. (1) The created ROM satisfies the law of conservation of energy, has good generalization ability, and can be extrapolated to some extent. (2) It is capable of modeling nonlinear time-variant systems. It can handle a wider range of phenomena compared with the method for creating ROMs under time-invariant assumptions [42,43]. In addition, it is widely applicable to fields where the concept of equivalent circuits can be applied. (3) It can be applied to systems where the parameter values used in the governing equations differ according to the area. (4) There are few constraints on the time series data to be learned. It can handle cases in which the time-step Δt of the time series data is not fixed. (5) The content of the ROM is easy to interpret. (6) ROM creation does not require a large amount of resources. It is also possible to create ROM from a number of detailed numerical analysis trials that is less than the number of input parameters. Therefore, less effort is required for data collection. In addition, little knowledge of heat transfer engineering is required for ROM creation. This is because the data reveal whether forced convective, natural convective, or radiative heat transfer are occurring.

In addition, the following should be noted. (1) Unknown phenomena cannot be treated with the proposed method. This is because we cannot deal with phenomena when the form of their basis functions cannot be conceptualized. (2) Temperature nodes are selected in advance, although strictness is not required. (3) The proposed method does not derive a governing equation but rather an approximate model that is consistent with physical laws and is easy to interpret.

2 Background

2.1 Thermal Network Method.

Physical phenomena can be simulated in a simplified manner by dividing a complex product or system into multiple elements and defining the relationships between the elements. This method is an application of the theory of equivalent circuits for solving electrical circuit problems in mechanical engineering and related fields. Known as the thermal network method in the field of heat transfer engineering, the current, electrical resistance, and capacitance of the electrical circuit correspond to the heat flow, thermal resistance, and heat capacity, respectively. There are two types depending on how the heat capacity is handled, namely, the Cauer type and the Foster type. In this research, we used the Cauer type, the mathematical model of which is shown in the following equation:
(1)

Here T is the temperature, C is the heat capacity, R is the thermal resistance, Q is the caloric value, and N is the number of nodes. Temperatures at the nodes are calculated by solving the simultaneous ordinary differential equations in Eq. (1). Because no discretization error occurs in the thermal network method, it is possible to solve the problem with a dimensionality a few orders of magnitude smaller than that in detailed thermofluid analysis. The thermal network model can be interpreted as a graph, with directed edges for air flow and undirected edges for other variables.

Constructing Eq. (1) deductively requires a deep understanding of physical phenomena and a large amount of time. The actual phenomena and structures, which are simplified from a physical point of view, as well as parameters such as thermal resistance and heat capacity are set. The heat capacity is determined mainly by the shape and physical properties. Thermal resistance is determined not only by the shape and physical properties but also by state quantities such as temperature and velocity. In cases where thermal resistance cannot be set theoretically, it is necessary to select an empirical formula suitable for the target object from the many candidates. Therefore, the quality of the model is highly dependent on the ability of the engineer.

2.2 Sparse Identification of Nonlinear Dynamical Systems.

In SINDy, which was first proposed by Brunton et al. in 2016 [30], the time derivative X˙ of a variable X is given in the form of Eq. (2) under the assumption that the true model can be represented by a linear combination of basis functions including nonlinear functions
(2)

Here X is an m×n matrix, m is the number of time samples, n is the dimension of X, Θ(X) is a library consisting of basis function candidates, and Ξ is a sparse vector of coefficients, which can be estimated using STLS, a novel sparse regression method. An example STLS flowchart is shown in Fig. 1.

Fig. 1
Example STLS flowchart
Fig. 1
Example STLS flowchart
Close modal

In STLS, after estimating the least-squares solution of Ξ, the process for modifying coefficients ξ, which are elements of Ξ smaller than the cutoff, is repeated. The model selection criteria represent the accuracy and simplicity of the model. The cutoff is a hyperparameter that greatly affects the quality of the model and is modified each time during the outside loop. In the later part of the inside loop, the number of coefficients to be modified can be zero, which is an advantage compared with some other methods, such as the least absolute shrinkage and selection operator method, in which the absolute values of the coefficients are shrunk by the thresholding operation. X˙ is contaminated with noise due to its numerical approximation; STLS is a robust approach for such noisy data, and it creates equations that balance accuracy and complexity.

SINDy has been applied in various ways [4451]. For example, Ruby et al. [44] applied ridge regression as the regression method for STLS in order to identify partial differential equations. In addition, Fukami et al. [47] used thresholded least squares and adaptive least absolute shrinkage and selection operator as the regression methods for STLS and applied them to a flow field with reduced dimensionality by CNN-AE. Furthermore, Mangan et al. [48] proposed a method to find a parsimonious model for a system by combining information criteria such as Akaike's information criterion (AIC) and Bayesian information criterion (BIC) with SINDy. Improvements in noise robustness have also been studied. For example, Cortiella et al. [49] proposed an approach that selects regularization parameters based on the corner point criterion of Pareto curves; Fasel et al. [50] proposed an approach that involves bootstrap aggregation; and Schaeffe et al. [51] proposed an approach based on the integral formulation.

3 Proposed Method of Sparse Modeling: Nonlinear Time-Variant Reduced-Order Model for Temperature Prediction

3.1 Overview of the Sparse Modeling Method.

Figure 2 shows a schematic flowchart of the proposed new sparse modeling method.

Fig. 2
Schematic flowchart of the proposed new sparse modeling method
Fig. 2
Schematic flowchart of the proposed new sparse modeling method
Close modal
We propose a method of outputting the ROM for temperature prediction in the form of Eq. (1), which is also used in the thermal network method. To this end, we constructed a library consisting of basis function candidates that are proportional to any of the terms on the right-hand side of Eq. (1) and used sparse regression to select the basis functions and estimate the coefficients. Equation (3) is an expression of Eq. (2) applied to temperature prediction
(3)

To create a ROM in the form of Eq. (1), rather than reducing the variable to a lower dimension by using proper orthogonal decomposition or CNN-AE, we treated the temperature Tn itself, which is represented by the collocation points. Note that the proposed method includes data-driven aspects, so it is not necessary to perform this task strictly. In other words, the temperature node does not have to be the mean temperature in the control volume; local values at arbitrary points are acceptable. Furthermore, the state quantity (e.g., temperature) is sufficient information to be stored at the nodes, and coordinate data are not required.

The vector X of factors that influence the temperature is composed of factors that vary significantly in the time series data. For example, if the size or material of the modeling target does not change, it is removed from X. Except in cases where the temperature range in the time series data is very wide, the effects of the temperature dependence on physical properties can be learned properly.

Next, we describe the basis function candidates that compose the library. We consider the forced convective heat transfer of a horizontal plate with a smooth surface. Equation (4) [52] is an equation for calculating the mean value of the Nusselt number Nu when laminar flow with a Prandtl number Pr greater than 0.5 flows along the constant-temperature wall
(4)
Here Re is the Reynolds number. The thermal resistance R can be calculated based on Nu¯. However, when laminar flow with a Pr greater than 0.5 flows along the constant–heat flux wall, Nu¯ can be calculated using the following equation [52]:
(5)
Furthermore, when most of the region is turbulent, Nu¯ can be calculated using the following equation [52]:
(6)
In addition to Eqs. (4)(6), many other formulas have been proposed. There are also many formulas consisting of complex nonlinear functions. These equations differ greatly in the amount of heat transfer depending on subtle differences in conditions, as in Eqs. (4) and (5), but they have a commonality that can be organized using a small number of dimensionless numbers. By extracting such commonalities, it is possible to define basis function candidates. For example, when incompressible flow is assumed, Nu can often be approximated by less than the first power of Re. Because the calculation range required for ROM is generally narrow, the above approximation is sufficient for equations represented as complex nonlinear functions. Therefore, the thermal resistance R can be set as shown in Eq. (7) when the velocity v is a variable
(7)
Here h is the heat transfer coefficient and S is the surface area. The point where the velocity v is measured does not have to be strict. It is also possible to substitute a parameter that is correlated with the velocity v, such as the speed of a fan. Basis function candidates can be setup in the form of Eq. (8), which is proportional to any of the terms on the right-hand side of Eq. (1), by using Eq. (7)
(8)
Natural convective heat transfer, radiative heat transfer, and thermal contact resistance can be considered in the same way. For example, natural convective heat transfer can be summarized by the Grashof number Gr and the Prandtl number Pr in many cases, and the thermal resistance R can be set in the same way as for forced convective heat transfer, as shown in the following equation:
(9)

Equation (9) shows the setting that takes into account the effect of the volumetric thermal expansion assuming an ideal gas. As in the case of forced convective heat transfer, a range can be specified for the exponent β. Thus, by utilizing domain knowledge, it is possible to appropriately set basis function candidates for known phenomena.

In this research, we first constructed a library for forced convective heat transfer, natural convective heat transfer, radiative heat transfer, and thermal conduction (including thermal contact resistance) as well as a source term to deal with forced and natural air cooling. Then, we performed sparse regression for the selection of basis functions and for coefficient estimation. However, because of the various problems described below, conventional machine learning methods did not work well. Therefore, we developed three new machine learning methods, which are described in Sec. 3.2.

3.2 Three Proposed Methods for Machine Learning

3.2.1 Data Preprocessing Method for Considering the Correlation Between Errors.

In the preprocessing step of calculating T˙ and Θ(T,X) in Eq. (3) from the time series data, the time information is lost and the data become independent. The first term of the objective function, which can be the root-mean-squared error (RMSE) or mean absolute error (MAE), does not consider whether the error is positive or negative. In addition, it is difficult to apply seasonal adjustment and high-order autocorrelation in our approach. Therefore, some novel method is needed to create a ROM that can predict the very many steps ahead with high accuracy.

Therefore, we propose a method to control the variance– covariance matrix Ω of the error term in the generalized least squares method by devising a data preprocessing method. Specifically, the time series data are transformed not only in the form of Eq. (3) but also in another form to consider the correlation between errors. Equation (10) shows the temperature prediction equation based on the first-order accuracy forward difference in order to explain the reason for transforming it differently
(10)
Here, the superscript numbers in parentheses are the time sample numbers. In thermofluid analysis, the time-step Δt is often not fixed, so the correlation between errors based on the time derivative of temperature cannot correctly evaluate the updated amount of temperature.
Therefore, the proposed method applies Eq. (11), in a fixed interval, which can be derived by substituting Eq. (3) for T˙n(m) in Eq. (10), to “another form” as described above
(11)
Here FI is the fixed interval, Ml is the number of data in the dataset l, L is the number of datasets, and tl,j denotes the time sample number j in the dataset l. Equation (12) shows the definition of ΔTFI
(12)
It is worth noting that the sparse vectors of coefficients Ξ in Eqs. (3) and (11) are identical. Equation (11) is the long-term component representing the amount of change in the fixed interval, which can be mixed with the short-term component representing the time derivative in Eq. (3) and used as learning data, as shown in the following equation:
(13)
Here, ξ is an element of the sparse vector of coefficients Ξ, θ is an element of the library Θ(T,X), and γn is a correction factor for normalization that is set so that the RMSE of the short-term component dominates. Equation (13) shows the equation for an arbitrary node n, giving priority to understandability. The upper row is the short-term component, and the lower row is the long-term component.
The least-squares solution for node n in Eq. (11) is shown in the following equation:
(14)
Here ξn=[ξn,1ξn,2ξn,P]T is the element for node n in Ξ and εTn(tl,m) is the error between two consecutive points in time-related to the temperature Tn at node n shown in the following equation:
(15)
It can be seen from Eq. (14) that the least-squares solution of Eq. (11) can be denoted as the error between two consecutive points in time. The second term in the second equation in Eq. (14) is the correlation between errors. In other words, the first- and second-terms are, respectively, the diagonal and off-diagonal elements of the variance–covariance matrix Ω of the error.
Equation (16) shows the least-squares solution for node n in Eq. (3), which represents the short-term component
(16)

Equation (14) which represents the long-term component, includes terms of the correlation between errors that are not included in Eq. (16), which represents the short-term component. To take advantage of this feature, the fixed interval FI should be wider. The number of possible combinations of the first term and the second term in the second equation in Eq. (14) is FI:C2FI, so that the second term dominates when FI is wide. However, a wide FI is problematic because it allows the correlation of errors between samples that are far apart. Hence, we propose combining multiple long-term components with different FI. Figure 3 shows the variance– covariance matrix Ωof the error when the number of datasets L is 3. The same procedure can be used for difference operators other than the forward difference.

Fig. 3
The variance–covariance matrix Ω of the error when the number of datasets L is 3
Fig. 3
The variance–covariance matrix Ω of the error when the number of datasets L is 3
Close modal

3.2.2 Sequential Thresholded Non-Negative Least-Squares Method Based on Term Size Criteria.

Figure 4 is a flowchart of the novel sparse regression method, which is a sequential thresholded non-negative least-squares method based on term size criteria. This method improves upon the STLS. Steps 1 and 2 in Fig. 4, which are improvements, are described below.

Fig. 4
Flowchart of the sequential thresholded non-negative least-squares method based on term size criteria
Fig. 4
Flowchart of the sequential thresholded non-negative least-squares method based on term size criteria
Close modal

In step 1, coefficient estimation is performed by suppressing the effect of multicollinearity among basis function candidates. In cases with more than a few dozen temperature nodes, there are naturally combinations of highly correlated temperature nodes. In these cases, estimates for coefficients with large absolute values are scattered when the least-squares method is used. Terms having large absolute values of the coefficients are not eliminated in step 2, which selects the basis functions, and thus, an unstable ROM is output. Hence, bearing in mind that the sign of the coefficients can be controlled by devising the setting of the basis function candidates, we applied a regression method with the weak constraint that the coefficients be non-negative [53]. The implementation of this method is simple, and it is also advantageous in that the computational efficiency is not significantly reduced compared with the least-squares method.

In step 2, the selection of basis functions is performed based on term size. In this method, the basis functions are selected based on terms having a strong influence on the time derivative of the temperature (i.e., the left-hand side). Equation (17) shows a representative value of the term size for the basis function candidate p at node n
(17)
Equation (18) shows an example of the threshold used for the selection of basis functions
(18)

Here, λ is a hyperparameter that takes a value of less than 1. If the value calculated in Eq. (17) is smaller than some threshold value (e.g., Eq. (18)), the corresponding coefficient is modified to 0, similar to a hard-thresholding operator. The max function in Eqs. (17) and (18) can be changed to a form corresponding to normalization methods such as the min–max or Z-score.

It can be seen from Eqs. (17) and (18) that the coefficients and threshold are corrected based on the data of the short-term component. Figure 5 shows the results when the data preprocessing method described in Sec. 3.2.1 is applied to time series data obtained from thermofluid analysis. Figure 5(a) shows the values on the left-hand side of the equation, and Fig. 5(b) shows the values of a basis function candidate on the right-hand side of the equation. In both graphs, the left side illustrates the values of the short-term component, and the right side illustrates the values of the long-term component. It can be seen that the means and standard deviations of the short-term component and the long-term component are significantly different. Because the long-term component is normalized such that the RMSE of the short-term component dominates, the long-term component does not significantly disturb the mean and standard deviation of the short-term component on the left-hand side of the equation. However, as for the right-hand side of the equation, the long-term component significantly disturbs the mean and standard deviation of the short-term component. The strong effect of the long-term component on the selection of basis functions should be avoided. However, it is difficult to solve this problem by using the normalization of basis function candidates and the concept of hypothesis testing.

Fig. 5
Results of applying the data preprocessing method described in Sec. 3.2.1 to time series data obtained from thermofluid analysis: (a) values on the left-hand side and (b)values of a basis function candidate on the right-hand side
Fig. 5
Results of applying the data preprocessing method described in Sec. 3.2.1 to time series data obtained from thermofluid analysis: (a) values on the left-hand side and (b)values of a basis function candidate on the right-hand side
Close modal
For example, we consider the application of a t-test where the null hypothesis is ξn,p=0. In this case, the t-value can be calculated using the following equation:
(19)

Here SSii represents the diagonal elements of the inverse matrix of the mean square deviations and the sum of the products of the deviations, and Ve is the error variance. The data preprocessing method described in Sec. 3.2.1 involves mixed data extracted from different populations, so SSii is meaningless. Therefore, selection of basis functions by using a t-test does not work effectively.

The proposed novel sparse regression method can easily remove the effect of the long-term component. In addition, because the proposed method selects basis functions according to a relative criterion, it is particularly effective for thermal problems in which the time constants differ significantly for each temperature node.

3.2.3 Space Search Method Based on Similar Basis Function Classification.

As mentioned earlier, the library is composed of various physical models. Because of the wide variety of heat transfer phenomena, there are many candidate combinations of basis functions that are highly correlated in the library. However, some combinations of physical models are not physically valid. Therefore, it is difficult to obtain the expected ROM only by devising a sparse regression method. For this reason, we developed a method to probabilistically extract basis function candidates and construct a candidate set of simultaneous ordinary differential equations, as shown in Fig. 6, which illustrates a detail of the upper right part of the novel sparse regression method shown in Fig. 2.

Fig. 6
Probabilistic extraction of basis function candidates and construction of a candidate set of simultaneous ordinary differential equations
Fig. 6
Probabilistic extraction of basis function candidates and construction of a candidate set of simultaneous ordinary differential equations
Close modal

Inside the library, we defined sets for each heat transfer mode, which we named sublibraries, such that the library comprises a family of sets. For sublibraries containing multiple physical models, an extraction probability was assigned to each physical model. To illustrate this idea using Eq. (8), vα1(TjTi) is physical model 1(θA1) and vα2(TjTi) is physical model 2(θA2). Because there are multiple temperature nodes, each physical model consists of multiple basis function candidates. The classification of basis function candidates is similar to group lasso [54]. The purpose of the proposed method is to limit the combination of physical models.

First, Eq. (13) is constructed using the basis function candidates represented by the physical models extracted according to the extraction probabilities, and then the sparse regression is performed. This process is iterated several times to create several ROM candidates. After that, the model selection criteria, including the information criteria for each ROM candidate, are calculated and the extraction probabilities of the physical models in the next iteration are determined based on the values of these criteria. The model selection criteria can simply be the sum of the error term (e.g., RMSE) and the L0 norm. Knowledge of the thermal network method can also be applied to create better ROMs. The number of basis functions is not linearly related to the quality of the ROM, so it is important to devise a way to keep them within an appropriate range, for example, using a nonlinear function with the L0 norm as a parameter. Methods such as AIC and BIC, which involve the logarithm of the residual sum of squares, did not work well with the proposed method.

We also modify the hyperparameter λ of the threshold shown in Eq. (18) along with the extraction probabilities. The hyperparameter λ does not suffer from complex multimodality, so no special method is required to determine the amount and sign of the hyperparameter modification (e.g., comparing the value of the model selection criteria of the current iteration with the value of the previous iteration). As more iteration are repeated, the probability of extracting a physical model that fits the data becomes higher, and thus, more simultaneous equations in the identical iteration overlap; that is, they have the same form. These overlapping simultaneous equations can be combined into a single, and thus, the time required for each iteration becomes progressively shorter.

4 Effectiveness of the Proposed Method

The effectiveness of the proposed sparse modeling method was verified using time series data obtained by thermofluid analysis of a power module mounted on a comb-shaped heat sink.

4.1 Power Module Mounted on a Forced Air Cooling Heat Sink

4.1.1 Thermofluid Analysis.

Figure 7 shows the target of the analysis. The heat-generating components were 12 chips. Tables 1 and 2 show the dimensions and physical properties of the analysis target, Table 3 shows the thermofluid analysis conditions for the learning and validation data, and Tables 4 and 5 show the analysis conditions for the test data.

Fig. 7
Analysis target: power module with forced air cooling
Fig. 7
Analysis target: power module with forced air cooling
Close modal
Table 1

Dimensions and physical properties of the power module

Thickness (mm)Width × depth (mm)Thermal conductivity (W/m/K)Specific heat capacity (J/kg/K)Density (kg/m3)
Chip0.258 × 84006563210
Part A0.18 × 873.22267300
Part B0.394 × 253903808960
Part C0.394 × 40706803200
Part D0.394 × 403903808960
Part E0.394 × 4073.22267300
Part F3.0120 × 603903808960
Thickness (mm)Width × depth (mm)Thermal conductivity (W/m/K)Specific heat capacity (J/kg/K)Density (kg/m3)
Chip0.258 × 84006563210
Part A0.18 × 873.22267300
Part B0.394 × 253903808960
Part C0.394 × 40706803200
Part D0.394 × 403903808960
Part E0.394 × 4073.22267300
Part F3.0120 × 603903808960
Table 2

Dimensions and physical properties of the heat sink

Height (mm)Length (mm)Width (mm)Fin thickness (mm)Number of fins
8080163224
Height (mm)Length (mm)Width (mm)Fin thickness (mm)Number of fins
8080163224
Base thickness (mm)Thermal conductivity (W/m/K)Specific heat capacity (J/kg/K)Density (kg/m3)
3.02258802,700
Base thickness (mm)Thermal conductivity (W/m/K)Specific heat capacity (J/kg/K)Density (kg/m3)
3.02258802,700
Table 3

Thermofluid analysis conditions for the learning and validation data

Dataset no.L1L2L3L4L5L11L12L13L14
Ambient temperature (K)Ta283293303313278333303313293
Velocity (m/s)v448865.67.275
Thermal contact resistance (K/W)Rc0.110.160.040.10.150.180.130.120.17
Caloric value (W)Q110060121542653055
Q22045000351407
Q306408252855500
Q44083004182609
Q550102043594505
Q6601210065502011
Q71210060454035930
Q8102045003733040
Q9830640552025150
Q1064080101801160
Q11450102065815510
Q120601210125830020
Dataset no.L1L2L3L4L5L11L12L13L14
Ambient temperature (K)Ta283293303313278333303313293
Velocity (m/s)v448865.67.275
Thermal contact resistance (K/W)Rc0.110.160.040.10.150.180.130.120.17
Caloric value (W)Q110060121542653055
Q22045000351407
Q306408252855500
Q44083004182609
Q550102043594505
Q6601210065502011
Q71210060454035930
Q8102045003733040
Q9830640552025150
Q1064080101801160
Q11450102065815510
Q120601210125830020
Table 4

Thermofluid analysis conditions for the test data

Dataset no.T1T2T3T4T5T6T7
Ambient temperature (K)Ta293283303323293313293
Velocity (m/s)v6101065.6Condition A
Thermal contact resistance (K/W)Rc0.100.050.100.200.250.050.20
Caloric value (W)Q18050502518Condition A (described in Table 5)
Q20002559
Q30002555
Q4000251
Q50002555
Q68050502541
Q70002513
Q80002542
Q90002565
Q100002523
Q110002569
Q120002550
Dataset no.T1T2T3T4T5T6T7
Ambient temperature (K)Ta293283303323293313293
Velocity (m/s)v6101065.6Condition A
Thermal contact resistance (K/W)Rc0.100.050.100.200.250.050.20
Caloric value (W)Q18050502518Condition A (described in Table 5)
Q20002559
Q30002555
Q4000251
Q50002555
Q68050502541
Q70002513
Q80002542
Q90002565
Q100002523
Q110002569
Q120002550
Table 5

Condition A for the test data in Table 4

Time (s)t0306090120150180210240270
306090120150180210240270300
Velocity (m/s)v881010448866
Caloric value (W)Q14003030300050500
Q21050005050500050
Q3,Q5Q8,Q910101010101010101010
Q405005005005000
Q61525354535250000
Q7105000505050000
Q100000000000
Q115505004040006060
Q122004005050040050
Time (s)t0306090120150180210240270
306090120150180210240270300
Velocity (m/s)v881010448866
Caloric value (W)Q14003030300050500
Q21050005050500050
Q3,Q5Q8,Q910101010101010101010
Q405005005005000
Q61525354535250000
Q7105000505050000
Q100000000000
Q115505004040006060
Q122004005050040050

The numbers for the caloric values Q in Tables 3 and 4 correspond to the chip numbers in Fig. 7. The velocity is the bulk value upstream of the heat sink. The velocity is set to a Reynolds number Re corresponding to turbulence. Note that there is no bypass, that is, no flow that avoids the heat sink. Thermal contact resistance Rc is the value between part F and the heat sink. Datasets L1 to L14 in Table 3 and datasets T1 to T5 in Table 4 are initial value problems with an analysis time of 1,000 s. The number of input parameters is 15, which is more than the number of learning datasets (14). The learning and validation data are in the range of 4–8 m/s in velocity, 0.04–0.18 K/W in thermal contact resistance, and 0–65 W in caloric value. The test data are in the range of 4–10 m/s in velocity, 0.05–0.25 K/W in thermal contact resistance, and 0–80 W in caloric value, which includes values outside the domain of definition for the learning and validation data. We did not use any special method to determine the analysis conditions, and thus, it is possible that an experimental design might provide better-quality data.

The thermofluid analysis was performed by computational fluid dynamics software (icepak; ansys, Canonsburg, PA) using the finite volume method. The number of nodes was several million. For the turbulence model, we selected one of the Reynolds-averaged Navier–Stokes models, which uses linear and logarithmic laws depending on the dimensionless wall distance. In addition, natural convective heat transfer and radiative heat transfer were set to not occur, and the time-step Δt was not fixed. Figure 8 shows some of the temperature measurement points. The output time series data included the temperature at 63 points: each of the chips; parts B, C, and E on the xy coordinates of the chip center, 12 locations each; three locations of part F and the heat sink base; and nine locations of the heat sink fins.

Fig. 8
Some of the temperature measurement points
Fig. 8
Some of the temperature measurement points
Close modal

4.1.2 Created Model.

The output is a ROM represented by 63 simultaneous ordinary differential equations. Some of these are shown in Eq. (20). The meanings of the subscripts in Eq. (20) are shown in Fig. 8 
(20)
Because the proposed method creates the ROM in the form of the equations used in the thermal network method, it is easy to interpret, as shown in Eq. (20). The number of nonzero coefficients in the ROM we created was 244, which is about 3% of the 9,513 basis function candidates. On average, about four basis functions made up each equation, which was considered a reasonable number from the viewpoint of physics. Equation (20) indicates that the connection relationship between the temperature nodes is reasonable. The basis functions for natural convective and radiative heat transfer were not chosen, whereas the basis functions for forced convective heat transfer were. From these facts, we concluded that the heat transfer phenomena could be understood to some extent from the time series data.

In a comparison with the MSE of the learning and validation data, the square of the bias was almost the same value, and the variance was approximately 4%. From this, the created ROM is unlikely to be overfit. The bias can be reduced by increasing the number of nodes—that is, by increasing the locations where heat capacity is assigned—or by modifying the information criteria. However, the tradeoff between small bias and model simplicity should be kept in mind.

4.1.3 Results.

We used the created ROM to predict the test data. The input was the initial temperature Tn(0) at 63 nodes and the time series information of the factors X that influence the temperature. The output was the predicted temperature at 63 nodes other than time t=0. We evaluated the effectiveness of the proposed sparse modeling method by comparing the predicted temperature calculated by the ROM with the results of the thermofluid analysis. Table 6 shows the mean error. The mean error for the seven types of test data was 0.8 K. The mean error for the 14 types of learning data was 0.6 K. Furthermore, as mentioned above, the variance was small for the learning and validation data represented by Eq. (13), so we concluded that there was no overfitting. The created ROM was able to perform a surrogate calculation for the thermofluid analysis, which normally takes between half a day to a day using a high-performance computer, in only a few seconds using an ordinary notebook computer.

Table 6

Mean error between the predicted temperature calculated by the ROM and the results of the thermofluid analysis

Dataset no.T1T2T3T4T5T6T7
Mean error (K)0.70.30.31.00.91.41.2
Dataset no.T1T2T3T4T5T6T7
Mean error (K)0.70.30.31.00.91.41.2

The prediction results for dataset T3 are shown in Fig. 9. We plotted seven representative nodes out of the 63 predicted temperatures; the horizontal axis is time and the vertical axis is temperature. The steady-state temperature was predicted with high accuracy. The rapid increases in temperature of chips having small time constants were captured with high accuracy. The temperature of nodes such as heat sinks, whose time constant is much larger than that of the chips, could also be predicted with high accuracy. It can be seen that TChip1TChip6, the temperature difference between chips 1 and 6, which generate the same calorific value, was predicted with high accuracy. The main reason for this small difference in temperature was the thermal spreading resistance. Furthermore, in TF(C)THSb(C), the large temperature difference between part F and the heat sink base due to thermal contact resistance could also be predicted with high accuracy. In addition, the temperatures of heat sink fins 1–3 indicate that the nonlinear temperature distribution in the direction of the fin height could also be predicted with high accuracy. This can be explained by the theory of fin efficiency, but it is worthwhile that these physical phenomena, including the above-mentioned thermal spreading resistance, was predicted with high accuracy from the data.

Fig. 9
Prediction results for dataset T3. Chips 1 and 6, the center of part F, the center of the heat sink base, and the center of heat sink fins 1–3 are plotted.
Fig. 9
Prediction results for dataset T3. Chips 1 and 6, the center of part F, the center of the heat sink base, and the center of heat sink fins 1–3 are plotted.
Close modal

The prediction results for dataset T7 are shown in Fig. 10. The created ROM was able to predict temperature with high accuracy even when the input varied with time. We also confirmed that errors did not accumulate.

Fig. 10
Prediction results for dataset T7. Chips 1 and 6, the center of part F, the center of the heat sink base, and the center of heat sink fin 3 are plotted.
Fig. 10
Prediction results for dataset T7. Chips 1 and 6, the center of part F, the center of the heat sink base, and the center of heat sink fin 3 are plotted.
Close modal

Figure 11 shows the results of predicting dataset T3 by the ROM created without using the data preprocessing method described in Sec. 3.2.1, that is, with only the short-term component T˙. Machine learning considering only the short-term component T˙ did not provide a highly accurate prediction for the object of this research. Consequently, we confirmed the importance of applying the long-term component ΔTFI.

Fig. 11
Results of predicting dataset T3 using ROMs created without using the proposed data preprocessing method described in Sec. 3.2.1
Fig. 11
Results of predicting dataset T3 using ROMs created without using the proposed data preprocessing method described in Sec. 3.2.1
Close modal

4.2 Power Modules Mounted on a Natural Air Cooling Heat Sink

4.2.1 Thermofluid Analysis.

Figure 12 shows the target of the analysis. Two power modules aligned in the direction of gravity are cooled by a heat sink. The heat-generating components were 24 chips. Tables 7 and 8 show the dimensions and physical properties of the analysis target. The chips and parts A to F are shown in Table 1. Table 9 shows the thermofluid analysis conditions for the learning and validation data, and Table 10 shows the analysis conditions for the test data.

Fig. 12
Analysis target: power modules with natural air cooling
Fig. 12
Analysis target: power modules with natural air cooling
Close modal
Table 7

Dimensions and physical properties of the power module. Chips and parts A to F are shown in Table 1

Thickness (mm)Thermal conductivity (W/m/K)Specific heat capacity (J/kg/K)Density (kg/m3)Emissivity
Case1.00.27150013100.8
Insulating resin0.3011001850
Thickness (mm)Thermal conductivity (W/m/K)Specific heat capacity (J/kg/K)Density (kg/m3)Emissivity
Case1.00.27150013100.8
Insulating resin0.3011001850
Table 8

Dimensions and physical properties of the heat sink

Height (mm)Length (mm)Width (mm)Fin thickness (mm)Number of fins
251601662.021
Height (mm)Length (mm)Width (mm)Fin thickness (mm)Number of fins
251601662.021
Base thickness (mm)Thermal conductivity (W/m/K)Specific heat capacity (J/kg/K)Density (kg/m3)
5.02258802,700
Base thickness (mm)Thermal conductivity (W/m/K)Specific heat capacity (J/kg/K)Density (kg/m3)
5.02258802,700
Table 9

Thermofluid analysis conditions for the learning and validation data

Dataset no.L1L2L3L7L8L13L14L15
Ambient temperature (K)Ta5554545353535
Initial temperature of the parts (K)Tn(0)5554545353535
Caloric value (W)Module 1Q1,11013.7961113.518
Q1,21124.495137.56.0
Q1,31231.89415.17.08.5
Q1,41341.9931.01615.5
Q1,51452.5926.21.24.0
Q1,61563.19112.314.12.2
Q1,71152.001011.14.013.4
Q1,82143.10117.716.22.5
Q1,93133.701216.81.16.8
Q1,104124.201311.26.06.9
Q1,115112.40143.616.213.2
Q1,126104.501512.65.02.1
Module 2Q2,19150.61002.48.013.2
Q2,29142.11101012.615.2
Q2,39132.31207.21.611.1
Q2,49121.313014.85.517.2
Q2,59113.414017.48.010.1
Q2,69101.31503.02.112
Q2,7063.61913.82.04.2
Q2,8052.0297.713.37.0
Q2,9043.7395.51.80.9
Q2,10032.7496.115.11.2
Q2,11021.7593.013.18.0
Q2,12010.6691213.62.5
Dataset no.L1L2L3L7L8L13L14L15
Ambient temperature (K)Ta5554545353535
Initial temperature of the parts (K)Tn(0)5554545353535
Caloric value (W)Module 1Q1,11013.7961113.518
Q1,21124.495137.56.0
Q1,31231.89415.17.08.5
Q1,41341.9931.01615.5
Q1,51452.5926.21.24.0
Q1,61563.19112.314.12.2
Q1,71152.001011.14.013.4
Q1,82143.10117.716.22.5
Q1,93133.701216.81.16.8
Q1,104124.201311.26.06.9
Q1,115112.40143.616.213.2
Q1,126104.501512.65.02.1
Module 2Q2,19150.61002.48.013.2
Q2,29142.11101012.615.2
Q2,39132.31207.21.611.1
Q2,49121.313014.85.517.2
Q2,59113.414017.48.010.1
Q2,69101.31503.02.112
Q2,7063.61913.82.04.2
Q2,8052.0297.713.37.0
Q2,9043.7395.51.80.9
Q2,10032.7496.115.11.2
Q2,11021.7593.013.18.0
Q2,12010.6691213.62.5
Table 10

Thermofluid analysis conditions for the test data

Dataset No.T1T2T3T4T5T6T7
Time (s)t0–100100–200200–300300–400400–500500–600
Ambient temperature (K)Ta20306025203020
Initial temperature of the parts (K)Tn(0)20306025203045
Caloric value (W)Module 1Q1,1304.01212200701015020
Q1,209.411000700000
Q1,302.3100007010000
Q1,404.58000700000
Q1,502.670007010000
Q1,607.86800700000
Q1,706.650007010000
Q1,801.641200700000
Q1,907.53000700000
Q1,1009.82000700000
Q1,1100.81000700000
Q1,1207.508150700000
Module 2Q2,109.91212201007020015
Q2,204.51100100710000
Q2,306.5100010070000
Q2,4206.8800100710000
Q2,501.270010070000
Q2,600.5680100710000
Q2,706.050010070000
Q2,802.9412010070000
Q2,909.530010070000
Q2,10208.520010070000
Q2,1105.710010070000
Q2,1206.1081510070000
Dataset No.T1T2T3T4T5T6T7
Time (s)t0–100100–200200–300300–400400–500500–600
Ambient temperature (K)Ta20306025203020
Initial temperature of the parts (K)Tn(0)20306025203045
Caloric value (W)Module 1Q1,1304.01212200701015020
Q1,209.411000700000
Q1,302.3100007010000
Q1,404.58000700000
Q1,502.670007010000
Q1,607.86800700000
Q1,706.650007010000
Q1,801.641200700000
Q1,907.53000700000
Q1,1009.82000700000
Q1,1100.81000700000
Q1,1207.508150700000
Module 2Q2,109.91212201007020015
Q2,204.51100100710000
Q2,306.5100010070000
Q2,4206.8800100710000
Q2,501.270010070000
Q2,600.5680100710000
Q2,706.050010070000
Q2,802.9412010070000
Q2,909.530010070000
Q2,10208.520010070000
Q2,1105.710010070000
Q2,1206.1081510070000

The two numbers in the subscripts of the caloric value Q in Tables 9 and 10 correspond to the module identification number in the former and the chip identification number in the latter. Datasets L1 to L15 in Table 9 and datasets T1 to T6 in Table 10 are the initial value problems with an analysis time of 10,000 s. The number of input parameters is 26, which is more than the number of learning datasets (15). The learning and validation data are in the range of 278–318 K for ambient temperature and 0–18 W for caloric value. The test data are in the range of 293–333 K for ambient temperature and 0–30 W for caloric value, which includes values outside the domain of definition for the learning and validation data.

The thermofluid analysis was performed by the same computational fluid dynamics software (icepak; ansys) as in Sec. 4.1. The number of nodes was several million. Natural convection was modeled using the Boussinesq approximation, and radiation was modeled by calculating the view factors. The time-step Δt was not fixed. Figure 13 shows some of the temperature measurement points of the heat sink, for which time series temperatures were output at the base (THSb), fin base (TFin1), and fin tip (TFin1), for a total of nine locations each. The temperature measurement points of the power modules were the same as in the case of forced air cooling shown in Fig. 8, except for part F and the case. As for the power modules, the output time series data included the temperature at 51 points: each of the chips; parts B, C, and E on the xy coordinates of the chip center, 12 locations each; and three locations of the case. There were 102 temperature measurement points for two power modules and 27 points for the heat sink, for a total of 129 points.

Fig. 13
Some of the temperature measurement points of the heat sink
Fig. 13
Some of the temperature measurement points of the heat sink
Close modal

4.2.2 Created Model.

The output is a ROM represented by 129 simultaneous ordinary differential equations, some of which are shown in Eq. (21). The meanings of the subscripts in Eq. (21) are explained in Figs. 8 and 13. The former numbers in the subscript of the temperature, “1” denotes module 1, “2” module 2, and “C” the center point between modules 1 and 2
(21)

The number of nonzero coefficients in the ROM we created was 586, i.e., five per equation, which is about 3% of the 21,414 basis function candidates. The basis functions for natural convective heat transfer and radiative heat transfer were properly chosen. It was also properly derived that there are two thermal paths: one through the case via the insulating resin, and the other through the heat sink. As in Sec. 4.1, in a comparison with the MSE of the learning and validation data, the square of the bias was almost the same value, and the variance was approximately 2%.

4.2.3 Results.

As in Sec. 4.1.3, the effectiveness of the proposed method was verified using the created ROM to predict the test data. Table 11 shows the mean error. The mean error for the seven types of test data was 0.6 K. The mean error for the 15 types of learning data was 0.4 K. Furthermore, as mentioned above, the variance was small for the learning and validation data, so we concluded that there was no overfitting.

Table 11

Mean error between the predicted temperature calculated by the ROM and the results of the thermofluid analysis

Dataset no.T1T2T3T4T5T6T7
Mean error (K)0.40.80.90.80.40.50.6
Dataset no.T1T2T3T4T5T6T7
Mean error (K)0.40.80.90.80.40.50.6

The prediction results for dataset T3 are shown in Fig. 14. We plotted five representative nodes out of the 129 predicted temperatures. One of the input parameters, the ambient temperature, was 15 K outside the domain of definition for the learning data. The phenomenon that the temperature T1,Chip12 of chip 12 in module 1 with a caloric value of 0 W increases due to the heat generated by the surrounding chips was predicted with high accuracy. Furthermore, the physical relationship between the nodes was properly captured: the temperature T1,Chip1 of chip 1 in module 1, which exchanges heat to the air warmed by natural convection, is approximately 2.1 K higher than the temperature T2,Chip1 of chip 1 in module 2.

Fig. 14
Prediction results for dataset T3. Chips 1 and 12 in module 1, chip 1 in module 2, the center of the case surface, and the fin tip in the center of the heat sink are plotted.
Fig. 14
Prediction results for dataset T3. Chips 1 and 12 in module 1, chip 1 in module 2, the center of the case surface, and the fin tip in the center of the heat sink are plotted.
Close modal

The prediction results for dataset T7 are shown in Fig. 15. The prediction was highly accurate even when the initial temperature of the parts differed from the ambient temperature.

Fig. 15
Prediction results for dataset T7. Chip 7 in module 1, chip 1 in module 2, and the fin tip in the center of the heat sink are plotted.
Fig. 15
Prediction results for dataset T7. Chip 7 in module 1, chip 1 in module 2, and the fin tip in the center of the heat sink are plotted.
Close modal

5 Discussion

We proposed a sparse modeling method for creating an interpretable ROM from the time series data of multiple temperature nodes and factors influencing temperature and showed that this method achieves a balance between accuracy and computational efficiency. The proposed sparse modeling method comprises three novel machine-learning methods. The best of these is the sequential thresholded non-negative least-squares method based on term size criteria, described in Sec. 3.2.2. This method reduces the effects of multicollinearity and enables the creation of robust ROMs. However, the other two methods are also important. The data preprocessing method for considering the correlation between errors, described in Sec. 3.2.1, predicts very many steps ahead with high accuracy, while the space search method based on similar basis function classification, described in Sec. 3.2.3, realizes physically valid combinations of physical models. Before the effectiveness verification described in this paper, various test trials on a simple problem led to the conclusion that all three of the methods were necessary. Because the proposed method can also model nonlinear time-variant systems, it can be applied to various products and systems. For example, it could enable the following applications.

  1. A surrogate model for data assimilation.

  2. Product-lifetime estimation based on long-term temperature prediction results assuming actual load. The design is expected to be advanced by combination with the surrogate model for life prediction by Hirohata et al. [55,56].

  3. Assistance for human understanding of various phenomena. The proposed method is a batch learning framework, and thus, it is recommended to remove noise from original time-series data by using general smoothing methods and to create ROMs using those data.

  4. Rapid understanding of the influence of design parameters, even during a detailed design process.

  5. A model for condition monitoring to realize condition-based maintenance. The proposed method is also compatible with causal inference, and thus, the relationship between real sensor signals and virtual simulation parameters can be efficiently modeled. It is also possible to combine the method of Suzuki et al. [57], in which parameters that degrade over time are modeled and the ROM is used as a system model for data assimilation.

6 Conclusion

In this paper, we proposed a sparse modeling method for automatically creating a thermal model for temperature prediction in the form of simultaneous ordinary differential equations from a very small number of time series data with a nonconstant time-step. The form of the thermal model is constrained by the physical model, and the model parameters are efficiently estimated using three novel machine-learning methods. The method shows promise for wide application in fields where the concept of equivalent circuits can be applied. Furthermore, it might also be applicable to various products and systems because the proposed method can also model nonlinear time-variant systems. The effectiveness of the proposed method was verified using time-series data obtained by thermofluid analysis of a power module mounted on a comb-shaped heat sink. Two types of cooling systems were targeted: forced air cooling systems with variable thermal contact resistance, and natural air cooling systems. The computational efficiency of the created ROM was very high, and unknown data, including extrapolation, was predicted with an error of less than 1 K.

Data Availability Statement

The authors attest that all data for this study are included in the paper.

Nomenclature

C =

heat capacity

FI =

fixed interval

Gr =

Grashof number

h =

heat transfer coefficient

L =

number of datasets

Ml =

number of data in the dataset l

N =

number of temperature nodes

Nu =

Nusselt number

P =

number of basis function candidates

Pr =

Prandtl number

Q =

caloric value

R =

thermal resistance

Rc =

thermal contact resistance

Re =

Reynolds number

S =

surface area

T =

temperatures

tl,j =

time sample number j in the dataset l

Ta =

ambient temperature

Tn =

temperature at the node n

Tn(0) =

initial temperature at the node n

v =

velocity

X =

factors that influence the temperature

γ =

correction factor for normalization

Δt =

time step

θ =

element of the library

Θ(T,X) =

library consisting of basis function candidates

ξ =

element of the sparse vector of coefficients

Ξ =

sparse vector of coefficients

Ω =

variance–covariance matrix of the error

References

1.
Brunton
,
S. L.
,
Noack
,
B. R.
, and
Koumoutsakos
,
P.
,
2020
, “
Machine Learning for Fluid Mechanics
,”
Annu. Rev. Fluid Mech.
,
52
(
1
), pp.
477
508
.10.1146/annurev-fluid-010719-060214
2.
Rasheed
,
A.
,
San
,
O.
, and
Kvamsdal
,
T.
,
2020
, “
Digital Twin: Values, Challenges and Enablers From a Modeling Perspective
,”
IEEE Access
,
8
, pp.
21980
22012
.10.1109/ACCESS.2020.2970143
3.
Hughes
,
M. T.
,
Kini
,
G.
, and
Garimella
,
S.
,
2021
, “
Status, Challenges, and Potential for Machine Learning in Understanding and Applying Heat Transfer Phenomena
,”
ASME J. Heat Mass Transfer-Trans. ASME
,
143
(
12
), p.
120802
.10.1115/1.4052510
4.
Raissi
,
M.
,
Perdikaris
,
P.
, and
Karniadakis
,
G. E.
,
2019
, “
Physics-Informed Neural Networks: A Deep Learning Framework for Solving Forward and Inverse Problems Involving Nonlinear Partial Differential Equations
,”
J. Comput. Phys.
,
378
, pp.
686
707
.10.1016/j.jcp.2018.10.045
5.
Pang
,
G.
,
Lu
,
L.
, and
Karniadakis
,
G. E.
,
2019
, “
fPINNs: Fractional Physics-Informed Neural Networks
,”
SIAM J. Sci. Comput.
,
41
(
4
), pp.
A2603
A2626
.10.1137/18M1229845
6.
Kharazmi
,
E.
,
Zhang
,
Z.
, and
Karniadakis
,
G. E.
,
2019
, “
Variational Physics-Informed Neural Networks for Solving Partial Differential Equations
,” arXiv preprint arXiv:1912.00873.
7.
Jagtap
,
A. D.
,
Kharazmi
,
E.
, and
Karniadakis
,
G. E.
,
2020
, “
Conservative Physics-Informed Neural Networks on Discrete Domains for Conservation Laws: Applications to Forward and Inverse Problems
,”
Comput. Methods Appl. Mech. Eng.
,
365
, p.
113028
.10.1016/j.cma.2020.113028
8.
Lu
,
L.
,
Meng
,
X.
,
Mao
,
Z.
, and
Karniadakis
,
G. E.
,
2021
, “
DeepXDE: A Deep Learning Library for Solving Differential Equations
,”
SIAM Rev.
,
63
(
1
), pp.
208
228
.10.1137/19M1274067
9.
Masi
,
F.
,
Stefanou
,
I.
,
Vannucci
,
P.
, and
Maffi-Berthier
,
V.
,
2021
, “
Thermodynamics-Based Artificial Neural Networks for Constitutive Modeling
,”
J. Mech. Phys. Solids
,
147
, p.
104277
.10.1016/j.jmps.2020.104277
10.
Zobeiry
,
N.
, and
Humfeld
,
K. D.
,
2021
, “
A Physics-Informed Machine Learning Approach for Solving Heat Transfer Equation in Advanced Manufacturing and Engineering Applications
,”
Eng. Appl. Artif. Intell.
,
101
, p.
104232
.10.1016/j.engappai.2021.104232
11.
Greydanus
,
S.
,
Dzamba
,
M.
, and
Yosinski
,
J.
,
2019
, “
Hamiltonian Neural Networks
,”
Adv. Neural Inf. Process. Syst.
,
32
, pp. 15379–15389.
12.
Lutter
,
M.
,
Ritter
,
C.
, and
Peters
,
J.
,
2019
, “
Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning
,” arXiv preprint arXiv:1907.04490.
13.
Cranmer
,
M.
,
Greydanus
,
S.
,
Hoyer
,
S.
,
Battaglia
,
P.
,
Spergel
,
D.
, and
Ho
,
S.
,
2020
, “
Lagrangian Neural Networks
,” arXiv preprint arXiv:2003.04630.
14.
Kim
,
B.
,
Azevedo
,
V. C.
,
Thuerey
,
N.
,
Kim
,
T.
,
Gross
,
M.
, and
Solenthaler
,
B.
,
2019
, “
Deep Fluids: A Generative Network for Parameterized Fluid Simulations
,”
Comput. Graphics Forum
,
38
(
2
), pp.
59
70
.10.1111/cgf.13619
15.
Zhu
,
Y.
,
Zabaras
,
N.
,
Koutsourelakis
,
P. S.
, and
Perdikaris
,
P.
,
2019
, “
Physics-Constrained Deep Learning for High-Dimensional Surrogate Modeling and Uncertainty Quantification Without Labeled Data
,”
J. Comput. Phys.
,
394
, pp.
56
81
.10.1016/j.jcp.2019.05.024
16.
Sanchez-Gonzalez
,
A.
,
Bapst
,
V.
,
Cranmer
,
K.
, and
Battaglia
,
P.
,
2019
, “
Hamiltonian Graph Networks With Ode Integrators
,” arXiv preprint arXiv:1909.12790.
17.
Sanchez-Gonzalez
,
A.
,
Godwin
,
J.
,
Pfaff
,
T.
,
Ying
,
R.
,
Leskovec
,
J.
, and
Battaglia
,
P.
,
2020
, “
Learning to Simulate Complex Physics With Graph Networks
,”
International Conference on Machine Learning
, July 13–18, pp.
8459
8468
.http://proceedings.mlr.press/v119/sanchezgonzalez20a/sanchez-gonzalez20a.pdf
18.
Pfaff
,
T.
,
Fortunato
,
M.
,
Sanchez-Gonzalez
,
A.
, and
Battaglia
,
P. W.
,
2020
, “
Learning Mesh-Based Simulation With Graph Networks
,” arXiv preprint arXiv:2010.03409.
19.
Horie
,
M.
,
Morita
,
N.
,
Hishinuma
,
T.
,
Ihara
,
Y.
, and
Mitsume
,
N.
,
2020
, “
Isometric Transformation Invariant and Equivariant Graph Convolutional Networks
,” arXiv preprint arXiv:2005.06316.
20.
Horie
,
M.
, and
Mitsume
,
N.
,
2022
, “
Physics-Embedded Neural Networks: -Equivariant Graph Neural PDE Solvers With Mixed Boundary Conditions
,” Advances in Neural Information Processing Systems, 35, pp. 23218-23229..
21.
Chen
,
R. T.
,
Rubanova
,
Y.
,
Bettencourt
,
J.
, and
Duvenaud
,
D. K.
,
2018
, “
Neural Ordinary Differential Equations
,”
Adv. Neural Inf. Process. Syst.
,
32
, pp. 6571–6583.
22.
Long
,
Z.
,
Lu
,
Y.
,
Ma
,
X.
, and
Dong
,
B.
,
2018
, “
Pde-Net: Learning Pdes From Data
,”
International Conference on Machine Learning
, Stockholm, Sweden, July 10–15, pp.
3208
3216
.
23.
Long
,
Z.
,
Lu
,
Y.
, and
Dong
,
B.
,
2019
, “
PDE-Net 2.0: Learning PDEs From Data With a Numeric-Symbolic Hybrid Deep Network
,”
J. Comput. Phys.
,
399
, p.
108925
.10.1016/j.jcp.2019.108925
24.
Kirchgässner
,
W.
,
Wallscheid
,
O.
, and
Böcker
,
J.
,
2023
, “
Thermal Neural Networks: Lumped-Parameter Thermal Modeling With State-Space Machine Learning
,”
Eng. Appl. Artif. Intell.
,
117
, p.
105537
.10.1016/j.engappai.2022.105537
25.
He
,
K.
,
Zhang
,
X.
,
Ren
,
S.
, and
Sun
,
J.
,
2016
, “
Deep Residual Learning for Image Recognition
,”
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, Las Vegas, NV, June 27–30, pp.
770
778
.10.1109/CVPR.2016.90
26.
Scarselli
,
F.
,
Gori
,
M.
,
Tsoi
,
A. C.
,
Hagenbuchner
,
M.
, and
Monfardini
,
G.
,
2009
, “
The Graph Neural Network Model
,”
IEEE Trans. Neural Networks
,
20
(
1
), pp.
61
80
.10.1109/TNN.2008.2005605
27.
Battaglia
,
P.
,
Pascanu
,
R.
,
Lai
,
M.
, and
Jimenez Rezende
,
D.
,
2016
, “
Interaction Networks for Learning About Objects, Relations and Physics
,”
Adv. Neural Inf. Process. Syst.
,
29
, pp. 4509–4517.
28.
Battaglia
,
P. W.
,
Hamrick
,
J. B.
,
Bapst
,
V.
,
Sanchez-Gonzalez
,
A.
,
Zambaldi
,
V.
,
Malinowski
,
M.
, et al.,
2018
, “
Relational Inductive Biases, Deep Learning, and Graph Networks
,” arXiv preprint arXiv:1806.01261.
29.
Gao
,
H.
,
Zahr
,
M. J.
, and
Wang
,
J. X.
,
2022
, “
Physics-Informed Graph Neural Galerkin Networks: A Unified Framework for Solving PDE-Governed Forward and Inverse Problems
,”
Comput. Methods Appl. Mech. Eng.
,
390
, p.
114502
.10.1016/j.cma.2021.114502
30.
Brunton
,
S. L.
,
Proctor
,
J. L.
, and
Kutz
,
J. N.
,
2016
, “
Discovering Governing Equations From Data by Sparse Identification of Nonlinear Dynamical Systems
,”
Proc. Natl. Acad. Sci.
,
113
(
15
), pp.
3932
3937
.10.1073/pnas.1517384113
31.
Schmidt
,
M.
, and
Lipson
,
H.
,
2009
, “
Distilling Free-Form Natural Laws From Experimental Data
,”
Science
,
324
(
5923
), pp.
81
85
.10.1126/science.1165893
32.
Bomarito
,
G. F.
,
Townsend
,
T. S.
,
Stewart
,
K. M.
,
Esham
,
K. V.
,
Emery
,
J. M.
, and
Hochhalter
,
J. D.
,
2021
, “
Development of Interpretable, Data-Driven Plasticity Models With Symbolic Regression
,”
Comput. Struct.
,
252
, p.
106557
.10.1016/j.compstruc.2021.106557
33.
Wang
,
Y.
,
Wagner
,
N.
, and
Rondinelli
,
J. M.
,
2019
, “
Symbolic Regression in Materials Science
,”
MRS Commun.
,
9
(
3
), pp.
793
805
.10.1557/mrc.2019.85
34.
Arnaldo
,
I.
,
Krawiec
,
K.
, and
O'Reilly
,
U. M.
,
2014
, “
Multiple Regression Genetic Programming
,”
Proceedings of the Annual Conference on Genetic and Evolutionary Computation
, Vancouver, BC, Canada, July 12–16, pp.
879
886
.
35.
Castelli
,
M.
,
Silva
,
S.
, and
Vanneschi
,
L.
,
2015
, “
A C++ Framework for Geometric Semantic Genetic Programming
,”
Genet. Program. Evolvable Mach.
,
16
(
1
), pp.
73
81
.10.1007/s10710-014-9218-0
36.
Rad
,
H. I.
,
Feng
,
J.
, and
Iba
,
H.
,
2018
, “
GP-RVM: Genetic Programing-Based Symbolic Regression Using Relevance Vector Machine
,” arXiv preprint arXiv:1806.02502.
37.
Udrescu
,
S. M.
, and
Tegmark
,
M.
,
2020
, “
AI Feynman: A Physics-Inspired Method for Symbolic Regression
,”
Sci. Adv.
,
6
(
16
), p.
eaay2631
.10.1126/sciadv.aay2631
38.
Udrescu
,
S. M.
,
Tan
,
A.
,
Feng
,
J.
,
Neto
,
O.
,
Wu
,
T.
, and
Tegmark
,
M.
,
2020
, “
AI Feynman 2.0: Pareto-Optimal Symbolic Regression Exploiting Graph Modularity
,”
Adv. Neural Inf. Process. Syst.
,
33
, pp.
4860
4871
.
39.
Petersen
,
B. K.
,
Landajuela
,
M.
,
Mundhenk
,
T. N.
,
Santiago
,
C. P.
,
Kim
,
S. K.
, and
Kim
,
J. T.
,
2019
, “
Deep Symbolic Regression: Recovering Mathematical Expressions From Data Via Risk-Seeking Policy Gradients
,” arXiv preprint arXiv:1912.04871.
40.
Wang
,
M.
,
Chen
,
C.
, and
Liu
,
W.
,
2022
, “
Establish Algebraic Data-Driven Constitutive Models for Elastic Solids With a Tensorial Sparse Symbolic Regression Method and a Hybrid Feature Selection Technique
,”
J. Mech. Phys. Solids
,
159
, p.
104742
.10.1016/j.jmps.2021.104742
41.
Kaptanoglu
,
A. A.
,
Zhang
,
L.
,
Nicolaou
,
Z. G.
,
Fasel
,
U.
, and
Brunton
,
S. L.
,
2023
, “
Benchmarking Sparse System Identification With Low-Dimensional Chaos
,”
Nonlinear Dyn.
,
111
(
14
), pp.
13143
13164
.10.1007/s11071-023-08525-4
42.
JEDEC STANDARD
,
2010
, “
Transient Dual Interface Test Method for the Measurement of the Thermal Resistance Junction-to-Case of Semiconductor Devices With Heat Flow Through a Single Path(JESD51-14)
,”
JEDEC
, Arlington, VA, Standard No. JESD51-14.
43.
Hu
,
X.
,
Asgari
,
S.
,
Yavuz
,
I.
,
Stanton
,
S.
,
Hsu
,
C. C.
,
Shi
,
Z.
,
Wang
,
B.
, and
Chu
,
H. K.
,
2014
, “
A Transient Reduced Order Model for Battery Thermal Management Based on Singular Value Decomposition
,” IEEE Energy Conversion Congress and Exposition (
ECCE
), Pittsburgh, PA, Sept. 14–18, pp.
3971
3976
.10.1109/ECCE.2014.6953941
44.
Rudy
,
S. H.
,
Brunton
,
S. L.
,
Proctor
,
J. L.
, and
Kutz
,
J. N.
,
2017
, “
Data-Driven Discovery of Partial Differential Equations
,”
Sci. Adv.
,
3
(
4
), p.
e1602614
.10.1126/sciadv.1602614
45.
Kaiser
,
E.
,
Kutz
,
J. N.
, and
Brunton
,
S. L.
,
2018
, “
Sparse Identification of Nonlinear Dynamics for Model Predictive Control in the Low-Data Limit
,”
Proc. R. Soc. A
,
474
(
2219
), p.
20180335
.10.1098/rspa.2018.0335
46.
Loiseau
,
J. C.
, and
Brunton
,
S. L.
,
2018
, “
Constrained Sparse Galerkin Regression
,”
J. Fluid Mech.
,
838
, pp.
42
67
.10.1017/jfm.2017.823
47.
Fukami
,
K.
,
Murata
,
T.
,
Zhang
,
K.
, and
Fukagata
,
K.
,
2021
, “
Sparse Identification of Nonlinear Dynamics With Low-Dimensionalized Flow Representations
,”
J. Fluid Mech.
,
926
, p.
A10
.10.1017/jfm.2021.697
48.
Mangan
,
N. M.
,
Kutz
,
J. N.
,
Brunton
,
S. L.
, and
Proctor
,
J. L.
,
2017
, “
Model Selection for Dynamical Systems Via Sparse Regression and Information Criteria
,”
Proc. R. Soc. A Math., Phys. Eng. Sci.
,
473
(
2204
), p.
20170009
.
49.
Cortiella
,
A.
,
Park
,
K.-C.
, and
Doostan
,
A.
,
2021
, “
Sparse Identification of Nonlinear Dynamical Systems Via Reweighted ℓ1-Regularized Least Squares
,”
Comput. Methods Appl. Mech. Eng.
,
376
, p.
113620
.10.1016/j.cma.2020.113620
50.
Fasel
,
U
, et al
2022
, “
Ensemble-SINDy: Robust Sparse Model Discovery in the Low-Data, High-Noise Limit, With Active Learning and Control
,”
Proc. R. Soc. A
,
478
(
2260)
, p.
20210904
.
51.
Schaeffer
,
H.
, and
McCalla
,
S. G.
,
2017
, “
Sparse Model Selection Via Integral Terms
,”
Phys. Rev. E
,
96
(
2
), p.
023302
.10.1103/PhysRevE.96.023302
52.
The Japan Society of Mechanical Engineers
,
2009
,
JSME Data Book: Heat Transfer
,
5
th ed.,
Maruzen Publishing
, Tokyo, Japan.
53.
Lawson
,
C. L.
, and
Hanson
,
R. J.
,
1995
, “
Solving Least Squares Problems
,” Classics in Applied Mathematics, Vol. 15,
SIAM
, Philadelphia, PA.
54.
Yuan
,
M.
, and
Lin
,
Y.
,
2006
, “
Model Selection and Estimation in Regression With Grouped Variables
,”
J. R. Stat. Soc. Ser. B (Stat. Methodol.)
,
68
(
1
), pp.
49
67
.10.1111/j.1467-9868.2005.00532.x
55.
Hirohata
,
K.
,
Hisano
,
K.
, and
Mukai
,
M.
,
2011
, “
Health-Monitoring Method of Note PC for Cooling Performance Degradation and Load Assessment
,”
Microelectron. Reliab.
,
51
(
2
), pp.
255
262
.10.1016/j.microrel.2010.09.010
56.
Hirohata
,
K.
,
Hisakuni
,
Y.
, and
Omori
,
T.
,
2018
, “
Prognostic Health Monitoring Method for Fatigue Failure of Solder Joints on Printed Circuit Boards Based on a Canary Circuit
,”
J. Nondestr. Eval. Diagn. Progn. Eng. Syst.
,
1
(
3
), p.
031004
.
57.
Suzuki
,
T.
, and
Takamatsu
,
T.
,
2017
, “
Diagnosis of Computer Cooling Performance Based on Multipoint Temperature Measurements
,”
J. Therm. Sci. Technol.
,
12
(
1
), p.
JTST0012
.10.1299/jtst.2017jtst0012