Abstract
In the framework of solid mechanics, the task of deriving material parameters from experimental data has recently reemerged with the progress in full-field measurement capabilities and the renewed advances of machine learning. In this context, new methods such as the virtual fields method and physics-informed neural networks have been developed as alternatives to the already established least-squares and finite element-based approaches. Moreover, model discovery problems are emerging and can be addressed in a parameter estimation framework. These developments call for a new unified perspective, which is able to cover both traditional parameter estimation methods and novel approaches in which the state variables or the model structure itself are inferred as well. Adopting concepts discussed in the inverse problems community, we distinguish between all-at-once and reduced approaches. With this general framework, we are able to structure a large portion of the literature on parameter estimation in computational mechanics—and we can identify combinations that have not yet been addressed, two of which are proposed in this paper. We also discuss statistical approaches to quantify the uncertainty related to the estimated parameters, and we propose a novel two-step procedure for identification of complex material models based on both frequentist and Bayesian principles. Finally, we illustrate and compare several of the aforementioned methods with mechanical benchmarks based on synthetic and experimental data.
1 Introduction
Within the framework of continuum solid mechanics, initial boundary value problems involving partial differential equations (PDEs) and appropriate initial/boundary conditions are solved to determine the fields of interest, including mechanical, thermal, magnetic, and/or electric fields. To be solvable, the PDEs need to be completed by closure models known as material models (or constitutive equations) that describe the relationships between kinematic and kinetic quantities (and possibly additional variables), i.e., strain and stress, or temperature gradient and heat flux, etc., for the materials under consideration. Constitutive equations are often of algebraic, ordinary differential (ODE), or differential-algebraic (DAE) form—or they can even be defined by PDEs, whereby the space-discretized form of the PDEs leads to similar mathematical structures as the aforementioned forms. Models of hyperelasticity lead to algebraic equations, whereas models of viscoelasticity or viscoplasticity often imply ODEs—although they can also be written as integro-differential equations. Rate-independent plasticity, which is based on a yield condition, often delivers a DAE-system, but, in the variational context, can also be formulated as a differential inclusion problem. Nonlocal models such as gradient-based elasticity, plasticity, or damage are described by PDEs. Once again, in the space discretized form the main structure of the equations resulting from PDEs is similar to that stemming from ODEs or DAEs; hence, for simplicity, in this paper, we limit ourselves to local constitutive laws.
Since constitutive models depend on various parameters, one of the fundamental issues in the theory of materials is the identification (or calibration) of these parameters on the basis of given experimental data. A number of identification approaches have been proposed, among which those based on least squares (LS) and the finite element method (FEM) have been especially prominent due to their high flexibility. Further methods have been proposed in the more recent past, including so-called full-field approaches such as the virtual fields method (VFM), and stochastic and machine learning approaches, including those based on physics-informed neural networks (PINNs). A detailed compilation of the related references is provided in the following sections.
Even more recently, approaches that bring identification one step further have been proposed and are drawing significant attention. They advocate a new paradigm, denoted as (material) model discovery, in which the structure of the material model itself is estimated at the same time as its unknown parameters, again based on the given experimental data. A possible strategy consists of forming a library of candidate basis functions, which are used in a model to approximate unknown functions describing the material behavior. In this way, the model discovery problem is reformulated as a parameter estimation problem, but with a typically much higher dimension of the parameter space. It is also possible to apply VFM, stochastic methods, and machine learning to model discovery.
These developments call for a new unified perspective that is able to cover both traditional and novel parameter estimation and model discovery approaches. The purpose of this paper is, on the one hand, to compile an overview of the available methods along with their formalization and, on the other hand, to provide a unified perspective. In contrast to existing attempts to unify parameter estimation methods in mechanics [1,2], our perspective is based on the all-at-once approach [3], which so far has been mainly applied in the inverse problems community. Adopting concepts discussed in this community, we distinguish between all-at-once and reduced approaches. The all-at-once approach employs a weighted combination of a model-based and a data-based objective function, and we demonstrate that both the reduced approach and the class of VFMs can be recovered as limit cases. With this general framework, we are able to structure a large portion of the literature on parameter estimation in computational mechanics, and we can identify combinations and settings that have not yet been addressed. Moreover, by including stochastic methods, the quality of the identified parameters can be addressed as well. Among the various available methods for parameter inference, we compare Bayesian and frequentist approaches and derive new results for two-step identification methods, which are needed to calibrate complex models.
The remainder of this paper is structured as follows. In Sec. 2, we start with an overview of the basic equations of the initial boundary value problems, as well as the structure of the material models. We then discuss a few aspects of the experimental possibilities, followed by a brief review of parameter identification in elasticity, viscoelasticity, and viscoplasticity.
Section 3 treats the numerical approaches to identify the material parameters. We formulate the LS method based on the FEM, the equilibrium gap, and the VFM, as well as more recent methods such as PINNs and the combination of model selection and parameter identification (i.e., model discovery).
As stated earlier, one of the major aims of this paper is to formulate a unified treatment for most of the available parameter identification procedures, which is the subject of Sec. 4. Section 5 focuses on the quality of the identified material parameters—which is addressed in a statistical setting, including the topics of identifiability and uncertainty quantification. In Sec. 6, examples are provided to compare the performance of the various schemes for selected applications.
2 Parameter Identification in Solid Mechanics
This section addresses the basic problem of parameter identification. First, the fundamental equations of solid mechanics are summarized. Then, we provide an overview of the basics of experimental observations and constitutive modeling using elastic, isotropic, or anisotropic, and inelastic material models. The notation in use is defined in the following manner: geometric vectors are symbolized by and second-order tensors by bold-faced letters. Furthermore, we introduce matrices and column vectors by bold-faced italic letters .
2.1 Fundamental Equations.
In this paper, we thus do not treat parameter estimation in dynamical systems, where properties of wave propagation in solid media are used for identification purposes, see Ref. [4], for example.
where defines the divergence operator with respect to . Moreover, is the density in the reference configuration, and symbolizes the first Piola–Kirchhoff stress tensor, where represents the deformation gradient. Here, denotes the gradient operator with respect to .
The balance of angular momentum implies the symmetry of the Cauchy stress tensor, .2 Thus, six independent stress components have to be determined, whereas only three scalar PDEs are available. Hence, the system of Eqs. (2), or equivalently Eq. (3), is closed by formulating
the kinematics, i.e., a relation between the displacement vector (or the motion ) and the deformation gradient , and
a constitutive model describing the stress state in dependence of the deformation and possibly of additional, so-called internal variables describing the process history.
For isotropic materials, is an isotropic tensor function [6–8]. In the case of anisotropy, special invariants define the stress state, see Ref. [7]. For a so-called hyperelastic constitutive model, a strain energy density function defines the stress state as . To limit the scope of the presentation, only the concise notation in Eq. (4) will be used here. Of course, other representations are possible, e.g., it is also common to express an elastic material law as with the left Cauchy–Green tensor , however this choice does not essentially influence the following considerations of material parameter identification.
if the last equation defines the yield condition, see Ref. [11]. In this case, Eq. (5) represents a DAE-system, see Refs. [11] and [12]. Moreover, in the field of yield-function-based, rate-independent modeling, the right-hand side contains case distinctions (loading-unloading or Karush–Kuhn–Tucker conditions) to be able to reproduce a nonsmooth material behavior. It should also be noted here that we are using variables defined in the reference configuration. When using internal variables with respect to the current configuration or models based on the multiplicative decomposition of the deformation gradient, if the evolution equations of the internal variables depend on relative derivatives with respect to some intermediate configuration, special attention must be paid to time integration in order to obtain objective quantities. To avoid a cumbersome discussion, we assume here that the relative derivatives can be expressed by the material time derivative of a tensorial quantity relative to the reference configuration with a transformation usually referred to as a pull-back operation. This implies a consistent notation in view of the interpretation of considering DAE-systems later on, for details see Refs. [13] and [14].
Different approaches in constitutive modeling derive systems of the form (5) from two scalar potentials, namely, the strain energy density and the dissipation rate potential, so that (under certain conditions on the properties of such potentials) thermodynamic consistency is automatically fulfilled, see Refs. [15–18], for example. Alternative approaches evaluate only the strain energy density function within the Clausius–Duhem inequality and motivate the evolution Eq. (5)2 in a different manner [7].
Of course, there are also other constitutive models that do not fit into the aforementioned structure—such as models described by integro-differential equations, fractional derivatives [19,20], or the endochronic plasticity formulation in Ref. [21]. Alternatively to the original formulation in Ref. [21], the stresses can also be given by a rate equation using a particular kernel function of the endochronic formulation, so that the integral formulation can be transformed into a rate formulation [22]. Models described by variational formulations, see, for example, Ref. [23], do not fit directly into the explained structure as well. However, very similar equations are obtained after their discretization, so that the schemes discussed later on are directly transferable.
is the given traction vector and the prescribed displacement, whereas and are the known initial conditions. Mixed boundary conditions, where , are allowed as well. Consistent initial conditions imply that Eq. (7)1 is also fulfilled for the initial conditions. Formulation (7) covers a large number of models.
In the following, we discuss special cases of material models and issues when determining material parameters from experimental data. We start with experimental possibilities and outline measurement techniques in Sec. 2.2, while parameter identification for different classes of constitutive models is discussed in Sec. 2.3.
2.2 Experimental Observations.
The constitutive models in Eqs. (7) contain unknown parameters . They have to be determined in such a manner that they reflect experimental observations. Since the models contain stresses and strains, the question of how to deduce these quantities from experimental measurements is a crucial one. The principal difficulty is to find experiments where the stress state under given (i.e., controlled) or measured external loads is known. Thus, before addressing material parameter identification, we briefly discuss the possible options for mechanical testing and the related detectable experimental data.
First, we consider experiments where the stress tensor is uniform in a certain area of the specimen, and we start with the uniaxial tension/compression test as the simplest example. In this case, only Eq. (7) have to be solved, since Eq. (7)1 is automatically fulfilled (provided that is negligible and the material properties are homogeneously distributed). In the data analysis, it is necessary to consider the specific structure of the tensorial quantities due to the assumed boundary conditions, such as the absence of stress components in the lateral direction for the present example of a uniaxial test. It should also be noted that, at any time, testing machines such as the ones depicted in Fig. 1 can either control the displacement and measure the resulting force—or vice versa. Accordingly, in order to obtain information on strains and stresses, it is necessary to know the relationship between the displacements and the strains as well as the forces and the stresses. For a uniaxial test, both relationships are known, but care must be taken to account for rigid body displacements (which occur in a specimen due to the finite machine stiffness) as well as to perform a sufficiently accurate measurement of the cross-sectional area of the specimen. A further experimental choice could be to directly control the strain as a function of time rather than the displacement. This choice would require additional technical equipment and call for a synchronization of the strain measurement device with the testing machine. Of course, a temporal quasi-static process can be divided into two (or more) stages with different controls, e.g., with displacement control during loading and force control during unloading.

Top: Universal testing machine for tension, compression, and torsion tests, bottom: Biaxial testing machine
Torsion tests on thin-walled tubes are an attractive alternative to tensile tests, since a uniform stress state over the wall thickness can be assumed if the wall thickness is small in comparison to the tube radius. During testing, either the torsion angle or the torsional moment is controlled. Once again, in order to compute strains and stresses, we need the relations between the torsion angle applied by the testing machine and the shear strain, as well as between the torsional moment and the shear stress. For the latter, the uniform stress assumption can be exploited.
Three-point or four-point bending tests are another simple test alternative. However, these experiments generate nonuniform stress fields, so that Eq. (7)1 needs to be solved. One option is to directly solve Eq. (7)1 as a local equilibrium equation (viewing the specimen as a three-dimensional solid), and another one is to recast it as a global equilibrium equation (adopting the approximations of beam theory). Respectively, the material parameters are calibrated based on full-field measurements or on discrete deflection information, see Refs. [24] and [25], for example.
For large deformation cases, there exist deformations—called universal deformations—which fulfill the local balance of linear momentum under the assumptions of isotropy and incompressibility, see, e.g., Refs. [8] and [9]. In all other cases, the entire problem (7) has to be solved.
Apart from the experiments we just discussed, there are only very few other alternatives, such as a specific type of shear test, see Refs. [26] and [27] for an overview, or experiments on specific membranes, see Ref. [28], from which the stress state can be extracted, for example. Even a biaxial tensile test, see Fig. 1, does not provide a uniform stress state, see Ref. [29] for a discussion on uniformity and a measure of deviation from it. In general cases, only the forces or torques of the testing device are directly known, and assumptions on the stress state must be provided.
On the other hand, if optical access to the sample surface and the ability to identify surface patterns are given, displacements (and strains) can be determined locally on the surface using, e.g., digital image correlation (DIC, see Refs. [30] and [31]), as shown in Fig. 2. With this technique, out-of-plane strains cannot be obtained without introducing further assumptions. In some situations, strain determination is only possible in an integral sense and in one direction, employing strain gauges, clip-on strain transducers, or optical methods. Fiber–Bragg grating sensors, see Refs. [32–34], are also used to determine strains in one direction inside a component, although care must be taken to ensure that these measurement systems do not influence the local strain state. It is also possible to apply microcomputed tomography (μ-CT) in combination with digital volume correlation (DVC) for volumetric determination of the strains, but the recording times are currently so long that long-term changes in the material, such as creep or relaxation effects, can only be ruled out by estimation [35,36]. Additional information can be obtained by the “single-valued” displacement or angle of the specimen holder.
In some situations, there is no optical access to the specimen, e.g., in triaxial tension/compression testing machines [37], indentation tests, Refs. [38–40], or some metal-forming processes where the specimens are not visible in the forming press. In these cases, only a single displacement component and a single force component, both as functions of time, can be evaluated by interpolating discrete data. In some cases, of course, a combination of full-field information and single-valued data can be extracted and considered in the material parameter identification concept [41].
2.3 Parameter Identification for Various Classes of Constitutive Models.
The parameter identification procedure depends on the particular constitutive model class at hand. Thus, the choice of the constitutive model class, which is based on the observed mechanical response of a specimen, e.g., the presence of plastic deformations or viscous effects, directly influences the approach to identifying the sought parameters. In the following, we provide an overview of material parameter identification for a few important classes of constitutive models—elasticity, viscoelasticity, elastoplasticity, and viscoplasticity.
2.3.1 Linear Elasticity.
For linear elasticity at small strains, we distinguish between isotropic and anisotropic cases.
Isotropy.
so that the identification of tensorial relations reduces to that of scalar-valued functions. The axial stress is given by (with ). The lateral strains are given by , i.e., the Poisson's ratio ν can be determined if the lateral strain is measured. Thus, we obtain a decoupling of the identification of Young's modulus E and Poisson's ratio ν, whereby E can be determined from the stress–strain information in the axial direction and ν can be obtained by measuring the transverse strain. In both cases, furthermore, the material parameters can be determined from linear expressions.
Then, the identification of the parameters has to be carried out, for example, by a nonlinear LS approach, see Sec. 3.3.1. Moreover, the lateral strain data are of importance since the axial information alone is insufficient to determine K and G uniquely. A detailed discussion of the reduction of the general class of material models (7) from the three-dimensional (3D) case to the uniaxial tensile case, which involves incorporating the boundary conditions into the strain and stress state, is provided in Ref. [42].
Whether material parameters, such as K and G in this case, can be uniquely identified should be linked to a criterion. A first attempt to investigate whether material parameters can uniquely be identified is provided in Ref. [43]. There are situations where infinite combinations of parameters can yield a very accurate reproduction of the experiments, but this raises the question of the physical meaning of the parameters. This is discussed in the general case of optimization by Ref. [44] and later on by Ref. [45], referred to as local identifiability. This concept is transferred to the case of problems in solid mechanics by Ref. [43] and further discussion is provided by Ref. [46]. A similar approach using the terminology of local stability is discussed by Refs. [47–49]. In a more general context, a first attempt to consider quality measures of the parameter identification in solid mechanics can be found in Ref. [50], where the correlation between the parameters is investigated. Here, the local identifiability concept, which can be extended to general models, is discussed in Sec. 5.1.1.
Anisotropy.
An extension of parameter identification for linear elastic materials to the case of anisotropy, and particularly of transverse isotropy and orthotropy (requiring five and nine material parameters, respectively, for a 3D model), is discussed in Ref. [51], unfortunately only at the theoretical level and with no supporting experiments. In fact, it is not possible to devise an experiment that covers one assumed boundary condition that is connected to a volumetric deformation. If applying a hydrostatic pressure using a fluid, it is very difficult to measure the strains in the three different directions. Inserting the specimen into a chamber whose walls are force gauges, the sensitivity to friction may become a dominating factor. In a 3D-testing machine, a homogeneous stress/strain state becomes questionable. Thus, not all parameters can be uniquely inferred by the procedure in Ref. [51]. This issue is addressed in Ref. [52] for transverse isotropy. It is shown by analytical considerations that there is one volumetric deformation process required—apart from tensile and shear deformations—to uniquely obtain all parameters. To implement this deformation process, a compression tool is developed [52]. However, the evaluation of the data shows that the measured lateral stresses are very sensitive to the geometric accuracy of the specimen and of the tool itself, and also to the friction conditions within the testing equipment. Consequently, the measurement results are not reliable and show large fluctuations. For the case of transverse isotropy with a specific preferred direction (perpendicular to the isotropy plane) in axisymmetric problems, the identification is discussed in Ref. [53].
The case of orthotropy is addressed in Ref. [54], where μ-CT data are evaluated to obtain the material parameters appearing in the analytical results obtained by assuming a homogenized material response. The difficulty is again to perform a reliable compression test to identify one of the parameters. For plane problems, the number of parameters can be reduced, see Ref. [55], for example. Again, the main difficulty is the need for special compression tests and their evaluation, since these tests provide only a very small amount of measurement data, which is also uncertain. Even DIC, which delivers only the surface deformation, does not circumvent the problem of making all material parameters “uniquely” identifiable. The problem is partly solved by using representative volume elements with individual material properties of the constituents to numerically estimate the homogenized material behavior.
2.3.2 Hyperelasticity.
The identification of the material parameters in hyperelasticity (elasticity at large strains, whereby the existence of a strain energy density function is postulated) has to be discussed in more detail. First, hyperelasticity relations are chosen either as part of the models of viscoelasticity, rate-independent plasticity, and viscoplasticity, or they are chosen to model purely elastic material behavior. The latter is typically the case for rubber and some biological materials, which are commonly modeled as weakly compressible. Thus, we discuss the case of isotropy and incompressibility first.
Incompressibility is modeled either by an undetermined pressure, which is calculated by the geometrical constraint equation of no volume change [6,7], or by introducing a bulk modulus that is infinitely large (in practice, very large) in comparison to the shear modulus. In the latter case, the bulk modulus can be interpreted as a penalty factor that is defined by a large number—and not determined by parameter identification. This choice of the penalty factor, i.e., the bulk modulus, is made by the user and can lead to lateral deformations that do not precisely represent incompressibility. Further, the resulting lateral stretches might even become nonphysical at very large strains [56]. For a possible modification requiring appropriate forms of the strain energy density function for the compressible part, see Ref. [57]. With the choice of incompressibility, the experimental data of the lateral stretch from a uniaxial tension/compression test are not necessary.
In the most common hyperelastic material models—apart from the micromechanically motivated approaches, e.g., Ref. [58]—the strain energy density is expressed by a polynomial function of the first and second invariants of the right Cauchy–Green tensor [57,59], or by a polynomial in the eigenvalues of the right stretch tensor (eigenstretches) [60]. The latter are called Ogden-type models. The main problem in identifying the parameters from uniaxial tensile tests, torsion tests, or combined tension-torsion tests is that some parameter values might lead to highly oscillatory tension-compression results or to highly sensitive results, which is discussed in Ref. [61] for a number of submodels of Rivlin–Saunders-type. These phenomena occur especially outside the range of the measurement data used for calibration.
One possible approach to solve this issue is to assume a priori that all polynomial coefficients are positive [62]. This leads to a linear LS identification problem with inequality constraints for uniaxial tension and torsion (universal deformations). The inequality constraints (enforcing positivity of the parameters) can be connected to stability requirements [63], and they raise the general question of the existence of a solution in relation to the constraints on the material parameters. This is discussed in Ref. [57] for a new class of polynomial models in the framework of polyconvex strain energy functions. Here, the assumption of non-negative polynomial coefficients (material parameters) is even necessary.
Further models permit arbitrary signs for the parameters, see Refs. [64] or [65]. So far, it has not been proven that negative material parameters exclude a physical behavior in all deformation stages. However, special attention is suggested when applying such models with negative parameters.
As mentioned earlier, in Ogden-type models, the strain energy density function depends nonlinearly on the eigenstretches. At first consideration, this has advantages for simple tests where the loading directions coincide with the eigendirections of the stretch tensor. Here, however, the material parameters occur nonlinearly in the stress–strain relation, necessitating the solution of a nonlinear optimization problem even for simple cases [66–68]. It is hardly possible to guarantee the sensitivity of the result in material parameter identification (due to the choice of the initial guess of the parameters in an iterative approach) or, consequently, the uniqueness of the estimated parameters.
For a recent overview of identification in hyperelasticity, see Ref. [69], where appropriate weighting functions are chosen to identify the parameters. A comparison of hyperelasticity models can be found in Refs. [70–72]. An overview of analytical expressions for determining the material parameters for specific isotropic hyperelasticity models is given in Ref. [73].
The treatment of material parameter identification for large strains in connection with anisotropic materials is similar as with small strains. The applications here are primarily designed for biological tissues, as these have preferred directions due to the presence of collagen fibers. In Ref. [74], the anisotropic material behavior of a soft tissue is experimentally investigated and calibrated using a finite strain, anisotropic hyperelasticity relation. Even the quality of the parameters is studied, which is very difficult to be guaranteed in layered tissues as it is the case of arteries, see Ref. [75]. Reference [76] considers experiments on various tissues and calibrates the material parameters based on data given in the literature. Unfortunately, the quality of the parameters is not studied. In Ref. [77], a polyconvex model for transverse isotropy is calibrated based on virtual experiments generated with a different constitutive model. In this respect, we refer to Ref. [78] as well. Alternative applications of anisotropic hyperelasticity are woven fabrics, see Ref. [79]. There, however, the simpler case of a plane stress problem is considered which essentially reduces the number of unknown parameters.
2.3.3 Viscoelasticity.
In the case of viscoelasticity, we have to treat either integral equations, ODEs, or DAEs to consider fading memory properties in the material. For this type of model, after infinitely long holding times of the applied load, the stress state coincides with the equilibrium stress state (which has no hysteresis). One common model structure goes back to overstress-type models, i.e., the stress state is decomposed into an equilibrium stress state, formulated by an elasticity relation, and an overstress part. This model type is sometimes also called hyper-viscoelasticity. The overstress part can be formulated as a sum of Maxwell models, i.e., ODEs of first order. Each Maxwell element contains two material parameters. Since all the parameters are highly correlated, a suitable procedure must be developed to determine the parameters in a reproducible way. Another approach takes a fractional derivative model as a surrogate model to determine the individual parameters for given relaxation spectra [80]. Unfortunately, the consistent reduction from the three-dimensional modeling to the uni-axial tension is not addressed. An alternative scheme is proposed in Ref. [81], where inequality constraints are used to successively determine the parameters. For a recent investigation and literature survey, see Ref. [82].
Since the model has a modular structure, the parameters related to the equilibrium stress and to the overstress can be determined successively, i.e., one arrives at the identification of the parameters of an elasticity relation (discussed in the previous subsections) and of the parameters of an ODE system. In Ref. [9], the successive identification is addressed under the assumption of a universal deformation (tension–torsion). However, no quality measures for the identification are discussed. Even the specification of the identification procedure is missing. Because of problems in identification, singular value decompositions are employed in Ref. [83].
Two questions arise here: First, how can the boundary conditions for homogeneous deformations be included in three-dimensional material models (7)? Second, which methods are available to determine the parameters in ODEs and DAEs in the context of LS approaches? The first issue is treated in Ref. [42], where a procedure for arbitrary models is proposed. The aspect of determining the parameters in transient problems is summarized in Ref. [84], where three types of schemes are proposed, namely, simultaneous simulation of sensitivities [85–87], internal numerical differentiation [88], and external numerical differentiation—see Ref. [89] for an application in nonlinear FEM. For the case of viscoelasticity using full-field measurements, which is discussed in Sec. 3.1, the reader is referred to Refs. [90] and [91].
2.3.4 Elastoplasticity and Viscoplasticity.
There is a large amount of articles on parameter identification in the context of plasticity. However, the quality of the estimated parameters is not frequently addressed. Among the first papers to investigate the correlation between parameters are Refs. [92] and [93]. Models based on yield functions contain elastic parameters, whose identification is discussed in Sec. 2.3.1. The determination of the yield stress is much more difficult for most metals. Either there are Lüders bands resulting in a spatial motion of dislocations, so that the resulting stress–strain response is not deterministically predictable, or the yield point is not very pronounced. Thus, the first yield stress is a very uncertain value. In the plastic region, a number of material parameters can describe the nonlinear hardening behavior (isotropic and kinematic hardening, for example), and these parameters can be strongly correlated. A linear correlation between the driving and the saturation term of the Armstrong & Frederick kinematic hardening model, see Ref. [94], can be concluded by analytical considerations of this model [42,54]. For pressure-dependent yield functions in soil mechanics, the identification using a triaxial compression testing device is discussed in Ref. [95]. The quality of the estimated parameters is also examined. The contribution [96] discusses the identification of material parameters for a complex viscoplasticity model using a gradient-based method. A detailed explanation of the calculation of the sensitivities is provided, which fits into the internal numerical differentiation procedure [88].
The more complex the models are, the more difficult it is to determine the parameters from simple experiments. Therefore, material parameter identification can serve as a means of model improvement and possibly model reduction (model identification). However, this field is often not documented, as only the final result of a model is published. When devising new material models, attention must be paid to their identifiability from experiments already during the development phase. This is discussed in detail in Ref. [54]. The authors of Ref. [97] discuss the practical identifiability of the parameters for a more complex yield function, whereas Ref. [98] focuses on a nonlinear kinematic hardening model that is calibrated from uniaxial tensile tests on the basis of one-dimensional considerations by applying an evolutionary algorithm. A further discussion on parameter identification using stress data from tensile experiments is provided by Ref. [99]. This is extended by investigations on noisy data by Ref. [100]. A review regarding the calibration of various plasticity models is also provided by Ref. [101].
3 Computational Approaches for Parameter Identification
As mentioned earlier, only very few experiments lead to uniform stress and strain states. If nonuniform stress/strain distributions occur within a specimen, problem (7)—consisting of (a) the balance equations (PDEs), (b) the constitutive equations, and (c) the kinematic relations, in addition to the initial and boundary conditions—has to be evaluated to obtain the parameters. Since the resulting systems of equations are similar for uniform deformations and spatially discretized nonuniform deformation problems, we will now discuss the latter, more challenging case. Also, the emphasis here is on the identification procedure and not on the numerical optimization schemes.
This section is structured as follows. First, details on the space and time discretization using finite elements are given, together with a detailed description of the three representative problems of linear elasticity for small strains, hyperelasticity, and inelasticity (these problems are later discussed from the perspective of identification). Then, a broad classification of the different identification methods is provided in Sec. 3.2, while an in-depth comparative study is deferred to Sec. 4. We start our review of identification methods with the nonlinear LS method using the FEM (Sec. 3.3) and continue with the equilibrium gap method and the VFM (Sec. 3.4). Surrogate models and PINNs are the topic of Sec. 3.5. Finally, an overview of model discovery and Bayesian approaches is provided in Secs. 3.6 and 3.7, respectively.
3.1 Finite Element Method.
where the matrices represent data management matrices containing only zeros and ones (Boolean matrices). Also, is the number of integration points (commonly Gauss points) within element e, and defines the total number of internal variables of a mesh with elements.
which represent the nodal internal forces, whereas defines the given equivalent nodal force vector. Here, and symbolize incidence matrices assembling all element contributions into a large system of equations (representing the assembly procedure). Moreover, denotes the column vector representation of the element strain tensor (Voigt notation), which depends linearly on the element nodal displacement vector , with . The number of element nodal displacement degrees-of-freedom is denoted with , and denote the weighting factors of the spatial integration in an element. Furthermore, is the strain-displacement matrix of element e evaluated at the jth Gauss point, . symbolizes the Jacobian matrix of the coordinate transformation between reference element coordinates and global coordinates. The symmetric stress tensor (7)2 is recast into vector , which is evaluated at the jth Gauss point. It depends, via the strain vector, on the displacements and the internal variables q.
is commonly solved, whereas Eq. (28)2 is used to compute the nodal reaction forces during postprocessing. Although it is common to state that a Newton–Raphson method is chosen to solve problem (29), it was shown in Refs. [11] and [104] that the Multilevel Newton algorithm proposed in Ref. [105] is applied (if an iterative stress algorithm is chosen at the local level, i.e., at each Gauss point). This method leads to the classical structure of local iterations, often called stress algorithm (although it is the internal variable computation), and to global iterations, where the increments of the displacement vector are obtained (on the basis of the consistent linearization stemming from the usage of the implicit function theorem).
Clearly, for rate-independent material models such as elasticity or rate-independent plasticity, time is only used to parameterize the external load and has no physical meaning (for this reason, it is sometimes referred to as “pseudo-time”). However, in experimental investigations time is also supplied, as the responses (such as displacements and forces) measured at different times are incorporated in the parameter identification. Therefore, the time dependence is also included here. Frequently, the stepwise increase of the load is chosen to be close to the solution of the Newton–Raphson method applied to Eq. (30)1, see Ref. [91] for details. In this sense, time is considered at least in the time-dependent boundary conditions.
Here, the explicit time (or load) dependence is omitted since proportionality is given. The stiffness matrices are defined by
and the elasticity matrix .
In the following, we formally restructure the equations discretized above with the aim of material parameter identification. For this purpose, we redefine as the state vector and introduce the vector of material parameters . The structure of the state vector depends on the problem under study.
3.1.1 Problem Class I: Linear Elasticity.
In the forward problem, is given, i.e., and has to hold. However, in parameter identification, has to be determined. Usually, no time dependence is assumed within the framework of statics. To indicate that different load values are used for parameter identification (and going back to the time-continuous setting for notational simplicity), Eq. (38) should actually be . While in Eq. (40) is time-independent, since the stiffness matrices do not depend on time, is time-dependent since it contains the time-dependent prescribed displacements and equivalent nodal force vector .
In the time-discrete setting, Eq. (46) has to be solved for each load or time-step, yielding different parameters . With this approach, further considerations have to be made to finally determine a unique set of parameters (e.g., by averaging, which is the simplest possibility).
3.1.2 Problem Class II: Hyperelasticity.
There are hyperelastic material models that are linear in the material parameters, such as the class of Rivlin–Saunders or Hartmann–Neff models, see Refs. [57] and [59]. This leads to the same representation as in Eq. (42). Unfortunately, this is not the case for Ogden-type models [8,60], where the fully general case (47) has to be considered.
3.1.3 Problem Class III: Inelasticity.
is given, see Eq. (15). If an implicit time discretization is applied, disappears from the equation. However, y contains the unknowns at all discrete times, and F consists of all nonlinear systems to be solved at each of those time steps. Now, as introduced in Eq. (17), y contains the unknown nodal displacements u, the nodal reaction forces p, and the internal variables q evaluated at the Gauss points. In this case, it is also possible to compute the reaction forces, an exercise which is left to the reader.
3.1.4 Using the Implicit Function Theorem.
3.2 Classification of Different Approaches.
Calibration is used to determine model parameters with which the model can best approximate available measurement data. A criterion for the approximation accuracy is provided in terms of a closed-form mathematical expression, i.e., the objective or loss function. The latter is a norm of the difference between the available measurement data and the model predictions, obtained from the solution of the governing equations via a numerical (discretization) method or a surrogate model. One important characteristic of the different numerical methods presented here is the type of spatial (and/or temporal) discretization. After formulating an objective function, the optimal model parameters that minimize this function are identified using optimization algorithms, often referred to as optimizers. Alternatively, in a statistical setting, a single best guess of the unknown material parameters is formulated as a point estimator, i.e., a function of the observed data [106]. Here, the likelihood function plays a central role, and Bayesian approaches additionally consider a prior density. With statistical models, the uncertainty of an estimate can be assessed through confidence/credible intervals, which are typically computed with sampling approaches.
In Table 1, the methods presented in the remainder of this paper are classified by the aforementioned features, namely,
Classification of different methods for parameter identification as outlined in Sec. 3.2
Method | Discretization | Parametrization | Optimization/sampling |
---|---|---|---|
Nonlinear least-squares using finite elements (Sec. 3.3) | Galerkin; local ansatz for virtual and real displacements | Material parameters | Gradient-based (e.g., trust region) or gradient-free (e.g., Nelder-Mead simplex) |
Equilibrium gap method (Sec. 3.4) | Galerkin; local ansatz for virtual fields | Material parameters | Linear system for linear problems; trust region for nonlinear problems |
Virtual fields method (Sec. 3.4) | Galerkin; global ansatz for virtual fields | Material parameters | Linear system for linear problems; trust region for nonlinear problems |
Physics-informed neural networks (Sec. 3.5) | Collocation; global ansatz parametrized in | Material parameters , parametrization of PDE solution | Gradient-based (e.g., ADAM, BFGS, L-BFGS-B) |
Surrogate models (Sec. 3.5) | Collocation/regression with polynomial ansatz, neural network, Gaussian process, and many more | Material parameters , surrogate parameters | Any (gradient-based, gradient-free, or sampling) |
Model discovery (Sec. 3.6) | Any | Material parameters ; nonzero κi from a large library are selected via sparse regression | Coordinate descent for linear problems; trust region for nonlinear problems |
Frequentist inference (Sec. 5.1) | Any | Any parameter combination | Optimization or sampling |
Bayesian inference (Sec. 3.7) | Any | Any parameter combination | Sampling or variational inference |
Method | Discretization | Parametrization | Optimization/sampling |
---|---|---|---|
Nonlinear least-squares using finite elements (Sec. 3.3) | Galerkin; local ansatz for virtual and real displacements | Material parameters | Gradient-based (e.g., trust region) or gradient-free (e.g., Nelder-Mead simplex) |
Equilibrium gap method (Sec. 3.4) | Galerkin; local ansatz for virtual fields | Material parameters | Linear system for linear problems; trust region for nonlinear problems |
Virtual fields method (Sec. 3.4) | Galerkin; global ansatz for virtual fields | Material parameters | Linear system for linear problems; trust region for nonlinear problems |
Physics-informed neural networks (Sec. 3.5) | Collocation; global ansatz parametrized in | Material parameters , parametrization of PDE solution | Gradient-based (e.g., ADAM, BFGS, L-BFGS-B) |
Surrogate models (Sec. 3.5) | Collocation/regression with polynomial ansatz, neural network, Gaussian process, and many more | Material parameters , surrogate parameters | Any (gradient-based, gradient-free, or sampling) |
Model discovery (Sec. 3.6) | Any | Material parameters ; nonzero κi from a large library are selected via sparse regression | Coordinate descent for linear problems; trust region for nonlinear problems |
Frequentist inference (Sec. 5.1) | Any | Any parameter combination | Optimization or sampling |
Bayesian inference (Sec. 3.7) | Any | Any parameter combination | Sampling or variational inference |
Discretization: Galerkin or collocation methods, as well as local or global ansatz functions.
Parametrization: The objective or likelihood function depends on parameters of the material model, and possibly also on the PDE solution or on a parametrization of the latter.
Optimization/sampling: While deterministic calibration approaches mostly make use of different gradient-based optimization schemes, both optimization and sampling are common choices in a statistical setting.
3.3 Nonlinear Least-Squares Method Using Finite Elements.
If the material parameters cannot be identified on the basis of simple experiments, where the stress and the strain are known, the entire boundary-value problem (7) of the experiment must be solved. This can be done by numerical approximation methods such as finite differences, boundary elements, finite volumes, or the FEM. In the following, we assume that the latter is the method of choice. The nonlinear LS (NLS) method is applied to minimize the difference between the results of the finite element simulation of the experimental boundary conditions and both pointwise and full-field measurement data. First, we discuss proposals for parameter identification of phenomenological constitutive models (macroscale), followed by schemes addressing models with scale separation, specifically materials with a microstructure.
3.3.1 Macroscopic Constitutive Models.
The combination of the FEM and a LS approach to determine parameters in the finite element model goes back to Ref. [107]. Later, Ref. [108] linked LS and FEM with the goal to detect the location and the Young's modulus of an inclusion in a matrix material. Discrete data in combination with finite elements were also used in Refs. [40] and [109–113]. This approach can be followed for experiments with nonuniform stress states when access is limited to, e.g., resultant forces, such as in the case of indentation tests or metal forming processes, see Refs. [114] and [115].
A conceptual and practical step beyond the use of pointwise data has emerged with the availability of optical methods. Here, a special focus lies on DIC methods, see, e.g., Refs. [30] and [116], where the surface displacements of the specimen are measured during loading. Approaches taking advantage of full-field displacement data, pioneered by Mahnken [47,117,118], were further extended by Refs. [29], [50], and [119–126], see also Ref. [127] for biaxial tensile tests and Refs. [128] and [129] for more complex deformation cases. In Ref. [1], an approach of this type was denoted finite element model updating (FEMU), a terminology that has been used continuously since then. Unfortunately, this terminology can be misleading since the same name is well-established in structural dynamics, see Ref. [130] for a review. Following Refs. [130] and [131], the FEMU approach is applied when the mathematical structure of the problem, here a finite element equation (e.g., the entries in the stiffness or mass matrix), is changed during the system identification process, see Ref. [132], for example. In contrast, when calibrating constitutive models, the finite element equation is specified once—and only the material parameters are updated. As a result, the calibration of constitutive models has to be clearly classified as parameter identification rather than FEMU. In Ref. [91], the combination of NLS with DIC data, where the boundary-value problem is discretized using the FEM, is denoted as NLS-FEM-DIC—a denomination that includes the objective function, the boundary-value problem solution technique, and the experimental measurement technique.
Alternative approaches to account for optical information are considered in Ref. [133], where gratings on the specimen surface are incorporated, in Ref. [134] using Moiré-patterns, or in Ref. [135] and [136] using contour data and discrete points on the surface. Reference [1] provides an overview of other schemes, such as various “gap methods” mainly applied in model updating using vibrational data, and a comparison to determine the elastic parameters, see Ref. [1]. Further contributions propose the “constitutive relation error” and “modified constitutive relation error” approaches, Refs. [137] and [138]. A study of the integrated DIC method and its references is provided by Ref. [139]. Overall, the testing and identification paradigm—which takes advantage of full-field measurement data rather than simple strain transducers or strain gauges—in conjunction with the FEM is referred to as “Material Testing 2.0” in Ref. [31].
The previously mentioned approaches bear a strong relation to the mathematical literature on the least-squares method applied to the solution of ODE- or DAE-systems [84]. This is explained in detail in Ref. [89] and will be briefly summarized in the following.
Since the number of individual entries, their magnitude, and their physical units vary within the residual r, it is common and reasonable to introduce a diagonal weighting matrix and to consider the weighted residual . For an approach to account for different amounts of data and different magnitudes of the physical quantities within the data, see Ref. [61]. The steps to preprocess the experimental full-field data as well as the data obtained from the numerical simulation are explained in detail in Appendix A.
represents the sensitivity matrix (also denoted as functional matrix or Jacobian matrix). The solution of problem (56) can be computed by various methods, see Refs. [140–144]. In the case of gradient-based algorithms, the derivatives (57) are required to compute . Parameter identification with sensitivity assessment is also covered by Ref. [145]. The computation of the sensitivities is compiled in Appendix D.
3.3.2 Representative Volume Element Approaches.
In some cases, materials possess a heterogeneous microstructure (an example being fiber-reinforced composites), and the goal of the identification is to determine their macroscopic properties, i.e., the parameters describing a homogenized material. For unidirectional fabrics or for orthogonal fabrics within the linear elastic regime, for example, appropriate macroscopic models are transversely isotropic elasticity (with five parameters) and orthotropic elasticity (with nine parameters), respectively. The difficulties that arise here lie in the experiments required to determine these parameters uniquely (see Sec. 2.3.1). One alternative is to use a representative volume element (RVE) and, assuming that the material parameters of the constituents are known, to determine the homogenized material parameters with analytical mixing rules, as discussed in Refs. [147] and [148], or with computational homogenization. However, care must be taken to ensure that the appropriate deformation modes are excited in order for the related parameters to be determined; see Ref. [54] for a specific application of an RVE to determine orthotropic material parameters. Further questions are treated in Refs. [83] and [149], discussing the issue of identifying the inelastic properties of the matrix and the interphase parameters for composite materials.
Another task might be to identify the material parameters of the microscopic constituents of a heterogeneous material. This can be done either by considering only the RVE and applying to it appropriate homogeneous loading conditions, or by relying on macroscopic experiments. The latter option is discussed by Ref. [150] in a FE2 context on the basis of the LS approach. Unfortunately, local identifiability of the parameters is not discussed. Since FE2 computations are very expensive, an iterative identification procedure using standard FE2 codes is very time-consuming. This problem is pointed out in Ref. [151], however, no connection to an identifiability assessment is made. The use of integrated DIC for parameter identification within a multiscale approach is discussed in the recent contribution [152].
An example of RVE-based identification approach for a damaging material is discussed in Ref. [153]. In order to incorporate the microvoid behavior under loading into the model, a unit cell is considered and the material parameters controlling damage evolution are identified. For that application, it is important to emphasize that considering damage at RVE-scale must be handled with care during homogenization. In addition to ensuring that appropriate deformation modes are excited for the parameter identification, it is important to recognize that the assumption of separation of scales is no longer applicable. Consequently, higher-order homogenization procedures should be used.
3.4 Equilibrium Gap and Virtual Fields Method.
As discussed in Sec. 3.3, the NLS method with DIC data seeks to minimize the square of the weighted residuals for the unknown parameters , thus minimizing the mismatch between FEM predictions and experimental observations d.
which implies that, in general, is not fulfilled (“there is a gap to the equilibrium conditions”). Moreover, d must be known. It has to be remarked that (where are the experimental displacement data) is very restrictive, as it only allows the use of the region of the DIC measurements (which is a subset of the domain) and limits analysis to plane problems if no three-dimensional data are available. While the scheme is applicable for problem classes I and II, it requires additional considerations for problem class III since the internal variables are not measurable. Thus, the scheme is only applicable for very specific problems in the proposed form (59). Theoretically, the scheme is extendable to force data as well. However, this may be difficult when using DIC, as the forces corresponding to the imaged regions may be unknown in this case.
Often, the number of virtual fields in is chosen such that it coincides with the number of unknown material parameters. These functions must be linearly independent.
which is a linear system with unknowns. The same property also holds for problem class II (hyperelasticity) if the material parameters are linearly embedded, e.g., with Rivlin–Saunders and Hartmann–Neff models [57,59], see Ref. [159]. Thus, material parameter identification with the VFM requires only the solution of one linear system of equations, whereas the NLS–FEM approach requires running a forward FEM simulation at each iteration of the optimization algorithm. Note, however, that the full displacement field is here assumed to be available from experimental measurements. A discussion on exceptions is provided in Ref. [160].
For problem class III (models with internal variables), the scheme explained above is not applicable. Details regarding the application of the VFM to some specific models with internal variables are provided by Refs. [157] and [161]. For general considerations, we refer to Sec. 4.4.1. Please note that, given the conceptual similarities between equilibrium gap and VFM, we will only refer to the VFM in the following.
3.5 Full-Field Surrogate Models and Physics-Informed Neural Networks.
The evaluation of using, e.g., a FEM simulation may be costly, especially if many iterations are required and the model to be calibrated is computationally demanding. Therefore, it can be efficient to replace the simulation with a so-called surrogate or metamodel. The vector of the corresponding results is denoted as .
Surrogate models require a flexible parametric regression ansatz, such as, e.g., multivariate polynomials. Recently, machine learning approaches such as artificial neural networks (ANNs) and Gaussian processes, often referred to as Kriging, have gained increasing attention. A brief introduction to ANNs and the notation used in the paper is given in Appendix C. One of the first attempts to apply ANNs as surrogates for the identification of material parameters goes back to Refs. [38,39], and [162]. Recent work in this direction can be found in Refs. [163] and [164]. When using surrogate models in the context of parameter identification, the overall procedure is the following:
To train a surrogate model as a regression function through the data sampled from a given number of evaluations of the physical model . In the context of Gaussian processes, are the kernel parameters that can be fitted to the data using Bayesian model selection or by minimizing the negative log-likelihood. After fixing the kernel parameters, the GP surrogate is conditioned on the available data to obtain the regression function. The training of ANNs, on the other hand, requires the identification of trainable parameters , i.e., the weights and biases, via optimization.
To identify the physical parameters by solving the minimization problem (63), whereby is replaced with its surrogate . Herein, denotes the nonphysical parameter set of the surrogate identified in step 1. Note that surrogates can also be used in the context of statistical inference in combination with sampling approaches, see Secs. 3.7 and 5.
Step 1 and 2 can be interconnected, which is then referred to as adaptive sampling, see Ref. [165] for an extensive review in the context of Kriging.
the ANN acts as a function approximator (or ansatz function) of the PDE solution. The above ansatz is parameterized in the ANN weights and biases . Figure 3(a) shows a schematic representation of ansatz (64).
Spatial (and, in the case of dynamic problems, temporal) derivatives of the ansatz (64) can be calculated using automatic differentiation, see Refs. [168,170], and [171]. For dynamic problems, this approach is referred to as continuous-time PINN [172]. Alternatively, temporal derivatives can be treated by means of discrete time-integration schemes, see Refs [172] and [174]. As mentioned earlier, dynamic problems are out of the scope of the representation. Instead, the time dependence in Eq. (64) expresses that different load (time) steps within one experiment or various experiments are used to identify the set of material parameters .
3.5.1 Parametric PINN as Surrogate Model.
Figure 3(b) shows a schematic representation of a parametric PINN ansatz in comparison to the one in Eq. (64). The training of parametric PINNs, i.e., the identification of the ANN parameters , is discussed in detail in Sec. 4.3.2. Parametric PINNs have been deployed for thermal analysis [175], magnetostatics [176], or for the optimization of an airfoil geometry [177]. To the best of our knowledge, they have not yet been used for the calibration of material models in solid mechanics.
3.5.2 Inverse or All-at-Once PINNs.
An alternative to the parametric PINN approach discussed above are so-called inverse PINNs, which are related to the all-at-once formulation introduced in Sec. 4. In inverse PINNs, material, and network parameters and are identified simultaneously, and the optimization problem (63) does not hold anymore. Details are provided in Sec. 4.5.2.
The following contributions deal with material model calibration from full-field displacement and force data using PINNs. In Ref. [178], the authors propose a multinetwork model for the identification of material parameters from displacement and stress data for linear elasticity and von Mises plasticity. For the analysis of internal structures and defects, the authors in Ref. [179] present a general framework for identifying unknown geometry and material parameters. In Ref. [180], a framework is developed for the calibration of hyperelastic material models from full-field displacement and global force–displacement data. Unlike in the conventional PINN approach [172], the physical constraints are imposed by using the weak form of the PDE. In Ref. [181], variational forms of the residual for the identification of homogeneous and heterogeneous material properties are formulated. The method is demonstrated using the example of linear elasticity. The study in Ref. [182] considers the identification of heterogeneous, incompressible, hyperelastic materials from full-field displacement data. It uses two independent ANNs, one for the approximation of the displacement field, and another one for approximating the spatially dependent material parameters. A comparison of different data sampling strategies as well as soft and hard boundary constraints is reported in Ref. [183]. In Ref. [184], PINNs are further enhanced toward the calibration of linear elastic materials from full-field displacement and global force data in a realistic regime. The realistic regime here refers to the order of magnitude of material parameters and displacements as well as noise levels.
3.6 Model Discovery.
The previously discussed methods for material characterization focus on the calibration of material parameters in an a priori known material model. The a priori choice of the material model, i.e., the combination of mathematical operations that describes the material behavior, typically relies on the intuition and modeling experience of the user. Inappropriate assumptions in the material model are reflected in a poor fitting accuracy of the model even after parameter calibration. To avoid such modeling errors, strategies have been proposed recently to automatically discover a suitable symbolic representation of the material model, while simultaneously calibrating the corresponding material parameters. Thus, the problem of material parameter calibration is generalized to the more intricate problem of material model discovery.
Algorithms for discovering symbolic models from data can be broadly classified into two types: symbolic regression methods based on genetic programming [185], which repeatedly mutate a model until its agreement with the data is satisfactory, and sparse regression methods [186] which select a model from a potentially large set (also called library or catalogue) of candidate models based on data. Such methods have first been used in the physical sciences to deduce symbolic expressions of the governing equations of dynamical systems [187,188]. In the field of material modeling, symbolic regression based on genetic programming has been used since the early work of Ref. [189], see also the more recent works in Refs. [190–194]. Sparse regression for the discovery of material models has been much less explored and has only recently gained attention [159,195–197]. See also the review in Ref. [198], where a novel symbolic regression approach based on formal grammars is proposed for hyperelasticity.
In Ref. [159], the authors propose EUCLID (efficient unsupervised constitutive law identification and discovery), a method for discovering symbolic expressions for hyperelastic strain energy density functions using sparse regression starting from displacement and force data. The method was later extended to elastoplasticity [199], viscoelasticity [200], and generalized standard materials [201], see Ref. [202] for an overview. A supervised version (i.e., a version based on stress data) of the EUCLID approach is experimentally validated in Ref. [203] using data stemming from simple mechanical tests on human brain tissue.
The idea behind EUCLID is to construct a set of candidate material models (i.e., a material model library) by introducing a general parametric ansatz for the material behavior, which depends on a large number of parameters ( with ) comprising all material parameters of all candidate material models. Calibrating the parameters using one of the previously discussed calibration methods would result in a highly ill-posed inverse problem. Even assuming that such a problem is solvable, the solution would deliver a parameter vector with many nonzero entries and, thus, a highly complex symbolic expression for the material model. Therefore, a regularization term is introduced which penalizes the number of nonzero entries in , and, consequently, promotes simplicity of the symbolic material model expression while alleviating the ill-posedness of the inverse problem. The so-called -regularization term takes the form with , assuming high values for dense vectors and smaller values otherwise. In the limit , the -regularization term converges to an operator that counts the number of nonzero entries in . For p = 1, the approach is called least absolute shrinkage and selection operator (Lasso)-regularization.
where the first term in the objective function quantifies the mismatch between the model and the data. Here, similar to the VFM, see Eq. (60), the model-data mismatch is defined indirectly by the sum of squared residuals of the discretized weak form of the balance of linear momentum. Here, however, we should note that other measures for the model-data mismatch, see Sec. 3.3, could be chosen likewise. The matrix and the load vector in Eq. (67) are constructed as described in Appendix B for linear elasticity at small strains. If more sophisticated material behavior like hyperelasticity is considered, the matrix changes accordingly, see Ref. [159].
The effect of the sparsity-promoting regularization term on the minimization problem is influenced by the weighting factor , which can be chosen to strike a balance between the fitting accuracy and the complexity of the discovered material model, see Refs. [199–203]. Abandoning the deterministic setting, the EUCLID method is investigated from a Bayesian perspective in Ref. [160] to quantify the uncertainty in the discovered models.
In the literature, two different lines of research on model discovery can be distinguished. The first aims to discover symbolic and thus interpretable expressions for the material models, as in the approaches we just described. The second is concerned with the training of noninterpretable black-box machine learning models for encoding the material response. Examples of the latter include machine learning models like splines [204], Gaussian processes [205], neural ordinary differential equations [206], and—the arguably most popular choice—ANNs [207]. Although purely data-driven constitutive models have shown some success [207,208], much effort has been addressed to constrain ANNs such that fundamental physical requirements (such as objectivity, material stability, and thermodynamical consistence) are respected, while maintaining their expressivity. In the context of hyperelasticity, for example, Refs. [209–212] choose specifically designed ANN architectures to ensure the (poly-)convexity of the strain energy density function. In the context of dissipative materials, Refs. [197,213–215] consider ANNs that do not violate thermodynamic requirements (see Ref. [216] for a review). An often disregarded drawback of ANNs is their data-hungriness. Supervised training frameworks for ANNs require a large amount of labeled stress–strain data pairs, which are experimentally not accessible and are therefore typically acquired through computationally expensive microstructure simulations, which require the microstructural properties of the material to be known. To counteract the data scarcity, unsupervised training frameworks that solely rely on experimentally available displacement and reaction force data like the NN-EUCLID proposed by Ref. [211] or the approach presented in Ref. [217] are indispensable. For further details on data-driven material modeling and discovery, the reader is referred to the recent review on the topic in Ref. [218]. While the subsequent analysis focuses on EUCLID, note that in principle any data-driven constitutive model can be implemented within a FEM or VFM code for model discovery, see, for instance, Ref. [217].
3.7 Bayesian Approaches.
where refers to the parameter prior and represents the likelihood function, which can be evaluated by solving the mechanical model.
with the weighted norm for any positive definite matrix A, and recover the NLS approach with Tikhonov regularization, see Ref. [220] for details and further links between Bayesian approaches and regularization. One can even obtain information about the posterior based on this optimization-based approach, by perturbing the data d with a randomly drawn noise vector according to the density . This so-called randomize-then-optimize approach is investigated in a number of papers, see Ref. [221], for instance.
When dealing with model discovery in the context of a possible large library of candidate basis functions, Bayesian priors are also frequently used to enforce sparsity. For instance, maximum a posteriori estimates yield, together with the Bayesian Lasso and spike-and-slab priors, Bayesian equivalents of the and regularization, respectively. A unifying framework to Bayesian sparsity priors has recently been proposed in Ref. [222].
Fully Bayesian approaches, however, go beyond point estimation and determine the posterior distribution , most often through sampling-based Markov chain Monte Carlo analysis. Here, a major bottleneck is the large computational cost, which is often alleviated by replacing the simulation model with a surrogate. A popular choice consists in employing Gaussian process modeling with adaptive sampling, see Ref. [223], for example. A sampling-free approach is put forth in Ref. [224], where polynomial chaos filtering is used to avoid costly sampling. An alternative consists of building a surrogate model for the likelihood function or the posterior directly, which in turn gives easy access to posterior moments and the marginal likelihood function [225]. The drawback here is the concentration of Likelihood and posterior in the large data-small noise limit, which requires dedicated adaptive surrogate modeling approaches.
Bayesian parameter estimates have been employed in the context of biological hyperelastic models [226], but also for phase-field fracture modeling [227,228], composite modeling [229], viscoelasticity [230,231], and plasticity [232,233], to name a few. See Ref. [234] for various applications in the context of computational mechanics and Ref. [235] for a comparison of Bayesian and deterministic parameter estimation approaches.
Another feature that makes Bayesian parameter estimation interesting for mechanics is the ability to compare competing models based on model selection criteria, such as Bayes' scores. This is proposed, e.g., in Ref. [236] for comparing hyperelastic material models and in Ref. [237] for selecting continuum damage mechanics models.
The Bayesian reasoning also underlies machine learning approaches, in particular Gaussian process models and Bayesian ANNs. A recent survey covering Bayesian methods in the context of PINNs is given in Ref. [238].
see Ref. [239]. Recently, the role of model error has been emphasized in the statistical finite element framework [240], which is mainly designed to update an uncertain state from noisy data. The reader is referred to Ref. [241] for a reaction-diffusion model and Ref. [242] for the case of hyperelastic material models. The importance of accounting for errors in the state representation for parameter identification is highlighted, e.g., in Ref. [243]. Therein, the authors show that a statistical representation of the discretization error leads to improved parameter estimates for several inverse problems. A similar approach is adopted for the calibration of hyperelastic materials in Ref. [226]. In the most basic case, one could simply inflate the covariance matrix to account for measurement and model error at the same time.
Note that there exist many other approaches to quantify uncertainties—employing, for instance, frequentist or interval-based methods. These methods have been applied in solid mechanics to some extent, but are not in the focus of the paper. Various frequentist approaches and a comparison to the Bayesian approach are included and cited in Sec. 5. Since publications in this direction typically do not employ the frequentist label, to the best of our knowledge, a systematic review of these approaches is not yet available.
Remark 1. There is a research direction of data-driven computational mechanics that entirely avoids the specification of material constitutive models [244]. This research is not included in the review because it does not feature material parameters, or parameters that define a model structure, that can be estimated or inferred, which is the main theme of this paper. Indeed, the approach in Ref. [245] is closer to nonparametric methods for constitutive modeling, see, e.g., Ref. [246], which is not in the scope of this review. In the case of model-free and prior-free data-driven inference [247], only a loss function, not a likelihood function, can be defined. Therefore, this method does not fit our setting, where the likelihood function plays a central role. Moreover, the prior plays an important role in our paper as well, because it may either incorporate prior knowledge on the ranges of physical parameters or enforce sparsity in the model discovery family of methods.
4 Unified Framework for Parameter Estimation
This section presents an abstract and unifying view on parameter identification. In particular, we introduce a parameter estimation framework which covers most available approaches in the computational mechanics literature. In the inverse problems community, parameter estimation problems are often classified in “reduced” and “all-at-once” approaches, see Refs. [248] and [249]. In an optimization context, the authors in Ref. [250] distinguish between “black-box” and “all-at-once” approaches, where the term “black-box” emphasizes that the reduced setting typically relies on an existing simulation code for the parameter estimation task. Moreover, Ref. [251] discusses all-at-once formulations for PDE-constrained optimization.
The reduced approach for parameter estimation is mainly centered around the NLS method and aims at identifying the unknown model parameters. The all-at-once approach additionally incorporates the state equation, e.g., Eq. (38) for problem class I, directly in the identification procedure. Thus, it allows to simultaneously identify the mechanical state and the model parameters. This approach has also been discussed in the context of the equation error method, see Ref. [252] for a combination of the NLS and equation error method. In computational mechanics, there exists a third important approach, the VFM. In the following, these three important cases are discussed and the links between them are established.
In the solid mechanics literature, the all-at-once concept is not widely discussed. As we point out later, the all-at-once setting is less restrictive regarding the model assumptions than the reduced approach and it provides a suitable framework for the recently introduced PINN-based identification approaches. Moreover, it may result in quite diverse iterative parameter estimation methods, regarding implementation and cost. In the following, we will first briefly introduce the basic concepts of reduced and all-at-once approaches. Comparative aspects regarding first-order optimality conditions are the topic of Sec. 4.2. Then, different variants of reduced (Sec. 4.3) and all-at-once approaches (Sec. 4.5) are discussed. VFM and EUCLID are the subject of Sec. 4.4. As novel contributions, we propose both an FEM- and VFM-based all-at-once formulation in Sec. 4.5.1.
4.1 Introducing Reduced and All-At-Once Approaches.
The state and parameter vector y and take values in the sets . The so-called observation operator O relates the model state y to the observational data d, which may be the raw data or some postprocessed version of it.
with the definition of a distance function, see Eq. (54). Approaches of this type correspond to the so-called reduced formulation. The main advantage of this approach is that the minimization is carried out for the parameter only, and Eq. (71)1 is fulfilled exactly. It must be mentioned that the observation operator can take many forms and may also account for integrated and/or indirectly measured quantities. For instance, O may be chosen to select a part of the full displacement vector in the case of DIC data. If, instead, the data d consist of strains, the observation operator computes the strains from the displacements first. Also, if measurements are available at a set of points in the computational domain, O can be defined to interpolate the discrete solution at the sensor locations. Here, “sensor locations” are the experimental evaluation positions, which, in the case of optical measurement data, may be very numerous.
where both y and are determined at the same time (hence the name of all-at-once approaches). In the case of elastic materials and displacement-independent loads for finite element equations, the function represents the distance between internal and external forces, which is a helpful interpretation for the later discussion.
which contains both physics- and data-based loss terms. The constants and can be used to weigh the importance of the individual contributions or to simply rescale the different loss terms. Such a conditioning of the objective function is often crucial to achieve numerical convergence, see Ref. [184], for example. In the setting introduced so far, the residual F can be discretized with any numerical method.
4.2 First-Order Optimality Conditions.
In the following, we will discuss connections and differences between reduced and all-at-once approaches as well as the VFM by inspecting necessary first-order conditions (FOCs). Exemplarily, we thereby consider elastic problems, i.e., problem classes I and II. The main lines of this section follow [3], which reports a comparison of reduced and all-at-once approaches in an abstract Banach space setting.
In general, there may be multiple parameters for which the minimum is attained, which would require to write in Eq. (76). In this case, the equal sign simply chooses a single element of the set.
Next, we derive the FOCs for both approaches.
With Eq. (85), we arrive at Eq. (93)1. Hence, we have shown the equivalence of Eq. (93) and Eq. (89).
where a strong connection to the reduced FOCs (93) becomes apparent. In the limit , we recover the reduced optimality conditions from those of the all-at-once formulation.
which is precisely the FOC of the VFM. We have summarized the connections and relations for reduced, all-at-once approaches as well as the VFM in Table 2.
Unified view on parameter identification
All-at-once approach | |
---|---|
Objective function | |
Parameters | Joint parameter vector |
First-order optimality conditions |
All-at-once approach | |
---|---|
Objective function | |
Parameters | Joint parameter vector |
First-order optimality conditions |
Reduced approach () | Virtual fields method () | |
---|---|---|
Objective function | ||
Parameters | Material parameters | Material parameters |
First-order optimality conditions |
Reduced approach () | Virtual fields method () | |
---|---|---|
Objective function | ||
Parameters | Material parameters | Material parameters |
First-order optimality conditions |
4.3 Reduced Approach.
Following this approach, the parameter-to-observable map predicts the data d for each , see also Table 3.
Solution and parameter-to-observable maps that enable the reduced formulation
Solution map | |
---|---|
Parameter-to-observable map |
Solution map | |
---|---|
Parameter-to-observable map |
The reduced approach has already been anticipated in Eqs. (51) or (52), and is in particular associated with the NLS-FEM approach, cf. the minimum problem (55) and Sec. 3.3.1. The reduced approach for the calibration of mechanical models is reviewed, e.g., in Ref. [47] with a focus on inelastic models.
Based on Eq. (104), the parameter vector can now be tuned to minimize the prediction-data mismatch . The main advantage of this approach is that only an optimization problem in the parameter domain needs to be solved. Moreover, the solver may be employed as a black box, providing a data prediction for each parameter value. The drawback is the restriction (102), which may not hold in practice. In the following, we proceed by discussing several specific problems.
4.3.1 NLS-FEM-Based Reduced Approach.
In the following, the reduced approach using the NLS-FEM method is discussed.
Problem Classes I and II: Elasticity.
The NLS-FEM approach of Sec. 3.3 can be related to the reduced formulation. For models in which the material parameters or the state quantities are included linearly, see, for example, Eqs. (39) or (46), there is no significant advantage with regard to the optimization problem. Thus, the iterative computation of Eq. (89) has to be carried out.
Problem Class III: Inelasticity.
Here, the last equation results from Eq. (24)2. The solution of this DAE-system yields the sensitivities . In Refs. [84–86], the system (108) is denoted as “simultaneous sensitivity equations”, see also Ref. [89] in the context of the FEM.
Since direct access to the finite element code is required in the internal numerical differentiation version, an alternative for the case of black box finite element programs has to be considered. This is called “external numerical differentiation”, see Refs. [84] and [89], where finite differences are chosen to obtain the derivatives in Eq. (114).
4.3.2 Parametric PINN-Based Reduced Approach.
In the following, the general procedure for parameter identification with parametric PINNs is presented for the different problem classes. We thereby consider PINN formulations based on the strong form of the balance of linear momentum (7), which can be regarded as a meshfree collocation method. Note that it is equally possible to approximate other physical principles using ANNs, e.g., the minimum of potential energy. This approach is known as the deep energy method, see Refs. [254] and [255] for a mixed formulation.
From here on, the superscript denotes quantities defined at collocation points . For an introduction to the notation of ANNs, the reader is referred to Appendix C.
Problem Class I & II: Elasticity.
Herein, reaction forces at Dirichlet boundaries have been neglected. Since these are relevant in the context of displacement controlled-experiments, they could be accounted for by an additional loss term or by using the method of Lagrange multipliers, see Sec. 3.1 in the context of the FEM.
Spatial derivatives occurring in the semidiscrete model (120) are computed using automatic differentiation. Alternatively, the automatic differentiation can be enriched or even replaced by incorporating discrete derivative operators known from other numerical methods, e.g., peridynamics [256].
Note that the mean squared error is often minimized in machine learning, see Ref. [257]. Here, we keep the squared error for the sake of consistency with the rest of the paper. After training, the material parameter vector can be identified using a reduced approach.
where denotes the parametric PINN surrogate model, see also Sec. 3.5. The partial derivatives can be computed using automatic differentiation, see also Appendix C.
It becomes apparent that Eq. (130) is analogous to the reduced formulation of the FEM-based reduced approach (87). However, it is important to note that the state Eq. (71)1 only holds in the least-squares sense, as expressed in Eq. (124), for reduced PINNs. This also explains the classification of parameteric PINNs as surrogate models, see Sec. 3.5.
Problem Class III: Inelasticity.
whereby the full state vector and its rate are obtained through the assembling step (122).
Next, the full inelastic discrete model is assembled as in Eq. (123) and the training follows in analogy to Eq. (124). Parameter identification once again requires minimizing (127) by solving the minimum problem (128). Here, we restrict ourselves to cases where the observation operator depends on the state y only, and not on its rate . As a consequence, the same necessary FOC (129) as in the elastic case can be applied.
4.4 Virtual Fields Method and EUCLID.
There are methods that do not fit exactly into either the reduced formulation or the all-at-once approach. The VFM, for instance, is based on an optimization problem over the parameter domain only, while replacing the unknown state with experimental data directly. Hence, it shares similarities with the reduced approach. In the following, we discuss the material parameter identification process for the VFM in Sec. 4.4.1. A generalization of the VFM to model discovery is EUCLID, presented in Sec. 4.4.2. An alternative approach to model discovery that also relies on VFM can be found in Ref. [217], but is not discussed in the present paper.
4.4.1 Virtual Fields Method.
see Eq. (101) as well.
Problem Class I: Linear Elasticity.
In other words, a very small system of linear equations has to be solved, which makes the method attractive for the problems under consideration.
To further reduce the effort of the matrix–matrix multiplication in Eq. (140), a different implementation can be considered, see Appendix B, Eq. (B12), where only one resulting force is extracted.
We further note that the VFM is closely related to the FEM-based reduced approach for problem class I. This is shown in Ref. [2], where it is observed that the objective functions of the minimization problems of the VFM and the FEM-based reduced approach differ only by the choice of the norm that measures the difference between the measured and the simulated displacements.
Problem Class II: Hyperelasticity.
Hence, a system of nonlinear equations must be solved to determine .
Problem Class III: Inelasticity.
and we obtain a coupled system of nonlinear equations. For stiff problems or unstable evolution equations, as discussed in Ref. [259], the fact that the integration step for the evolution equations are not exactly satisfied may lead to problems. Thus, a different approach may be needed, which is discussed in the following.
This can be interpreted as the simultaneous sensitivity equation for the VFM.
vanishes due to condition (153)2. In other words, Eq. (153)1 represents a small system of nonlinear equations, which is coupled to the integration step of the internal variables (153)2. Various special cases can be treated—such as linearity in the material parameters, precalculation of the internal variables, and others, depending on the structure of the constitutive equations. This is not discussed in detail here.
4.4.2 EUCLID.
In EUCLID (see also Sec. 3.6), instead of a specific material model, a so-called model library with a large number of material parameters (e.g., in the order of hundreds [200] or thousands [203] parameters) is employed in the formulation of the model F. Applying the previously discussed identification methods to such a model library would result in a highly ill-posed minimization problem and in uninterpretable and impractical models with a large number of nonzero parameters. In order to obtain interpretable models, the number of nonzero parameters contained in the identified must be as small as possible, which is ensured using Lasso regularization by modifying the objective functions (136) and (144).
Problem Class I: Linear Elasticity.
We note that, for the linear elastic case there is not much freedom in how to choose the material model. The material characterization problem essentially boils down to identifying the elasticity tensor. Thus, formulating a model library for linear elasticity is not meaningful, and applying EUCLID is un-necessary.
Problem Class II: Hyperelasticity.
Due to the nonsmooth regularization term, deriving a linear system of equations via the FOC as for the VFM is not possible and the solver has to be chosen thoughtfully. Efficient solvers for problems of this type have been proposed, for instance, by Refs. [186] and [260]. The assumption that the strain energy density depends linearly on the material parameters covers well-known hyperelastic material models like Rivlin-Saunders or Hartmann-Neff-type models. Other models, e.g., those of Ogden type, do not fall into this category. In Refs. [200] and [203], a discretization of the parameter space is used to recast the latter models in the form (156). So far, EUCLID has been proposed in conjunction with the VFM, which is also the basis for the following presentation. Nevertheless, the concept can be transferred to other identification approaches as well.
Problem Class III: Inelasticity.
The choice of the model library, which determines the characteristics of and the ODE constraint, is not as straightforward as for elastic problems. In Ref. [199], a Fourier expansion of the yield surface is proposed to formulate a general library for plasticity models, and in Ref. [201], the concept of generalized standard materials [18] is leveraged to construct a general model library containing elastic, viscoelastic, plastic, and viscoplastic material models.
4.5 All-At-Once Approach.
Here, and denote nominal values, which are either zero or express prior knowledge about the expected state/parameter values, and are penalty parameters. However, we can also consider cases where only contains a penalty regularization on the state, as discussed in Ref. [3], because the high-dimensional state vector is the main source of ill-posedness in the inverse problem. An alternative approach to solve ill-posed inverse problems without explicit regularization is the Landweber iteration, a particular form of gradient descent. Interestingly, the Landweber iteration results in quite different implementations for the reduced and all-at-once approaches, as noted in Ref. [3]. In particular, the reduced Landweber iteration requires the repetitive solution of systems of equations, whereas the all-at-once Landweber iteration only requires matrix–vector multiplications. A short description of the method can be found in Appendix F.
Despite the challenges of a high-dimensional parameter space, the all-at-once approach is appealing because of its flexibility, but also because it may result in efficient iterative methods. A Bayesian all-at-once approach is outlined, for instance, in Ref. [263]. In Ref. [261], the all-at-once approach is presented in combination with an ensemble Kalman inversion method. Therein, a similarity to the PINN setting is indicated [261], and in Sec. 4.5.2, we elaborate on this connection in detail.
4.5.1 FE- and VFM-Based All-At-Once Approaches.
Consisting of and equations, respectively. Note that the discretized equilibrium conditions are no longer exactly fulfilled, see Eq. (163)2. If a Newton-like scheme is applied, the second derivatives of F with respect to y and are required, making the scheme difficult from an implementation point of view. Thus, a Gauss–Newton scheme that circumvents this drawback is of higher interest, or the Landweber iteration may be applied, see Appendix F.
Problem Class I: Linear Elasticity.
where we also used Eq. (44). Since depends on and depends on u, the equations are certainly nonlinear (in the displacements u). For example, an iterative Block-Gauss Seidel method could be considered to obtain an efficient procedure, where stepwise systems of linear equations are solved to determine u, p, and . An alternative is provided in Appendix F. An open question is the influence of uncertain measurement data in Eq. (164)1, which affect the violation of the equilibrium conditions due to the coupling to Eq. (164)2.
Problem Class II: Hyperelasticity.
In the case of hyperelasticity with constitutive models that depend linearly on the material parameters, we have defined in Eq. (48), which is coupled with Eq. (164)2 (of course, with a matrix written for large strains). Thus, we obtain a very similar structure of equations. If this is not the case, the more general form (163) has to be evaluated.
Problem Class III: Inelasticity.
where Eq. (167) represents the differential part of the DAE-system.
An alternative approach might be to apply the implicit function theorem inserted into Eq. (165). Then, a reduction of the number of equations is obtained. Alternatively, the time discretization of the DAE-system can be performed first, resulting in a similar approach as for problem class II.
Further Considerations.
Here, describes the measurement data of the reaction forces. In the following, the above method is referred to as all-at-once-FEM (AAO-FEM). The matrices, e.g., , are defined in Eq. (B15).
Since in Ref. [2] the Euclidean norm is related to the FEM-based reduced approach and the -seminorm is related to the VFM, we denote the method (170) as AAO-VFM. As changing the norm to a seminorm has an influence on the solution uniqueness, a proper regularization might be necessary to treat the problem, which is beyond the scope of this work.
4.5.2 PINN-Based All-At-Once Approach.
The training of an inverse PINN can be classified as an all-at-once approach: Both network parameters and material parameters are identified simultaneously, expressed by the vector of trainable parameters , see also Table 4. In comparison to the AAO-FEM approach, the state vector y is not identified directly, but through its parameterization in network parameters , as outlined below for the different problem classes.
Problem Classes I & II: Elasticity.
As for parametric PINNs, the full state vector and the full elastic discrete model are assembled in analogy to Eqs. (122) and (123), respectively.
To avoid trivial solutions and ensure the identifiability of material parameters , it is important to incorporate force information into the loss function. This can be either achieved via (i) volumetric forces, (ii) Neumann boundary conditions, or (iii) the balance of internal and external work, see Refs. [257] and [264], for example. For the aspect of identifiability, see also Sec. 5.1.1.
Since network and material parameters and differ considerably in their respective order of magnitude, the formulation of the loss function is crucial for the convergence of (173), as discussed in detail in [264]. This might be achieved, for example, by adaptive weighting of the loss terms [265,266].
Problem Class III: Inelasticity.
5 Statistical Parameter Inference
i.e., we model the probability density of the data conditional on a specific value of the parameters. Here, represents the number of data variables. The conditional density can be linked to the likelihood function as . The likelihood function plays a crucial role for both frequentist and Bayesian approaches.
Here, ns represents the number of state variables. It should be noted that the high-dimensional parameter vector poses severe problems for the inference approach. Bayesian approaches are particularly appealing in this case because the prior enables regularization, which is needed when inferring a high-dimensional state variable. A Bayesian all-at-once approach was outlined in Ref. [263].
5.1 Frequentist Approach.
in the infinite-data limit , where denotes the true but unknown parameter value and where the matrices Q and Z are detailed in Secs. 5.1.2 and 5.3.1, in the context of uncertainty quantification. Before that, the important topic of parameter identifiability is discussed in Sec. 5.1.1.
Note that the all-at-once approach can be handled along the same lines by using a for the quantities s, d, and e and by replacing with , see also Eq. (183).
5.1.1 Identifiability.
The evaluation of the Hessian matrix H provides information on whether a unique solution exists locally, at the obtained solution . If the determinant of the Hessian matrix or any subdeterminant vanishes, , there is no unique solution, see Refs. [44] and [45]. This concept is called local identifiability and has been investigated for common material models of solid mechanics in Ref. [43]. This approach can be adopted if there are no constraints in the optimization problem (or if none of the constraints are active in the solution). Moreover, it is also difficult to decide whether the determinant is really zero, especially since very small values can occur because of large numbers of data and/or the assigned units. To circumvent this issue, Ref. [45] describes a measure which is based on the relation between the eigenvalues of the Hessian matrix. In Refs. [48] and [49] this is linked to stability investigations, where also the eigenvalues are evaluated. Here, stability implies that small perturbations of the experimental data do not lead to any significant changes in the parameters [69]. It is noteworthy that the evaluation of the local identifiability is useful for investigations of different parameter identification procedures. Thus, the studies are usually performed in re-identifications of given parameters with synthetic data.
The concept of local identifiability has been applied in several works, see, for instance, Refs. [43], [46], and [52]. The case of finite strain viscoelasticity using DIC-data is addressed in Ref. [91] and finite strain viscoplasticity by Ref. [54]. With regard to the identifiability of parameters, see also Ref. [268].
Again, the same conditions on identifiability apply to . Note that in the context of ANNs, identifiability of is usually not considered.
5.1.2 Uncertainty Analysis.
Here, we have used the Hessian approximation derived in the preceding subsection and the subscript 0 refers to evaluation at the true value . Note that corresponds to the inverse Fisher information matrix, whereas represent the factors appearing in the Huber sandwich [269].
5.2 Bayesian Approach.
The following considerations are derived for the reduced approach. In analogy to Sec. 5.1, the all-at-once approach can be handled along the same lines by using a for the quantities s, d, and e and by replacing with , see also Eq. (183).
In the Bayesian approach, a prior density is formulated in addition to the likelihood function. Hence, a Bayesian procedure posits a model for the data (likelihood) and a distribution over the parameter space (prior), which expresses plausible ranges or other physical constraints without any recurrence to an observation. The result is a distribution over the parameter domain, conditional on the observations (posterior), which expresses our uncertainty about the unknown parameters and which allows to extract an estimate of the parameters themselves. Bayesian inference can be carried out by treating the model as a black box, similar to LS-based approaches, however, the model structure can also be used explicitly to improve the numerical efficiency.
Hence, the prior naturally leads to a regularized LS problem, and the frequentist estimate (185) is recovered for a uniform (noninformative) prior.
But even in the case of normally distributed data and a normally distributed prior, the posterior is not normally distributed for finite sample sizes. This is because of the nonlinear dependence between the parameter vector and simulated data . The posterior density therefore needs to be approximated numerically, where a large variety of algorithms are now in use, see, e.g., Ref. [271] for an overview of Markov chain Monte Carlo approaches and Ref. [272] for variational approaches to Bayesian parameter estimation.
Once approximated, the posterior distribution provides insights on the parameter dependence structure and on the parameter uncertainty. For instance, we can extract the posterior covariance matrix, which is appealing if the posterior is unimodal and close to a normal distribution. In the case of a multimodal non-Gaussian posterior, empirical credible intervals can be derived directly from the Markov chain.
5.2.1 Linkage to Frequentist Approach.
for or in the distribution, where denotes the total variation distance – a distance measure for probability distributions. The limit statement (205) is known as the Bernstein-von-Mises theorem and implies that, asymptotically, the posterior distribution contracts to the true value and that frequentist confidence intervals are asymptotically equivalent to their Bayesian counterparts derived from . Note that the theorem requires the data to be generated from the model with the true parameter , see Ref. [273], and it is known that Bayesian methods perform suboptimally under model misspecification. Bernstein-von-Mises theorems for the case of a model misspecification are reviewed in Ref. [274].
5.2.2 Identifiablity in the Bayesian Approach.
Regarding the concept of solution uniqueness, the recently published article [275] states (relatively mild) conditions under which a unique posterior measure exists, which depends continuously on the observed data in appropriate distances between probability measures. This underlines the global viewpoint of Bayesian methods, contrary to the local identifiability analysis outlined for frequentist approaches in the previous subsection. Please note that a precise definition of identifiability in a Bayesian context is still a topic of discussion [276].
5.3 Inference for Complex Materials With Two-Step Approach.
When parameter identification is carried out for inelastic materials, it is common to identify the elastic parameters first before estimating the remaining parameters characterizing the inelastic behavior. The approach can be formalized in a statistical way with the concept of two-step inference methods [277]. A Bayesian counterpart is obtained by hierarchical modeling, as we show in this section. In the following, we focus on the reduced approach.
The underlying assumption is that , i.e., that the elastic data can be explained with the elastic material parameters only. Hence, can be estimated first, independently of the remaining plastic parameters. The uncertainty analysis is more involved in this case because the uncertainty of the initial elasticity inference step needs to be taken into account when inferring the plasticity parameters.
5.3.1 Two-Step Frequentist Approach.
This section covers a derivation of uncertainty estimates through asymptotic normality considerations and Gaussian error propagation, which, to the best of our knowledge, is new in the context of a two-stage parameter identification framework in solid mechanics.
Uncertainty Via Asymptotic Normality.
and their solutions are denoted as . Please note the introduction of scaling factors to simplify the derivations.
holds, where denotes the unknown true value of the elasticity parameter vector. It is, however, more difficult to obtain an accurate uncertainty estimate for the plasticity parameters because the additional uncertainty in needs to be considered as well.
where the quantities and , which are defined in the Appendix, reflect sensitivities of the plasticity parameter estimate to changes in the elasticity parameters. A computable estimate is obtained by replacing all true parameters in the previous equations with their parameter estimates. Hence, all terms appearing in the definition of the covariance matrix (208) have been specified.
Uncertainty Via Gaussian Error Propagation.
i.e., the confidence interval reads . In Ref. [279], this is interpreted as the norm of the Gateaux derivative in the direction of the uncertainties of the individual parameters, where mixed partial derivatives are neglected (i.e., the parameters are assumed to be uncorrelated). Moreover, the same concept is chosen in Refs. [54], [278], and [280] to also estimate the uncertainty in FEM simulations caused by uncertainties in the material parameters. The covariance matrix (198) and confidence interval (200) are often denoted as quality measures in the frequentist approach.
5.3.2 Hierachical Bayesian Approach.
Equation (216) represents a hierarchical model for the posterior of the two-step problem. The posterior distribution of the plasticity calibration step contains as a hyperparameter, which can be inferred at the next level. The final level, i.e., the prior densities, is not explicitly restated here.
6 Examples
This section presents illustrative examples for the theoretical considerations of Secs. 4 and 5. The numerical setups for the different reduced approaches, the VFM and the all-at-once approaches, outlined in Sec. 4, are the topic of Sec. 6.1. For the reduced approaches, results are presented both in a deterministic and in a stochastic setting based on Bayesian inference. Section 6.2 covers a comparison between the deterministic reduced approach using LS-FEM and model discovery with EUCLID for finite-strain hyperelasticity.
In Sec. 6.3, we present results for the novel two-step inference procedure with inelastic materials, as discussed in Sec. 5.3. In this context, the focus lies on the uncertainty quantification of a small-strain elasto-plasticity model, where full-field data are not considered. Instead, stress–strain data obtained from real-world experiments are used.
Note that the main aim of this section is to illustrate how the various methods reviewed and introduced in this paper can be used in practice for parameter identification. We thereby identify the main capabilities and requirements of each approach. It is out of the scope of the paper to investigate the detailed cost-accuracy tradeoff for the individual methods.
6.1 Comparison of Reduced and All-At-Once Approaches—Linear Elastic Plate With a Hole.
As a first example, we consider a plate with a hole under tensile load and assume linear elastic isotropic material behavior. The corresponding material parameters under consideration are , where we choose the true values . To compare the results obtained by the different calibration methods, we draw on so-called re-identifications, as follows. The boundary-value problem with prescribed material parameters is first solved with the FEM. The computed displacement data are subsequently polluted with artificially generated noise and applied as generated full-field data for the model calibration in order to re-identify the previously prescribed material parameters.
The geometry of the domain is shown in Fig. 4. The plate is subjected to a tensile load on the left edge, which is applied as equivalent nodal force on the corresponding nodes in the FEM. Here, symmetry is employed, i.e., only a quarter of the geometry is used for the two-dimensional spatial discretization with four-noded quadrilateral elements (bilinear Q4-elements). Further, a plane stress state is assumed. To minimize the influence of the spatial discretization error, a high-fidelity solution is computed with elements. To mimic DIC data, a linear interpolation of the high-fidelity displacement data to a coarser spatial discretization () is performed. The distance of the finite element nodes in this coarse discretization is approximately —a spatial resolution which can be obtained by real DIC measurements. These artificial full-field displacements are treated as synthetic experimental data (with experimental data points, see App. A) for calibration, and they are studied as clean data and with different levels of Gaussian noise . Following Refs. [161] and [279], the standard deviation of the displacements obtained from DIC measurements is . To investigate the effect of the applied noise on the calibration results, we choose a second noise level as well. The generated displacement data with artificial additive Gaussian noise are shown in Fig. 5.

Generated displacement data with artificial noise for re-identification of elasticity parameters (a) generated axial displacement data with artificial noise (b)generated lateral displacement data with artificial noise
6.1.1 FEM-Based Reduced Approach.
In the following, we first discuss the deterministic parameter identification, followed by a stochastic approach based on Bayesian inference with the FEM.
Deterministic Identification.
A weighted NLS scheme according to Secs. 3.3 or 4.3.1 is applied to re-identify both parameters based on the artificially generated displacement data. The corresponding weighting factors are chosen as the maximum displacement values in each direction to account for the order of magnitude of the displacements. A trust-region reflective algorithm, implemented in the Matlab routine lsqnonlin.m, is applied to find the solution . The particular termination criteria for the optimizer, the change in the function value of the objective function, and the change in the arguments, are set to . The re-identified parameters in Table 5 show good agreement with the true values, with slight deviations stemming from the interpolation of the high-fidelity data to the coarser grid. The node-wise relative error between the computed displacements in the solution and the generated displacements are given in Fig. 6. It is noteworthy that the applied noise level does not significantly influence the identification results. Although the determinant of the approximated Hessian (192) is small , the parameters are uniquely identifiable from different initial values. Further, the trust-region algorithm requires the computation of the Jacobian (57), which is done here by means of numerical differentiation employing a forward difference quotient. As a result, 12 calls of the FEM program are required as the optimizer stops after four iterations (including the initial evaluation of the start values).

Absolute error between generated displacements with and computed displacements with the identified parameters in the NLS scheme with FEM (a) absolute error in the axial displacements with artificial noise and (b) absolute error in the lateral displacements with artificial noise
Results of re-identification of material parameters for linear elasticity from clean and noisy data
Method | σ (mm) | (N mm−2) | ||
True values | 210,000 | 0.3000 | ||
Reduced approaches [4.3] | ||||
LS (FEM) [3.3, 4.3.1] | 0 | 205,410 | 0.3007 | |
205,963 | 0.3025 | |||
205,855 | 0.3034 | |||
Bayesian inference (FEM) [5.2, 4.3.1] | 0 | 207,757 | 0.3051 | |
207857 | 0.3061 | |||
207,415 | 0.3064 | |||
LS (parametric PINN) [4.3.2] | 0 | 210,105 | 0.3000 | |
210,184 | 0.3009 | |||
209,779 | 0.3013 | |||
Bayesian inference (parametric PINN) [5.2, 4.3.2] | 0 | 210,671 | 0.3011 | |
210,755 | 0.3019 | |||
210,357 | 0.3025 | |||
VFM [4.4] | ||||
VFM [4.4.1] | 0 | 209,294 | 0.2993 | |
All-at-once approaches [4.5] | ||||
AAO-FEM [4.5.1] | 0 | 210,009 | 0.2581 | |
212,698 | 0.2338 | |||
206,222 | 0.2889 | |||
AAO-VFM [4.5.1] | 0 | 210,018 | 0.2581 | |
212,728 | 0.2336 | |||
206,242 | 0.2857 | |||
AAO-PINN [4.5.2] | 0 | 203,919 | 0.3000 | |
202,441 | 0.3705 | |||
196,018 | 0.4191 |
Method | σ (mm) | (N mm−2) | ||
True values | 210,000 | 0.3000 | ||
Reduced approaches [4.3] | ||||
LS (FEM) [3.3, 4.3.1] | 0 | 205,410 | 0.3007 | |
205,963 | 0.3025 | |||
205,855 | 0.3034 | |||
Bayesian inference (FEM) [5.2, 4.3.1] | 0 | 207,757 | 0.3051 | |
207857 | 0.3061 | |||
207,415 | 0.3064 | |||
LS (parametric PINN) [4.3.2] | 0 | 210,105 | 0.3000 | |
210,184 | 0.3009 | |||
209,779 | 0.3013 | |||
Bayesian inference (parametric PINN) [5.2, 4.3.2] | 0 | 210,671 | 0.3011 | |
210,755 | 0.3019 | |||
210,357 | 0.3025 | |||
VFM [4.4] | ||||
VFM [4.4.1] | 0 | 209,294 | 0.2993 | |
All-at-once approaches [4.5] | ||||
AAO-FEM [4.5.1] | 0 | 210,009 | 0.2581 | |
212,698 | 0.2338 | |||
206,222 | 0.2889 | |||
AAO-VFM [4.5.1] | 0 | 210,018 | 0.2581 | |
212,728 | 0.2336 | |||
206,242 | 0.2857 | |||
AAO-PINN [4.5.2] | 0 | 203,919 | 0.3000 | |
202,441 | 0.3705 | |||
196,018 | 0.4191 |
Bayesian Inference with FEM.
Next, we apply a sampling-based approach according to Sec. 5.2 to identify the material parameters. In particular, we employ an in-house variant of the affine-invariant ensemble sampler [281], which features only one free parameter, the step size, and shows very robust performance for a wide range of densities. Compared to the LS approach, sampling approaches to Bayesian inference require a large number of model evaluations, particularly for high-dimensional problems. Hence, tractable approaches need to include surrogate modeling. Here, however, we sample the original FEM model directly, since the simulation times are manageable, and the posterior density is free of surrogate modeling errors in this case. We employ uniform priors, covering 10% variation around the nominal parameter values. The Bayesian approach permits to estimate the measurement noise size together with the unknown parameters. In our experiments, however, we prescribe this value, as is done for all other methods discussed herein. For all test cases, we employ an ensemble consisting of 50 chains, each of size 100. The step size is varied to achieve sound acceptance rates. In all cases, 50% of all ensemble chains are removed to account for burn-in, i.e., the phase where the Markov chain explores the parameter space and the samples are not representative for the posterior distribution. All chains are merged afterwards to form a sample from the posterior density. Exemplarily, we depict the results for noise level and step size 5, consisting of the posterior histograms in Fig. 7.
We observe a good concentration of the posterior with a small uncertainty. The expected values (solid lines) are slightly off the true parameter values; however, the performance is comparable to the other methods and the slight offset can again be attributed to the coarse grid interpolation of the data. The trace plots further underline the good stationary behavior of the chains. The determined parameters are reported in Table 5. Note that for clean data, a step size of 4 was found to give better results.
6.1.2 Parametric PINN-Based Reduced Approach.
In this section, we use a parametric PINN surrogate for calibration, and the material parameters E and ν become additional inputs to the ANN alongside the coordinates x. The displacements in both x1- and x2-direction are the outputs of the ANN (see Sec. 3.5 for more details). For this example, we choose an ANN with 4 hidden layers with 32 neurons each and the hyperbolic tangent as activation. The imposed displacements are enforced by hard boundary conditions [258]. In an offline phase, the parametric PINN is first trained to learn a parameterized solution of the displacement field for a range of and for E and ν, respectively. For the training, we sampled 32 uniformly distributed E and ν in the specified range, resulting in 1024 different combinations. For each combination of material parameters, in turn, we sampled 64 collocation points in the domain and 32 points on each of the five boundaries.
In a subsequent online phase, we use the parametric PINN for the re-identification of the material parameters with both the NLS method (similar to Secs. 3.3 and 4.3.1) and Bayesian inference (Secs. 5.2), where the parametric PINN acts as a surrogate of the solution map. Both for the optimization problems resulting from PINN training and for solving the NLS problem, we use the L-BFGS algorithm [141]. For Bayesian inference, we employ the same algorithm and hyperparameters as for Bayesian inference with FEM in Sec. 6.1.1. The results of the re-identification are reported in Table 5. The results of both the NLS method and the Bayesian inference are in good agreement with the true material parameters. For the Bayesian inference, we achieve similar small uncertainties as for the Bayesian inference with FEM.
6.1.3 PINN-Based All-At-Once Approach.
Next, an inverse PINN is used to calibrate the linear-elastic material model from the artificially generated displacement data following the all-at once approach. The PINN formulation used here is identical to that in Ref. [184] and is described in an abstract manner in Sec. 4.5.2. Equivalent to the mean squared error, for the weights and in Eq. (173), we choose and , where 3097 is the number of observation points. To take the force information into account, the balance between internal and external energy is used as a further loss term. While the integral of the strain energy is approximated via the observation points, the external energy is approximated at 64 points distributed uniformly on the Neumann boundary. As an initial guess for E and ν, we choose and 0.3, respectively. Please note that a sensible choice of the initial guess is important for solving the optimization problem. The displacement field in both x1- and x2-direction is approximated by two independent, fully connected feed-forward ANNs with two hidden layers with 16 neurons each, and we choose the hyperbolic tangent as activation. The resulting optimization problem is solved using the BFGS [282–285] optimizer. The re-identification yields the material parameters given in Table 5. For the clean displacement data, the re-identified parameters are in good agreement with the true values. As already mentioned for the other approaches, at least for the clean data, a significant cause for the slight deviation in the re-identified Young's modulus E lies in the linear interpolation of the high-fidelity data to the coarser grid. For increasing noise levels, however, increasingly significant deviations of the re-identified material parameters can be observed, especially for the Poisson's ratio. One possible reason for the large deviation is the overfitting of the ANN to the noisy displacement data, see also the discussion in Ref. [184]. Although the PDE acts as regularization, additional regularization might be necessary for further improvement. Further investigation of this method for noisy data is the subject of current research.
6.1.4 Virtual Fields Method.
We now apply the VFM to identify the elastic parameters from the given displacement and force data. We interpolate the given displacement data with local finite element ansatz functions and choose the same local finite element ansatz functions for the virtual fields to assemble the overdetermined system of Eq. (B12) (see Appendix B for details). As more information is provided in the interior of the domain than at the boundary, we choose the weighting factor (see Appendix B) to increase the influence of the boundary data in the regression problem. The overdetermined system of equations is solved in a LS sense to arrive at the desired material parameters, which are reported in Table 5.
The computation of the system of Eq. (B12) requires computing the strain field from the given displacement data. As the displacement data are interpolated with local finite element ansatz functions, adding artificial noise to the displacement data has a significant effect on the computed strain field. Therefore, the results of the VFM become unreliable for the noisy data cases. This could be counteracted by denoising the displacement data before using them as input to the VFM. For the sake of brevity, and because the VFM has been validated for noisy data in previous studies [1], we omit this denoising step here and present only results of the VFM for the noiseless case.
6.1.5 FEM- and VFM-Based All-At-Once Approaches.
Finally, we apply the all-at-once approach, see Sec. 4.5.1. As for the reduced VFM, we interpolate the given displacement data with local finite element ansatz functions, and we choose the same local finite element ansatz functions for the virtual fields. Then, the system matrices are assembled as described in Appendix B. We choose the weights of the optimization problem as (see Appendix B), for the AAO-FEM and for the AAO-VFM. Further, we choose the initial elasticity parameters (corresponding to and ) and the noisy artificial data for the initial displacement field. To solve the optimization problem ((169), (170)), we choose the reflective trust-region method implemented in the Matlab function lsqnonlin.m. The all-at-once approach simultaneously identifies the displacement field and the material parameters. Hence, an advantage of the all-at-once approach over, for example, the VFM is that a smooth displacement field is obtained even if the measurement data are noisy, while at the same time identifying the material parameters. The results of the identified parameters are reported for different noise levels in Table 5.
6.1.6 Discussion.
The material parameters E and ν identified with the different schemes are provided in Table 5. All deterministic and stochastic reduced approaches yield satisfactory results, independently of the noise level. While a noisy case is not considered for the VFM here, the VFM yields satisfactory results for the noiseless case. The AAO-PINN exhibits quite a strong dependence on the noise level, see also Ref. [264]. Also, the accuracy of the results obtained with AAO-FEM and AAO-VFM is more affected by noise compared to their reduced counterparts. This can be explained by the FOCs of the AAO formulation: In any AAO approach, is not strictly fulfilled anymore. The data and the physics contributions to the loss function need to be balanced during optimization, which leads to errors if the data are noisy.
Further, during the numerical studies on the all-at-once approaches (AAO-PINN, AAO-FEM, AAO-VFM), a relatively high sensitivity of the identified parameters with respect to the initial guesses was observed (see also the related discussion for AAO-PINNs in Ref. [184]). In addition to the numerical examples shown in this section, a proper mathematical analysis of the well-posedness and in particular the sensitivity with respect to the initial guess of the methods would be highly interesting—but this goes beyond the scope of this article. As outlined in Sec. 4.5.1, regularization, e.g., using a prior, could help to alleviate this issue.
6.2 Deterministic Calibration and Model Discovery for Finite-Strain Hyperelasticity.
In the following, we compare the identified parameters of various hyperelasticity models from both deterministic model calibration with LS-FEM and deterministic model discovery using EUCLID. For this purpose, we utilize data sourced from Refs. [159] and [286], where a hyperelastic square specimen with a hole undergoes displacement-controlled asymmetric biaxial tension, see Fig. 8. Utilizing twofold symmetry, only a quarter of the plate is modeled.
here adapted to near incompressibility. For the present example, we consider the volumetric part (218), and three different approaches for the isochoric part of the strain energy density function
For the Neo-Hookean model, four load steps are considered, whereas eight load steps are employed for the Isihara and Haines-Wilson models. Detailed descriptions of data generation, application of noise, and subsequent denoising using kernel ridge regression are provided by Ref. [159]. The deterministic model calibration is performed using the weighted NLS-FEM. From the full-field data, we perform a linear interpolation onto the finite element node positions depicted in Fig. 8. The displacement weighting factors in x1- and x2-directions are set to and , respectively. Similar to the model discovery approach in Ref. [159], the deterministic model calibration considers both full-field data and reaction force quantities, where we use the total horizontal and vertical reaction force analogous to those in real experiments. Accordingly, the reaction force residuals between data and model response are weighted with and . Note that the model discovery approach in Ref. [159] employs different weighting factors for the full-field data and the reaction forces.
Table 6 summarizes the calibration results for the three hyperelasticity models from deterministic NLS-FEM calibration and from deterministic model discovery using EUCLID. The NLS-FEM results for full-field data without noise (σ = 0) align closely with the true parameter values across all three hyperelasticity models. A minor deviation between the ground truth and the identified parameters arises from interpolating the generated full-field data onto the finite element node positions of the spatial discretization in Fig. 8. EUCLID discovers exactly both the hidden material models and the material parameters in the noiseless case. For noisy full-field displacements, parameters calibrated using NLS-FEM correlate well with the ground truth, displaying only a small sensitivity with respect to the applied noise levels. This robustness is due to the fact that the model features are predefined, leaving the material parameters as the only variables within the optimization procedure. Conversely, EUCLID is designed to identify both model features and material parameters concurrently. At low noise level (), EUCLID accurately discovers the correct model features across all three hyperelasticity models. However, at high noise level (), EUCLID successfully captures the appropriate model features and material parameters for the Neo-Hookean model, but does not correctly discover the model features for Isihara and Haines-Wilson models. However, as discussed in Ref. [159], the model features determined by EUCLID in the high-noise case still show reasonable agreement with the true model under moderate deformations, such as those featured in the present dataset.
Results of deterministic model calibration and discovery for finite-strain Hyperelasticity using NLS-FEM and EUCLID [159]
Neo-Hooke | Truth | σ (mm) | |
NLS-FEM | σ = 0 | ||
EUCLID | σ = 0 | ||
Isihara | Truth | ||
NLS-FEM | σ = 0 | ||
EUCLID | σ = 0 | ||
Haines-Wilson | Truth | ||
NLS-FEM | σ = 0 | ||
EUCLID | σ = 0 | ||
Neo-Hooke | Truth | σ (mm) | |
NLS-FEM | σ = 0 | ||
EUCLID | σ = 0 | ||
Isihara | Truth | ||
NLS-FEM | σ = 0 | ||
EUCLID | σ = 0 | ||
Haines-Wilson | Truth | ||
NLS-FEM | σ = 0 | ||
EUCLID | σ = 0 | ||
6.3 Two-Step Inference of Small-Strain von Mises Plasticity.
In this example, two-step inference is investigated for a small-strain plasticity model with von Mises yield function and nonlinear kinematic hardening of Armstrong and Frederick type. The numerical treatment of the constitutive model was originally developed in Refs. [290] and [291], and the material model is summarized in Appendix H. The experimental data are obtained from tensile tests on specimens according to DIN EN ISO 6892-1 and simultaneous measurements with a DIC system. The information from the DIC system allows to determine the axial and lateral strains in the parallel region of the specimen as explained in Ref. [52], whereas an additional compensation of rigid body movements is applied here. The specimens are made of the steel alloy TS275 and are loaded in a displacement-controlled process with . The experimental data of five specimens are employed up to a maximum axial strain of 5%. The experimental stress–strain data are shown in Fig. 9. The material parameters of the constitutive model can be identified in a sequential manner. First, the Young's modulus E is determined from the stress–strain response of the specimens in the elastic region, which here comprises the data until a maximum axial strain of 0.1%. Then, the Poisson's ratio ν is identified from the lateral strain data in the elastic region. The constitutive model, see App. H, requires a different set of elastic parameters , namely, the bulk modulus K and shear modulus G, instead of the elastic parameters . This has to be considered during the model calibration, as will be explained later. Finally, the plastic parameters , i.e., the yield stress k and the kinematic hardening parameters b and c of the Armstrong and Frederick ansatz of kinematic hardening, are identified, i.e., . The experimental data obtained from the five tensile tests are reduced to one dataset by computing the mean values of stresses and lateral strains. To account for uncertainties, the covariance matrix of the observations d is considered during calibration. In this example, the calibration is carried out with a frequentist approach and Bayesian inference using a similar setup to obtain comparable results, especially regarding the uncertainties of the parameters. For the theoretical considerations, see also Sec. 5.3.
6.3.1 Two-Step Frequentist Approach.
Herein, a formulation corresponding to Secs. 3.3.1 and 4.3 is applied, where the covariance matrix of the observations is chosen in the weighted LS approach, i.e., in Eq. (54). Further, the settings of the optimization algorithm are the same as in the previous example. All confidence intervals in the following are provided for a confidence level of 68%.
Elasticity parameters: The Young modulus E is identified from the axial stress–strain data in the elastic region utilizing a weighted linear LS scheme as . The Hessian indicates local uniqueness of the solution . The result of the calibration is visualized in Fig. 10, where the shaded area corresponds to the experimental data. Five data points, i.e., , are used for the identification.
Moreover, the Poisson ratio ν is estimated from the lateral strain data in the elastic domain, where a weighted linear LS scheme is applied again.
The LS method yields . As before, the obtained solution is locally unique, although the determinant of the Hessian is small . The experimental and calibrated lateral strain data are shown in Fig. 11.
using a Monte Carlo approach to account for the corresponding uncertainties. To this end, we assume normally distributed E and ν and randomly pick 4000 samples, which are used to evaluate Eq. (224). The computed distributions for K and G yield the mean value as parameter estimation and the standard deviation for the uncertainty under consideration of error propagation effects. As a result, the elasticity parameters and are obtained. It is worth mentioning that the Gaussian error propagation (214) as an alternative approach for the uncertainty quantification yields values very similar to the Monte Carlo approach, namely, and .
Plasticity Parameters.
As previously mentioned, the plasticity parameters can be identified from the stress–strain response of the specimens. For this purpose, we only apply data that have not already been considered in the calibration of the elasticity parameters. The weighted NLS method employs data points and yields , and . The Hessian indicates local uniqueness for the minimum found. The calibrated stress–strain curve including elastic and plastic stages is shown in Fig. 12 together with the experimental data.
The aforementioned uncertainties of the plasticity parameters stem from the mismatch between experimental data and model response in the obtained solution . However, it also has to be considered that the plasticity parameters are identified with uncertain elasticity parameters . To account for this, we proceed according to Sec. 5.3.1. Then, the uncertainties , and of the plasticity parameters are obtained.
6.3.2 Hierarchical Bayesian Inference.
In the following, we compare the frequentist approach of Sec. 5.1 with a sampling-based Bayesian approach to parameter estimation and uncertainty quantification. In particular, we employ a nested Markov chain Monte Carlo approach for the hierarchical Bayesian model Eq. (216). The same experimental data as in the NLS calibration are employed, and a hierarchical model is used for the two-step calibration approach, following Sec. 5.3.2. We directly sample the finite element model with the affine-invariant ensemble sampler [281].
Elasticity Parameters.
The elasticity parameters are determined using uniform priors with 20% variation around the estimates and with step size 12 and employing 100 chains of chain length 100. Afterward, the chains are merged to represent a sample of the posterior density. The calibration yields and , which is in good agreement with the experimental data, see Figs. 10 and 11, and with the previously reported NLS results. The posterior histograms are depicted in Fig. 13, and the trace plots are shown in Fig. 14, where the stationary behavior of the chains is evident.
The obtained Markov chain can be directly used for the computation of the elasticity parameters , hence, contrary to the NLS approach, no assumption regarding the distribution of the elasticity parameters E and ν is required. In this way, we obtain and . Note that reporting the mean value and standard deviation here does not imply a Gaussian distribution for the parameters. Indeed, the Bayesian sampling approach allows to estimate complex distributions from the posterior sample.
Plastic Parameters.
Analogously to the NLS approach, the plastic material parameters are inferred from the elasto-plastic stress response of the material. Again, uniform priors are used, where 20% variation around is chosen for the yield stress k and 30% variation around and is applied for the hardening parameters. Step sizes of 4 and 80 chains of chain length 100 are selected to obtain sufficient acceptance rates. Exemplarily, we show the calibration results for two different sets of elasticity parameters in Figs. 15 and 16.

Corner plot for calibration of plasticity parameters using Bayesian inference with elasticity parameters

Corner plot for calibration of plasticity parameters using Bayesian inference with elasticity parameters
The parameter estimation and uncertainty quantification is done according to Sec. 5.3.2 using 1000 samples of the elasticity parameters . Depending on the elasticity parameters, different plasticity parameters and uncertainties are determined. The distributions are shown in Figs. 17 and 18, where the calibrated parameters and uncertainties under consideration of elasticity parameter uncertainty are indicated with solid lines. The model response with the calibrated parameters compared to the experimental data is shown in Fig. 12 as well.

Distributions of calibrated plasticity parameters , when using different sets of elasticity parameters

Distributions of calibrated parameter uncertainties when using different sets of elasticity parameters
6.3.3 Discussion.
The identified material parameters are compiled in Table 7, where the notation is used for uncertainties computed by the model calibration scheme and for uncertainties when considering uncertain elasticity parameters. The determined parameters of both methods are in good agreement. Both methods yield reasonable parameter uncertainties. However, the Bayesian approach yields larger uncertainties compared to the NLS method for the plasticity parameters. This can be attributed to the fact that the LS uncertainties are based on linearization and Gaussian assumptions, and they only hold exactly in the asymptotic regime of very large data sets. The posterior histograms for the plasticity parameters are also clearly non-Gaussian. Although the sampling-based Bayesian approach can handle these types of uncertainties, the associated sampling cost can be very high. Instead of the hierarchical Bayesian approach, outlined in Sec. 5.3.2, one could also determine the full posterior directly. The hierarchical setting was chosen here, since it closely resembles the frequentist two-step procedure.
Results from calibration of small strain elasto-plasticity constitutive model from experimental data
Parameter | K (N mm−2) | G (N mm−2) | k (N mm−2) | b | c (N mm−2) |
---|---|---|---|---|---|
Nonlinear least-squares | |||||
150,991 | 79,321 | 282.6 | 41.04 | 3499.8 | |
– | – | 1.12 | 1.76 | 116.8 | |
2951 | 628 | 1.07 | 1.68 | 111.75 | |
Bayesian inference | |||||
152,306 | 79,392 | 283.0 | 40.31 | 3453.7 | |
11,803 | 3526 | 1.63 | 2.59 | 162.7 |
Parameter | K (N mm−2) | G (N mm−2) | k (N mm−2) | b | c (N mm−2) |
---|---|---|---|---|---|
Nonlinear least-squares | |||||
150,991 | 79,321 | 282.6 | 41.04 | 3499.8 | |
– | – | 1.12 | 1.76 | 116.8 | |
2951 | 628 | 1.07 | 1.68 | 111.75 | |
Bayesian inference | |||||
152,306 | 79,392 | 283.0 | 40.31 | 3453.7 | |
11,803 | 3526 | 1.63 | 2.59 | 162.7 |
It should further be mentioned that—in contrast to Refs. [54] and [278], for example, where the Gaussian error propagation is employed—a different approach is followed here to compute the uncertainties of subsequently identified parameters. The main advantage of the procedure, which is explained in Sec. 5.3.1 and in more detail in Appendix G, is that the majority of the required quantities, such as the Jacobians, are already known from the model calibration. Thus, the computation of the partial derivatives required for the Gaussian error propagation, see Eq. (214), is circumvented, and the uncertainty estimation can be easily performed in a postprocessing step.
7 Conclusions
While increasingly large amounts of data can be obtained from experimentation and measurement techniques, these data are usually sparse, i.e., they provide a reduced amount of information compared to a full-field simulation, and tend to be noisy. A question that often arises in this context is how such data can be related to physical models and high-fidelity 3D fields, that can be computed from such models. Thus, apart from model calibration using real experiments, there is an increasing need for solutions to interpret the experimental data. As a consequence, parameter identification, which in the past played a role mainly in constitutive modeling combined with experimental mechanics, is increasingly gaining attention across disciplines.
The present paper provides a short discussion on well-established experiments and measurement possibilities as well as an overview of known and newly developed computational approaches for parameter identification in the context of solid mechanics. By following a classification into reduced and all-at-once approaches from the inverse problems community, most of the methods available in the literature can be classified. The all-at-once formulation combines the model residual and the difference between simulated and observed data into a single objective function, and the reduced formulation is obtained as the limit when the discrete model equation is enforced as a strong constraint. Interestingly, the virtual fields method can be recovered at the other end of the spectrum when the data-related objective is enforced strictly. This holistic view, which is one of the main contributions of this work, highlights the connections between the different methods—which then allows for a better understanding of identifiability and the need for regularization, which is often crucial in parameter estimation problems. Moreover, this structured framework allows to easily formulate new approaches to parameter identification, two of which are put forth in this work. Since data are usually noisy, an uncertainty quantification perspective is included in the considerations as well. Bayesian uncertainty quantification, in particular, provides a parallel derivation for well-established regularization approaches. Our main contribution in this regard is the formulation of a parameter estimation and uncertainty quantification method for material models, proceeding in two subsequent steps, whereby we compare frequentist and Bayesian approaches.
The review is accompanied by three numerical examples. The first is a linear elastic problem with artificially generated data, for which the different reduced and all-at-once approaches as well as virtual fields methods are applied and compared. The second example illustrates model discovery and the third example focuses on two-step inference in a frequentist setting and hierachical Bayesian inference for an inelastic material using real-world experimental data within the classical finite element least-squares approach.
From an application point of view, not all methods have the same level of maturity. Whereas reduced approaches are already widely used in practice, the all-at-once paradigm still needs further developments to be applicable to large-scale problems. The main difficulty is the large parameter space consisting of both material parameters and state vector. Additionally, the sensitivity to noise, which we observed in our numerical examples, needs to be reduced. Still, the all-at-once approach is of interest not only because of the unified perspective but also because efficient iterative solutions seem possible. While the paper mainly compares and discusses the first-order necessary conditions of different approaches, black box and mildly intrusive adaptations are possible in many cases, which is often desirable in practice. As a minimum implementation requirement, access to the parametric model residuals and the state quantities is needed to compute the objective function value.
We hope that the present study might help researchers and engineers working on different aspects of parameter identification to gain a better understanding of the potentials and pitfalls in parameter identification. In this sense, the unified framework may serve as an anchor point for the development of new methods by leveraging knowledge from statistics, optimization, machine learning, and finite elements.
Funding Data
Swiss National Science Foundation (SNF) (Project No. 200021_204316; Funder ID: 10.13039/501100001711).
German Research Foundation (Grant Nos. DFG GRK2075-2 and DFG 501798687; Funder ID: 10.13039/501100001659).
Data Availability Statement
All the codes and experimental data are released as open-source in the GitHub-repository.3
Appendix A: Processing of Experimental and Numerical Full-Field Data
The identification of material parameters from full-field data requires particular attention to the extraction of the data from both experiment as well as numerical model. Therefore, the procedure is explained in detail with a focus on the nonlinear least-squares method using finite elements, see Sec. 3.3.
For full-field measurement data, one obtains from experiment , a data vector . The data are displacements, in-plane stretches/strains, or in-plane strain components at spatial positions on the surface of the specimen at particular times. In addition, it can be extended by discrete force information from a load cell. In other words, in the case of (temporal) load, i.e., time steps, each time-step consists of entries (discrete spatially distributed displacements, in-plane principal stretches, in-plane strains, and forces concerned, …), i.e., . In experiment , there is a sampling rate that provides the time points tm, , at which we need to evaluate. The data of each experiment is compiled in the vector , where symbolizes the data from one experiment at time tm. If all data, from all considered experiments, are assembled, the entire data vector with is obtained. This includes either experiments that are repeated several times, or completely different loading paths, boundary conditions, or geometries.
On the other hand, the finite element model provides for each recomputed experiment , displacements ( and ) – or strains/stretches, …–, and reaction forces for the time points . Obviously, the temporal and the spatial points of the experiment and the finite element model are different. Consequently, the finite element results are temporally and spatially interpolated to match the time points and the measurement points of the experiment. In the time domain, a linear interpolation is chosen to align the experimental data with the model time data because sensitivities (particular derivatives), which should not be interpolated, are required later on. Accordingly, the dimension of the data vectors d is as large as it is given by the finite element simulations.
Some remarks concerning the digital image correlation data need to be added. Initially, DIC-data can only provide information about the deformation of the surface of a subarea of a specimen or component. These displacements partly contain rigid body motions (based on the insufficient machine stiffness or the test equipment), which are difficult to consider in a finite element program. Some commercial DIC suppliers provide a rigid-body compensation (unfortunately, without specifying the mathematical background and procedures). Here, it thus makes more sense not to use the surface displacements, but the surface strains. This is done on the level of displacements in Ref. [125] or on the level of strains/stretches in Ref. [91].
Commercial finite element programs have a similar problem: Only a few programs are available for the comparison with DIC-data of the surface strains, where it is possible to evaluate the strains or measures, which can be determined from the strains (principal strains, shear angles, …) at arbitrary points of the surface. Here, it could help to employ evaluation tools that are based either on triangulation, Refs. [292–294], or, for example, global strain determination tools [295]. These concepts can be applied to both the DIC-data as well as to the finite element nodal displacement data. This has the additional advantage that the same interpolation scheme and strain evaluation is applied to both systems, i.e., DIC and FEM. Furthermore, the sensitivity analysis, i.e., the derivatives, can be computed analytically.
Usually, the displacements—or resulting quantities such as the principal strains or stretches—are only compared with the assigned quantities of the experiment in a subarea of the finite element (surface) domain, i.e., only the subset of the finite element nodal displacements are included. For this purpose, the assignment matrix is introduced. In addition, single forces should be included, which are determined from the reaction forces . In a tensile experiment , whereas in a biaxial tensile test . For tension-torsion problems , namely the axial force and the torque. The matrix is exploited to extract the reaction force(s)/moment(s) from the Lagrange multipliers. The consideration of reaction forces in the NLS-FEM-DIC approach can be found in Refs. [91] and [296] for inelastic materials.
Appendix B: System Matrices for Linear Expressions in the Parameters
Of course, by exploiting symmetry conditions (less entries in ) or the specific structure of the elasticity matrix C, the matrix has to be adapted. This holds for the consideration of omitting zero-multiplications in the matrix–matrix product as well.
i.e., nodes where no equivalent nodal forces act are explicitly set to zero.
A node-wise implementation for plane problems can be found in Ref. [159].
Appendix C: Artificial Neural Networks
The definition in Eq. (C6) demonstrates that FFNNs are highly parameterized, nonlinearly composed functions.
It is generally possible to choose other metrics, depending on the problem.
For the optimization of parameters, it is common to use gradient-based optimization algorithms. The gradient of the loss function with respect to the FFNN parameters can be calculated using automatic differentiation [168].
It is well known that FFNN are universal function approximators [297–299]. Provided that an FFNN has a sufficient number of parameters, the FFNN, according to the universal approximation theorem, can theoretically approximate any continuous function and its derivatives to an arbitrarily small error. It should be noted, however, that the question of optimal training of FFNNs – to reach their full potential – has not yet been solved. For a more in-depth introduction to deep learning, we refer to standard textbooks, for example, Ref. [300].
Appendix D: Sensitivities in Least-Squares Method Using Finite Elements
If the experimental surface data are mapped to the finite element mesh, the derivative of the interpolation scheme of the projection to the displacements in between an element has to be performed as well. Thus, it is recommended to project the nodal displacement information to the DIC-data in order to circumvent this step.
In Ref. [91], it is also explained which quantities can be calculated on Gauss-point level, where either automatic differentiation schemes, Refs. [301] and [302], or the code generation approach of Refs. [303–306] can be applied.
If principal strains are compared within the NLS-method, the evaluation tool of the interpolation scheme to calculate the strains has to be considered as well, but this once again requires the derivative (D3), see Ref. [91].
are required (here, the postprocessing step (28)2 is omitted for brevity). In this sense, the same time steps are required for both calculations. Otherwise, additional interpolations are necessary. An application using the commercial finite element program Abaqus within a gradient-based optimization scheme is given by Ref. [308] (among many others), and a comparison of the required computational times with respect to internal numerical differentiation is provided in Ref. [91].
Appendix E: Derivation of Finite Element All-at-Once Approach
Appendix F: Iterative Methods
The step sizes μk are chosen to ensure convergence. Note that, here, regularization is applied implicitly during iteration as discussed in Ref. [3].
which does not involve any solution of a linear or nonlinear model.
Appendix G: Details on the Two-Step Inference Procedure
where tensor notation is omitted for simplicity.
and that holds.
Assuming uncorrelated plastic residuals with common variance results in Eq. (213).
Appendix H: Applied Viscoplasticity Model
Here, the constitutive model of a small strain Perzyna-type viscoplasticity model with von Mises yield function is summarized. The constitutive model fits into the structure of Eq. (5). The stress algorithm developed in Refs. [290] and [291] integrates the evolution equations with an elastic-predictor/inelastic-corrector method. Further, nonlinear kinematic hardening of Armstrong and Frederick-type is considered, Ref. [94], while no isotropic hardening is taken into account here, for the sake of brevity. defined the kinematic hardening variable, whereas symbolize the viscous strains. represent the rate of viscous arc length, and symbolizes the “plastic” multiplier, which is given by a constitutive model for the case of viscoplasticity. Utilizing the limit yields the rate-independent small strain elasto-plasticity material, which is applied in Sec. 6.3. The constitutive model is compiled in Table 8. The case distinction in elastic and viscoplastic deformations is enabled with Macauley brackets, for x > 0 and for . The parameter is assumed for obtaining a dimensionless quantity in the Macauley brackets. Thus, σ0 is not seen as a material parameter. The material parameters as shown in Table 8 are: bulk modulus K, shear modulus G, yield stress k, the parameters b and c for describing the nonlinear kinematic hardening, the viscosity η, and the exponent r of the viscosity function.
Summary of constitutive equations for a model of viscoplasticity (von Mises-type viscoplasticity with nonlinear kinematic hardening)
Elasticity | Viscoplasticity | |
---|---|---|
Yield function | (H1) | |
Loading condition | f < 0 | |
Elasticity relation | (H2) | |
Flow rule | (H3) | |
Kinematic hardening | (H4) | |
Abbreviations | (H5) | |
Material parameters | (H6) |
Elasticity | Viscoplasticity | |
---|---|---|
Yield function | (H1) | |
Loading condition | f < 0 | |
Elasticity relation | (H2) | |
Flow rule | (H3) | |
Kinematic hardening | (H4) | |
Abbreviations | (H5) | |
Material parameters | (H6) |
In the context of the general representation (7), the viscous strains and the back stress tensor define the internal variables, . The evolution equations are given by Eqs. (H3) and (H4), which contain a case distinction.
Footnotes
We do not explicitly discuss problems in gradient elasticity and plasticity, where the symmetry of the stress tensor is not assumed anymore, see Ref. [5]. However, these models can also be embedded after discretization into the structure of the material parameter identification problem, discussed later on.