Abstract

Accurately modeling nonlinear dynamic systems remains a significant challenge across numerous scientific fields, from physics to finance. Existing methods require prior knowledge of the system or are sensitive to noise, making them impractical to handle real-world tasks. Based on the sparse identification of nonlinear dynamics (SINDy) algorithm, our previous research proposed a variant of the SINDy algorithm. With the Lagrangian integrated, we named our method extended Lagrangian-SINDy (xL-SINDy). The xL-SINDy shows higher robustness than the competing method with a factor of 100, but it sacrifices the ability to deal with a nonconservative system. This paper introduces a new method that extends xL-SINDy's capabilities. We address its primary limitation by integrating the Rayleigh dissipation function into the xL-SINDy framework. Tests on four different pendulum systems show that our method maintains the same level of robustness as the xL-SINDy, but can identify nonconservative terms in the system. Making it better describe the real-world systems. While the proposed method is limited to specific types of dissipative forces that the Rayleigh dissipation can describe, the provided framework can be extended to include any form of the nonconservative terms.

1 Introduction

More than three hundred years ago, Isaac Newton tried to figure out what made an apple fall off the tree; behind the phenomenon, he found gravity. That is one of our first attempts to find the system dynamics through the phenomenon. Since then, modeling has become one of the eternal topics in science research; even now, scientists are still trying to model the complex relationships between stars. However, there are too many dynamic systems in daily life, and it is impossible to model all of them manually. Still, knowing the dynamics might be important since it helps us to understand how the system's parts contribute to the movements. That makes the autonomous modeling method helpful for saving efforts. However, the noisy, complex environment makes real-world autonomous modeling more challenging than simulation, which is still an open topic [14].

There is a debate between complexity and parsimony in the quest for precision in modeling. Complex models offer detailed descriptions through complex equations but are often hard to interpret as explainable knowledge. A representation is the neural network model, which can fit the data with high accuracy, but it is hard to find analytical meaning from the network. Conversely, parsimonious models aim for simplicity and capture the system's essence with minimal components. As a representation, the equation of motion describes the system's movement and provides insights into the system's structure. Despite the difficulty of deriving such models, contemporary research prefers parsimonious models for their analytical elegance [511].

Though studies in this area started more than a hundred years ago, applicable methods only began to appear after machine learning tools were developed [1215]. In 2007, Lipson and co-worker explored merging machine learning into the autonomous modeling process [6]. In 2016, Brunton et al. proposed the game-changing sparse identification of nonlinear dynamics (SINDy) algorithm [7]. The machine learning method that removes the irrelevant terms and gives a sparse result, commonly called sparse regression, is the key that helps SINDy to lower the computation cost. That made SINDy to be the first applicable modeling algorithm. It was evaluated in many areas, from chemical kinetics to chaos theory [7,1624].

With its proven performance, we choose it as the basement of this research. However, vanilla SINDy is sensitive to noise, so applying SINDy in the real world is challenging. To mitigate this issue, Brunton and co-workers proposed a more robust variant called SINDy with parallel and implicit expression (SINDy-PI) [25]. However, with the maximum capable Gaussian noise level as 103, it was still not robust enough for real-world tasks. To make the algorithm even more robust, our group proposed the extended Lagrangian-SINDy (xL-SINDy) method, which changes the learning target from the equations of motion to Lagrangian, which describes the system dynamics in a simplified way [26,27]. Thanks to the simplicity of Lagrangian, the xL-SINDy can manage noise magnitude up to 101, outperforming SINDy-PI. However, as a tradeoff, xL-SINDy cannot manage those systems with energy exchange with the outer environment, commonly called nonconservative systems. Since a typical energy exchange is the dissipative force and it is everywhere in the real world, most dynamical systems in the real world are nonconservative. We need to extend the xL-SINDy to manage the nonconservative case before we apply it in the real world to solve the autonomous modeling challenge.

To better understand the relationship between SINDy, SINDy-PI, xL-SINDy, and our method, we can disassemble the whole recognition process into three parts:

  1. The base algorithm: This determines the computational efficiency of the process.

  2. The identification object: The object represents the system dynamics and works as the identification object; it affects the process robustness.

  3. Formation of the regression loss: This affects the optimization criteria and brings the nonconservative capability of the process.

All four methods use the same base algorithm—the SINDy algorithm. For SINDy-PI, the identification object is replaced, providing an enhanced ability to handle greater robustness. xL-SINDy improved with the identification object and loss formation, with robustness surpassing SINDy-PI; however, it cannot handle nonconservative cases. In our research, we utilize the xL-SINDy framework but modify the loss to improve its capability, thereby retaining xL-SINDy's superior robustness while overcoming the capability limitations. Figure 1 illustrates the comparison between SINDy and its variants.

Fig. 1
Comparison between SINDy and its variants
Fig. 1
Comparison between SINDy and its variants
Close modal

In this paper, we combine a nonconservative expression called the Rayleigh dissipation function and the xL-SINDy, providing a highly robust and nonconservative capable solution for system identification and one step closer to solving real-world problems.

2 Methodology

This section introduces the vanilla SINDy and xL-SINDy and then discusses the modification to adopt a nonconservative system. A training efficiency test helps select the appropriate algorithm for the sparse regression task. Finally, the workflow of the whole algorithm is introduced, and a figure demonstration is given.

2.1 Vanilla Sparse Identification of Nonlinear Dynamics.

As the basement of all other SINDy variants, vanilla SINDy enables efficient system dynamics modeling with a sparse regression procedure. Figure 2 explains the algorithm step-by-step. The simplest form separates the equations into the coefficients and system states' functions. Then, the functions are stored in a vector, and the coefficients are stored in a matrix; the multiplication of these two can regenerate the equations. The next step assumes that the coefficients are unknown but that data from the system are given. In this case, the coefficient matrix would be initialized with random values, and a linear regression could help find the coefficients from the data. The regression loss is the error between the acceleration prediction and the true value. Finally, for the complete SINDy case, the coefficients and system states' functions are unknown. The initial function vector, also called the candidate library in SINDy, collects all possible combinations of the system states. The coefficient matrix would get randomly initialized, similar to the second step. The algorithm uses sparse regression to discover the relevant terms inside the huge function vector. It promotes most of the coefficients to zero, finding the dynamics from the data.

Fig. 2
A step-by-step demonstration of SINDy
Fig. 2
A step-by-step demonstration of SINDy
Close modal

2.2 Extended Lagrangian Sparse Identification of Nonlinear Dynamics.

Before introducing our modification to the xL-SINDy, it is necessary to introduce how xL-SINDy formulates the problem first. For vanilla SINDy, the equations of motion are disassembled into candidate functions and coefficients. In xL-SINDy, the Lagrangian is disassembled instead. Since Lagrangian does not give acceleration, it is necessary to use the Euler–Lagrange equation to obtain another regression loss: the system's energy flow. The Euler–Lagrange equation describes a conservative situation whose dynamics can be fully described by the Lagrangian. The Euler–Lagrange equation can predict the energy flow in or out of the system; its error with the true value is the regression loss. With the regression loss obtained, we can perform the SINDy algorithm on the Lagrangian and fit the coefficients to the data. As the scalar describes the system dynamics, the Lagrangian is defined as the difference between the system's kinetic and potential energy
(1)
Using SINDy's disassemble format, the Lagrangian can be separated into a candidate function vector Φ and a coefficients vector C. Note that vanilla SINDy has a separate expression for every system state, so a matrix is needed to store the coefficients. However, in xL-SINDy, the single expression of Lagrangian is enough to describe the dynamics. Thus, the candidate functions and coefficients are stored at a one-dimensional vector
(2)
The Euler–Lagrange equation is defined as follows, and the q refers to the system states vector:
(3)
Notice that each system state would have its partial derivative, if we expand the state vector, we would have individual equations for every state. In this paper, we would like to keep the system state in vector form for the simplicity of the expression. By substituting Eq. (2) into this equation, we can have
(4)
The equation stands when the vector C is all zero, but that does not give any meaningful information about the system. And this zero-solution trap should be avoided during regression training. Back to the derivation, Eq. (4) can be extended using the time derivative chain rule
(5)

The regression process is to find a nonzero solution C to make this equation stand, and its multiplication with the Φ gives the final learned Lagrangian. The defect of xL-SINDy comes from the zero on the right-hand side of Eq. (5), limiting it to only describe conservative systems.

2.3 Modified Euler–Lagrange Equation for Nonconservative Systems.

To address the limitations of the previously described xL-SINDy method, which pertains specifically to conservative systems, we extend the capability of the Euler–Lagrange equation by adding a nonconservative term on the right-hand side. Specifically, we introduce the Rayleigh dissipation function model, which provides a universal solution for modeling various common types of dissipative force. It can generate different types of dissipative force by changing only one hyperparameter, being able to produce a large number of candidate functions in advance; it aligns with SINDy's need for the candidate function vector construction. By adding the Rayleigh dissipation function terms into the candidate function vector, our algorithm can learn the system's Lagrangian and its nonconservative parts at the same time. Although the Rayleigh dissipation function already contains the common types of dissipative force, it is necessary to admit that the Rayleigh dissipation function cannot cover all situations. Adding additional terms to the candidate function vector can address this issue.

To tackle different types of dissipation function, we incorporate the Rayleigh dissipation function directly into the Euler–Lagrange equation. Here, D stands for the Rayleigh dissipation function, and qj is the jth generalized coordinate of the system
(6)

The left side of this equation is the same as Eq. (3), with the Rayleigh dissipation function term added on the right-hand side. With a minus term as the system energy flow, the system loses energy due to dissipative force, which is the nonconservative case we want to describe.

Assuming we are working with a double pendulum system, q˙j refers to the velocity of the jth joint, and cj refers to the coefficient of the jth joint's dissipation function. The n refers to the order of the dissipation function. With such a function, we can implement different kinds of dissipative force into the system by changing n. For example, when n=0, the function refers to a constant dissipation function like constant dissipative force, and when n=1, it refers to a dissipation function that is linear to one of the system states like linear dissipative force. All different components can be easily added up to compose the whole dissipation function. In the experiment, the candidate function vector includes dissipative force created by the Rayleigh dissipation function for every system state with n ranging from 0 to 3, while a constant or linear type dissipative force is applied in the system.

To help understand how Rayleigh dissipation function works, here is a demonstration of how to use it to simulate linear dissipative force. Assume there is a system with the following Lagrangian and dissipative force. It describes a 1DOF object moving in a single dimension and experiencing linear dissipative force proportional to its velocity. Here, m is the object mass, x˙ is the object's velocity, f is the dissipative force applied to the object, and b is the dissipative coefficient
(7)
According to the definition of Rayleigh dissipation function, the dissipation function should be set to n=1
(8)
We can verify this by substituting it into Eq. (6), which states that acceleration comes from the only external force: dissipative force. The substitution meets the assumption in Eq. (7) and proves the capability of the Rayleigh dissipation function
(9)

After understanding how the Rayleigh dissipation function works, the next step is to add it to the xL-SINDy framework. For convenience in sparse regression, we need to divide the Lagrangian and the Rayleigh dissipation function into the candidates part and coefficient part as follows: L=k=1pckϕk,D=j=1hejψj.

Here, the ϕk and ψj refer to the Lagrangian and dissipation function candidates, respectively, and ck and ej refer to the coefficients of these two candidates' libraries.

With such assignments, Eq. (6) can be extended as follows:
(10)
Then, the time derivative part can be further extended inside the candidate parts using the chain rule, and q˙q˙ϕk refers to two consecutive partial derivations on the ϕk with respect to the system states
(11)
To simplify the expression, we assign
(12)
Then, the equation can be simplified as follows:
(13)

This demonstration mainly focuses on the passive scenario in which no active force is applied to the system, and no prior knowledge is assumed. With some modification, this method can also be applied to active systems and systems with partial prior knowledge. An active model can be obtained by including the dissipation function as part of the external force. The model with partial prior knowledge can be obtained by separating the dissipation function on the left-hand side of the equation.

Active case
(14)

The τext refers to the known external force input to the system.

Partial prior knowledge case
(15)

Having some prior knowledge means knowing certain candidates in the expression and their coefficients. Here, Ml,Nl,Ol refers to the components formed by the known candidates.

2.4 Sparse Regression Algorithm.

The sparse regression training in the SINDy algorithm is challenging for its high sparsity compared to other common applications of sparse regression algorithms. For standard sparse regression cases, the ratio of irrelevant functions inside the whole vector, commonly named sparsity, is around 80% to 50%. In SINDy tasks, the sparsity is usually higher than 90%; such a sparse dataset is challenging to fit. To address this challenge, a robust, efficient, sparse regression algorithm is needed. In experiments, the proximal gradient descent (PGD) [28] methods perform well in high-sparsity sparse regression thanks to their robustness brought by the proximal operator. To find the optimal option, we tested three variants of PGD and vanilla PGD; the result is shown in Fig. 3. Although the PGD with line-search (PGD with Ls) [29] seems to converge with the least iterations, it takes longer for each iteration than all other methods. As a result, its actual time consumption sits between the accelerated proximal gradient descent (APGD) [30] and fast iterative shrinkage threshold algorithm (FISTA) [31], making the APGD the best choice for our task.

Fig. 3
Training efficiency comparison, the error is the mean square error between the angle prediction from the obtained model and the true angle value
Fig. 3
Training efficiency comparison, the error is the mean square error between the angle prediction from the obtained model and the true angle value
Close modal

3 Results

In this section, we tested the algorithm's robustness on four different systems and then verified its generalizability with two extra tests. With a maximum capable noise level as 101, the proposed method maintains the same robustness level as the xL-SINDy. In the generalizability test, the algorithm shows an accuracy of over 90% across different situations, proving its potential as a universal solution.

For ease of reading, the paper only compares the simulation curve between the true and obtained models. A table of all obtained expressions is in Table 1. In the single pendulum, θ refers to the angle of the pendulum; in cart pendulum, x refers to the displacement of the cart and θ refers to the angle of the pendulum; in the double pendulum, θ1 refers to the angle of the first joint which is the joint that got fixed with the environment, θ2 refers to the angle of the second joint which is the joint that connects the two bars; in the spherical pendulum, θ refers to the polar angle, and ϕ refers to the azimuthal angle.

Table 1

Obtained Lagrangian and Rayleigh dissipation functions for different systems under various noise levels

SystemNoiseObtained LagrangianObtained dissipation function
Single pendulumTrue0.500θ˙2+9.81cosθ0.090θ˙2
00.500θ˙2+9.81cosθ0.090θ˙2
1030.500θ˙2+9.61cosθ0.090θ˙2
2×1020.500θ˙2+9.87cosθ0.090θ˙2
6×1020.500θ˙2+9.49cosθ0.090θ˙2
1010.500θ˙2+9.79cosθ0.090θ˙2
Cart pendulumTrue0.250θ˙2+0.750x˙2+0.500x˙θ˙cosθ+4.905cosθ0.250θ˙2+0.250x˙2
00.250θ˙2+0.740x˙2+0.495x˙θ˙cosθ+4.905cosθ0.250θ˙2+0.250x˙2
1040.250θ˙2+0.736x˙2+0.493x˙θ˙cosθ+4.905cosθ0.250θ˙2+0.251x˙2
1030.245θ˙2+0.690x˙2+0.461x˙θ˙cosθ+4.905cosθ0.249θ˙2+0.251x˙2
2×1020.245θ˙2+0.678x˙2+0.453x˙θ˙cosθ+4.905cosθ0.252θ˙2+0.235x˙2
6×1020.245θ˙2+0.649x˙2+0.436x˙θ˙cosθ+4.905cosθ0.241θ˙2+0.192x˙2
1010.240θ˙2+0.661x˙2+0.445x˙θ˙cosθ+4.905cosθ0.216θ˙2+0.142x˙2
Double pendulumTrueθ˙12+0.5θ˙22+19.62cosθ1+9.81cosθ2+θ˙1θ˙2sinθ1sinθ2+θ˙1θ˙2cosθ1cosθ20.250θ˙12+0.250θ˙22
00.999θ˙12+0.500θ˙22+19.620cosθ1+9.794cosθ2+0.999θ˙1θ˙2sinθ1sinθ2+1.000θ˙1θ˙2cosθ1cosθ20.241θ˙12+0.244θ˙22
1030.998θ˙12+0.498θ˙22+19.620cosθ1+9.752cosθ2+0.998θ˙1θ˙2sinθ1sinθ2+0.998θ˙1θ˙2cosθ1cosθ20.251θ˙12+0.249θ˙22
2×1020.998θ˙12+0.498θ˙22+19.620cosθ1+9.764cosθ2+0.998θ˙1θ˙2sinθ1sinθ2+0.998θ˙1θ˙2cosθ1cosθ20.234θ˙12+0.240θ˙22
6×1020.999θ˙12+0.494θ˙22+19.620cosθ1+9.755cosθ2+0.995θ˙1θ˙2sinθ1sinθ2+0.994θ˙1θ˙2cosθ1cosθ20.219θ˙12+0.223θ˙22
1010.988θ˙12+0.488θ˙22+19.620cosθ1+9.608cosθ2+0.973θ˙1θ˙2sinθ1sinθ2+0.983θ˙1θ˙2cosθ1cosθ20.198θ˙12+0.211θ˙22
Spherical pendulumTrue0.50θ˙2+0.50ϕ˙2sin2θ+9.81cosθ0.050θ˙2+0.050ϕ˙2
00.50θ˙2+0.50ϕ˙2sin2θ+9.81cosθ0.050θ˙2+0.050ϕ˙2
1030.51θ˙2+0.511ϕ˙2sin2θ+9.79cosθ0.053θ˙2+0.049ϕ˙2
2×1020.54θ˙2+0.44ϕ˙2sin2θ+9.81cosθ0.046θ˙2+0.048ϕ˙2
6×1020.48θ˙2+0.48ϕ˙2sin2θ+9.81cosθ0.028θ˙2+0.042ϕ˙2
SystemNoiseObtained LagrangianObtained dissipation function
Single pendulumTrue0.500θ˙2+9.81cosθ0.090θ˙2
00.500θ˙2+9.81cosθ0.090θ˙2
1030.500θ˙2+9.61cosθ0.090θ˙2
2×1020.500θ˙2+9.87cosθ0.090θ˙2
6×1020.500θ˙2+9.49cosθ0.090θ˙2
1010.500θ˙2+9.79cosθ0.090θ˙2
Cart pendulumTrue0.250θ˙2+0.750x˙2+0.500x˙θ˙cosθ+4.905cosθ0.250θ˙2+0.250x˙2
00.250θ˙2+0.740x˙2+0.495x˙θ˙cosθ+4.905cosθ0.250θ˙2+0.250x˙2
1040.250θ˙2+0.736x˙2+0.493x˙θ˙cosθ+4.905cosθ0.250θ˙2+0.251x˙2
1030.245θ˙2+0.690x˙2+0.461x˙θ˙cosθ+4.905cosθ0.249θ˙2+0.251x˙2
2×1020.245θ˙2+0.678x˙2+0.453x˙θ˙cosθ+4.905cosθ0.252θ˙2+0.235x˙2
6×1020.245θ˙2+0.649x˙2+0.436x˙θ˙cosθ+4.905cosθ0.241θ˙2+0.192x˙2
1010.240θ˙2+0.661x˙2+0.445x˙θ˙cosθ+4.905cosθ0.216θ˙2+0.142x˙2
Double pendulumTrueθ˙12+0.5θ˙22+19.62cosθ1+9.81cosθ2+θ˙1θ˙2sinθ1sinθ2+θ˙1θ˙2cosθ1cosθ20.250θ˙12+0.250θ˙22
00.999θ˙12+0.500θ˙22+19.620cosθ1+9.794cosθ2+0.999θ˙1θ˙2sinθ1sinθ2+1.000θ˙1θ˙2cosθ1cosθ20.241θ˙12+0.244θ˙22
1030.998θ˙12+0.498θ˙22+19.620cosθ1+9.752cosθ2+0.998θ˙1θ˙2sinθ1sinθ2+0.998θ˙1θ˙2cosθ1cosθ20.251θ˙12+0.249θ˙22
2×1020.998θ˙12+0.498θ˙22+19.620cosθ1+9.764cosθ2+0.998θ˙1θ˙2sinθ1sinθ2+0.998θ˙1θ˙2cosθ1cosθ20.234θ˙12+0.240θ˙22
6×1020.999θ˙12+0.494θ˙22+19.620cosθ1+9.755cosθ2+0.995θ˙1θ˙2sinθ1sinθ2+0.994θ˙1θ˙2cosθ1cosθ20.219θ˙12+0.223θ˙22
1010.988θ˙12+0.488θ˙22+19.620cosθ1+9.608cosθ2+0.973θ˙1θ˙2sinθ1sinθ2+0.983θ˙1θ˙2cosθ1cosθ20.198θ˙12+0.211θ˙22
Spherical pendulumTrue0.50θ˙2+0.50ϕ˙2sin2θ+9.81cosθ0.050θ˙2+0.050ϕ˙2
00.50θ˙2+0.50ϕ˙2sin2θ+9.81cosθ0.050θ˙2+0.050ϕ˙2
1030.51θ˙2+0.511ϕ˙2sin2θ+9.79cosθ0.053θ˙2+0.049ϕ˙2
2×1020.54θ˙2+0.44ϕ˙2sin2θ+9.81cosθ0.046θ˙2+0.048ϕ˙2
6×1020.48θ˙2+0.48ϕ˙2sin2θ+9.81cosθ0.028θ˙2+0.042ϕ˙2

We modified the training process with two special mechanisms besides the mentioned measures: two-phase training and scaling [32]. This accelerates and stabilizes the training process and does not affect robustness. With a weak connection with the main algorithm and page limitation, we skip the details of these mechanisms.

With Fig. 4 summarizing the whole algorithm, here is the training flow: First, select the Lagrangian and Rayleigh candidate functions, then randomly initialize the corresponding coefficients. By applying them to the Euler–Lagrange equation, we obtain the system's energy flow expression. Next, substitute the system states into Eq. (13), (14), or (15) to predict the system's energy flow. Using the error between the prediction and the true value, update the coefficients and fit them to the data, which gives the final Lagrangian and Rayleigh dissipation functions. We also provide some visualizations of the libraries in Figs. 1013; these figures are provided in the  Appendix.

Fig. 4
The workflow of the proposed algorithm
Fig. 4
The workflow of the proposed algorithm
Close modal

In the results, we noticed that the coefficients for the dissipative function tend to decline with the growing noise magnitude, a possible explanation would be the dissipative force' effect is gradually being covered with the growing noise, making it harder to recognize the dissipative part under high noise level.

3.1 Experiment Setup.

To evaluate the effectiveness of our method, we selected single pendulum, cart pendulum, double pendulum, and spherical pendulum systems as subjects for our experimentation; the detailed configurations of these systems are defined in our previous work [27] and shown in Fig. 5. These choices were made due to their relatively simple but chaotic behavior, making them ideal candidates for comparison with SINDy-PI. The dynamics governing these systems are defined by their respective Lagrangian, as presented in Table 1. To include dissipative part in the system, a linear dissipative force is applied to all four systems, as the dissipation function shows in Table 1. The whole process is based on simulation, and the data sample frequency is 50 Hz.

Fig. 5
Overview of different pendulum structures used in experiment. (a) The structure of the single pendulum. In the experiment, l = 1 m and m = 1 kg. (b) The structure of the cart pendulum. In the experiment, l = 1 m and m = 1 kg. (c) The structure of the double pendulum. In the experiment, l1 = l2 = 1 m and m1 = m2 = 1 kg. (d) The structure of the spherical pendulum. In the experiment, l = 1 m and m = 1 kg.
Fig. 5
Overview of different pendulum structures used in experiment. (a) The structure of the single pendulum. In the experiment, l = 1 m and m = 1 kg. (b) The structure of the cart pendulum. In the experiment, l = 1 m and m = 1 kg. (c) The structure of the double pendulum. In the experiment, l1 = l2 = 1 m and m1 = m2 = 1 kg. (d) The structure of the spherical pendulum. In the experiment, l = 1 m and m = 1 kg.
Close modal

To evaluate the robustness of the proposed method, we added Gaussian noise into the generated data. We aligned our noise magnitude settings with those of xL-SINDy for comparison convenience, selecting values of 104, 103, 2×102, 6×102, and 101. The tests reveal that the proposed method maintains the same robustness level as the xL-SINDy and outperforms SINDy-PI but with the extra capability to model nonconservative systems.

To show that the algorithm can perform stably with different pendulum systems, we also tested the algorithm under the same model type but with different physical parameters and dissipative force types.

For the sake of readability and to aid in interpreting results, all Lagrangians in this study have been scaled relative to the first coefficient. This approach is valid, as scaling the Lagrangian by an arbitrary constant does not alter the system dynamics. For the same reason, systems under different noise levels are started at the same initial position in the figures below, though in real experiments the position is randomly initialized. In the figures, the first 5 s would be the training data with noise and the obtained model's data, while the rest would be the validation data that started from a different initial position and the obtained model's data. By showing both the training and validation comparison to the obtained model, it is proved that it is indeed a universal model that fully describes the system's dynamics.

3.2 Single Pendulum.

For the single pendulum system, we attempt to recover the system dynamics with the Rayleigh dissipation function model. In sample generation, the pendulum is randomly initialized in the range [π,π] for the angle and zero for the velocity. The candidate function library includes polynomials and trigonometric functions up to second order for the Lagrangian, and dissipation functions for n ranging from 0 to 3, eventually building a library containing 17 functions, with three true functions inside the library (two from the Lagrangian and one from the dissipation function); the task sparsity is 17.64%. Figure 6 gives an intuitive simulation based on the result of the Rayleigh dissipation function incorporated method. The robustness of the algorithm is also proven. Although not shown here, for the conservative system, the coefficient shifting is less prominent along with the growing noise; the same magnitude of noise has a greater influence on nonconservative system. This effect stems from a larger relative error when the oscillation magnitude decreases at the late stage of nonconservative systems. Overall, the positive result in this simple model proves the algorithm's viability, giving confidence for tests on complex systems.

Fig. 6
Single pendulum simulation results (Rayleigh model)
Fig. 6
Single pendulum simulation results (Rayleigh model)
Close modal

3.3 Cart Pendulum.

The angle θ of the pendulum is randomly initialized at the range [π,π], and the cart position is zero, both with zero velocity. The candidate function library includes polynomials and trigonometric functions up to fourth order for the Lagrangian, and dissipation functions with n ranging from 0 to 3, eventually building a library containing 61 functions, with six true functions inside the library (four from the Lagrangian and two from the dissipation function); the task sparsity is 9.84%. The simulation result is shown in Fig. 7. As the results show, our method can find the correct candidates for noise levels up to 101, although with some coefficients shifting, the simulation curve of our obtained can still follow the true model.

Fig. 7
Cart pendulum simulation result based on the model obtained by our method and SINDy-PI, the training data consists of 20 initial conditions, each lasting for 5 s. The validation set is the data for 5 s with a different initial condition.
Fig. 7
Cart pendulum simulation result based on the model obtained by our method and SINDy-PI, the training data consists of 20 initial conditions, each lasting for 5 s. The validation set is the data for 5 s with a different initial condition.
Close modal

We also performed a comparative analysis using the well-known SINDy-PI method to validate our proposed algorithm further. In Fig. 7, the dotted lines give the results of SINDy-PI. As the results indicate, SINDy-PI method is more sensitive to noise. Even though it is claimed that SINDy-PI can handle noise magnitudes up to 103, the actual robustness of the algorithm is also affected by the system complexity. The involvement of dissipative force significantly adds complexity, causing SINDy-PI's performance degradation. In our cart pendulum test, SINDy-PI can handle only up to 104 noise magnitude. In contrast, our proposed method can accommodate noise magnitudes up to 101, matching the robustness of the original xL-SINDy algorithm.

3.4 Double Pendulum.

The pendulum is randomly initialized for sampling, both joints' angles are in the range [π,π], and velocities are zero. The candidate function library has 95 candidates, with eight true functions in the library (six for the Lagrangian and two for the dissipation function); the task sparsity is 8.42%. Testing with the double pendulum system in Table 1, Fig. 8 offers an intuitive simulation-based comparison between the model generated by our algorithm, the model generated by SINDy-PI, and the true model. Our algorithm effectively recovers the correct candidate terms up to a noise magnitude of 101, though there is a noticeable shift in coefficients when the noise level reaches 6×102. This could be attributed to the relative noise level issue discussed in Sec. 3.2. Still, the algorithm exhibits 103 times higher robustness than SINDy-PI.

Fig. 8
Double pendulum simulation result based on the model obtained by our method and SINDy-PI, the training data consists of 20 initial conditions, each lasting for 5 s. The validation set is the data for 5 s with a different initial condition.
Fig. 8
Double pendulum simulation result based on the model obtained by our method and SINDy-PI, the training data consists of 20 initial conditions, each lasting for 5 s. The validation set is the data for 5 s with a different initial condition.
Close modal

Like the cart pendulum case, SINDy-PI could only handle noise level up to 104. For higher noise level, the SINDy-PI can only discover part of the true model or even fail to capture any essence of it. In the meantime, our method can still find the correct candidates and follow the true model's curve at the noise level of 101, further proving the robustness of our method.

3.5 Spherical Pendulum.

Finally, we also tested our method using a spherical pendulum mentioned in Table 1. With a polar coordinate definition, the pendulum's polar angle θ is initialized in the range [π/3,π], while the initial azimuthal angle is zero, and the angle velocities are both zero. The candidate function library has 65, with five true functions in the library (three for the Lagrangian and two for the dissipation function); the task sparsity is 7.69%.

The test results are shown in Fig. 9; compared with the other three cases, the spherical pendulum turns out to be more challenging to recognize. Our method can only handle noise levels up to 6×102, as SINDy-PI could not give any valid result even for cases without noise.

Fig. 9
Simulation result of the spherical pendulum
Fig. 9
Simulation result of the spherical pendulum
Close modal
Fig. 10
Candidate library visualization of the single pendulum: (a) the Lagrangian library and (b) the Rayleigh dissipation function library
Fig. 10
Candidate library visualization of the single pendulum: (a) the Lagrangian library and (b) the Rayleigh dissipation function library
Close modal
Fig. 11
Candidate library visualization of the cart pendulum: (a) the Lagrangian library and (b) the Rayleigh dissipation function library
Fig. 11
Candidate library visualization of the cart pendulum: (a) the Lagrangian library and (b) the Rayleigh dissipation function library
Close modal
Fig. 12
Candidate library visualization of the double pendulum: (a) the Lagrangian library and (b) the Rayleigh dissipation function library
Fig. 12
Candidate library visualization of the double pendulum: (a) the Lagrangian library and (b) the Rayleigh dissipation function library
Close modal
Fig. 13
Candidate library visualization of the spherical pendulum: (a) the Lagrangian library and (b) the Rayleigh dissipation function library
Fig. 13
Candidate library visualization of the spherical pendulum: (a) the Lagrangian library and (b) the Rayleigh dissipation function library
Close modal

3.6 Generalizability Test.

To evaluate the algorithm's generalizability, the algorithm is also tested under varying physical parameters. To be more specific, we utilize the double pendulum system and change its mass, length, and damping coefficients. Table 2 provides a visual representation of the model's accuracy through a heatmap under a noise level of 103; the deeper color refers to a higher accuracy. The accuracy is calculated using the mean square error of the coefficients between the identified model and the true model. Here, l1,l2 refer to the length of the pendulums, m1,m2 refer to the masses, and b1,b2 refer to the dissipative coefficients. Across various physical parameters, our algorithm consistently recovered system dynamics with an accuracy exceeding 90%, affirming its promising applicability in diverse settings. Besides, model performance under different dissipative force types and magnitudes is also tested. The test contains constant and viscous dissipative force, with coefficients ranging from 0.05 to 0.5. With results shown in Table 3, the overall accuracy stays above 90%, showing the algorithm's capability. The result also shows that the algorithm performs better under constant dissipative force. Overall, the proposed method generalizes different system settings and dissipative force, proving its potential as a universal solution.

Table 2

Accuracy on the double pendulum with different physical conditions

b1=0.2,b2=0.2
l1=0.5,l2=0.5l1=1,l2=1l1=1.5,l2=1.5
m1=1.5,m2=1.594.57%93.12%96.72%
m1=1,m2=193.74%93.97%93.38%
m1=0.5,m2=0.594.19%94.53%94.32%
b1=0.2,b2=0.2
l1=0.5,l2=0.5l1=1,l2=1l1=1.5,l2=1.5
m1=1.5,m2=1.594.57%93.12%96.72%
m1=1,m2=193.74%93.97%93.38%
m1=0.5,m2=0.594.19%94.53%94.32%
b1=0.05,b2=0.05
l1=0.5,l2=0.5l1=1,l2=1l1=1.5,l2=1.5
m1=1.5,m2=1.593.64%91.03%92.41%
m1=1,m2=194.37%92.35%94.06%
m1=0.5,m2=0.594.82%93.85%92.44%
b1=0.05,b2=0.05
l1=0.5,l2=0.5l1=1,l2=1l1=1.5,l2=1.5
m1=1.5,m2=1.593.64%91.03%92.41%
m1=1,m2=194.37%92.35%94.06%
m1=0.5,m2=0.594.82%93.85%92.44%
Table 3

Model accuracy under different dissipative coefficients

(a) Viscous dissipative force
b2accb10.050.10.5
0.0592.35%95.57%92.53%
0.195.70%95.57%93.57%
0.596.20%94.52%94.20%
(a) Viscous dissipative force
b2accb10.050.10.5
0.0592.35%95.57%92.53%
0.195.70%95.57%93.57%
0.596.20%94.52%94.20%
(b) Constant dissipative force
b2accb10.050.10.5
0.0598.91%95.92%98.59%
0.198.12%98.33%98.18%
0.596.02%95.16%96.58%
(b) Constant dissipative force
b2accb10.050.10.5
0.0598.91%95.92%98.59%
0.198.12%98.33%98.18%
0.596.02%95.16%96.58%

4 Conclusion and Discussion

This paper presents an algorithm for discovering the Lagrangian models of nonconservative systems using noisy data. Extending the existing algorithm to the on-conservative domain is critical for future real-world applications. Our method improves noise robustness compared with the existing SINDy-PI method.

While the xL-SINDy offered a robust approach to identifying the conservative systems, its inability to account for nonconservative systems was a notable drawback. The proposed method fills this gap effectively, offering a more generalized modeling approach. The Rayleigh model we proposed has excellent capabilities for different situations of dissipative force. It is necessary to admit that the proposed method can only handle dissipative forces included by the Rayleigh dissipation function. However, such a capability can be extended by manually adding additional terms to the dissipation function library. With these extra terms, other nonconservative nonlinear forces like Coulomb friction or viscous friction can also be detected in the model.

A current limitation of our approach is the recognition efficiency and accuracy when the system becomes complicated. A robustness degradation already appears in the spherical pendulum test; such degradation would be more evident for more complex systems. Some pre/postdata processing techniques are under consideration to further improve the robustness. In light of recent advancements, it is crucial to acknowledge the Ensemble-SINDy methodology introduced by Fasel et al. [33], which has exhibited significant gains in robustness and accuracy by incorporating ensemble techniques. This is a postprocess of the results, and it is particularly appealing for our work, as the ensemble method is independent of the core SINDy algorithm, enabling seamless integration with any of the variants of SINDy. Moreover, approaches like derivative-based SINDy [34], which employ denoising techniques before regression as a data preprocessing, offer another compelling avenue for enhancing the robustness of our model. Unlike our method and SINDy-PI, these two methods did not modify the SINDy process. Therefore, they are excluded from the comparison in the experiment part. Still, their performance is appealing, and integrating them might be our next step.

Our method offers a robust approach for the sparse identification of nonlinear dynamic systems. Its ability to incorporate dissipation functions makes it closer to real-world applications, setting new stage for future research in this area.

Funding Data

  • JSPS Grant-in-Aid for Scientific Research (Grant No. 24K00841; Funder ID: 10.13039/501100001691).

Data Availability Statement

The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.

Nomenclature

b =

damping coefficient

b1,b2 =

the damping coefficient of the first and second joint of double pendulum

C =

coefficient vector

ck,cj =

coefficients of the system state functions

D =

Rayleigh dissipation function

f =

damping force

L =

Lagrangian

l1,l2 =

the length of the first and second bar of double pendulum

m =

object mass

m1,m2 =

the mass of the first and second end of double pendulum

Mk,Nk,Ok,Pj =

simplification signs for the equation

Ml,Nl,Ol =

simplification signs for the known part of the equation

n =

the hyperparameter of Rayleigh dissipation function

q =

system state

T =

kinetic energy

V =

potential energy

x,x˙,x¨ =

object position, velocity, and acceleration

=

differential operator

θ,θ˙,θ¨ =

object angle, angular velocity, and angular acceleration

θ1,θ1˙ =

the angle and angular velocity of double pendulum's first joint

θ2,θ2˙ =

the angle and angular velocity of double pendulum's second joint

σ =

Gaussian noise magnitude

τext =

external force to the system

ϕ˙ =

the azimuthal angular velocity of spherical pendulum

ϕk =

system state functions

Φ =

function vector

ψj =

Rayleigh dissipation functions

Appendix: Candidate Library Visualization

In this appendix, we add visualization plots for the candidate library used in the experiment for ease of understanding, the highlighted expressions are the true terms for the system dynamics.

References

1.
Schmid
,
P.
,
2010
, “
Dynamic Mode Decomposition of Numerical and Experimental Data
,”
J. Fluid Mech.
,
656
(
7
), pp.
5
28
.10.1017/S0022112010001217
2.
Benner
,
P.
,
Gugercin
,
S.
, and
Willcox
,
K.
,
2015
, “
A Survey of Projection-Based Model Reduction Methods for Parametric Dynamical Systems
,”
SIAM Rev.
,
57
(
4
), pp.
483
531
.10.1137/130932715
3.
Peherstorfer
,
B.
, and
Willcox
,
K.
,
2016
, “
Data-Driven Operator Inference for Nonintrusive Projection-Based Model Reduction
,”
Comput. Methods Appl. Mech. Eng.
,
306
, pp.
196
215
.10.1016/j.cma.2016.03.025
4.
Kutz
,
J.
,
Brunton
,
S.
,
Brunton
,
B.
, and
Proctor
,
J.
,
2016
,
Dynamic Mode Decomposition
, Society for Industrial and Applied Mathematics, Philadelphia, PA.
5.
Bongard
,
J.
, and
Lipson
,
H.
,
2007
, “
Automated Reverse Engineering of Nonlinear Dynamical Systems
,”
Proc. Natl. Acad. Sci.
,
104
(
24
), pp.
9943
9948
.10.1073/pnas.0609476104
6.
Schmidt
,
M.
, and
Lipson
,
H.
,
2009
, “
Distilling Free-Form Natural Laws From Experimental Data
,”
Science
,
324
(
5923
), pp.
81
85
.10.1126/science.1165893
7.
Brunton
,
S.
,
Proctor
,
J.
, and
Kutz
,
J.
,
2016
, “
Discovering Governing Equations From Data by Sparse Identification of Nonlinear Dynamical Systems
,”
Proc. Natl. Acad. Sci.
,
113
(
15
), pp.
3932
3937
.10.1073/pnas.1517384113
8.
Rudy
,
S.
,
Brunton
,
S.
,
Proctor
,
J.
, and
Kutz
,
J.
,
2017
, “
Data-Driven Discovery of Partial Differential Equations
,”
Sci. Adv.
,
3
(
4
), p.
e1602614
.10.1126/sciadv.1602614
9.
Mallat
,
R.
,
Bonnet
,
V.
,
Huo
,
W.
,
Karasinski
,
P.
,
Amirat
,
Y.
,
Khalil
,
M.
, and
Mohammed
,
S.
,
2018
, “
Human-Exoskeleton System Dynamics Identification Using Affordable Sensors
,” 2018 IEEE International Conference on Robotics and Automation (
ICRA
), Brisbane, Australia, May 21–25, pp.
6759
6765
.10.1109/ICRA.2018.8463178
10.
Bruder
,
D.
,
Remy
,
C.
, and
Vasudevan
,
R.
,
2019
, “
Nonlinear System Identification of Soft Robot Dynamics Using Koopman Operator Theory
,” 2019 International Conference on Robotics and Automation (
ICRA
), Montreal, QC, Canada, May 21–24, pp.
6244
6250
.10.1109/ICRA.2019.8793766
11.
Fontanelli
,
G.
,
Ficuciello
,
F.
,
Villani
,
L.
, and
Siciliano
,
B.
,
2017
, “
Modelling and Identification of the da Vinci Research Kit Robotic Arms
,” 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (
IROS
), Vancouver, BC, Canada, Sept. 24–28, pp.
1464
1469
.10.1109/IROS.2017.8205948
12.
Lusch
,
B.
,
Kutz
,
J.
, and
Brunton
,
S.
,
2018
, “
Deep Learning for Universal Linear Embeddings of Nonlinear Dynamics
,”
Nat. Commun.
,
9
(
1
), p.
4950
.10.1038/s41467-018-07210-0
13.
Wehmeyer
,
C.
, and
Noé
,
F.
,
2018
, “
Time-Lagged Autoencoders: Deep Learning of Slow Collective Variables for Molecular Kinetics
,”
J. Chem. Phys.
,
148
(
24
), p.
241703
.10.1063/1.5011399
14.
Yilmaz
,
N.
,
Wu
,
J.
,
Kazanzides
,
P.
, and
Tumerdem
,
U.
,
2020
, “
Neural Network Based Inverse Dynamics Identification and External Force Estimation on the da Vinci Research Kit
,” 2020 IEEE International Conference on Robotics and Automation (
ICRA
), Paris, France, May 31–Aug. 31, pp.
1387
1393
.10.1109/ICRA40945.2020.9197445
15.
Sanyal
,
S.
, and
Roy
,
K.
,
2023
, “
RAMP-Net: A Robust Adaptive MPC for Quadrotors Via Physics-Informed Neural Network
,” 2023 IEEE International Conference on Robotics and Automation (
ICRA
), London, UK, May 29–June 2, pp.
1019
1025
.10.1109/ICRA48891.2023.10161410
16.
Fasel
,
U.
,
Kaiser
,
E.
,
Kutz
,
J.
,
Brunton
,
B.
, and
Brunton
,
S.
,
2021
, “
SINDy With Control: A Tutorial
,” 2021 60th IEEE Conference on Decision and Control (
CDC
), Austin, TX, Dec. 14–17, pp.
16
21
.10.1109/CDC45484.2021.9683120
17.
Dam
,
M.
,
Brøns
,
M.
,
Juul Rasmussen
,
J.
,
Naulin
,
V.
, and
Hesthaven
,
J.
,
2017
, “
Sparse Identification of a Predator-Prey System From Simulation Data of a Convection Model
,”
Phys. Plasmas
,
24
(
2
), p.
022310
.10.1063/1.4977057
18.
De Silva
,
B.
,
Higdon
,
D.
,
Brunton
,
S.
, and
Kutz
,
J.
,
2020
, “
Discovery of Physics From Data: Universal Laws and Discrepancies
,”
Front. Artif. Intell.
,
3
, p.
25
.10.3389/frai.2020.00025
19.
Hoffmann
,
M.
,
Fröhner
,
C.
, and
Noé
,
F.
,
2019
, “
Reactive SINDy: Discovering Governing Reactions From Concentration Data
,”
J. Chem. Phys.
,
150
(
2
), p.
025101
.10.1063/1.5066099
20.
Chen
,
J.
,
Zhang
,
M.
,
Yang
,
Z.
, and
Xia
,
L.
,
2021
, “
A Robust Data-Driven Approach for Dynamics Model Identification in Trajectory Planning
,” 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (
IROS
), Prague, Czech Republic, Sept. 27–Oct. 1, pp.
7104
7111
.10.1109/IROS51168.2021.9635979
21.
Sorokina
,
M.
,
Sygletos
,
S.
, and
Turitsyn
,
S.
,
2016
, “
Sparse Identification for Nonlinear Optical Communication Systems: SINO Method
,”
Opt. Express
,
24
(
26
), p.
30433
.10.1364/OE.24.030433
22.
Cai
,
Y.
,
Wang
,
X.
,
Joós
,
G.
, and
Kamwa
,
I.
,
2023
, “
An Online Data-Driven Method to Locate Forced Oscillation Sources From Power Plants Based on Sparse Identification of Nonlinear Dynamics (SINDy)
,”
IEEE Trans. Power Syst.
,
38
(
3
), pp.
2085
2099
.10.1109/TPWRS.2022.3189602
23.
Huang
,
S.-C.
,
Chen
,
H.-W.
, and
Tien
,
M.-H.
,
2023
, “
Predicting Nonlinear Modal Properties by Measuring Free Vibration Responses
,”
ASME J. Comput. Nonlinear Dyn.
,
18
(
4
), p.
041005
.10.1115/1.4056949
24.
Kanki
,
R.
, and
Saito
,
A.
,
2024
, “
Data-Driven Initial Gap Identification of Piecewise-Linear Systems Using Sparse Regression and Universal Approximation Theorem
,”
ASME J. Comput. Nonlinear Dyn.
,
19
(
6
), p.
061003
.10.1115/1.4065440
25.
Kaheman
,
K.
,
Kutz
,
J.
, and
Brunton
,
S.
,
2020
, “
SINDy-PI: A Robust Algorithm for Parallel Implicit Sparse Identification of Nonlinear Dynamics
,”
Proc. Math. Phys. Eng. Sci.
,
476
(
2242
), p.
20200279
.10.1098/rspa.2020.0279
26.
Chu
,
H.
, and
Hayashibe
,
M.
,
2020
, “
Discovering Interpretable Dynamics by Sparsity Promotion on Energy and the Lagrangian
,”
IEEE Rob. Autom. Lett.
,
5
(
2
), pp.
2154
2160
.10.1109/LRA.2020.2970626
27.
Purnomo
,
A.
, and
Hayashibe
,
M.
,
2023
, “
Sparse Identification of Lagrangian for Nonlinear Dynamical Systems Via Proximal Gradient Method
,”
Sci. Rep.
,
13
(
1
), p.
7919
.10.1038/s41598-023-34931-0
28.
Nesterov
,
Y.
,
2013
, “
Gradient Methods for Minimizing Composite Functions
,”
Math. Program.
,
140
(
1
), pp.
125
161
.10.1007/s10107-012-0629-5
29.
Latafat
,
P.
,
Themelis
,
A.
,
Stella
,
L.
, and
Patrinos
,
P.
,
2024
, “
Adaptive Proximal Algorithms for Convex Optimization Under Local Lipschitz Continuity of the Gradient
,”
Math. Program.
, pp.
1
39
.10.1007/s10107-024-02143-7
30.
Li
,
H.
, and
Lin
,
Z.
,
2015
, “
Accelerated Proximal Gradient Methods for Nonconvex Programming
,”
Advances in Neural Information Processing Systems
, Vol.
28
,
C.
Cortes
,
N.
Lawrence
,
D.
Lee
,
M.
Sugiyama
, and
R.
Garnett
, eds.,
Curran Associates
, Montreal, QC, Canada.
31.
Beck
,
A.
, and
Teboulle
,
M.
,
2009
, “
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
,”
SIAM J. Imaging Sci.
,
2
(
1
), pp.
183
202
.10.1137/080716542
32.
Sun
,
T.
, and
Zhang
,
C.-H.
,
2012
, “
Scaled Sparse Linear Regression
,”
Biometrika
,
99
(
4
), pp.
879
898
.10.1093/biomet/ass043
33.
Fasel
,
U.
,
Kutz
,
J.
,
Brunton
,
B.
, and
Brunton
,
S.
,
2022
, “
Ensemble-SINDy: Robust Sparse Model Discovery in the Low-Data, High-Noise Limit, With Active Learning and Control
,”
Proc. R. Soc. A: Math., Phys. Eng. Sci.
,
478
(
2260
), p.
20210904
.10.1098/rspa.2021.0904
34.
Wentz
,
J.
, and
Doostan
,
A.
,
2023
, “
Derivative-Based SINDy (DSINDy): Addressing the Challenge of Discovering Governing Equations From Noisy Data
,”
Comput. Methods Appl. Mech. Eng.
,
413
, p.
116096
.10.1016/j.cma.2023.116096