Estimation and filtering are important tasks in most modern control systems. These methods rely on accurate discrete-time approximations of the system dynamics. We present filtering algorithms that are based on discrete mechanics techniques (variational integrators), which are known to preserve system structures (momentum, symplecticity, and constraints, for instance) and have stable long-term energy behavior. These filtering methods show increased performance in simulations and experiments on a real digital control system. The particle filter as well as the extended Kalman filter benefits from the statistics-preserving properties of a variational integrator discretization, especially in low bandwidth applications. Moreover, it is shown how the optimality of the Kalman filter can be preserved through discretization by means of modified discrete-time Riccati equations for the covariance updates. This leads to further improvement in filter accuracy, even in a simple test example.

## Introduction

Estimation and filtering techniques for unknown system states based on noisy sensor data are crucial in many control systems. These methods are implemented in discrete time on digital embedded systems. Thus, they rely on accurate approximations of continuous-time system behavior; in low-bandwidth applications, predictions over longer time-horizons are required. Traditionally, explicit Euler discretizations of the system dynamics or high-order Runge-Kutta methods are chosen. Due to computational restrictions and real-time constraints, the first-order explicit Euler method is often the standard choice. However, low-order variational integration methods provide an alternative that allows efficient and accurate predictions. This paper studies the benefits of structure-preserving integration in two commonly used filtering approaches: (extended) Kalman filters and particle filters.

### Kalman Filters.

The Kalman filter is likely the most well-known and widely used estimator [1]. When the Kalman filter is applied to a linear system with additive Gaussian noise, it provides the optimal, maximum-likelihood (minimum-variance) state estimator [2]. The Kalman filter has been successfully applied to a wide variety of problems in image processing, wireless communication, aerospace, robotics, and so on. While the Kalman filter is simple to compute, the requirement of linear dynamics limits its applicability to nontrivial systems. The extended Kalman filter (EKF) relaxes the linearity requirement by utilizing local linearizations to approximate posterior distributions as Gaussian. It is only slightly more complex than the Kalman filter, and it is applicable to a larger class of systems. One drawback of the EKF is that its performance is strongly influenced by the severity of any nonlinearities in the system. Nevertheless, the EKF is commonly employed and has demonstrated reasonable performance in a wide range of applications.

### Particle Filters.

In recent years, particle filters have become a popular technique for estimation [3,4], likely due to their generality. In a particle filter, the uncertainty posterior distribution is represented by a finite collection of parameters referred to as *particles*. These particles are drawn from a distribution representing the current belief of the system's state, and then each particle is mapped forward in time using a model of the system. As the number of particles approaches infinity, the distribution of the particles approaches the solution to the Fokker-Plank equation associated with the system [5]. A primary feature of the particle filter is its wide applicability. Particle filters work with arbitrary noise model and have no issues with nonlinear system dynamics. A strong disadvantage of particle filter lies in their computational complexity. Its reliability is primarily governed by the number of particles used, and in some problems, the number of particles required to approximate the true posterior distribution may be quite large (10^{4}–10^{6}).

### Discrete Mechanics.

Discrete mechanics derives models for mechanical systems via discrete variational principles [6]. These models define numerical integrators (*variational integrators* (VIs), also called *symplectic integrators*) which possess beneficial properties: They conserve symplecticity and symmetries (momenta) of the continuous-time system, and they have stable energy behavior even in long-term simulations, which is important in astrodynamics or molecular dynamics simulations [7]. Furthermore, VIs exactly satisfy holonomic constraints common in multibody systems. VIs can be used to design an indirect or a direct optimal control method (cf., e.g., Ref. [8]) that inherits the advantages of forward integration. Additionally, the problem formulation requires fewer optimization variables and leverages structured linearizations for efficient optimization algorithm design (cf. Ref. [9]). Recently, the discrete mechanics concept has been extended to stochastic integrators (see Refs. [10,11]) which almost surely preserve the symplectic structure. In Ref. [12], stochastic variational integrators have been successfully applied for estimation problems on Lie groups using a method based on uncertainty ellipsoids. This paper extends results of the authors' previous works [13,14].

### Embedded Systems.

One of the primary motivations for this work is to develop and explore filter implementations that preserve geometric structure in order to improve performance while also facilitating implementation on an experimental embedded system. In recent years, much work has been done on leveraging a variety of numerical methods for improving the performance of estimation routines [15–17], but these methods typically achieve increased performance through higher-order, computationally expensive methods that may not be appropriate for embedded estimation. For an arbitrary choice of integrator for use in an embedded system, one would hope that the computational cost was low to deal with limited computational resources, and that the formulation could be cast as a one-step map that is easily linearized to fit the form of standard discrete control and estimation routines. One may also hope that the integration method satisfies holonomic constraints as often encountered in mechanical systems. Moreover, the satisfaction of holonomic constraints should not result in loss of the ability to linearize the discrete map across a time-step. The variational integrator (VI) presented in Ref. [9] satisfies all of these requirements while also providing a symplectic integrator with stable long-term energy behavior. The linearizations of that VI are explicit and come with fixed computational cost further aiding in implementation on an embedded system. While the VI from Ref. [9] is not the only integrator used, herein it provides motivation for much of this work. This integrator is a drop-in replacement for linearizable one-step maps, such as the explicit Euler scheme, in standard forms of particle filters and extended Kalman filters. While there is a multitude of other integration schemes available, it is difficult to find another technique that does not compromise at least one of these requirements. For example, many integrators that satisfy holonomic constraints are not easily linearizable. Higher-order integrators often result in multistep methods, and the increased order results in increased computational cost. For several experimental examples presented in this paper, we compare VI-based estimation routines to corresponding routines based on explicit Euler integration. While other choices could be made for this comparison, due to our interest in estimation for embedded systems, we constrain our choice to low-order, one-step maps, and in this context, the explicit Euler integrator is a standard choice.^{1}

### Contributions.

To begin, we illustrate the advantageous statistical properties of first- and second-order variational integrators for simulating a stochastic harmonic oscillator. We then present particle filters and extended Kalman filters based on VI discretization and demonstrate their performance in simulated and experimental examples. It is demonstrated in Sec. 5 that both particle filters and extended Kalman filters may be highly sensitive to the choice of integration scheme and that VI-based methods outperform the discrete-time filter obtained from zero-order-hold and explicit Euler discretization—a combination that fits well within the embedded systems requirements discussed in the previous section. A second important contribution of this work is the derivation of structure-preserving covariance updates for a Kalman filter. We prove that the modified update equations additionally capture the Hamiltonian/symplectic structure resulting from the optimality property of the Kalman filter. We then apply a traditional Kalman filter and a Kalman filter with the structure-preserving covariance update equations to a simple example. It is shown that a Kalman filter with the structure-preserving update equations outperforms the filter with standard covariance updates.

In Sec. 2, we give an introduction to variational/symplectic integrators for deterministic and stochastic systems. Section 3 provides an introduction to particle filters and illustrates how symplectic integration can be used in particle filtering in order to reduce the common issue of particle deprivation. In Sec. 4, we recall the definition of discrete-time Kalman filter and show how the one-step maps can be derived from variational integrators. The increased performance of these filters is experimentally demonstrated for a planar crane system in Sec. 5. We extend our approach by deriving structure-preserving covariance updates in Sec. 6. Finally, we discuss the results and provide concluding remarks in Sec. 7.

## Variational Integrators

### Variational Mechanics.

*Lagrangian L*, a real-valued function that depends on the system's configuration

*q*(

*t*) and velocity $q\u02d9(t)$ at any time

*t*. The Lagrangian is typically the difference of kinetic and potential energy, whereas the system's

*Hamiltonian H*is typically the sum of all energies. In addition, there may be forces $f(q,q\u02d9,u)$ such as friction or a control input force that influence the system's dynamics. The system dynamics are described by the Lagrange-d'Alembert principle, such that they satisfy the variational equation (cf. Ref. [6])

*q*and momentum variables $p=(\u2202L/\u2202q\u02d9)(q,q\u02d9)$ with Hamiltonian $H(q,p)=p\xb7q\u02d9\u2212L(q,q\u02d9)$ and Hamiltonian forcing

*f*

_{H}In the absence of forces, the flow of a Hamiltonian system (and thus, of an unforced Lagrangian system) is *symplectic*.^{2}

### Discrete Mechanics.

In order to simulate the system dynamics, most numerical integration schemes, e.g., Runge-Kutta methods, would be applied to either the Euler–Lagrange equations (2) or the Hamilton equations (3). This differs from *variational integrators*, which instead discretize the variational equation (1). This process generates an iterative numerical integration scheme, and the approach guarantees that structures of the original continuous system are preserved, e.g., symplecticity and symmetries in terms of system invariances (cf. Ref. [6]). Furthermore, variational integrators have exponentially stable energy behavior over long time horizons (cf. Ref. [7]).

*q*for $k=1,\u2026,N\u22121$ leads to the

_{k}*discrete forced Euler–Lagrange equations*(cf. Ref. [6,8]) for $k=1,\u2026,N\u22121$

where $DnLd$ is the *slot derivative*^{3} of *L _{d}*. Discrete momenta are given by the relationships $pk=\u2212D1Ld(qk,qk+1)\u2212fk\u2212$ and $pk+1=D2Ld(qk,qk+1)+fk+$ which implicitly define a

*symplectic one-step integration map*$\Psi k+1:(qk,pk)\u21a6(qk+1,pk+1)$. It is shown in Ref. [7] that (in the unforced case) a large class of variational integrators can be written as symplectic partitioned Runge-Kutta methods. In Ref. [8], the definition of symplecticity in VI/symplectic partitioned Runge-Kutta methods is extended to external forces. Thus, we will use both VI and symplectic partitioned Runge-Kutta schemes throughout this paper. In particular, we will use the VI midpoint rule for the planar crane system (Sec. 5) and the symplectic Euler method for Kalman filtering in Sec. 6.

### Stochastic Symplectic Integrators.

In Ref. [11], the Hamilton-Pontryagin variational principle is extended to stochastic dynamical systems. It is shown that a stochastic mechanical system possesses an almost surely variational structure, and thus is almost surely symplectic and momentum map preserving. These properties carry over to the stochastic variational integrator that Ref. [11] derives from analogous discrete variations. While their first-order integrator is derived from Hamilton-Pontryagin principle in configuration, momentum, and velocity coordinates, the stochastic symplectic Euler method used in this work is based on Hamilton equations in (*q*, *p*) coordinates.

for $k=0,1,\u2026,N\u22121$, where $\Delta k\omega 1,2$ are discretized independent Wiener processes and $\sigma ,\gamma \u2208\mathbb{R}$. In the following, we refer to this method as the *symplectic Euler scheme*. Note that this method simplifies to an explicit symplectic method for separable Hamiltonian systems ($H=T(p)+V(q)$).

*variational midpoint integrator*, which, applied to stochastic mechanical systems, leads to

Alternatively, the midpoint integrator can be derived variationally from $Ld(qk,qk+1,h)=h\xb7L((qk+1+qk)/2,(qk+1\u2212qk)/h)$ which we will use in the examples presented in Sec. 5. The stochastic processes are then included as shown in Ref. [11].

The following example of a stochastic harmonic oscillator is adapted from Ref. [10], and we use it to illustrate the crucial role of the integration scheme in stochastic simulations.

*Example 2.1.*(Stochastic Harmonic Oscillator). The stochastic differential equations for the harmonic oscillator with configuration $q\u2208\mathbb{R}$, momentum $p\u2208\mathbb{R}$, Hamiltonian $H(q,p)=(1/2)(q2+p2)$, and Wiener processes $w1,2$ are given by

We use the analytic solution presented in Ref. [10] to investigate the flow from a set of initial conditions (circle about (0, 0) in Fig. 1) and to compare the approximation quality of the different integrators. The first integration method is the standard Euler–Maruyama integrator (see, e.g., Ref. [5]). As can be seen in Fig. 1, its flow artificially expands the radius of the circle (blue dots), and the mean location of the circle drifts significantly from the true solution. This is unlike the behavior of the two symplectic methods (second-order midpoint in red; points mostly hidden by first-order symplectic Euler results depicted in green), which both follow the exact flow very closely and retain the circle area. In particular, we cannot observe a clear superiority of the second-order method over the symplectic Euler scheme in this example. Both symplectic integrators have statistics-preserving^{4} properties that are crucially important for estimator performance, as we illustrate in the remainder of this paper.

## Particle Filters

This section will provide a brief introduction to particle filters and it will provide a numerical example illustrating how a symplectic integrator can be leveraged to alleviate a common issue with particle filters.

### Basic Particle Filtering.

The key concept behind a particle filter is that uncertainty distributions of the system's state will be represented by a finite collections of particles. No analytical expression for the probability density function associated with these distributions is required; rather, we assume that the particles are drawn from this distribution. In this way, the collection of particles itself represents an approximation of the distribution. This is opposed to other filters, e.g., the Kalman filter, where the uncertainty distribution is parameterized and the filter predicts the evolution of the parameters. We will now sketch the steps required for a particle filter following the terminology and syntax of Ref. [3].

#### Initialization.

A particle filter is initialized by generating a set of *M* particles $X0$ where each particle $x0m$ with $1\u2264m\u2264M$ represents a possible initial state of the system. This initial particle collection may be drawn from a known distribution or even assumed to all have the same value.

#### Proposal Distribution.

In each step of the particle filter, each particle from the previous time-step is mapped forward to the current time using a noisy system model and knowledge of the controls used during the previous time-step. This new collection of particles is referred to as a *proposal distribution.* This step is also the step where the integration techniques of this work are implemented.

#### Importance Weights.

In order to incorporate measurements, an *importance weight* for each particle is calculated. The importance weight uses a noisy measurement model to evaluate the probability of a measurement *z _{t}* given a particle $xtm$, i.e., the weight of particle

*m*at time

*t*is $wtm=p(zt\u2009|\u2009xtm)$.

#### Resampling.

In most implementations of the particle filter, the next step is to incorporate the measurement *z _{t}* into the system's uncertainty distribution through a resampling process. In this process, the particles representing a sampling of the posterior distribution for the current time-step are generated by randomly drawing with replacement a new set of

*M*particles from the proposal distribution. This random drawing is biased by the importance weights calculated in the previous step. Particles from the proposal distribution with higher measurement probability have a higher probability of being drawn for inclusion in the posterior distribution. In this way, particles with low probability are culled from the collection, and the total particle distribution is

*drawn*to the measurement. This process allows the statistic of the posterior distribution to include knowledge of the measurements and it helps to ensure that the process of generating the proposal distribution is not wasting time computational effort on integrating particles that have very low posterior probability.

#### Iterate.

To continue the particle filter, the posterior particle set from time-step *t* is used as the initial set for the Proposal Distribution step of time-step *t* + 1.

### Particle Filters and Choice of Integrator.

A particle filter inherits the statistical properties of the integration scheme that is used for propagating the collection of particles. We demonstrate this by revisiting the stochastic harmonic oscillator example. In the applications presented in Sec. 5, we show an example of how a common issue with particle filters (particle deprivation) can be significantly reduced by using variational integrators.

In particle filtering algorithms, a collection of particles is mapped forward a single time-step using a noisy process model in order to predict a proposal distribution. Then, a measurement of the system is taken, and a new set of particles is produced by resampling the proposal distribution to produce an approximation of the posterior distribution. Particles with higher probability given the measurement value (and its corresponding probability density function) have a higher probability of being selected for the posterior distribution. This resampling process incorporates the measurement in the filter state by removing particles with low probability.

Typically, if measurements are collected regularly from a reliable sensor, the incorporation of the measurements will help to regulate errors caused by inaccurate integration of the particle states. However, there are many situations where one may encounter unreliable measurements, for example, in high-dimensional data association problems, systems with unreliable communication networks, or tracking systems suffering from occlusions. In these situations, the particle filter must predict the state of the system over long time horizons without measurements. As a result, symplectic integrators may significantly increase the performance of the particle filter. An illustration of this can be seen in the following example.

*Example 3.1*. (Stochastic Harmonic Oscillator). In Fig. 2, the statistical properties of the harmonic oscillator of Eq. (7) are simulated by integrating a collection of 1000 particles for 60 s with a time-step of 0.0625 s using a midpoint VI and an explicit Euler integrator (RK1). The plot shows the eigenvalues of the covariance of the particle distribution as a function of time. It can be seen that the RK1 scheme adds significantly more noise to the system than the variational integrator. This is true even though the two collections of particles were driven by the same set of sample paths of the Wiener process. Even in this simple system, with no measurements to regulate the integration errors, the nonsymplectic RK1 integrator significantly overpredicts the uncertainty in the system state.

### Particle Deprivation.

A common issue encountered in particle filters is that of *particle deprivation* [3]. In a filter suffering from particle deprivation, there are too few particles in the vicinity of the true state of the system. When this occurs, the resampling process may eventually drive the number of unique particles down to a single particle. While there are known heuristics for preventing this situation [3], we point out that an integrator adding artificial diffusion to the collection of particles will increase the filter's probability of suffering from particle deprivation when measurements are incorporated. Results illustrating this point are in Sec. 5.

## Kalman Filter

As a starting point, we consider the following formulation of a Kalman filter; originally developed in Ref. [1]. For a textbook reference, we refer to Ref. [20].

*(Discrete-time Kalman filter, [1]). Let a model of the system and the measurement be given by*

*with R*${wk}k=0N,\u2009{vk}k=0N$

_{k}> 0 and*uncorrelated Gaussian white noise processes with covariance Q*$x\u0302k+1$

_{k}and R_{k}respectively. The initial state x_{0}is assumed to be drawn from a Gaussian centered around a nominal initial state x^{0}with covariance the same as the filter's initial covariance P_{0}. A measurement at time-step k is denoted z_{k}. Then, the optimal estimation of the state,*, given measurements up to step k, is given by the following update equations:*

In Kalman's original work [1], the state transition matrix of the continuous-time system is used for *A _{k}* in Proposition 4.1. In general, this transition matrix cannot be computed analytically, but it can be approximated by an integration scheme applied to the continuous differential equation.

### Linearizations for Extended Kalman Filters.

*structured linearization*[21,22]. Many estimation algorithms, such as the EKF, rely on local linearizations to adapt linear system tools to nonlinear systems. Typically, in the continuous time setting, to obtain a local linearization at a point, one would use a Taylor expansion of $x\u02d9=f(x,u)$ about that point. This linearization is an infinitesimal linearization, but the discrete-time linearization is what is required. In other words, one needs to linearize the discrete map $xk+1=fk(xk,uk)$. In a standard implementation of an EKF [3], the discrete-time Kalman filter from Proposition 4.1 is adapted to a nonlinear system using the following linearizations:

where the measurement model is assumed to be of the form $zk=h(xk)+vk$.

As the presented VI is a one step method, it admits a simple linearization and even though the method is an implicit method, one can calculate an explicit linearization of the map [22]. Furthermore, since the VI is *exactly* sampling a nearby mechanical system, the VI linearization is an *exact* linearization of the nearby mechanical system. It is thus called a *structured linearization* [21,22]. This restriction on the behavior of the linearization greatly improves its local accuracy. In Sec. 5, we show an example of how the improved accuracy of this linearization leads to better EKF estimator performance.

In principle, a one-step map can be derived and linearized for implicitly defined RK schemes in the same manner, but higher order RK methods lead to complex Taylor series expansions due to the intermediate evaluation points. For that reason, implementations are often restricted to first-order methods. While this paper does not explicitly consider results related to estimation for constrained systems, we note that an additional interesting feature of the midpoint VI integrator is that it exactly satisfies holonomic constraints at each time-step while still admitting the explicit, structured linearization.

## Application to Planar Crane System

This section presents experimental and simulated examples demonstrating how the choice of discretization scheme may dramatically affect the performance of standard particle filters and extended Kalman filters. Further, we demonstrate improved performance of both filters when leveraging the structure-preserving properties of variational integrators.

In Secs. 5.1 and 5.2, we apply particle filters and EKFs to the planar crane system as shown in Fig. 3. This system and corresponding modeling strategy have been discussed by the authors in Refs. [21] and [23]. In those works, a modeling strategy was presented where the position of the winch system *x _{r}* and the length of the string

*r*were treated as kinematic inputs, i.e., inputs where the system has sufficient control authority to perfectly track any desired trajectory [24]. With this modeling strategy, the Lagrangian for the system is only a function of the dynamic configuration variables (

*x*,

*y*) and is given by

As a result of the kinematic input assumption, all damping in the crane portion of the system is assumed to be negligible. Moreover, through experimentation, we found that the damping in the swing dynamics of the payload was low enough to also be considered negligible. Thus, the deterministic models of the dynamics of this system have no damping.

For all results in this section, the noise models are assumed to be additive Gaussian noise with zero mean. The covariances parameterizing these distributions are all assumed to be diagonal, and the experimental system only measures relevant configuration variables, i.e., there are no measurements of velocity or discrete generalized momentum.

### Numerical Results.

Figure 4 shows simulated effects of the choice of integration scheme on performance of the EKF algorithm at a range of time-steps. Each point on the plot was generated by running 1000 trials of the filter on a nominal feasible trajectory for the system described in Fig. 3. In the “prediction” step of the EKF, actual samples from the nominal trajectory were used, i.e., the predictions without additive noise corresponded exactly with the nominal trajectory. Measurements were simulated by adding Gaussian noise to the nominal trajectory. For each trial, the *L*_{2} error between the filtered signal and the reference signal was determined at each time-step. These errors were then averaged to produce an “error norm” for each trial. The error norms were then averaged, and their standard deviations calculated to produce the points and error bars in Fig. 4. For the upper, solid curve, the local linearization was performed by evaluating the infinitesimal derivatives of the continuous dynamics about the current best estimate, and then using an explicit Euler approximation to convert this into a discrete linearization. For the bottom curve, the structured linearization of the midpoint variational integrator discussed in Sec. 4.1 was used. It is easily seen that as the filter frequency decreases, the performance of the discrete-mechanics-based variational integrator representation significantly outperforms the explicit Euler approximation.

### Experimental Results.

In this section, we use experimental data from the robotic planar crane system described in Refs. [21] and [23]. This system utilizes digital encoders to close control loops around *x _{r}* and

*r*. Thus, the kinematic-input modeling strategy, mentioned in Sec. 5, is used again. Since these inputs are assumed to be perfectly controlled, the dynamic configuration variables (

*x*,

*y*) are the only state variables estimated.

Figure 5 shows the parametric evolution of estimates of the dynamic configuration variables using particle filters and EKFs at two different frequencies using both a midpoint VI and an RK1 integrator to represent the system. The strength of the VI integrator can be seen by noting that the particle filter and the EKF estimates of the system's uncertainty are nearly identical even at frequencies as low as 6 Hz. The two filter estimates are not only in agreement with each other, but they are also in agreement between the frequencies. We emphasize that, at 6 Hz, the time-step used for integration, measuring, linearization, and estimation is $\u2248$0.167 s. With this large time-step, the RK1 particle filter is useless.

Figure 6 shows the time evolution of the eigenvalues of the covariances from each of the estimators in Fig. 5. This figure further demonstrates that the VI covariance predictions are stable even at large time-steps. Additionally, it demonstrates that, in this particular system, the structured linearization allows the EKF to perform nearly identically to the particle filter. In this case, the increased accuracy of the structured linearization avoids the increased computational expense of the particle filter while achieving similar performance.

In Sec. 3.3, the issue of particle deprivation in particle filters was discussed. It was explained that the VI reduces the likelihood of this issue by removing artificial noise injected by a traditional integrator; this is especially true at low frequencies. Figure 7 shows the number of unique particles as a function of time for the 6 Hz particle filters using the VI and RK1 integrator. In several cases, the particle distribution from the RK1 integrator is artificially spread to the point that resampling produces only two unique particles (out of 1000) which implies that uncertainty estimates from the filter are essentially useless.

## Kalman Filter With Structure-Preserving Covariance Updates

In Sec. 4, we discussed the difference between symplectic and nonsymplectic discrete system matrices *A _{k}* in the Kalman filter and showed benefits of using symplectic discretization schemes for filtering applications in Sec. 5. However, the Kalman filter provides further structure due to its optimality property. In this section, we derive a modification to the previously used Kalman filter that preserves this optimality structure through discretization and thereby further improves the filter performance.

To this aim, we start with the *continuous-time* Kalman filter, as introduced in Ref. [25] (see Proposition 6.1 below). The continuous-time Kalman filtering problem is dual to a linear quadratic regulator (LQR) problem: While a linear quadratic regulator controller minimizes a quadratic cost functional, the Kalman filter determines state estimates that minimize the expected squared error to the process' true state (cf. Ref. [25]). In particular, both problems exhibit an additional *Hamiltonian structure* (independent of the mechanical Hamiltonian structure). This can be seen by examining the state-adjoint system that belongs to the respective problem's optimal solutions.

Our approach aims at additionally preserving this property through numerical discretization. Therefore, we set up a symplectic discretization of the state-adjoint system for the Kalman filtering problem. It is not necessary to explicitly solve for the adjoint equations, though. Instead, we directly derive modified discrete-time Riccati equations which are independent of the states and adjoints (as previously presented in Ref. [14]). Thus, our Kalman filter algorithm only consists of a single covariance update step and an all-in-one state update afterward (see Algorithm 6.1). The performance of this algorithm is studied in Example 6.1 and compared to the discrete-time Kalman filter methods with standard covariance updates.

*(From Ref. [*ib25

*25*

*]). Let the model of the system and the measurement be given by*

*where*$x(t0)\u223cN(x0,P0),\u2009w\u223cN(0,Q),\u2009v\u223cN(0,R)$

*with white noise processes*${w(t)},\u2009{v(t)}$

*being uncorrelated, also with x*

^{0}. Then, the optimal filter is given byfrom which the covariance equation (12) can be derived.

Lemma 6.1. *The state-adjoint system (11) is Hamiltonian and the transition matrix*$\Theta (t,t0)$*is symplectic.*

*Proof*. The system matrix of the combined state adjoint equations is $S:=(\u2212ATCTR\u22121CGQGTA)$ and, since $CTR\u22121C$ and *GQG*^{T} are symmetric, it satisfies $(JS)T=JS$ with $J=(01\u221210)$. So we have a (linear) Hamiltonian system, for which the flow is defined by the transition matrix. Thus, it follows that $\Theta (t,t0)$ is a symplectic matrix for all $t\u2265t0$.◻

In our discrete-time approximation of the Riccati equation for the covariance matrix, we derive discrete-time covariance updates that exactly preserve the following two properties: the symplectic structure of the state-adjoint differential equations and their linear relationship defined by *P*(*t*). As in the continuous-time Kalman filter, the update equation for the covariance matrix *P _{k}* does not depend on the state or on the adjoint. For known system and error covariance matrices at all time nodes, ${Pk}k=1N$ can be computed beforehand.

*Let a system and measurement model be given as in Proposition 6.1 for a 2 n-dimensional mechanical system with state*$(q,p)T$

*. Let*$td={tk}k=0N$

*denote a set of discrete time points, with*$tk=kh$

*for step size h, and assume P*$\Psi k$

_{0}symmetric is given. We denote by*the one-step map of the symplectic Euler*:

^{5}discretization of the four-dimensional state-adjoint system (11)*with coordinates*$xk=(vk,yk)T$

*and*$wk=(\lambda k,\mu k)T$

*. Note that*$vk,\u2009yk,\u2009\lambda k,\u2009\mu k\u2208\mathbb{R}n$

*. Then,*$\Psi k$

*is symplectic by construction. Further, if the discrete covariance updates for*$k=0,1,\u2026,N\u22121$

*are computed by*

*with the linear one-step map*$\Psi k$*being identified with its representing matrix and*$\Psi k=(\Psi k11\Psi k12\Psi k21\Psi k22)$*, then we have that*$wk=Pkxk$*for all*$k=0,1,\u2026,N$.

and, by construction, $\Psi k$ is symplectic (the symplectic Euler scheme indeed leads to a symplectic one-step map, see Ref. [7] for a proof).

Note that as long as we have a time-invariant estimation problem, $\Psi k$ is a constant matrix that can be obtained from solving the set of Eq. (15) for $(vk+1,yk+1,\lambda k+1,\mu k+1)$ and generating the matrix entries just once. If the system and measurement model matrices are time-varying though, this step would have to be performed in every iteration.

For estimation problems of mechanical systems, it is desirable to also preserve the mechanical structure (i.e., Hamiltonian matrix, *A*, or Hamiltonian plus forcing), as demonstrated in Sec. 4. Therefore, we combine the structure-preserving covariance update of Theorem 6.1 with discrete symplectic state updates. This leads to the following algorithm that resembles the continuous-time filter (see Proposition 6.1) in that it does not split into prediction and measurement update steps as the discrete Kalman filter does (cf. Proposition 4.1).

**Algorithm 6.1**. For the implementation of the structure-preserving Kalman filter with symplectic state and covariance updates using symplectic Euler integration, we modify the continuous-time Kalman filter algorithm of Ref. [20] to

1. take measurement^{6}$zk+1$, |

2. update covariance matrix by |

3. $Pk+1=(\Psi k21+\Psi k22Pk)(\Psi k11+\Psi k12Pk)\u22121,$ |

4. compute Kalman gain as $Kk+1=Pk+1CTR\u22121$, |

5. update state by $(qk+1pk+1)=(qkpk)+h[(A\u2212Kk+1C)(qk+1pk)+Kk+1zk+1]$ |

1. take measurement^{6}$zk+1$, |

2. update covariance matrix by |

3. $Pk+1=(\Psi k21+\Psi k22Pk)(\Psi k11+\Psi k12Pk)\u22121,$ |

4. compute Kalman gain as $Kk+1=Pk+1CTR\u22121$, |

5. update state by $(qk+1pk+1)=(qkpk)+h[(A\u2212Kk+1C)(qk+1pk)+Kk+1zk+1]$ |

Step 4 is again an implicit, but linear update, so it can be analytically inverted beforehand and then directly evaluated during filtering. While Algorithm 6.1 may not initially appear to follow the same procedure of the standard discrete Kalman filter (Proposition 4.1) in that there is no *prediction* or *update* step, we point out that one could rewrite the standard discrete Kalman filter to give a single covariance update equation analogous to the update rule in Step 2 of Algorithm 6.1. In general, the covariance update from Algorithm 6.1 and from Proposition 4.1 will not result in the same covariance evolution.

The previous results specifically use a first-order symplectic scheme for two reasons: We want to give practitioners a good idea how to implement the filter and we want to have a fair comparison to the standard approach that is based on nonsymplectic, first-order, explicit Euler discretizations (see Example 6.1). However, if higher-order symplectic discretization schemes, e.g., a second-order midpoint rule, are used, only the calculation of the one-step map based on a modified Eq. (15) and the state update in Step 4 of the algorithm need to be adjusted.

*Example 6.1.* (Stochastic Harmonic Oscillator). We revisit the example of a stochastic harmonic oscillator as introduced in Example 2.1. The structure-preserving Kalman filter method as presented in Algorithm 6.1 is compared to the discrete-time variants introduced in Sec. 4: the standard covariance updates with either an explicit Euler discretization for the discrete filter state update or a symplectic discretization by midpoint rule. Figure 8 shows the resulting filter behavior in state space (top row), as well as the covariance matrix eigenvalues (bottom left) and the mechanical energy of the oscillator (bottom right). All integrators use a step-size of *h* = 0.03125 and identical measurements generated from one sample path. Unlike the examples presented in Sec. 4, the symplectic state updates alone do not improve the filter; likely due to the simplistic nature of the harmonic oscillator. However, the novel variant of a structure-preserving filter has smaller covariances, as can be seen from the ellipses in the top plot and the covariance eigenvalues. In particular, the mechanical energy of the filtered state, which would ideally—in the absence of noise and without numerical errors—be a constant, is much more accurately preserved than in the other two cases.

## Conclusion

This paper presents three structure-preserving filtering methods for mechanical control systems: (1) Particle filter with symplectic state predictions, (2) Extended Kalman filter with symplectic state predictions and structured linearizations of variational integrators, and (3) Kalman filter with symplectic state update and additional symplectic covariance update. The filtering techniques have been tested in simulated examples and in a real-time experiment with a planar crane system. Symplectic discrete-time predictions of the states increased the performance of both the particle and the extended Kalman filter compared to the standard choice of explicit Euler updates. Replacing the covariance updates in the linear Kalman filter can improve its performance even further. We derive a modified discrete-time Riccati equation for the covariance matrices that preserves the Hamiltonian/symplectic structure of the system of optimality conditions that belongs to the continuous-time Kalman filter. This novel filter variant can be extended to nonlinear systems by linearizing the discretized state-adjoint system of the optimality conditions. The one-step map of the symplectic state-adjoint scheme, which is used for the covariance Riccati equations, will then become time dependent. Thus, one would expect an increased computational effort. However, this will be balanced by the observed higher accuracy of the method which allows a coarser time-grid than the nonsymplectic first-order methods. Further theoretical studies of the symplectic filtering approach can be performed by backwards error analysis [7] applied to the state-adjoint system. We also note that recent work [18] has shown that it may be possible to construct arbitrarily high-order variational integrators that are still linearizable one-step maps. These integrators are developed through backward error analysis [7] by constructing *surrogate Lagrangians*. High-order VI-based filtering techniques are another area for future investigation. In many real-world applications, the filtering methods presented herein would be combined with optimal control techniques. First results of a combination with the presented filters are given in Ref. [13]; a detailed study is left for future work.

## Acknowledgment

This work was supported by Army Research Office grant W911NF-14-1-0461, and by the National Science Foundation under Grant CMMI 1334609. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

In the conclusion of this document we mention some recent work [18] that describes the generation arbitrary-order VIs that are still one-step maps. These VIs may provide another choice of integrator that satisfies the aforementioned requirements while also yielding a higher order method.

In general, a map is symplectic if it preserves a symplectic form ([6,7]). To get an intuition for symplecticity, it is helpful to consider the fact that symplectic transformations are area-preserving. A linear map $A:\mathbb{R}2n\u2192\mathbb{R}2n$ is symplectic if $ATJA=J$ with $J=(01\u221210)$.

The slot derivative $DnL(A1,A2,\u2026)$ represents the derivative of the function L with respect to the nth argument, An. At times, the arguments to the function L may be dropped for compactness – in that case the slot derivative still applies to the argument order from the function definition.

Note that the states of the Hamiltonian system do not coincide with the filter state. Since the Hamiltonian structure of the mechanical system matrix *A* suggests a splitting of the state and adjoint variables, though, we *choose* a symplectic Euler that treats half of the states and adjoints as implicit and the other half as explicit. However, we point out that other splittings between the set of coordinates define a first-order symplectic integration scheme, as well.

In simulated examples we generate measurements by $zk+1=Cx((k+1)h)+wk+1$ where *x*(*t*) is some ground truth trajectory, e.g., obtained from a high-order integration with small step sizes and *w _{k}* is a noise vector drawn from the assumed Gaussian distribution.