## Abstract

In this article, we discuss a special class of time-optimal control problems for dynamic systems, where the final state of a system lies on a hyper-surface. In time domain, this endpoint constraint may be given by a scalar equation, which we call transversality condition. It is well known that such problems can be transformed to a two-point boundary value problem, which is usually hard to solve, and requires an initial guess close to the optimal solution. Hence, we propose a new gradient-based iterative solution strategy instead, where the gradient of the cost functional, i.e., of the final time, is computed with the adjoint method. Two formulations of the adjoint method are presented in order to solve such control problems. First, we consider a hybrid approach, where the state equations and the adjoint equations are formulated in time domain but the controls and the gradient formula are transformed to a spatial variable with fixed boundaries. Second, we introduce an alternative approach, in which we carry out a complete elimination of the time coordinate and utilize a formulation in the space domain. Both approaches are robust with respect to poor initial controls and yield a shorter final time and, hence, an improved control after every iteration. The presented method is tested with two classical examples from satellite and vehicle dynamics. However, it can also be extended to more complex systems, which are used in industrial applications.

## 1 Introduction

Among all optimal control problems, the solution of time-optimal problems is probably the most challenging ones. In this paper, we consider a dynamic system described by a set of minimal coordinates and given initial values. We are looking for the optimal control of the system such that the time for a prescribed process or maneuver becomes a minimum.

In order to solve such problems, fundamental books on optimal control as, e.g., Refs. [1] and [2] discuss direct and indirect methods. In summary, direct methods, also addressed as nonlinear programing, transform the dynamic problem to a static problem by a parameterization of the control variables, see Ref. [3], for example. Numerous methods follow a direct approach, e.g., interior point algorithms, active set, sequential quadratic programing, pseudo-spectral method, to name but a few. Indirect methods are based on the solution of the optimality conditions stated by Pontryagin resulting in a two-point boundary value problem. Solution strategies for the boundary value problem are discussed by various authors [47] and are already well established in a wide range in engineering sciences. The boundary value problem can, e.g., be solved with shooting methods or with discretization techniques in which both states and controls are discretized. For more details to direct and indirect methods, a good overview of both is given by Bianco et al. [8]. In a previous work, Bottasso et al. [9] presented a combined indirect approach for solving inverse and trajectory optimization problems in multibody dynamics, which also corresponds to the ideas presented in Ref. [5]. The latter mentioned work deals with the time-optimal control of a vehicle using an indirect approach to solve the boundary value problem.

A first significant formulation of the time-optimal control problem of a general vehicle maneuver has been stated by Hendrikx et al. [10]. This was followed by further articles concerning motorcycle lap time simulations in Ref. [4] and car lap time simulations discussed in Refs. [6] and [7]. Basically, all the mentioned authors have solved the boundary value problems associated with the optimal control problems. Time-optimal control problems play a significant role as well in robotic applications, see, e.g., Ref. [11] for the time-optimal control problem with specified trajectory. More recently, in Ref. [12] time-optimal motions following a predefined end-effector path are considered for kinematically redundant robots as well as for nonredundant robots. Here, a multiple shooting approach is used to solve the boundary value problem.

An indirect alternative to solving the underlying boundary value problem is given by gradient methods, independently developed by Kelley [13] and by Bryson and Denham [14] already in the sixties of the past century.

Various authors have already utilized this method in the field of inverse dynamics of multibody systems more than 20 years ago, as, e.g., Refs. [17] and [18], but nevertheless, the topic still provides open research potential.

More recently, Refs. [19] and [20] show how the adjoint method can be applied efficiently to a multibody system described by differential-algebraic equations of index three. A piecewise adjoint method is presented in Ref. [21], in which the coordinate partitioning underlying ordinary differential equations are formulated as a boundary value problem, and are solved by multiple shooting methods. It has to be emphasized that the adjoint method can be utilized to solve even highly nonlinear systems with a vast number of degrees-of-freedom in a straight forward way, as, e.g., in the application of second-order adjoint sensitivity analysis of large multibody systems in Ref. [22], in which a comparison between the adjoint method and direct differentiation shows a significant reduction in computational costs.

The basic idea of the adjoint method tracing back to Kelley [13] and Bryson and Denham [14], has been applied by various authors, as, e.g., for parameter identification and inverse dynamic problems [20]. Here, the idea of the adjoint method from the latter mentioned reference, has been used for the application for time-optimal control problems. Considering a dynamic system described by a minimal set of independent coordinates, the regarding initial value problem can be stated in the form
$ẋ=f(x(t),u(t))$
(1a)
$x(t0)=x0$
(1b)
where $x(t)∈ℝn$ denotes the vector of state variables and $u(t)∈ℝm$ the vector of control inputs. The final time $tf>t0$ of a maneuver or an operation period is implicitly defined by a scalar terminal condition for the state variables of the form
$Φ(x(tf))=sf$
(2)
where $Φ(x(tf))$ is a function $ℝn→ℝ$ and sf is a given number. For example, if we consider a car driving one lap on a race track, the operation period terminates, when the center of mass of the vehicle crosses the finish line. Now, we ask for the control which minimizes the operation period $tf−t0$ without violating inequality constraints for the state and control variables. For that purpose, we introduce a cost functional of the form
$J=∫t0tf[1+P(x(t),u(t))]dt$
(3)

where $P(x(t),u(t))$ is a proper penalty function which is zero if the states and control variables satisfy the constraint conditions, and increases rapidly if they are not satisfied. Moreover, P should be continuously differentiable at least once. Unless violating the constraints, the cost functional J is simply the length of the time interval $tf−t0$. The time-optimal control problem is then formulated as the problem of finding controls $u(t)$ in the time interval $t∈[t0,tf]$ which minimizes the functional J.

One strategy for finding these controls, which is pursued in this paper, is to compute the gradient of the cost functional, i.e., the direction of the steepest descent, and to walk along this direction until a (local) minimum of J is reached. That means we are asking for the variation $δu(t)$ of a given control $u(t)$ resulting in the largest local decrease of J. In order to compute $δu(t)$ we first augment the cost functional by the state equations in Eq. (1a) yielding
$J¯=∫t0tf[1+P(x,u)+pT(f(x,u)−ẋ)]dt$
(4)
where $p(t)∈ℝn$ denotes the vector of adjoint variables. If the state equations are satisfied, the term multiplied with $p(t)$ is zero and $J¯$ is identical with J for any choice of $p(t)$. Let us now consider an infinitesimal variation of the controls $δu(t)$. This results in an infinitesimal change of the states $δx(t)$ and, hence, of the final time $δtf$. The first-order variation of $J¯$ is given by
$δJ¯=∫t0tf[Pxδx+Puδu+pT(fxδx+fuδu−δẋ)]dt+[1+P(x(tf),u(tf))+pT(tf)(f(x(tf),u(tf))−ẋ(tf))]δtf$
(5)
where $fx$ and $fu$ denote the partial derivatives of $f(x(t),u(t))$ with respect to $x(t)$ and $u(t)$, analogously, the same abbreviations apply to $Px$ and $Pu$. Note, that a variation of the final time also contributes to the change of the integral but since the equations of motion are fulfilled at t = tf, the variation of the cost functional reduces to
$δJ¯=∫t0tf[Pxδx+Puδu+pT(fxδx+fuδu−δẋ)]dt+[1+P(x(tf),u(tf))]δtf$
(6)
Performing an integration by parts of the last term of the integrand $pT(t)δẋ(t)$ yields
$δJ¯=∫t0tf[(Pu+pTfu)δu+(Px+fx+ṗT)δx] dt−pT(tf)δx(tf)+[1+P(x(tf),u(tf))]δtf$
(7)
where $δx(t0)=0$ as the initial conditions are fixed. Now we have to pay special attention to the boundary terms in $δJ¯$. Since the final states must satisfy Eq. (2), the variation of the final time and of the final states are not independent [1,23]. The relation between $δtf$ and $δx(tf)$ is illustrated in Fig. 1. The solid line shows the states terminating at tf. The dashed line shows a modified motion, caused by a variation of the controls, terminating at $tf+δtf$. Up to first order, the variation of the final states where the operation period ends is given by
$δxf=x*(tf+δtf)−x(tf)=x(tf+δtf)+δx(tf+δtf)−x(tf)=ẋ(tf)δtf+δx(tf)$
(8)
The operation period with modified controls terminates when the condition $Φ(x(tf)+δxf)=Φ(x(tf))+Φx(x(tf))δxf=sf$ is met. Since $Φ(x(tf))=sf$ and after inserting $δxf$ from Eq. (8), we obtain
$Φx(x(tf))δxf=Φx(x(tf))(ẋ(tf)δtf+δx(tf))=0$
(9)
and after solving for $δtf$, we end up with the relation
$δtf=−Φx(x(tf))Φx(x(tf))ẋ(tf)δx(tf)$
(10)
where $ẋ(tf)=f(x(tf),u(tf))$. Substituting $δtf$ in Eq. (7) and collecting all variations results in
$δJ¯=∫t0tf[(Pu+pTfu)δu(t)+(Px+pTfx+ṗT)δx] dt−[pT(tf)+Φx(x(tf))1+P(x(tf),u(tf))Φx(x(tf))f(x(tf),u(tf))]δx(tf)$
(11)

So far, we have not made any restrictions to the adjoint variables, now we shall find it convenient to choose $p(t)$ such that all terms multiplied with $δx$ vanish. Thus, the adjoint variables are defined by the linear final value problem

$ṗ(t)=−PxT−fxTp(t)$
(12a)
$p(tf)=−ΦxT(x(tf))1+P(x(tf),u(tf))Φx(x(tf))f(x(tf),u(tf))$
(12b)

which can be solved backward in time after the system equations have been solved for $x(t)$ yielding the time variant Jacobi matrices $fx(x(t),u(t))$ and $Px(x(t),u(t))$. Proceeding in this way, the first variation of the cost functional reduces to
$δJ¯=∫t0tf(Pu+pTfu)δu(t)dt$
(13)
Hence, we observe that the largest local decrease of $δJ¯$ is obtained if $δu(t)$ is selected in the opposite direction of the term in the bracket, i.e., if we set
$δu(t)=−ε[PuT(x(t),u(t))+fuT(x(t),u(t))p(t)]$
(14)

where ε must be a sufficiently small positive number.

## 3 Transformation to a Coordinate With Fixed Boundaries

A problem with the presented adjoint method arises because the modified control $u*(t)=u(t)+δu(t)$ also causes a change of the time interval $[t0,tf]$ to $[t0,tf*]$. If $tf*, which should be the case for a control variation of the form like in Eq. (14), the improved control $u*(t)$ can be truncated at $t=tf*$ although the information about $u*(t)$ is available also for $tf*. However, if ε was selected too large, the terminal condition may be eventually met at a later time instance $tf*>tf$. In this case, we have no information about $u*(t)$ for $tf.

Hence, for time-optimal control problems, it is convenient to transform the time coordinate into a coordinate within a fixed interval. For instance, with the left side of the terminal condition in Eq. (2), we could introduce a new coordinate s instead of t by setting
$s(t):=Φ(x(t))$
(15)

which ranges between the fixed boundaries $s0=Φ(x(t0))$ and $sf=Φ(x(tf))$. Based on this transformation, two solution strategies concerning the adjoint gradient method are pursued in this paper: (a) a hybrid formulation, where the state and adjoint equations are solved in the original time domain and only the controls and the gradient are defined in the fixed interval, and (b) the complete transformation of the time coordinate to a fixed interval. Both methods are discussed in detail below.

### 3.1 The Adjoint Method in a Hybrid Domain.

In many cases, it is difficult to realize a complete transformation of the time coordinate. If we are using a multibody simulation software, this would require a massive intervention into the software architecture. Hence, it would be preferable to solve the equations of motion in the time domain and to transform the gradient formula in Eq. (13) only. At a first glance, it seems that only the integration variable t has to be substituted by $s=Φ(x(t))$. However, Fig. 2 illustrates that the change of the integration variable has also an impact on the δ-operation describing the control variation. Let $u*(t)=u(t)+δu(t)$ be a perturbed control in terms of the original time. Since a control variation $δu(t)$ also causes a state variation $δx(t)$, we will also observe a variation of the state dependent coordinate $s(t)=Φ(x(t))$, which is up to first order given by
$δs(t)=Φx(x(t))δx(t)$
(16)
Setting $s*(t)=s(t)+δs(t)$, the control variation in terms of the time t reads
$δu(t)=u*(s*(t))−u(s(t))=u(s*(t))+δu(s*(t))−u(s(t))$
(17)
Expanding $u(s*(t))=u(s(t)+δs(t))$ and $δu(s(t)+δs(t))$ about s in a linear Taylor series we obtain
$δu(t)=u(s(t))+dudsδs(t)+δu(s(t))−u(s(t))=δu(s(t))+dudsδs(t)$
(18)
After inserting Eq. (16) for $δs(t)$ we end up with the relation
$δu(t)=δu(s(t))+dudsΦx(x(t))δx(t)$
(19)
Since we are looking for $δu(s)$ which causes the largest local decrease of the cost functional J, the variation $δu(t)$ in Eq. (11) must be expressed by this relation, yielding
$δJ¯=∫t0tf(Pu+pTfu)δu(s(t)) dt+∫t0tf[Px+pT(fx+fududsΦx(x(t)))+ṗT]δx(t) dt−[pT(tf)+ΦxT(x(tf))1+P(x(tf),u(tf))Φx(x(tf))f(x(tf),u(tf))]δx(tf)$
(20)

To eliminate the influence of all terms multiplied with $δx(t)$, we now define the adjoint variables in the following way:

$ṗ(t)=−PxT−(fx+fududsΦx(x(t)))Tp(t)$
(21a)
$p(tf)=−ΦxT(x(tf))1+P(x(tf),u(tf))Φx(x(tf))f(x(tf),u(tf))$
(21b)

Notice that we have added the term $fu(du/ds)Φx$ to the original definition of the adjoint system in Eq. (12a). However, also the new adjoint variables can be determined by solving Eq. (21) backward in time. Finally, the first variation of the cost functional reduces to
$δJ¯=∫t0tf(Pu+pTfu)δu(s(t))dt$
(22)
in which we can substitute the integration variable t by s. With
$dt=1ṡds=1Φxẋds=1Φxf(x,u)ds=g(x,u)ds$
(23)
where $g(x,u)=1/(Φxf(x,u))$, we obtain
$δJ¯=∫s0sf(Pu+pTfu)δu(s)g(x,u)ds$
(24)
Finally, the largest local decrease of the cost functional results from
$δu(s)=−ε (PuT+fuTp)g(x,u)$
(25)

where ε is a sufficiently small positive number.

With this result, we are ready to apply a gradient-based optimization procedure for finding a minimum of J, which can be summarized as follows:

1. Choose an initial guess for the control history $u(s)$ which causes the states to satisfy the terminal condition in Eq. (2).

2. Solve Eq. (1) for the state variables $x(t)$ until Eq. (2) is satisfied. During the numerical integration of the state equations, the argument s in $u(s)$ must be replaced by the original time t with Eq. (15).

3. Solve Eq. (21) for the adjoint variables $p(t)$ by a backward integration starting at t = tf.

4. Select a value for ε and compute the control variation $δu(s)$ for every s from Eq. (25) by involving the inverse function of s(t). Alternatively, a quasi-Newton method can be applied to the update formula in Eq. (25), e.g., the BFGS-method.

5. Update the control history by setting $u(s)→u(s)+δu(s)$. With the updated control $u(s)$, the cost functional should be reduced, if ε was selected sufficiently small.

6. Vary ε, resolve the state equations and evaluate the cost functional until an optimal value for ε is found, which minimizes the cost functional along the search direction. For that purpose, a line search algorithm as described in Ref. [24] can be used.

7. Replace the control history $u(s)$ by the updated control and repeat steps 2–6. Terminate the optimization procedure when Eq. (24) is sufficiently small (zero gradient).

In summary, the advantage of the presented method is that the equations of motion and also the adjoint system can be solved in the time domain and only the control variables and the gradient formula are transformed into the state space. Unfortunately, this requires the derivative of the controls with respect to s which can become infinite for bang–bang controls and could cause an undesirable excitation term in the adjoint equations in Eq. (21a). In the next paragraph, we show that this problem can be overcome by an elimination of the time coordinate also in the state equations in Eq. (1a). However, the price to pay is that a standard multibody simulation software is probably no longer applicable.

### 3.2 The Adjoint Method in s-Domain.

The function s(t) defined by Eq. (15) can be used to eliminate the time coordinate from the state equations in Eq. (1a). Using Eq. (23), we obtain
$ẋ=dxdsdsdt=x′(s)1g(x,u)=f(x,u)$
(26)
or, respectively,
$x′(s)=g(x,u)f(x,u):=f̃(x(s),u(s)), x(s0)=x0$
(27)
where $g(x,u)=1/(Φxf(x,u))$. For measuring the original time, we make use of Eq. (15) once more and include the additional differential equation
$t′(s)=dtds=g(x,u)$
(28)
which is used to determine the final time $tf=t(sf)$. Now we are looking for controlling the dynamic system such that
$J=tf+∫s0sfP(x(s),u(s))ds$
(29)
is minimized. Herein, $P(x(s),u(s))$ denotes again a penalty function to consider constraints on states and inputs. In order to derive a gradient of Eq. (29) with respect to the controls $u(s)$ we may extend J by the state equations in Eq. (27) and by (28) yielding
$J¯=tf+∫s0sf[P(x,u)+pT(f̃(x,u)−x′)+ξ(g(x,u)−t′)]ds$
(30)
in which $p(s)∈ℝn$ denotes a vector of adjoint variables and $ξ(s)$ is a single adjoint variable associated with the differential equation defining the time coordinate. The linear change of the extended cost functional is given by
$δJ¯=δtf+∫s0sf[Pxδx+Puδu+pT(f̃xδx+f̃uδu−δx′)+ξ(gxδx+guδu−δt′)]ds$
(31)
Integration by parts for the terms including $δx′$ and $δt′$ yields
$∫s0sfpTδx′ ds=−∫s0sfp′Tδx ds+pT(sf)δx(sf)∫s0sfξδt′ ds=−∫s0sfξ′δt ds+ξ(sf)δtf$
(32)
Note that the variations $δt0$ and $δx(s0)$ are zero since the initial conditions are prescribed. Inserting Eq. (32) into Eq. (31) and collecting all terms including $δx(s), δu(s)$, and $δt(s)$, yields
$δJ¯=∫s0sf[(Pu+pTf̃u+ξgu)δu+(Px+pTf̃x+p′T+ξgx)δx+ξ′δt]ds+(1−ξ(sf))δtf−pT(sf)δx(sf)$
(33)

The adjoint variables $p(s)$ and $ξ(s)$ can now be again defined such that all terms with $δx(s)$ and $δt(s)$ disappear. This is the case if $ξ′(s)=0$ and $ξ(sf)=1$, i.e., $ξ(s)=1$ for $s∈[s0,sf]$, and if

$p′(s)=−gxT−PxT−f̃xTp$
(34a)
$p(sf)=0$
(34b)

This final value problem may be solved for $p(s)$. Finally, the first variation of the cost functional reduces to
$δJ¯=∫s0sf(Pu+pTf̃u+gu)δu(s)ds$
(35)
showing the direct influence of $δu(s)$ on J. Finally, an update of the control signals is computed from Eq. (35), by choosing
$δu(s):=−ε(PuT+f̃uTp+guT)$
(36)

in which ε is again a small positive number determining the size of the control update.

Finally, we summarize the main steps of the algorithm for the s-domain:

1. Select a control history $u(s)$ in the s-interval $[s0,sf]$.

2. Solve the state equations in Eq. (27) for $x(s)$ and the differential equation in Eq. (28) for t(s), simultaneously.

3. Solve the adjoint differential equations in Eq. (34) for the adjoint variables $p(s)$ backward in s.

4. Compute the control update $δu(s)$ for every s by using Eq. (36). Note, instead of using the update $δu(s)$ directly, Eq. (36) can be used to apply a quasi-Newton method.

5. Improve the control history by setting $u(s)→u(s)+δu(s)$. If ε was selected sufficiently small, the updated control $u(s)$ reduces the cost functional in Eq. (29), in which the final time is given by $tf=t(sf)$.

6. Find a suitable factor ε, e.g., by using a line search algorithm.

7. Repeat steps 2–6 and stop the algorithm when $δJ¯$ in Eq. (35) is sufficiently small.

## 4 Numerical Examples

Now we consider two numerical examples in order to apply the proposed methods for computing the time-optimal control for dynamic systems. For the first example, the adjoint method in a hybrid domain is used for trajectory planning of a space flight, while in the second example the adjoint method in the s-domain is utilized for computing time-optimal vehicle maneuvers.

### 4.1 Time-Optimal Space Flight to the Sphere of Influence.

As a first example, we consider the time-optimal flight to the sphere of influence. The sphere of influence is a domain around a celestial body that indicates where the gravity of the body has a dominating effect on the dynamics of other space objects [25]. We are interested in the optimal control of a space vehicle so that we reach the sphere of influence in minimal time. We consider only planar motions described in polar coordinates r(t) and $φ(t)$, see Fig. 3.

The system equations of a space vehicle in the gravitational field of the Earth, see, e.g., Ref. [1], read
$ṙ=vrφ̇=vtrv̇r=vt2r−gρ2r2+Fmsin(u(r))v̇t=−vrvtr+Fmcos(u(r))ṁ=−FgIsp$
(37)
where m(t) denotes the mass of the vehicle, g the gravitational acceleration at sea level, and ρ the radius of the Earth. The vehicle can be maneuvered by a constant rocket thrust F in a variable direction described by the angle u(r) which is considered as our control variable. Restricting to small thrust angles, the state equations become linear in u(r) and therefore would lead to a classical bang–bang control. Nevertheless, notice that these differential equations are nonlinear in the states and control variables. Hence, no constraints/bounds on the thrust angle are introduced. Moreover, Isp denotes the specific impulse of the rocket propellant. The system equations have the form of Eq. (1a) if we introduce $x=(r,φ,vr,vt,m)T$ as the vector of state variables. At t = t0, the spacecraft is in a circular orbit of the Earth with known parameters. The maneuver under consideration is finished if r(t) is equal to the radius $rfinal$ of the sphere of influence of the Earth. Hence, our terminal condition reads
$Φ(x(tf))=x1(tf)=r(tf)=rfinal$
(38)
which defines tf implicitly. We want to compute u(r) such that the flight time to the sphere of influence is minimal. Hence, neither involving a penalty function nor a control constraint, the cost functional is simply given by
$J=∫t0tf1dt$
(39)

i.e., the length of the time interval $tf−t0$.

For finding the optimal control, we use the hybrid method described in Sec. 3.1 and determine the control u as a function of r, which corresponds to the coordinate s. The adjoint equations can be directly derived from Eq. (21) and read
$ṗ1=p2vtr2+p3[vt2r2−2gρ2r3−dudrFmcos(u(r))]−p4[vrvtr2−dudrFmsin(u(r))]ṗ2=0ṗ3=−p1+p4vtrṗ4=−p2r−2p3vtr+p4vrrṗ5=p3Fm2sin(u(r))+p4Fm2cos(u(r))$
(40)
with the initial values $p1(tf)=−1/vr(tf)$ and $p2(tf)=p3(tf)=p4(tf)=p5(tf)=0$. The formula for the control update, given by Eq. (25), yields
$δu(r)=−εFm(r)vr(r)[p3(r)cos(u(r))−p4(r)sin(u(r))]$
(41)

For the numerical example, the following parameters are used: The specific impulse $4.21×104 s$, the thrust force $F=1033 kN$, the radius of the Earth $ρ=6375 km$, the gravitational acceleration $g=9.8106×10−3 km/s2$, and the radius of the sphere of influence $rfinal=9.25×105 km$. Initially, the space vehicle is in a circular Earth orbit with altitude $h=300 km$. Hence, the initial conditions at $t=t0=0$ are given by: $r(0)=ρ+h, φ(0)=0, vr(0)=0, vt(0)=gρ2/(ρ+h)$ and $m(0)=115 t$. We consider two different initial controls, which are both far away from the optimal solution: (a) u(r)=0 and (b) $u(r)=r π/(2rfinal)$. In the first case, the thrust always acts in tangential direction. In the latter case, the thrust acts tangentially at the beginning and radially at the end, i.e., the sphere of influence is crossed in normal direction. The controls u(r) are discretized by 500 data points in time. Notice that a traditional gradient computation by direct differentiation would require 500 additional forward solutions of the state equations, whereas with our method the gradient is obtained already from one solution of the adjoint system equations. Figure 4 depicts the decrease of the final time with an increasing number of quasi-Newton iterations for both initial controls. The final time $1.8106×104 s$ for the initial control (a) and $1.5704×104 s$ for the initial control (b) are reduced to $tf=1.3187×104 s$ after the optimization process.

The time-optimal control angles u(r) computed from the initial controls (a) and (b) are shown in Fig. 5. The state histories r(t) and $φ(t)$ for the space vehicle are summarized in Fig. 6 and the flight trajectory of the space vehicle is plotted in Fig. 7. They are almost identical for both initial controls. The spacecraft moves nearly on a straight flight path. Notice that for the initial control (a) some small oscillations remain in the control u(r) at the end of the maneuver. The reason for this numerical artifact is that the final value $u(rfinal)$ is never updated by the gradient formula in Eq. (41) due to the zero values for p3 and p4 at t = tf and, hence, the control keeps its original value u =0. Clearly, the final control value has no influence on the flight time. For the initial control (a), the algorithm converges to a step function at $r=rfinal$ which requires a higher number of iterations. However, as it can be seen from the convergence plot in Fig. 4, the cost functional no longer decreases already after 20 iterations. This can be explained by the fact that the control history close to t = tf has only small influence on the cost functional. Hence, the oscillations in u(r) do not change the flight time and the flight path very much. However, Fig. 4 shows that choosing the initial control (b), for which the final value at t = tf coincides with the expected value $π/2$, increases the convergence rate and the oscillations at the end of the optimal control history in Fig. 5 vanish.

### 4.2 Time-Optimal Control of a Racing Car.

As a second example, we consider a mechanical model of a car for which accelerating, braking, and steering should be determined such that the time for a vehicle maneuver becomes minimal.

#### 4.2.1 Equations of Motion and Problem Formulation.

To describe the physical behavior of the vehicle, we use a single-track model and a Pacejka tire model for the road to wheel contact, see Fig. 8. The position of the vehicle is described by the absolute coordinates x and y of its center of gravity (CoG) and by the orientation angle ψ with respect to the x-axis. Moreover, m is the total mass of the vehicle and J denotes its moment of inertia. Both wheels have the effective radius R and the moment of inertia Jw about their rotation axis. The lengths a and b are the distances measured from the center of gravity to the rear axle and to the front axle. The control inputs of the model are the steering angle u1 of the front axis and the driving/braking torque u2 on the rear axis.

Since, for the Pacejka tire model, the tire forces $F1,X, F1,Y, F2,X$, and $F2,Y$ must be described in the body-fixed frame of the chassis, it is convenient to formulate the equations of motion with respect to body-fixed coordinates, as proposed in Refs. [6] and [26].

If VX and VY denote the components of the absolute velocity of the CoG with respect to the body fixed frame, the equations of motion read
$ẋ=VX cos(ψ)−VY sin(ψ)ẏ=VX sin(ψ)+VY cos(ψ)ψ̇=ωψV̇X=1m(mVYωψ+F1,X cos(u1)+F2,X−F1,Y sin(u1))V̇Y=1m(−mVXωψ+F1,Y cos(u1)+F2,Y+F1,X sin(u1))ω̇ψ=1J(F1,Ya cos(u1)−F2,Yb+F1,Xa sin(u1))ω̇1=−1JwF1,X Rω̇2=1Jw(u2−F2,X R)$
(42)

where $ωψ$ is the angular velocity $ψ̇$ of the chassis about the vertical and ω1 and ω2 denote the angular velocities of the wheels.

The determination of the tire forces $F1,X, F1,Y, F2,X$, and $F2,Y$ is the major part of the equations of motion. However, with respect to the computational effort, we use a simplified Pacejka tire model proposed in Ref. [6], where the camber angle of the tire is neglected. With the lateral and longitudinal slips
$α1=u1−arctan(VY+ωψaVX)κ1=ω1R−VX cos(u1)−(VY+ωψa)sin(u1)VX cos(u1)+(VY+ωψa)sin(u1)α2=−arctan(VY+ωψbVX)κ2=ω2R−VXVX$
(43)
for the front and the rear axis, the tire forces are given by
$F1,X=G ba+bfX(κ1) gX(κ1,α1) F1,Y=G ba+bfY(α1) gY(α1,κ1)F2,X=G aa+bfX(κ2) gX(κ2,α2) F2,Y=G aa+bfY(α2) gY(α2,κ2)$
(44)
in which G denotes the gravitational force on the vehicle. The Pacejka formulas read
$fi(x):=μi sin(ciarctan(bix−ai[bix−arctan(bix)]))gi(x,y):=cos(kiarctan(diy cos(arctan(eix))))$
(45)

with $i=X,Y$. The numerical values for the tire parameters ai, bi, ci, μi, ki, di, and ei are shown in Table 1.

To formulate the terminal condition and the penalty function, forcing the car to stay within the boundaries of the road, we introduce the curvilinear coordinates (s, r), which describe the vehicle position by the distance s along the road centerline and the deviation r measured normal to the centerline. The transformation from s and r to the Cartesian coordinates x and y is depicted in Fig. 9 and given by
$x(s,r)=xc(s)+yc′(s)ry(s,r)=yc(s)−xc′(s)r$
(46)

where $xc(s)$ and $yc(s)$ are the coordinates of the centerline in arc length description and primes denote derivatives with respect to s. Using, for example, a Newton scheme, s and r can be expressed by x and y from Eq. (46).

With the curvilinear coordinate s, the terminal condition, expressing that the vehicle crosses the finish line, is readily formulated as s = sf, where sf is the arc length coordinate of the finish line. Moreover, the penalty function for the road boundaries can be described by the lateral coordinate r as follows:
$P1(r):={0for |r|
(47)

Here, $r¯=w−h$, where w is the half road width and h is a small thickness parameter to activate the penalty function before the track constraint is violated. The penalty function is zero if $|r| and increases rapidly if $|r|>r¯$. Note, that $P1(r)$ is quadratic in r if $r¯<|r|, whereas it increases linearly if $|r|>r*$. Hence, $P1(r)$ is a once continuously differentiable function which is necessary to avoid jumps in the adjoint equations.

A similar penalty function is also used to constrain the driving torque u2
$P2(u2):={0for |u2|
(48)

Here, $u¯2$ is the maximum driving torque and $u2*$ separates the domains of quadratic and linear increase.

Both penalty functions are weighted and combined to one resulting penalty function $P(r,u2)=α1P1(r)+α2P2(u2)$. The cost functional to be minimized can then be defined by
$J=∫t0tf(1+P(r(t),u2(t))) dt$
(49)

where tf is the time to drive the vehicle through the road track.

#### 4.2.2 Transformation to the s-Domain.

For the solution of the problem posed above, we use the adjoint method in s-domain as described in Sec. 3.2. For that purpose, the equations of motion are first transformed from time to s-domain, where s is the arc length along the road centerline which has been introduced in Sec. 4.2.1. Notice that the boundaries for s are fixed, since we know s = s0 for the initial state and s = sf for the terminal state when the finish line is crossed.

Recall that the arc length s is given by Eq. (46) from which it can be expressed by the Cartesian coordinates x, y. The transformation of the time derivative of the state vector is given by
$dxdt=dxdsdsdt=x′(s)1g$
(50)
in which $g=1/ṡ$. The velocity $ṡ$ is obtained from differentiating Eq. (46) with respect to time
$ẋ=xc′(s)ṡ+yc″(s)ṡr+yc′(s)ṙẏ=yc′(s)ṡ−xc″(s)ṡr−xc′(s)ṙ$
(51)
Solving for $ṡ$ and $ṙ$ and using $xc′2+yc′2=1$ yields
$ṡ=−1r k(s)−1(xc′(s)ẋ+yc′(s)ẏ)$
(52a)
$ṙ=−1r k(s)−1[(rxc″(s)−yc′(s))ẋ+(xc′(s)+ryc″(s))ẏ]$
(52b)
where $k(s)=xc″(s)yc′(s)−xc′(s)yc″(s)$ denotes the signed curvature of the mean trajectory. After inserting the first two state equations from Eq. (42) for $ẋ$ and $ẏ$ in Eq. (52a), we obtain
$ṡ=1g=xc′(s)(VX cos(ψ)−VY sin(ψ))+yc′(s)(VX sin(ψ)+VY cos(ψ))1−r k(s)$
(53)

At this point, it is advantageous to replace the description of the vehicle position by (x, y) and use the curvilinear coordinates (s, r) instead. Since s is our independent variable now, we only need one state equation for r which is provided by Eq. (52b) after substituting again for $ẋ$ and $ẏ$ from the original state equations.

Finally, after applying Eq. (50) to Eq. (42) and replacing the first two equations by Eq. (52b), we end up with the following transformed state equations:
$r′=g(r k−1)[(VX cos(ψ)−VY sin(ψ))(xc″r−yc′)+(VX sin(ψ)+VY cos(ψ))(xc′+yc″r)]ψ′=gωψVX′=gm(mVYωψ+F1,X cos(u1)+F2,X−F1,Y sin(u1))VY′=gm(−mVXωψ+F1,Y cos(u1)+F2,Y+F1,X sin(u1))ωψ′=gJ(F1,Ya cos(u1)−F2,Yb+F1,Xa sin(u1))ω1′=−R gJwF1,Xω2′=gJw(u2−F2,XR)$
(54)
where $( )′$ denotes the derivative with respect to s. In the s-domain, the cost functional is readily described by
$J=tf+∫s0sfP(r(s),u2(s)) ds$
(55)
For measuring the final time, we add the differential equation
$t′(s)=g$
(56)

from which we may determine $tf=t(sf)$.

The adjoint equations associated with Eq. (54) which define the adjoint variables $p=(p1,p2,p3,p4,p5,p6,p7)T$ associated with the state variables r, ψ, VX, VY, $ωψ$, ω1, ω2 can be derived from Eq. (34a). Due to rather lengthy expressions, we refrain from writing down the equations here. The final values for $p(s)$ at s = sf are defined by Eq. (34b), i.e., by $p(sf)=0$. After solving the state equations and the associated adjoint equations, we can compute the update from Eq. (36) as follows:
$δu1(s)=−ε[−p6gRJw∂F1,X∂u1+p5gaJ(F1,X cos(u1)+∂F1,X∂u1sin(u1)−F1,Y sin(u1)+∂F1,Y∂u1cos(u1))+p4gm(F1,X cos(u1)+∂F1,X∂u1sin(u1)−F1,Y sin(u1)+∂F1,Y∂u1cos(u1))+p3gm(−F1,X sin(u1)+∂F1,X∂u1cos(u1)−F1,Y cos(u1)−∂F1,Y∂u1sin(u1))]δu2(s)=−ε[∂P∂u2+p7gJw]$
(57)

#### 4.2.3 Numerical Results.

For the numerical computations, we consider a road centerline which is composed of straights and Clothoids.3 The vehicle starts at the centerline of the track with r =0. Moreover, we prescribe the initial velocities with $VX=12 m/s$, VY = 0, $ωψ=0, ω1=VX/R$, and $ω2=VX/R$ at $s=s0=0$. The tire parameters and the vehicle parameters for the numerical computations are summarized in Tables 1 and 2.

In order to start the optimization procedure, we have to find a reasonable initial guess for the control of the vehicle. Therefore, we use a simple driver model in order to follow the road centerline and to apply suitable velocity profile. Various authors have already discussed the problem of path planning strategies [27,28], and the application of velocity profiles [27]. The goal of the first example is to identify the control of a vehicle driving through a single 90 deg curve. In Fig. 10, the convergence of the cost functional is shown. After approximately 988 quasi-Newton iterations, the total time can be reduced from $7.3742 s$ to $4.9504 s$. The optimization was stopped due to a sufficient small first-order optimality measure. The initial and the optimized trajectories are depicted in Fig. 11.

The optimal control signals, i.e., the steering angle $u1,opt$ and the drive torque $u2,opt$ are shown in Fig. 12 and compared to the initial controls $u1,ini$ and $u2,ini$. According to Pontryagin's minimum principle [29], see also Ref. [1], we expect the following cases for the optimal drive torque, which appears linear in the Hamiltonian:

$u2,opt(s):={u2,max,p7(s)<0u2,min,p7(s)>0singular,p7(s)=0$

where $u2,max≃u¯2$ and $u2,min≃−u¯2$. If the adjoint variable $p7(s)$ passes through zero, we identify a switching point of the control $u2(s)$. However, particular attention shall be paid to finite time intervals in which the adjoint variable $p7(s)$ remains zero. Such intervals are called singular intervals. In that case, the necessary condition, which minimizes the Hamiltonian, provides no information about the optimal control. Here, the maximum transmitted force of the tire is the physical bound for the optimal control, which is observed by the method itself.

To circumvent the singular interval problem, usually the following traditional strategies are pursed: On the one hand, a singular control can be computed from differentiation of the switching condition $p7=0$ and involving the state and costate equations, see Ref. [1]. On the other hand, as a numerical approach, a regularization term of the form, e.g., $ϵu2$ can be added to the cost functional. However, both strategies require either massive manipulations of the state and costate equations or an unphysical modification of the cost functional, whereas our method provides a natural approach to handle the singular case. With the presented approach, the control for a singular interval can be detected without any further modification of the optimization process. Figure 12 shows that the numerical result is in agreement with the theoretical principle.

Finally, we are looking for the time-optimal control of the single-track vehicle driving through a hairpin turn. The reduction of the cost functional over iterations is shown Fig. 13. After approximately 2254 quasi-Newton iterations, we stop the optimization procedure. Note that for industrial applications in most cases a much lower number of iterations is sufficient; however, to demonstrate Pontryagin's theory on singular intervals, a high accuracy is necessary. The total time can be reduced from $9.9182 s$ to $6.4712 s$. In Fig. 14, we have once more established the result of the preprocessed driver model and the final solution of the time-optimal control problem. The control signals are shown in the first two diagrams in Fig. 15. In the third diagram, we encounter the theory of Pontryagin, too, where the control behavior of the driving torque prevails in a similar manner as discussed before.

## 5 Conclusion

Based on the adjoint gradient computation, we have proposed a new way to solve time-optimal control problems. The novelty of the presented approach lies in the adaption of the adjoint method to a cost functional with variable integration limits. With this method, an analytical formula for the gradient is obtained, for which only one additional system of adjoint equations has to be solved. For a numerical evaluation, the gradient formula can be discretized as fine as desired without increasing the computational effort. Further benefits of the adjoint approach are summarized as follows:

• By using an efficient gradient-based approach, the solution of a two-point boundary value problem, which often requires the application of sophisticated homotopy strategies, can be avoided.

• An improved control which reduces the cost functional is obtained after every successful control update.

• The method can also be applied to more complex systems to find time-optimal controls. If standard simulation software, e.g., multibody simulation programs, is used, the gradient computation could be automatized if a module for the adjoint equations is implemented.

• Even singular time intervals, where the optimal control cannot be determined from Pontryagin's minimum principle directly, can be identified without any further modification of the method.

Both, the hybrid formulation discussed in Sec. 3.1 and the complete formulation in s-domain introduced in Sec. 3.2 were successfully applied to two examples. The convergence of the orbital example was quite fast. However, the computation of the minimum time vehicle maneuver shows a rather slow convergence near the optimum, which is expectable for gradient-based methods, in particular, if the solution converges to a bang–bang control. Nevertheless, the method converges despite a poor initial guess for the control signals.

## Footnotes

3

A Clothoid is a flat curve that is defined by the condition that the radius of curvature decreases inversely proportional to the arc length.

## Acknowledgment

The financial support by the Austrian Federal Ministry for Digital and Economic Affairs and the National Foundation for Research, Technology and Development is gratefully acknowledged. Karin Nachbagauer acknowledges support from the Austrian Science Fund (FWF): T733-N30, and from the Technical University of Munich—Institute for Advanced Study.

## References

References
1.
Kirk
,
D. E.
,
2004
,
Optimal Control Theory: An Introduction
,
Dover Publications, Inc.
,
Mineola, NY
.
2.
Sage
,
A. P.
, and
White
,
C. C.
,
1977
,
Optimum Systems Control
,
Prentice-Hall, Inc.
,
Englewood Cliffs, NJ
.
3.
Betts
,
J. T.
,
2010
,
Practical Methods for Optimal Control and Estimation Using Nonlinear Programming
,
SIAM
,
.
4.
Cossalter
,
V.
,
Lio
,
M. D.
,
Lot
,
R.
, and
Fabbri
,
L.
,
1999
, “
A General Method for the Evaluation of Vehicle Manoeuvrability With Special Emphasis on Motorcycles
,”
Veh. Syst. Dyn.
,
31
(
2
), pp.
113
135
.10.1076/vesd.31.2.113.2094
5.
Bertolazzi
,
E.
,
Biral
,
F.
, and
Lio
,
M. D.
,
2005
, “
Symbolic–Numeric Indirect Method for Solving Optimal Control Problems for Large Multibody Systems
,”
Multibody Syst. Dyn.
,
13
(
2
), pp.
233
252
.10.1007/s11044-005-3987-4
6.
Olofsson
,
B.
,
Lundahl
,
K.
,
Berntorp
,
K.
, and
Nielsen
,
L.
,
2013
, “
An Investigation of Optimal Vehicle Maneuvers for Different Road Conditions
,”
IFAC Proc. Vol.
,
46
(
21
), pp.
66
71
.10.3182/20130904-4-JP-2042.00007
7.
Lot
,
R.
, and
Biral
,
F.
,
2014
, “
A Curvilinear Abscissa Approach for the Lap Time Optimization of Racing Vehicles
,”
IFAC Proc. Vol.
,
47
(
3
), pp.
7559
7565
.10.3182/20140824-6-ZA-1003.00868
8.
Bianco
,
N. D.
,
Bertolazzi
,
E.
,
Biral
,
F.
, and
Massaro
,
M.
,
2019
, “
Comparison of Direct and Indirect Methods for Minimum Lap Time Optimal Control Problems
,”
Veh. Syst. Dyn.
,
57
(
5
), pp.
665
696
.10.1080/00423114.2018.1480048
9.
Bottasso
,
C. L.
,
Croce
,
A.
,
Ghezzi
,
L.
, and
Faure
,
P.
,
2004
, “
On the Solution of Inverse Dynamics and Trajectory Optimization Problems for Multibody Systems
,”
Multibody Syst. Dyn.
,
11
(
1
), pp.
1
22
.10.1023/B:MUBO.0000014875.66058.74
10.
Hendrikx
,
J. P. M.
,
,
T. J. J.
, and
Kriens
,
R. F. C.
,
1996
, “
Application of Optimal Control Theory to Inverse Simulation of Car Handling
,”
Veh. Syst. Dyn.
,
26
(
6
), pp.
449
461
.10.1080/00423119608969319
11.
Bobrow
,
J. E.
,
Dubowsky
,
S.
, and
Gibson
,
J. S.
,
1985
, “
Time-Optimal Control of Robotic Manipulators Along Specified Paths
,”
Int. J. Rob. Res.
,
4
(
3
), pp.
3
17
.10.1177/027836498500400301
12.
Reiter
,
A.
,
Müller
,
A.
, and
Gattringer
,
H.
,
2018
, “
On Higher Order Inverse Kinematics Methods in Time-Optimal Trajectory Planning for Kinematically Redundant Manipulators
,”
IEEE Trans. Ind. Inf.
,
14
(
4
), pp.
1681
1690
.10.1109/TII.2018.2792002
13.
Kelley
,
H. J.
,
1962
, “
,”
Optimization Techniques With Applications to Aerospace Systems
(Mathematics in Science and Engineering),
G.
Leitmann
, ed., Vol.
5
,
,
New York
, Chap. 6, pp.
206
254
.
14.
Bryson
,
A. E.
, and
Denham
,
W. F.
,
1964
, “
Optimal Programming Problems With Inequality Constraint II: Solution by Steepest Ascent
,”
AIAA J.
,
2
(
1
), pp.
25
34
.10.2514/3.2209
15.
Rao
,
S. S.
,
2009
,
Engineering Optimization
,
John Wiley & Sons
,
New York
.
16.
Bryson
,
A. E.
, and
Ho
,
Y. C.
,
1975
,
Applied Optimal Control: Optimization, Estimation and Control
,
Hemisphere
,
Washington, DC
.
17.
Bestle
,
D.
, and
Eberhard
,
P.
,
1992
, “
Analyzing and Optimizing Multibody Systems
,”
Mech. Struct. Mach.
,
20
(
1
), pp.
67
92
.10.1080/08905459208905161
18.
Haug
,
E. J.
,
Wehage
,
R. A.
, and
Mani
,
N. K.
,
1984
, “
Design Sensitivity Analysis of Large-Scale Constrained Dynamic Mechanical Systems
,”
ASME J. Mech., Trans., Autom. Des.
,
106
(
2
), pp.
156
162
.10.1115/1.3258573
19.
Steiner
,
W.
, and
Reichl
,
S.
,
2012
, “
The Optimal Control Approach to Dynamical Inverse Problems
,”
ASME J. Dyn. Syst., Meas., Control
,
134
(
2
), p.
021010
.10.1115/1.4005365
20.
Nachbagauer
,
K.
,
Oberpeilsteiner
,
S.
,
Sherif
,
K.
, and
Steiner
,
W.
,
2015
, “
The Use of the Adjoint Method for Solving Typical Optimization Problems in Multibody Dynamics
,”
ASME J. Comput. Nonlinear Dyn.
,
10
(
6
), p.
061011
.10.1115/1.4028417
21.
Schaffer
,
A. S.
,
2005
, “
On the Adjoint Formulation of Design Sensitivity Analysis of Multibody Dynamics
,” Dissertation,
University of Iowa
,
Iowa City, IA
.
22.
Ding
,
J.-Y.
,
Pan
,
Z.-K.
, and
Chen
,
L.-Q.
,
2007
, “
Second Order Adjoint Sensitivity Analysis of Multibody Systems Described by Differential-Algebraic Equations
,”
Multibody Syst. Dyn.
,
18
(
4
), pp.
599
617
.10.1007/s11044-007-9080-4
23.
Lanczos
,
C.
,
1986
,
The Variational Principles of Mechanics
,
Dover Publications, Inc.
,
Mineola, NY
.
24.
Nocedal
,
J.
, and
Wright
,
S.
,
2006
,
Numerical Optimization
,
,
New York
.
25.
Steiner
,
W.
, and
Schagerl
,
M.
,
2004
,
Raumflugmechanik
,
Springer
,
Berlin, Heidelberg
.
26.
Popp
,
K.
, and
Schiehlen
,
W.
,
2010
,
Ground Vehicle Dynamics
,
,
Berlin
.
27.
Casanova
,
D.
,
2000
, “
On Minimum Time Vehicle Manoeuvring: The Theoretical Optimal Lap
,” Dissertation,
Cranfield University
,
Cranfield, UK
.
28.
Frezza
,
R.
,
Beghi
,
A.
, and
Notarstefano
,
G.
,
2005
, “
Almost Kinematic Reducibility of a Car Model With Small Lateral Slip Angle for Control Design
,”
Proceedings of the IEEE International Symposium on Industrial Electronics (IEEE ISIE 2005)
,
Dubrovnik, Croatia
, June 20-23, Vol.
1
, pp.
343
348
.10.1109/ISIE.2005.1528934
29.
Pontryagin
,
L. S.
,
Boltyanskii
,
V.
,
Gamkrelidze
,
R.
, and
Mischchenko
,
E.
,
1962
, “
The Mathematical Theory of Optimal Processes
,”
John Wiley & Sons
,
New York
.