Abstract

In this article, we discuss a special class of time-optimal control problems for dynamic systems, where the final state of a system lies on a hyper-surface. In time domain, this endpoint constraint may be given by a scalar equation, which we call transversality condition. It is well known that such problems can be transformed to a two-point boundary value problem, which is usually hard to solve, and requires an initial guess close to the optimal solution. Hence, we propose a new gradient-based iterative solution strategy instead, where the gradient of the cost functional, i.e., of the final time, is computed with the adjoint method. Two formulations of the adjoint method are presented in order to solve such control problems. First, we consider a hybrid approach, where the state equations and the adjoint equations are formulated in time domain but the controls and the gradient formula are transformed to a spatial variable with fixed boundaries. Second, we introduce an alternative approach, in which we carry out a complete elimination of the time coordinate and utilize a formulation in the space domain. Both approaches are robust with respect to poor initial controls and yield a shorter final time and, hence, an improved control after every iteration. The presented method is tested with two classical examples from satellite and vehicle dynamics. However, it can also be extended to more complex systems, which are used in industrial applications.

1 Introduction

Among all optimal control problems, the solution of time-optimal problems is probably the most challenging ones. In this paper, we consider a dynamic system described by a set of minimal coordinates and given initial values. We are looking for the optimal control of the system such that the time for a prescribed process or maneuver becomes a minimum.

In order to solve such problems, fundamental books on optimal control as, e.g., Refs. [1] and [2] discuss direct and indirect methods. In summary, direct methods, also addressed as nonlinear programing, transform the dynamic problem to a static problem by a parameterization of the control variables, see Ref. [3], for example. Numerous methods follow a direct approach, e.g., interior point algorithms, active set, sequential quadratic programing, pseudo-spectral method, to name but a few. Indirect methods are based on the solution of the optimality conditions stated by Pontryagin resulting in a two-point boundary value problem. Solution strategies for the boundary value problem are discussed by various authors [47] and are already well established in a wide range in engineering sciences. The boundary value problem can, e.g., be solved with shooting methods or with discretization techniques in which both states and controls are discretized. For more details to direct and indirect methods, a good overview of both is given by Bianco et al. [8]. In a previous work, Bottasso et al. [9] presented a combined indirect approach for solving inverse and trajectory optimization problems in multibody dynamics, which also corresponds to the ideas presented in Ref. [5]. The latter mentioned work deals with the time-optimal control of a vehicle using an indirect approach to solve the boundary value problem.

A first significant formulation of the time-optimal control problem of a general vehicle maneuver has been stated by Hendrikx et al. [10]. This was followed by further articles concerning motorcycle lap time simulations in Ref. [4] and car lap time simulations discussed in Refs. [6] and [7]. Basically, all the mentioned authors have solved the boundary value problems associated with the optimal control problems. Time-optimal control problems play a significant role as well in robotic applications, see, e.g., Ref. [11] for the time-optimal control problem with specified trajectory. More recently, in Ref. [12] time-optimal motions following a predefined end-effector path are considered for kinematically redundant robots as well as for nonredundant robots. Here, a multiple shooting approach is used to solve the boundary value problem.

An indirect alternative to solving the underlying boundary value problem is given by gradient methods, independently developed by Kelley [13] and by Bryson and Denham [14] already in the sixties of the past century.

Here, the basic idea is to find a control history leading to a minimum of a cost functional. Numerous strategies are available to compute the control for which a functional attains a minimum, as, e.g., the method of the steepest descent, the conjugate gradient method, the gradient projection method, the Gauss-Newton method, or quasi-Newton methods like the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm when estimating the Hessian, see, e.g., Ref. [15] for an overview and comparison of performance behavior. In any cases, the gradient of the cost functional with respect to the control history has to be computed. A control signal is often time discretized in order to obtain a finite dimensional problem. In case of a time discretized control variable, the computational efforts are extremely high due to the numerical evaluation of the gradient at every time-step. However, the effort of the gradient computation can be significantly reduced by using the adjoint method, see, e.g., the pioneering work by Bryson and Ho [16]. The basic idea of this approach is the introduction of additional adjoint variables determined by a set of adjoint differential equations from which the gradient can be computed more efficiently than, e.g., with a numerical (direct) differentiation. To determine the partial derivatives with respect to each time-discrete control parameter by means of a numerical differentiation, the entire forward system must be solved once for a variation of every discretization parameter. With the adjoint method, however, the gradient can be computed very efficiently by solving two systems of ordinary differential equations only, i.e., by the solutions of the system equations and of the so-called adjoint equations. With the gradient information, the control input can be improved iteratively converging to an optimal control. Even by choosing a suboptimal initial guess for the control inputs, in general, the proposed method converges to the (local) optimal solution.

Various authors have already utilized this method in the field of inverse dynamics of multibody systems more than 20 years ago, as, e.g., Refs. [17] and [18], but nevertheless, the topic still provides open research potential.

More recently, Refs. [19] and [20] show how the adjoint method can be applied efficiently to a multibody system described by differential-algebraic equations of index three. A piecewise adjoint method is presented in Ref. [21], in which the coordinate partitioning underlying ordinary differential equations are formulated as a boundary value problem, and are solved by multiple shooting methods. It has to be emphasized that the adjoint method can be utilized to solve even highly nonlinear systems with a vast number of degrees-of-freedom in a straight forward way, as, e.g., in the application of second-order adjoint sensitivity analysis of large multibody systems in Ref. [22], in which a comparison between the adjoint method and direct differentiation shows a significant reduction in computational costs.

In this paper, a gradient-based solution approach using the adjoint method for time-optimal control problems for dynamic systems is presented, which has not yet been published in the open literature so far. The main novelty lies in the adaption of the adjoint gradient computation to a functional with variable integration limits. For that purpose, the adjoint method is formulated for two different domain approaches concerning a transformation between the time coordinate and a coordinate within a fixed interval essential for free final time problems. On the one hand, the adjoint method in a hybrid domain gives the advantage of solving the system and adjoint equations in time domain using classical solver and step size strategies, but defining the control and the gradient in the space domain. On the other hand, the adjoint method is presented in complete state space domain, in which the state and adjoint equations, cost functional, and control undergo a total transformation to a coordinate with fixed boundaries. Both methods arise as innovative tools for the efficient computation of the gradient of a cost functional for all kinds of optimization methods. Due to the novel consistent formulation, both presented methods even allow the application of preferable quasi-Newton methods using the BFGS-update for the Hessian. Therefore, time-optimal control problems can now be considered in an efficient way concerning convergence properties.

2 The Adjoint Gradient Method in Time Domain

The basic idea of the adjoint method tracing back to Kelley [13] and Bryson and Denham [14], has been applied by various authors, as, e.g., for parameter identification and inverse dynamic problems [20]. Here, the idea of the adjoint method from the latter mentioned reference, has been used for the application for time-optimal control problems. Considering a dynamic system described by a minimal set of independent coordinates, the regarding initial value problem can be stated in the form  
ẋ=f(x(t),u(t))
(1a)
 
x(t0)=x0
(1b)
where x(t)n denotes the vector of state variables and u(t)m the vector of control inputs. The final time tf>t0 of a maneuver or an operation period is implicitly defined by a scalar terminal condition for the state variables of the form  
Φ(x(tf))=sf
(2)
where Φ(x(tf)) is a function n and sf is a given number. For example, if we consider a car driving one lap on a race track, the operation period terminates, when the center of mass of the vehicle crosses the finish line. Now, we ask for the control which minimizes the operation period tft0 without violating inequality constraints for the state and control variables. For that purpose, we introduce a cost functional of the form  
J=t0tf[1+P(x(t),u(t))]dt
(3)

where P(x(t),u(t)) is a proper penalty function which is zero if the states and control variables satisfy the constraint conditions, and increases rapidly if they are not satisfied. Moreover, P should be continuously differentiable at least once. Unless violating the constraints, the cost functional J is simply the length of the time interval tft0. The time-optimal control problem is then formulated as the problem of finding controls u(t) in the time interval t[t0,tf] which minimizes the functional J.

One strategy for finding these controls, which is pursued in this paper, is to compute the gradient of the cost functional, i.e., the direction of the steepest descent, and to walk along this direction until a (local) minimum of J is reached. That means we are asking for the variation δu(t) of a given control u(t) resulting in the largest local decrease of J. In order to compute δu(t) we first augment the cost functional by the state equations in Eq. (1a) yielding  
J¯=t0tf[1+P(x,u)+pT(f(x,u)ẋ)]dt
(4)
where p(t)n denotes the vector of adjoint variables. If the state equations are satisfied, the term multiplied with p(t) is zero and J¯ is identical with J for any choice of p(t). Let us now consider an infinitesimal variation of the controls δu(t). This results in an infinitesimal change of the states δx(t) and, hence, of the final time δtf. The first-order variation of J¯ is given by  
δJ¯=t0tf[Pxδx+Puδu+pT(fxδx+fuδuδẋ)]dt+[1+P(x(tf),u(tf))+pT(tf)(f(x(tf),u(tf))ẋ(tf))]δtf
(5)
where fx and fu denote the partial derivatives of f(x(t),u(t)) with respect to x(t) and u(t), analogously, the same abbreviations apply to Px and Pu. Note, that a variation of the final time also contributes to the change of the integral but since the equations of motion are fulfilled at t = tf, the variation of the cost functional reduces to  
δJ¯=t0tf[Pxδx+Puδu+pT(fxδx+fuδuδẋ)]dt+[1+P(x(tf),u(tf))]δtf
(6)
Performing an integration by parts of the last term of the integrand pT(t)δẋ(t) yields  
δJ¯=t0tf[(Pu+pTfu)δu+(Px+fx+ṗT)δx]dtpT(tf)δx(tf)+[1+P(x(tf),u(tf))]δtf
(7)
where δx(t0)=0 as the initial conditions are fixed. Now we have to pay special attention to the boundary terms in δJ¯. Since the final states must satisfy Eq. (2), the variation of the final time and of the final states are not independent [1,23]. The relation between δtf and δx(tf) is illustrated in Fig. 1. The solid line shows the states terminating at tf. The dashed line shows a modified motion, caused by a variation of the controls, terminating at tf+δtf. Up to first order, the variation of the final states where the operation period ends is given by  
δxf=x*(tf+δtf)x(tf)=x(tf+δtf)+δx(tf+δtf)x(tf)=ẋ(tf)δtf+δx(tf)
(8)
The operation period with modified controls terminates when the condition Φ(x(tf)+δxf)=Φ(x(tf))+Φx(x(tf))δxf=sf is met. Since Φ(x(tf))=sf and after inserting δxf from Eq. (8), we obtain  
Φx(x(tf))δxf=Φx(x(tf))(ẋ(tf)δtf+δx(tf))=0
(9)
and after solving for δtf, we end up with the relation  
δtf=Φx(x(tf))Φx(x(tf))ẋ(tf)δx(tf)
(10)
where ẋ(tf)=f(x(tf),u(tf)). Substituting δtf in Eq. (7) and collecting all variations results in  
δJ¯=t0tf[(Pu+pTfu)δu(t)+(Px+pTfx+ṗT)δx]dt[pT(tf)+Φx(x(tf))1+P(x(tf),u(tf))Φx(x(tf))f(x(tf),u(tf))]δx(tf)
(11)

So far, we have not made any restrictions to the adjoint variables, now we shall find it convenient to choose p(t) such that all terms multiplied with δx vanish. Thus, the adjoint variables are defined by the linear final value problem  

ṗ(t)=PxTfxTp(t)
(12a)
 
p(tf)=ΦxT(x(tf))1+P(x(tf),u(tf))Φx(x(tf))f(x(tf),u(tf))
(12b)

which can be solved backward in time after the system equations have been solved for x(t) yielding the time variant Jacobi matrices fx(x(t),u(t)) and Px(x(t),u(t)). Proceeding in this way, the first variation of the cost functional reduces to  
δJ¯=t0tf(Pu+pTfu)δu(t)dt
(13)
Hence, we observe that the largest local decrease of δJ¯ is obtained if δu(t) is selected in the opposite direction of the term in the bracket, i.e., if we set  
δu(t)=ε[PuT(x(t),u(t))+fuT(x(t),u(t))p(t)]
(14)

where ε must be a sufficiently small positive number.

3 Transformation to a Coordinate With Fixed Boundaries

A problem with the presented adjoint method arises because the modified control u*(t)=u(t)+δu(t) also causes a change of the time interval [t0,tf] to [t0,tf*]. If tf*<tf, which should be the case for a control variation of the form like in Eq. (14), the improved control u*(t) can be truncated at t=tf* although the information about u*(t) is available also for tf*<t<tf. However, if ε was selected too large, the terminal condition may be eventually met at a later time instance tf*>tf. In this case, we have no information about u*(t) for tf<t<tf*.

Hence, for time-optimal control problems, it is convenient to transform the time coordinate into a coordinate within a fixed interval. For instance, with the left side of the terminal condition in Eq. (2), we could introduce a new coordinate s instead of t by setting  
s(t):=Φ(x(t))
(15)

which ranges between the fixed boundaries s0=Φ(x(t0)) and sf=Φ(x(tf)). Based on this transformation, two solution strategies concerning the adjoint gradient method are pursued in this paper: (a) a hybrid formulation, where the state and adjoint equations are solved in the original time domain and only the controls and the gradient are defined in the fixed interval, and (b) the complete transformation of the time coordinate to a fixed interval. Both methods are discussed in detail below.

3.1 The Adjoint Method in a Hybrid Domain.

In many cases, it is difficult to realize a complete transformation of the time coordinate. If we are using a multibody simulation software, this would require a massive intervention into the software architecture. Hence, it would be preferable to solve the equations of motion in the time domain and to transform the gradient formula in Eq. (13) only. At a first glance, it seems that only the integration variable t has to be substituted by s=Φ(x(t)). However, Fig. 2 illustrates that the change of the integration variable has also an impact on the δ-operation describing the control variation. Let u*(t)=u(t)+δu(t) be a perturbed control in terms of the original time. Since a control variation δu(t) also causes a state variation δx(t), we will also observe a variation of the state dependent coordinate s(t)=Φ(x(t)), which is up to first order given by  
δs(t)=Φx(x(t))δx(t)
(16)
Setting s*(t)=s(t)+δs(t), the control variation in terms of the time t reads  
δu(t)=u*(s*(t))u(s(t))=u(s*(t))+δu(s*(t))u(s(t))
(17)
Expanding u(s*(t))=u(s(t)+δs(t)) and δu(s(t)+δs(t)) about s in a linear Taylor series we obtain  
δu(t)=u(s(t))+dudsδs(t)+δu(s(t))u(s(t))=δu(s(t))+dudsδs(t)
(18)
After inserting Eq. (16) for δs(t) we end up with the relation  
δu(t)=δu(s(t))+dudsΦx(x(t))δx(t)
(19)
Since we are looking for δu(s) which causes the largest local decrease of the cost functional J, the variation δu(t) in Eq. (11) must be expressed by this relation, yielding  
δJ¯=t0tf(Pu+pTfu)δu(s(t))dt+t0tf[Px+pT(fx+fududsΦx(x(t)))+ṗT]δx(t)dt[pT(tf)+ΦxT(x(tf))1+P(x(tf),u(tf))Φx(x(tf))f(x(tf),u(tf))]δx(tf)
(20)

To eliminate the influence of all terms multiplied with δx(t), we now define the adjoint variables in the following way:  

ṗ(t)=PxT(fx+fududsΦx(x(t)))Tp(t)
(21a)
 
p(tf)=ΦxT(x(tf))1+P(x(tf),u(tf))Φx(x(tf))f(x(tf),u(tf))
(21b)

Notice that we have added the term fu(du/ds)Φx to the original definition of the adjoint system in Eq. (12a). However, also the new adjoint variables can be determined by solving Eq. (21) backward in time. Finally, the first variation of the cost functional reduces to  
δJ¯=t0tf(Pu+pTfu)δu(s(t))dt
(22)
in which we can substitute the integration variable t by s. With  
dt=1ṡds=1Φxẋds=1Φxf(x,u)ds=g(x,u)ds
(23)
where g(x,u)=1/(Φxf(x,u)), we obtain  
δJ¯=s0sf(Pu+pTfu)δu(s)g(x,u)ds
(24)
Finally, the largest local decrease of the cost functional results from  
δu(s)=ε(PuT+fuTp)g(x,u)
(25)

where ε is a sufficiently small positive number.

With this result, we are ready to apply a gradient-based optimization procedure for finding a minimum of J, which can be summarized as follows:

  1. Choose an initial guess for the control history u(s) which causes the states to satisfy the terminal condition in Eq. (2).

  2. Solve Eq. (1) for the state variables x(t) until Eq. (2) is satisfied. During the numerical integration of the state equations, the argument s in u(s) must be replaced by the original time t with Eq. (15).

  3. Solve Eq. (21) for the adjoint variables p(t) by a backward integration starting at t = tf.

  4. Select a value for ε and compute the control variation δu(s) for every s from Eq. (25) by involving the inverse function of s(t). Alternatively, a quasi-Newton method can be applied to the update formula in Eq. (25), e.g., the BFGS-method.

  5. Update the control history by setting u(s)u(s)+δu(s). With the updated control u(s), the cost functional should be reduced, if ε was selected sufficiently small.

  6. Vary ε, resolve the state equations and evaluate the cost functional until an optimal value for ε is found, which minimizes the cost functional along the search direction. For that purpose, a line search algorithm as described in Ref. [24] can be used.

  7. Replace the control history u(s) by the updated control and repeat steps 2–6. Terminate the optimization procedure when Eq. (24) is sufficiently small (zero gradient).

In summary, the advantage of the presented method is that the equations of motion and also the adjoint system can be solved in the time domain and only the control variables and the gradient formula are transformed into the state space. Unfortunately, this requires the derivative of the controls with respect to s which can become infinite for bang–bang controls and could cause an undesirable excitation term in the adjoint equations in Eq. (21a). In the next paragraph, we show that this problem can be overcome by an elimination of the time coordinate also in the state equations in Eq. (1a). However, the price to pay is that a standard multibody simulation software is probably no longer applicable.

3.2 The Adjoint Method in s-Domain.

The function s(t) defined by Eq. (15) can be used to eliminate the time coordinate from the state equations in Eq. (1a). Using Eq. (23), we obtain  
ẋ=dxdsdsdt=x(s)1g(x,u)=f(x,u)
(26)
or, respectively,  
x(s)=g(x,u)f(x,u):=f̃(x(s),u(s)),x(s0)=x0
(27)
where g(x,u)=1/(Φxf(x,u)). For measuring the original time, we make use of Eq. (15) once more and include the additional differential equation  
t(s)=dtds=g(x,u)
(28)
which is used to determine the final time tf=t(sf). Now we are looking for controlling the dynamic system such that  
J=tf+s0sfP(x(s),u(s))ds
(29)
is minimized. Herein, P(x(s),u(s)) denotes again a penalty function to consider constraints on states and inputs. In order to derive a gradient of Eq. (29) with respect to the controls u(s) we may extend J by the state equations in Eq. (27) and by (28) yielding  
J¯=tf+s0sf[P(x,u)+pT(f̃(x,u)x)+ξ(g(x,u)t)]ds
(30)
in which p(s)n denotes a vector of adjoint variables and ξ(s) is a single adjoint variable associated with the differential equation defining the time coordinate. The linear change of the extended cost functional is given by  
δJ¯=δtf+s0sf[Pxδx+Puδu+pT(f̃xδx+f̃uδuδx)+ξ(gxδx+guδuδt)]ds
(31)
Integration by parts for the terms including δx and δt yields  
s0sfpTδxds=s0sfpTδxds+pT(sf)δx(sf)s0sfξδtds=s0sfξδtds+ξ(sf)δtf
(32)
Note that the variations δt0 and δx(s0) are zero since the initial conditions are prescribed. Inserting Eq. (32) into Eq. (31) and collecting all terms including δx(s),δu(s), and δt(s), yields  
δJ¯=s0sf[(Pu+pTf̃u+ξgu)δu+(Px+pTf̃x+pT+ξgx)δx+ξδt]ds+(1ξ(sf))δtfpT(sf)δx(sf)
(33)

The adjoint variables p(s) and ξ(s) can now be again defined such that all terms with δx(s) and δt(s) disappear. This is the case if ξ(s)=0 and ξ(sf)=1, i.e., ξ(s)=1 for s[s0,sf], and if  

p(s)=gxTPxTf̃xTp
(34a)
 
p(sf)=0
(34b)

This final value problem may be solved for p(s). Finally, the first variation of the cost functional reduces to  
δJ¯=s0sf(Pu+pTf̃u+gu)δu(s)ds
(35)
showing the direct influence of δu(s) on J. Finally, an update of the control signals is computed from Eq. (35), by choosing  
δu(s):=ε(PuT+f̃uTp+guT)
(36)

in which ε is again a small positive number determining the size of the control update.

Finally, we summarize the main steps of the algorithm for the s-domain:

  1. Select a control history u(s) in the s-interval [s0,sf].

  2. Solve the state equations in Eq. (27) for x(s) and the differential equation in Eq. (28) for t(s), simultaneously.

  3. Solve the adjoint differential equations in Eq. (34) for the adjoint variables p(s) backward in s.

  4. Compute the control update δu(s) for every s by using Eq. (36). Note, instead of using the update δu(s) directly, Eq. (36) can be used to apply a quasi-Newton method.

  5. Improve the control history by setting u(s)u(s)+δu(s). If ε was selected sufficiently small, the updated control u(s) reduces the cost functional in Eq. (29), in which the final time is given by tf=t(sf).

  6. Find a suitable factor ε, e.g., by using a line search algorithm.

  7. Repeat steps 2–6 and stop the algorithm when δJ¯ in Eq. (35) is sufficiently small.

4 Numerical Examples

Now we consider two numerical examples in order to apply the proposed methods for computing the time-optimal control for dynamic systems. For the first example, the adjoint method in a hybrid domain is used for trajectory planning of a space flight, while in the second example the adjoint method in the s-domain is utilized for computing time-optimal vehicle maneuvers.

4.1 Time-Optimal Space Flight to the Sphere of Influence.

As a first example, we consider the time-optimal flight to the sphere of influence. The sphere of influence is a domain around a celestial body that indicates where the gravity of the body has a dominating effect on the dynamics of other space objects [25]. We are interested in the optimal control of a space vehicle so that we reach the sphere of influence in minimal time. We consider only planar motions described in polar coordinates r(t) and φ(t), see Fig. 3.

The system equations of a space vehicle in the gravitational field of the Earth, see, e.g., Ref. [1], read  
ṙ=vrφ̇=vtrv̇r=vt2rgρ2r2+Fmsin(u(r))v̇t=vrvtr+Fmcos(u(r))ṁ=FgIsp
(37)
where m(t) denotes the mass of the vehicle, g the gravitational acceleration at sea level, and ρ the radius of the Earth. The vehicle can be maneuvered by a constant rocket thrust F in a variable direction described by the angle u(r) which is considered as our control variable. Restricting to small thrust angles, the state equations become linear in u(r) and therefore would lead to a classical bang–bang control. Nevertheless, notice that these differential equations are nonlinear in the states and control variables. Hence, no constraints/bounds on the thrust angle are introduced. Moreover, Isp denotes the specific impulse of the rocket propellant. The system equations have the form of Eq. (1a) if we introduce x=(r,φ,vr,vt,m)T as the vector of state variables. At t = t0, the spacecraft is in a circular orbit of the Earth with known parameters. The maneuver under consideration is finished if r(t) is equal to the radius rfinal of the sphere of influence of the Earth. Hence, our terminal condition reads  
Φ(x(tf))=x1(tf)=r(tf)=rfinal
(38)
which defines tf implicitly. We want to compute u(r) such that the flight time to the sphere of influence is minimal. Hence, neither involving a penalty function nor a control constraint, the cost functional is simply given by  
J=t0tf1dt
(39)

i.e., the length of the time interval tft0.

For finding the optimal control, we use the hybrid method described in Sec. 3.1 and determine the control u as a function of r, which corresponds to the coordinate s. The adjoint equations can be directly derived from Eq. (21) and read  
ṗ1=p2vtr2+p3[vt2r22gρ2r3dudrFmcos(u(r))]p4[vrvtr2dudrFmsin(u(r))]ṗ2=0ṗ3=p1+p4vtrṗ4=p2r2p3vtr+p4vrrṗ5=p3Fm2sin(u(r))+p4Fm2cos(u(r))
(40)
with the initial values p1(tf)=1/vr(tf) and p2(tf)=p3(tf)=p4(tf)=p5(tf)=0. The formula for the control update, given by Eq. (25), yields  
δu(r)=εFm(r)vr(r)[p3(r)cos(u(r))p4(r)sin(u(r))]
(41)

For the numerical example, the following parameters are used: The specific impulse 4.21×104s, the thrust force F=1033kN, the radius of the Earth ρ=6375km, the gravitational acceleration g=9.8106×103km/s2, and the radius of the sphere of influence rfinal=9.25×105km. Initially, the space vehicle is in a circular Earth orbit with altitude h=300km. Hence, the initial conditions at t=t0=0 are given by: r(0)=ρ+h,φ(0)=0,vr(0)=0,vt(0)=gρ2/(ρ+h) and m(0)=115t. We consider two different initial controls, which are both far away from the optimal solution: (a) u(r)=0 and (b) u(r)=rπ/(2rfinal). In the first case, the thrust always acts in tangential direction. In the latter case, the thrust acts tangentially at the beginning and radially at the end, i.e., the sphere of influence is crossed in normal direction. The controls u(r) are discretized by 500 data points in time. Notice that a traditional gradient computation by direct differentiation would require 500 additional forward solutions of the state equations, whereas with our method the gradient is obtained already from one solution of the adjoint system equations. Figure 4 depicts the decrease of the final time with an increasing number of quasi-Newton iterations for both initial controls. The final time 1.8106×104s for the initial control (a) and 1.5704×104s for the initial control (b) are reduced to tf=1.3187×104s after the optimization process.

The time-optimal control angles u(r) computed from the initial controls (a) and (b) are shown in Fig. 5. The state histories r(t) and φ(t) for the space vehicle are summarized in Fig. 6 and the flight trajectory of the space vehicle is plotted in Fig. 7. They are almost identical for both initial controls. The spacecraft moves nearly on a straight flight path. Notice that for the initial control (a) some small oscillations remain in the control u(r) at the end of the maneuver. The reason for this numerical artifact is that the final value u(rfinal) is never updated by the gradient formula in Eq. (41) due to the zero values for p3 and p4 at t = tf and, hence, the control keeps its original value u =0. Clearly, the final control value has no influence on the flight time. For the initial control (a), the algorithm converges to a step function at r=rfinal which requires a higher number of iterations. However, as it can be seen from the convergence plot in Fig. 4, the cost functional no longer decreases already after 20 iterations. This can be explained by the fact that the control history close to t = tf has only small influence on the cost functional. Hence, the oscillations in u(r) do not change the flight time and the flight path very much. However, Fig. 4 shows that choosing the initial control (b), for which the final value at t = tf coincides with the expected value π/2, increases the convergence rate and the oscillations at the end of the optimal control history in Fig. 5 vanish.

4.2 Time-Optimal Control of a Racing Car.

As a second example, we consider a mechanical model of a car for which accelerating, braking, and steering should be determined such that the time for a vehicle maneuver becomes minimal.

4.2.1 Equations of Motion and Problem Formulation.

To describe the physical behavior of the vehicle, we use a single-track model and a Pacejka tire model for the road to wheel contact, see Fig. 8. The position of the vehicle is described by the absolute coordinates x and y of its center of gravity (CoG) and by the orientation angle ψ with respect to the x-axis. Moreover, m is the total mass of the vehicle and J denotes its moment of inertia. Both wheels have the effective radius R and the moment of inertia Jw about their rotation axis. The lengths a and b are the distances measured from the center of gravity to the rear axle and to the front axle. The control inputs of the model are the steering angle u1 of the front axis and the driving/braking torque u2 on the rear axis.

Since, for the Pacejka tire model, the tire forces F1,X,F1,Y,F2,X, and F2,Y must be described in the body-fixed frame of the chassis, it is convenient to formulate the equations of motion with respect to body-fixed coordinates, as proposed in Refs. [6] and [26].

If VX and VY denote the components of the absolute velocity of the CoG with respect to the body fixed frame, the equations of motion read  
ẋ=VXcos(ψ)VYsin(ψ)ẏ=VXsin(ψ)+VYcos(ψ)ψ̇=ωψV̇X=1m(mVYωψ+F1,Xcos(u1)+F2,XF1,Ysin(u1))V̇Y=1m(mVXωψ+F1,Ycos(u1)+F2,Y+F1,Xsin(u1))ω̇ψ=1J(F1,Yacos(u1)F2,Yb+F1,Xasin(u1))ω̇1=1JwF1,XRω̇2=1Jw(u2F2,XR)
(42)

where ωψ is the angular velocity ψ̇ of the chassis about the vertical and ω1 and ω2 denote the angular velocities of the wheels.

The determination of the tire forces F1,X,F1,Y,F2,X, and F2,Y is the major part of the equations of motion. However, with respect to the computational effort, we use a simplified Pacejka tire model proposed in Ref. [6], where the camber angle of the tire is neglected. With the lateral and longitudinal slips  
α1=u1arctan(VY+ωψaVX)κ1=ω1RVXcos(u1)(VY+ωψa)sin(u1)VXcos(u1)+(VY+ωψa)sin(u1)α2=arctan(VY+ωψbVX)κ2=ω2RVXVX
(43)
for the front and the rear axis, the tire forces are given by  
F1,X=Gba+bfX(κ1)gX(κ1,α1)F1,Y=Gba+bfY(α1)gY(α1,κ1)F2,X=Gaa+bfX(κ2)gX(κ2,α2)F2,Y=Gaa+bfY(α2)gY(α2,κ2)
(44)
in which G denotes the gravitational force on the vehicle. The Pacejka formulas read  
fi(x):=μisin(ciarctan(bixai[bixarctan(bix)]))gi(x,y):=cos(kiarctan(diycos(arctan(eix))))
(45)

with i=X,Y. The numerical values for the tire parameters ai, bi, ci, μi, ki, di, and ei are shown in Table 1.

To formulate the terminal condition and the penalty function, forcing the car to stay within the boundaries of the road, we introduce the curvilinear coordinates (s, r), which describe the vehicle position by the distance s along the road centerline and the deviation r measured normal to the centerline. The transformation from s and r to the Cartesian coordinates x and y is depicted in Fig. 9 and given by  
x(s,r)=xc(s)+yc(s)ry(s,r)=yc(s)xc(s)r
(46)

where xc(s) and yc(s) are the coordinates of the centerline in arc length description and primes denote derivatives with respect to s. Using, for example, a Newton scheme, s and r can be expressed by x and y from Eq. (46).

With the curvilinear coordinate s, the terminal condition, expressing that the vehicle crosses the finish line, is readily formulated as s = sf, where sf is the arc length coordinate of the finish line. Moreover, the penalty function for the road boundaries can be described by the lateral coordinate r as follows:  
P1(r):={0for|r|<r¯12(|r|r¯)2forr¯<|r|<r*12(r*r¯)2+(r*r¯)(|r|r*)otherwise
(47)

Here, r¯=wh, where w is the half road width and h is a small thickness parameter to activate the penalty function before the track constraint is violated. The penalty function is zero if |r|<r¯ and increases rapidly if |r|>r¯. Note, that P1(r) is quadratic in r if r¯<|r|<r*, whereas it increases linearly if |r|>r*. Hence, P1(r) is a once continuously differentiable function which is necessary to avoid jumps in the adjoint equations.

A similar penalty function is also used to constrain the driving torque u2 
P2(u2):={0for|u2|<u¯212(|u2|u¯2)2foru¯2<|u2|<u2*12(u2*u¯2)2+(u2*u¯2)(|u2|u2*)otherwise
(48)

Here, u¯2 is the maximum driving torque and u2* separates the domains of quadratic and linear increase.

Both penalty functions are weighted and combined to one resulting penalty function P(r,u2)=α1P1(r)+α2P2(u2). The cost functional to be minimized can then be defined by  
J=t0tf(1+P(r(t),u2(t)))dt
(49)

where tf is the time to drive the vehicle through the road track.

4.2.2 Transformation to the s-Domain.

For the solution of the problem posed above, we use the adjoint method in s-domain as described in Sec. 3.2. For that purpose, the equations of motion are first transformed from time to s-domain, where s is the arc length along the road centerline which has been introduced in Sec. 4.2.1. Notice that the boundaries for s are fixed, since we know s = s0 for the initial state and s = sf for the terminal state when the finish line is crossed.

Recall that the arc length s is given by Eq. (46) from which it can be expressed by the Cartesian coordinates x, y. The transformation of the time derivative of the state vector is given by  
dxdt=dxdsdsdt=x(s)1g
(50)
in which g=1/ṡ. The velocity ṡ is obtained from differentiating Eq. (46) with respect to time  
ẋ=xc(s)ṡ+yc(s)ṡr+yc(s)ṙẏ=yc(s)ṡxc(s)ṡrxc(s)ṙ
(51)
Solving for ṡ and ṙ and using xc2+yc2=1 yields  
ṡ=1rk(s)1(xc(s)ẋ+yc(s)ẏ)
(52a)
 
ṙ=1rk(s)1[(rxc(s)yc(s))ẋ+(xc(s)+ryc(s))ẏ]
(52b)
where k(s)=xc(s)yc(s)xc(s)yc(s) denotes the signed curvature of the mean trajectory. After inserting the first two state equations from Eq. (42) for ẋ and ẏ in Eq. (52a), we obtain  
ṡ=1g=xc(s)(VXcos(ψ)VYsin(ψ))+yc(s)(VXsin(ψ)+VYcos(ψ))1rk(s)
(53)

At this point, it is advantageous to replace the description of the vehicle position by (x, y) and use the curvilinear coordinates (s, r) instead. Since s is our independent variable now, we only need one state equation for r which is provided by Eq. (52b) after substituting again for ẋ and ẏ from the original state equations.

Finally, after applying Eq. (50) to Eq. (42) and replacing the first two equations by Eq. (52b), we end up with the following transformed state equations:  
r=g(rk1)[(VXcos(ψ)VYsin(ψ))(xcryc)+(VXsin(ψ)+VYcos(ψ))(xc+ycr)]ψ=gωψVX=gm(mVYωψ+F1,Xcos(u1)+F2,XF1,Ysin(u1))VY=gm(mVXωψ+F1,Ycos(u1)+F2,Y+F1,Xsin(u1))ωψ=gJ(F1,Yacos(u1)F2,Yb+F1,Xasin(u1))ω1=RgJwF1,Xω2=gJw(u2F2,XR)
(54)
where () denotes the derivative with respect to s. In the s-domain, the cost functional is readily described by  
J=tf+s0sfP(r(s),u2(s))ds
(55)
For measuring the final time, we add the differential equation  
t(s)=g
(56)

from which we may determine tf=t(sf).

The adjoint equations associated with Eq. (54) which define the adjoint variables p=(p1,p2,p3,p4,p5,p6,p7)T associated with the state variables r, ψ, VX, VY, ωψ, ω1, ω2 can be derived from Eq. (34a). Due to rather lengthy expressions, we refrain from writing down the equations here. The final values for p(s) at s = sf are defined by Eq. (34b), i.e., by p(sf)=0. After solving the state equations and the associated adjoint equations, we can compute the update from Eq. (36) as follows:  
δu1(s)=ε[p6gRJwF1,Xu1+p5gaJ(F1,Xcos(u1)+F1,Xu1sin(u1)F1,Ysin(u1)+F1,Yu1cos(u1))+p4gm(F1,Xcos(u1)+F1,Xu1sin(u1)F1,Ysin(u1)+F1,Yu1cos(u1))+p3gm(F1,Xsin(u1)+F1,Xu1cos(u1)F1,Ycos(u1)F1,Yu1sin(u1))]δu2(s)=ε[Pu2+p7gJw]
(57)

4.2.3 Numerical Results.

For the numerical computations, we consider a road centerline which is composed of straights and Clothoids.3 The vehicle starts at the centerline of the track with r =0. Moreover, we prescribe the initial velocities with VX=12m/s, VY = 0, ωψ=0,ω1=VX/R, and ω2=VX/R at s=s0=0. The tire parameters and the vehicle parameters for the numerical computations are summarized in Tables 1 and 2.

In order to start the optimization procedure, we have to find a reasonable initial guess for the control of the vehicle. Therefore, we use a simple driver model in order to follow the road centerline and to apply suitable velocity profile. Various authors have already discussed the problem of path planning strategies [27,28], and the application of velocity profiles [27]. The goal of the first example is to identify the control of a vehicle driving through a single 90 deg curve. In Fig. 10, the convergence of the cost functional is shown. After approximately 988 quasi-Newton iterations, the total time can be reduced from 7.3742s to 4.9504s. The optimization was stopped due to a sufficient small first-order optimality measure. The initial and the optimized trajectories are depicted in Fig. 11.

The optimal control signals, i.e., the steering angle u1,opt and the drive torque u2,opt are shown in Fig. 12 and compared to the initial controls u1,ini and u2,ini. According to Pontryagin's minimum principle [29], see also Ref. [1], we expect the following cases for the optimal drive torque, which appears linear in the Hamiltonian:

 
u2,opt(s):={u2,max,p7(s)<0u2,min,p7(s)>0singular,p7(s)=0

where u2,maxu¯2 and u2,minu¯2. If the adjoint variable p7(s) passes through zero, we identify a switching point of the control u2(s). However, particular attention shall be paid to finite time intervals in which the adjoint variable p7(s) remains zero. Such intervals are called singular intervals. In that case, the necessary condition, which minimizes the Hamiltonian, provides no information about the optimal control. Here, the maximum transmitted force of the tire is the physical bound for the optimal control, which is observed by the method itself.

To circumvent the singular interval problem, usually the following traditional strategies are pursed: On the one hand, a singular control can be computed from differentiation of the switching condition p7=0 and involving the state and costate equations, see Ref. [1]. On the other hand, as a numerical approach, a regularization term of the form, e.g., ϵu2 can be added to the cost functional. However, both strategies require either massive manipulations of the state and costate equations or an unphysical modification of the cost functional, whereas our method provides a natural approach to handle the singular case. With the presented approach, the control for a singular interval can be detected without any further modification of the optimization process. Figure 12 shows that the numerical result is in agreement with the theoretical principle.

Finally, we are looking for the time-optimal control of the single-track vehicle driving through a hairpin turn. The reduction of the cost functional over iterations is shown Fig. 13. After approximately 2254 quasi-Newton iterations, we stop the optimization procedure. Note that for industrial applications in most cases a much lower number of iterations is sufficient; however, to demonstrate Pontryagin's theory on singular intervals, a high accuracy is necessary. The total time can be reduced from 9.9182s to 6.4712s. In Fig. 14, we have once more established the result of the preprocessed driver model and the final solution of the time-optimal control problem. The control signals are shown in the first two diagrams in Fig. 15. In the third diagram, we encounter the theory of Pontryagin, too, where the control behavior of the driving torque prevails in a similar manner as discussed before.

5 Conclusion

Based on the adjoint gradient computation, we have proposed a new way to solve time-optimal control problems. The novelty of the presented approach lies in the adaption of the adjoint method to a cost functional with variable integration limits. With this method, an analytical formula for the gradient is obtained, for which only one additional system of adjoint equations has to be solved. For a numerical evaluation, the gradient formula can be discretized as fine as desired without increasing the computational effort. Further benefits of the adjoint approach are summarized as follows:

  • By using an efficient gradient-based approach, the solution of a two-point boundary value problem, which often requires the application of sophisticated homotopy strategies, can be avoided.

  • An improved control which reduces the cost functional is obtained after every successful control update.

  • The method can also be applied to more complex systems to find time-optimal controls. If standard simulation software, e.g., multibody simulation programs, is used, the gradient computation could be automatized if a module for the adjoint equations is implemented.

  • Even singular time intervals, where the optimal control cannot be determined from Pontryagin's minimum principle directly, can be identified without any further modification of the method.

Both, the hybrid formulation discussed in Sec. 3.1 and the complete formulation in s-domain introduced in Sec. 3.2 were successfully applied to two examples. The convergence of the orbital example was quite fast. However, the computation of the minimum time vehicle maneuver shows a rather slow convergence near the optimum, which is expectable for gradient-based methods, in particular, if the solution converges to a bang–bang control. Nevertheless, the method converges despite a poor initial guess for the control signals.

Footnotes

3

A Clothoid is a flat curve that is defined by the condition that the radius of curvature decreases inversely proportional to the arc length.

Acknowledgment

The financial support by the Austrian Federal Ministry for Digital and Economic Affairs and the National Foundation for Research, Technology and Development is gratefully acknowledged. Karin Nachbagauer acknowledges support from the Austrian Science Fund (FWF): T733-N30, and from the Technical University of Munich—Institute for Advanced Study.

References

References
1.
Kirk
,
D. E.
,
2004
,
Optimal Control Theory: An Introduction
,
Dover Publications, Inc.
,
Mineola, NY
.
2.
Sage
,
A. P.
, and
White
,
C. C.
,
1977
,
Optimum Systems Control
,
Prentice-Hall, Inc.
,
Englewood Cliffs, NJ
.
3.
Betts
,
J. T.
,
2010
,
Practical Methods for Optimal Control and Estimation Using Nonlinear Programming
,
SIAM
,
Philadelphia, PA
.
4.
Cossalter
,
V.
,
Lio
,
M. D.
,
Lot
,
R.
, and
Fabbri
,
L.
,
1999
, “
A General Method for the Evaluation of Vehicle Manoeuvrability With Special Emphasis on Motorcycles
,”
Veh. Syst. Dyn.
,
31
(
2
), pp.
113
135
.10.1076/vesd.31.2.113.2094
5.
Bertolazzi
,
E.
,
Biral
,
F.
, and
Lio
,
M. D.
,
2005
, “
Symbolic–Numeric Indirect Method for Solving Optimal Control Problems for Large Multibody Systems
,”
Multibody Syst. Dyn.
,
13
(
2
), pp.
233
252
.10.1007/s11044-005-3987-4
6.
Olofsson
,
B.
,
Lundahl
,
K.
,
Berntorp
,
K.
, and
Nielsen
,
L.
,
2013
, “
An Investigation of Optimal Vehicle Maneuvers for Different Road Conditions
,”
IFAC Proc. Vol.
,
46
(
21
), pp.
66
71
.10.3182/20130904-4-JP-2042.00007
7.
Lot
,
R.
, and
Biral
,
F.
,
2014
, “
A Curvilinear Abscissa Approach for the Lap Time Optimization of Racing Vehicles
,”
IFAC Proc. Vol.
,
47
(
3
), pp.
7559
7565
.10.3182/20140824-6-ZA-1003.00868
8.
Bianco
,
N. D.
,
Bertolazzi
,
E.
,
Biral
,
F.
, and
Massaro
,
M.
,
2019
, “
Comparison of Direct and Indirect Methods for Minimum Lap Time Optimal Control Problems
,”
Veh. Syst. Dyn.
,
57
(
5
), pp.
665
696
.10.1080/00423114.2018.1480048
9.
Bottasso
,
C. L.
,
Croce
,
A.
,
Ghezzi
,
L.
, and
Faure
,
P.
,
2004
, “
On the Solution of Inverse Dynamics and Trajectory Optimization Problems for Multibody Systems
,”
Multibody Syst. Dyn.
,
11
(
1
), pp.
1
22
.10.1023/B:MUBO.0000014875.66058.74
10.
Hendrikx
,
J. P. M.
,
Meijlink
,
T. J. J.
, and
Kriens
,
R. F. C.
,
1996
, “
Application of Optimal Control Theory to Inverse Simulation of Car Handling
,”
Veh. Syst. Dyn.
,
26
(
6
), pp.
449
461
.10.1080/00423119608969319
11.
Bobrow
,
J. E.
,
Dubowsky
,
S.
, and
Gibson
,
J. S.
,
1985
, “
Time-Optimal Control of Robotic Manipulators Along Specified Paths
,”
Int. J. Rob. Res.
,
4
(
3
), pp.
3
17
.10.1177/027836498500400301
12.
Reiter
,
A.
,
Müller
,
A.
, and
Gattringer
,
H.
,
2018
, “
On Higher Order Inverse Kinematics Methods in Time-Optimal Trajectory Planning for Kinematically Redundant Manipulators
,”
IEEE Trans. Ind. Inf.
,
14
(
4
), pp.
1681
1690
.10.1109/TII.2018.2792002
13.
Kelley
,
H. J.
,
1962
, “
Method of Gradients
,”
Optimization Techniques With Applications to Aerospace Systems
(Mathematics in Science and Engineering),
G.
Leitmann
, ed., Vol.
5
,
Academic Press
,
New York
, Chap. 6, pp.
206
254
.
14.
Bryson
,
A. E.
, and
Denham
,
W. F.
,
1964
, “
Optimal Programming Problems With Inequality Constraint II: Solution by Steepest Ascent
,”
AIAA J.
,
2
(
1
), pp.
25
34
.10.2514/3.2209
15.
Rao
,
S. S.
,
2009
,
Engineering Optimization
,
John Wiley & Sons
,
New York
.
16.
Bryson
,
A. E.
, and
Ho
,
Y. C.
,
1975
,
Applied Optimal Control: Optimization, Estimation and Control
,
Hemisphere
,
Washington, DC
.
17.
Bestle
,
D.
, and
Eberhard
,
P.
,
1992
, “
Analyzing and Optimizing Multibody Systems
,”
Mech. Struct. Mach.
,
20
(
1
), pp.
67
92
.10.1080/08905459208905161
18.
Haug
,
E. J.
,
Wehage
,
R. A.
, and
Mani
,
N. K.
,
1984
, “
Design Sensitivity Analysis of Large-Scale Constrained Dynamic Mechanical Systems
,”
ASME J. Mech., Trans., Autom. Des.
,
106
(
2
), pp.
156
162
.10.1115/1.3258573
19.
Steiner
,
W.
, and
Reichl
,
S.
,
2012
, “
The Optimal Control Approach to Dynamical Inverse Problems
,”
ASME J. Dyn. Syst., Meas., Control
,
134
(
2
), p.
021010
.10.1115/1.4005365
20.
Nachbagauer
,
K.
,
Oberpeilsteiner
,
S.
,
Sherif
,
K.
, and
Steiner
,
W.
,
2015
, “
The Use of the Adjoint Method for Solving Typical Optimization Problems in Multibody Dynamics
,”
ASME J. Comput. Nonlinear Dyn.
,
10
(
6
), p.
061011
.10.1115/1.4028417
21.
Schaffer
,
A. S.
,
2005
, “
On the Adjoint Formulation of Design Sensitivity Analysis of Multibody Dynamics
,” Dissertation,
University of Iowa
,
Iowa City, IA
.
22.
Ding
,
J.-Y.
,
Pan
,
Z.-K.
, and
Chen
,
L.-Q.
,
2007
, “
Second Order Adjoint Sensitivity Analysis of Multibody Systems Described by Differential-Algebraic Equations
,”
Multibody Syst. Dyn.
,
18
(
4
), pp.
599
617
.10.1007/s11044-007-9080-4
23.
Lanczos
,
C.
,
1986
,
The Variational Principles of Mechanics
,
Dover Publications, Inc.
,
Mineola, NY
.
24.
Nocedal
,
J.
, and
Wright
,
S.
,
2006
,
Numerical Optimization
,
Springer Science & Business Media
,
New York
.
25.
Steiner
,
W.
, and
Schagerl
,
M.
,
2004
,
Raumflugmechanik
,
Springer
,
Berlin, Heidelberg
.
26.
Popp
,
K.
, and
Schiehlen
,
W.
,
2010
,
Ground Vehicle Dynamics
,
Springer Science & Business Media
,
Berlin
.
27.
Casanova
,
D.
,
2000
, “
On Minimum Time Vehicle Manoeuvring: The Theoretical Optimal Lap
,” Dissertation,
Cranfield University
,
Cranfield, UK
.
28.
Frezza
,
R.
,
Beghi
,
A.
, and
Notarstefano
,
G.
,
2005
, “
Almost Kinematic Reducibility of a Car Model With Small Lateral Slip Angle for Control Design
,”
Proceedings of the IEEE International Symposium on Industrial Electronics (IEEE ISIE 2005)
,
Dubrovnik, Croatia
, June 20-23, Vol.
1
, pp.
343
348
.10.1109/ISIE.2005.1528934
29.
Pontryagin
,
L. S.
,
Boltyanskii
,
V.
,
Gamkrelidze
,
R.
, and
Mischchenko
,
E.
,
1962
, “
The Mathematical Theory of Optimal Processes
,”
John Wiley & Sons
,
New York
.