Abstract

Technology advancement for on-road vehicles has gained significant momentum in the past decades, particularly in the field of vehicle automation and powertrain electrification. The optimization of powertrain controls for autonomous vehicles typically involves a separated consideration of the vehicle’s external dynamics and powertrain dynamics, with one key aspect often overlooked. This aspect, known as flexible power demand, recognizes that the powertrain control system does not necessarily have to precisely match the power requested by the vehicle motion controller at all times. Leveraging this feature can lead to control designs achieving improved fuel economy by adding an extra degrees-of-freedom to the powertrain control while maintaining safety and drive comfort. The present research investigates the use of an approximate dynamic programming (ADP) approach to develop a powertrain controller, which takes into account the flexibility in power demand within the ADP framework. The concept of reachable sets is incorporated into the ADP framework to ensure safety, improve ride comfort, and enhance the accuracy of the optimization solution. The formulation is based on an autonomous hybrid electric vehicle, while the methodology can also be applied to other types of vehicles. It is also found that necessary customization of the ADP algorithm is needed for this particular control problem to prevent convergence issues. Finally, a case study is presented to evaluate the effectiveness of flexible power demand, as addressed by the ADP method. The experiment demonstrates a 14.1% improvement in fuel economy compared to a scenario without flexible power demand.

1 Introduction

Transportation accounts for over 70% of the total oil consumption in the United States, and almost 65% of the U.S transportation consumption is from passenger vehicles [2]. The need for lower fuel consumption and cleaner powertrain operation has been driving continuous development in automotive technologies, specifically in the field of powertrain electrification/hybridization [3]. Meanwhile, research on autonomous vehicles has gained major momentum recently, and a much higher percentage of autonomous/semi-autonomous vehicles is expected to be on the road in the near future [4,5]. Although each technology has its own advantages, combining powertrain electrification/hybridization and autonomy has the potential to significantly enhance the fuel efficiency of vehicles.

Hybridization is an intermediate step on the path toward full electrification [6]. A hybrid electric vehicle (HEV) is propelled by an internal combustion engine (ICE) and a battery pack that interacts with electric motors [7]. The presence of an alternative power source gives the powertrain an extra degrees-of-freedom, and thus, the engine can be controlled to operate close to its optimal region [8]. The deficit or surplus of the engine power is taken from or stored in the battery pack. The task of optimally supplying the requested power from the power sources is called powertrain energy management or power-split management [9]. To solve this optimization problem, several methodologies have been applied in the literature, including dynamic programming (DP) [10,11], model predictive control [12,13], rule-based methods [14,15], equivalent consumption minimization strategy [16,17], Pontryagin’s minimum principle (PMP) [1820], and reinforcement learning [2123]. Powertrain energy management is a crucial task since weak power management can result in battery overcharge, battery drainage, and eventually poor fuel economy.

Proper control of autonomous vehicles has the potential to dramatically reduce congestion, injuries, and fuel consumption [2426]. For example, autonomous vehicles (AVs) can have shorter gaps with their leading vehicles which results in improving traffic throughput [27,28]. Also, researchers have shown AVs can improve fuel consumption by eliminating the stop-and-go waves in traffic [29]. Combining autonomy and powertrain hybridization results in an autonomous HEV. As shown in Fig. 1, an autonomous HEV has two levels of control: an upper-level controller and a lower-level controller. The upper-level controller is in charge of optimizing the external dynamics of the vehicle and decides how much driveline torque is needed to meet the maneuvering goals. The lower-level controller is responsible for efficiently allocating the requested driving torque between the ICE and the electric power source.

Fig. 1
Two levels of control for autonomous HEVs [30]
Fig. 1
Two levels of control for autonomous HEVs [30]
Close modal

Most existing research studies these two levels separately: some research studies focus only on the lower-level control design (powertrain energy management) [3136] and some solely studied the upper-level control design (motion tracking/coordination) [32,3743]. Studies from the U.S. Department of Energy have shown that augmenting these two optimization problems can offer fuel-saving potentials that cannot be achieved by powertrain optimization alone [44,45]. The underlying reason is that when these levels are solved separately, the upper-level controller cannot observe the dynamics of the powertrain, and thus, the requested driving power may not consider the powertrain’s most efficient condition.

The benefits of joint optimization of these two levels have urged the researchers to focus on solving the speed control and power energy management in a unified framework [4650]. To mitigate the computational burden due to the high coupling between the dynamics of the upper level and lower level, Refs. [4649] proposed a hierarchical control framework to optimize (1) the vehicle’s speed profile and (2) the powertrain efficiency of the vehicle for the optimal speed profile derived in (1). Although these studies have shown promising results, these two levels are still solved separately, and the full potential of the integrated optimization in fuel minimization is still unrealized. Ma et al. [50] devised a control architecture to solve the vehicle dynamics and the powertrain dynamics in one optimization problem. Nevertheless, the drawback in their methodology (forward DP) is that it depends on the system’s initial condition and new optimization is needed anytime the initial condition changes [51].

This paper will explore a customized approach in which still two levels are optimized separately and the lower-level controller receives the power demand from the upper-level controller. However, the lower-level controller is given an extra degrees-of-freedom by using the unique property in autonomous HEVs, i.e., flexibility in power demand. This property implies that the vehicle does not have to precisely match the power demanded by the upper-level controller in real-time. There can be some variations in power that deviate from what is required by the upper-level controller. This feature is only achievable in autonomous HEVs where both upper-level and lower-level controllers operate in the background, and no human driver is intervening. This approach has been sought by some researchers recently [30,52] which has shown promising results in terms of fuel consumption reduction. However, the methodologies they have used make them less practical in the application of HEV power management. In Ref. [30], PMP was used to optimize the powertrain control optimization. For conventional HEVs, PMP is a powerful method for addressing powertrain energy management. However, when constraints on states exist in the problem, it can be challenging to implement PMP to meet them [53,54]. Thus, for autonomous HEVs with flexible driveline power demand wherein state constraints are applied [30], PMP can cause some convergence issues. Also, Zhang et al. [52] deployed an adaptive equivalent consumption minimization strategy to optimize power splits for an automated parallel HEV with flexible power demand. However, the main drawback in Ref. [52] is that the proposed method could not satisfy the final conditions.

The research presented in this paper investigates a customized approximate dynamic programming (ADP) approach, which is used for the first time to address the issue of powertrain optimization with flexible power demand. Solving this problem using ADP is independent of the initial condition of the system and requires much less memory storage capacity compared to the conventional DP. However, directly using standard ADP to solve the optimal control problem can be challenging. The standard ADP method needs the dynamical system to have a quadratic cost function to minimize, as well as an affine structure of the inputs. However, these conditions do not hold in the energy management of HEVs. Also, the system has nonlinear constraints on states and control inputs which standard ADP cannot take care of. Therefore, two methods are adopted to address these challenges. Initially, the ADP method is modified to facilitate its use in handling non-quadratic cost functions with non-affine inputs. Second, the concept of reachable set [55] is adopted in implementing ADP to meet the nonlinear constraints on states and control inputs. Studies have shown that utilizing the concept of reachable sets can improve the accuracy and the efficiency of the optimization solution noticeably [55].

The outline of this paper is as follows. In Sec. 2, the HEV modeling is discussed. Reachable sets and ADP framework are introduced in Sec. 3, followed by a numerical example in Sec. 4. At last, concluding remarks are given in Sec. 5.

2 Hybrid Electric Vehicle Dynamics

2.1 Upper-Level Dynamics.

Figure 2 illustrates the free-body diagram of a vehicle moving in a straight path within an inertial frame of reference.3 Note that the reaction forces on each individual wheel are summed up at their mid-axles (bicycle model). Let x represent the longitudinal displacement of the center of mass of the vehicle C, and v denote its longitudinal velocity, the external kinematics and dynamics of the vehicle can be expressed as follows:
(1)
(2)
(3)
(4)
here m, r, Af, fdrag, fR, and μR represent the vehicle’s mass, wheel radius, effective frontal area, drag force, rolling resistance force, and rolling resistance coefficient, respectively. Additionally, ρ and Cdrag correspond to the air density and coefficient of drag, respectively.
Fig. 2
Free-body diagram of the vehicle
Fig. 2
Free-body diagram of the vehicle
Close modal

2.2 Lower-Level Dynamics

2.2.1 Powertrain Dynamics.

In this study, a power-split hybrid powertrain is considered (Fig. 3). This mechanism which was commercialized by Toyota is known as Toyota hybrid system (THS). THS is formed of a battery pack, an ICE, a coupler gear set, a planetary gear set, an inverter, and two electric machines. In the literature, the electric machine that mostly acts as motion power output is labeled as the “motor,” and the one that is expected to mostly operate as an electricity generation unit is called the “generator.” The powertrain splits the power between the ICE and the battery through the planetary gear set that consists of three main elements: the sun, the carrier, and the ring. The engine is linked to the carrier, and the sun is connected to the generator.

Fig. 3
THS powertrain schematic
Fig. 3
THS powertrain schematic
Close modal

Gear G0 merges the ring’s power and the motor’s power through two identical gears, G1 and G2, to drive the driveline. Neglecting the inertia of the moving parts in the powertrain and using the power balance at the planetary and the coupler gear sets, the following algebraic equations between different components hold:

(5)
(6)
(7)
(8)
(9)
(10)
where ωe, ωr, ωg, ωm, and ωd denote the angular velocity of the engine, the ring, the generator, the motor, and the driveline, respectively. Likewise, Tg, Te, Tr, Td, and Tm represent the torque of the generator, the engine, the ring, the driveline, and the motor, respectively. Also, rr, rs, and kC denote the radius of the ring gear, the radius of the sun gear, and the final gear ratio at the coupler gear set, respectively.
Lastly, considering the power balance at the inverter reveals the following algebraic equation:
(11)
where Pbatt, Pm, and Pg denote the battery power, the motor power, and the generator power, respectively. Note that the motor power and generator power can be positive (when they are operating as a motor) or negative (when they are operating as a generator). Parameters μm and μg, respectively, denote the coefficients of efficiency for the motor and the generator when they operate as electricity-producing units. These coefficients vary between 0 and 1. Therefore, km and kg are either equal to 1 when their corresponding electrical machines produce motion power or they are equal to 1 otherwise. Also, a positive Pbatt means the battery is discharging while a negative Pbatt shows it is charging. Using Eqs. (5)(11), Pbatt is formulated as:
(12)

2.2.2 Battery Dynamics.

The dynamics of the battery are modeled using an equivalent circuit model [56]
(13)
here state of the charge, open-circuit voltage, internal resistance, and capacitance of the battery are represented by SoC, Vbatt, Rbatt, and Qbatt, respectively. Note that the nominal operating range of SoC is around 40–80% in charge-sustaining operations, where the initial and final values of SoC are equal. For this operating range, the battery’s parameters are almost constant [57].

2.2.3 Fuel Consumption Dynamics.

The fuel consumption dynamics are generally governed by the engine’s angular velocity ωe and the engine’s torque Te as depicted by Eq. (14).
(14)
where γ:R+×R+R+ is a mapping whose inputs are ωe and Te and measures the fuel consumption rate. R+ depicts the set of non-negative real numbers. An engine map generated from the experimental data is typically employed to capture this correlation, as illustrated in Fig. 4. The solid line and the thick dashed line depict the maximum engine torque and the optimal engine torque T¯e for different values of ωe, respectively. Given the two degrees-of-freedom in the powertrain for power supply, it is a reasonable assumption [50,52] that for any engine power Pe(t)=Te(t)*ωe(t), the corresponding solution pair (Te(t),ωe(t)) will position the engine power at its optimal efficiency point on the engine map. T¯e for this point is mathematically approximated by the following equation:
Fig. 4
(15)

2.2.4 Flexible Power Demand Dynamics.

As stated before, the upper-level controller determines the torque and the power required by the vehicle based on the upper-level dynamics. Conventionally, these power and torque demands are fed to the lower-level controller. The lower-level controller determines the distribution of power among the ICE, electrical machines, and the battery pack to meet the power demand. In the majority of the research studies, the driveline power demand is strictly met by the lower-level controller, i.e., fixed power demand. However, in an autonomous HEV, there exists a uniqueness which allows the lower-level controller to have certain degrees of flexibility to meet the instantaneous power requested by the upper level called flexible power demand. This flexibility in power demand, which inherently introduces a corresponding flexibility in torque demand as illustrated in Fig. 5, adds an extra degrees-of-freedom to the powertrain control optimization which can further improve fuel efficiency. Flexible power demand and flexible torque demand refer to the same method and are used interchangeably throughout this paper.

Fig. 5
Energy management hierarchy with flexible torque demand
Fig. 5
Energy management hierarchy with flexible torque demand
Close modal

Variations in the power supplied by the powertrain will lead to deviations in the anticipated acceleration determined by the upper-level controller. Consequently, there will be corresponding deviations in the expected velocity and displacement of the vehicle. However, in order to ensure that the vehicle reaches its intended destination at the end of the drive cycle, allowances for velocity and displacement deviations are made only during intermediate steps. In other words, although the longitudinal displacement, velocity, and driveline torque demand may differ from their expected values at each intermediate moment determined by the upper-level controller, the deviation in the longitudinal displacement and velocity must diminish as time approaches the end of the considered time horizon (note that the time horizon can be a short period, and the optimization can be done for each time period going forward one by one).

Denoting the flexible driveline torque, longitudinal displacement, and velocity as T~d, x~, and v~ respectively, the external dynamics of the vehicle can be rewritten to incorporate the aforementioned flexibilities as follows:
(16)
(17)
(18)
 It is worth noting that x~(0)=x(0) and v~(0)=v(0) are valid assumptions based on the fact that at the start of the drive cycle, the lower-level controller begins with the same initial longitudinal displacement and velocity as determined by the upper level.
Define Δx as the difference between the flexible longitudinal displacement and the longitudinal displacement obtained from the upper-level controller as follows
(19)
 Also, Δv is defined in a similar way
(20)
 By taking into account Eqs. (1)(4), and (16)(18), the following relationships can be derived:
(21)
(22)
where ΔTdTd~Td denotes the amount of flexibility in the torque demand and serves as a control input.

2.3 State Space Model.

In order to monitor the SoC of the battery and the deviations in longitudinal displacement and velocity, we define the following state vector:
(23)
Similarly, the vector of inputs is defined as follows:
(24)
 By utilizing Eq. (15), the input vector can be simplified to
(25)
 Taking into account Eqs. (12), (13), (21), and (22), the system dynamics can be summarized as
(26)
where F(,) is specified as follows
(27)

3 Optimal Control Formulation

Given a general discrete dynamical system with n states and m inputs, depicted as follows:
(28)
where x(t)Rn and u(t)Rm are the state and the input vector of the system, and F(,) represents the governing dynamical equation of the system. To tackle the optimization problem, it is necessary to establish a cost function. The cost function Jc, in its most comprehensive form, can be defined as follows:
(29)
where tf denotes the final time, and λ(x(t),u(t)) represents the cost associated with intermediate states and inputs. It is important to note that x(tf) denotes the final state vector, and ψ:RnR+ refers to the penalizing function, which is a design parameter chosen as a non-negative function. The purpose of this function is to ensure that the system reaches the desired terminal point xdes(tf) by penalizing state vectors that deviate significantly from xdes(tf). By introducing the discretization sample time as δt and the discrete time index as k, Eq. (28) can be discretized using the Euler method
(30)
 Likewise, Eq. (29) can be discretized as
(31)
here N is defined as tfδt. In accordance with the definition of the cost function in Eq. (31), the cost-to-go Vk(x(k)) is determined as the cost from the state x(k) at time index k to the end of the time horizon, and can be expressed as
(32)
This equation straightforwardly indicates that the cost of transitioning from x(k) at time index k to x(N) is equivalent to the cost of transitioning from x(k) to x(k+1) plus the cost of transitioning from x(k+1) to x(N). It is also important to note that
(33)
 Let optimal cost-to-go Vk*(X(k)) be defined as the minimum cost of transitioning from x(k) to x(N). According to the Bellman’s principle of optimality [58], Vk*(X(k)) can be expressed recursively as
(34)
where the control input for which Vk*(x(k)) is attained is called the optimal control input u*(x(k)).

3.1 Standard Approximate Dynamic Programming Methods.

Assume that the dynamical system in Eq. (30) is control affine as shown follows:
(35)
Also, assume that a quadratic cost function with the intermediate cost is shown as follows:
(36)
where f() and g() represent the dynamics of the system, and matrices Q and R are positive semi-definite, and positive definite matrices, respectively. Since the optimal control input u*(x(k)) minimizes the optimal cost-to-go Vk*(x(k)), thus, it satisfies the Bellman optimality condition, Vk*(x(k))u*(x(k))=0, and can be calculated as [59]
(37)
 Mainly, there are two types of ADP methods to solve this equation: adaptive critics and single neuron adaptive critics (SNAC). Although they both use neural networks to find the sub-optimal solution, the main difference between them is that in SNAC, the neural network is used to approximate the derivative of the cost-to-go, i.e., Vk+1*(x(k+1))x(k+1). In adaptive critics, two sets of neural networks are used: one to approximate the optimal cost-to-go Vk*(x(k)) and one to approximate the optimal action u*(x(k)). Both of these conventional ADP methods are fast and provide sub-optimal solutions. However, there are some restrictions with these conventional ADP methods. First, they assume that the system is control input affine [6063], and this assumption is the essential assumption needed to solve for the optimal control inputs. Although there are alternative methods to alleviate this requirement, they typically can induce other challenges. Second, the cost function needs to be a quadratic function of the states and the inputs [6062]. Another issue in the conventional ADP methods is that they usually solve problems with unconstrained inputs [6062]. Note that in Eq. (37), the inputs found are proportionally dependent on the magnitude of Vk+1*x(k+1), and they can be unreasonably high if not well constrained.

3.2 Customized Approximate Dynamic Programming Method.

With the restrictions explained in Sec. 3.1, standard ADP methods cannot be directly applied to the HEV power control optimization problem considered in this paper. First, the system described in Eq. (26) is highly non-affine due to the battery dynamics. Second, in order to minimize fuel consumption throughout the drive cycle, the intermediate cost is defined as follows:
(38)
 Thus, the cost function is not a quadratic function of the states and inputs. Additionally, it is crucial to impose constraints on the inputs to ensure they adhere to the physical limitations of engine operation and drive comfort. Furthermore, Eq. (13) imposes a complex nonlinear constraint of states and inputs on the system, i.e., (Pbatt(k)Vbatt24Rbatt). To the best of the authors’ knowledge, none of the current ADP algorithms can address this type of nonlinear constraint.

To address these issues, in this study, we propose to modify the original ADP method to offer a customized ADP method for the optimization problem considered. First of all, this customized ADP method is applicable to both control affine and non-affine systems. Furthermore, it is capable of handling non-quadratic nonlinear cost functions associated with both states and inputs. Finally, it adopts a way to handle complex constraints using the concept of reachable sets.

Particulary, in this method, the approximated optimal cost-to-go V¯k*(X(k)) at each step is determined through the utilization of deep neural networks (DNNs)
(39)
 In this DNN, the input consists of the current state X(k), and the output is the approximation of the optimal cost-to-go V¯*k(X(k)). The architecture of the DNN used in this study, denoted as ϕk(), is illustrated in Fig. 6. To detail this method, first, the concept of reachable sets needs to be addressed.

3.3 Reachable Set.

To address the complex constraints associated with the optimization problem, the concept of reachable sets is adopted to the customized ADP framework.4 The reachable set, at each step k, is defined as the set of points for which there is at least one sequence of actions under which the system transforms from the current state to a state in the desired region at the end of the time horizon. Figure 7 illustrates the concept of reachable sets for a two-dimensional dynamical system. For instance, in this figure, x(k) is a point in the reachable set at step k since there is at least a sequence of actions {u(k),,u(N2),u(N1)} which can transform x(k) to a point x(N) in the desired region at step N. Note that this sequence of actions may not necessarily be the optimal sequence of actions from x(k) to the end of the horizon.

Fig. 6
Architecture of the deep neural network
Fig. 6
Architecture of the deep neural network
Close modal
Fig. 7
Illustration of reachable sets
Fig. 7
Illustration of reachable sets
Close modal

In this figure, each rectangle represents the horizon of the state space at its corresponding time index, and the shaded region within the rectangle illustrates the reachable set at that time index. To determine the reachable set at each step, we proceed backward in time.

Specifically, for the system defined in Sec. 2.3, the reachable sets are obtained as follows: for k=N, the reachable set is defined as the set containing all the desired terminal points Xdes(N)
(40)
where Xmin(N) and Xmax(N) denote the minimum and the maximum range for Xdes(N), respectively.
Given the constraints on control inputs and the constraints set by the system dynamics, i.e., X(k)X(k) and u0(k)U(k), the reachable sets from step N1 to step 1 are found following the method in Ref. [55] as
(41)
 The constraints specified in Eq. (41) are imposed to ensure that the states and inputs of the system remain within the feasible region. For instance, in the case of HEVs, the SoC must be within the range of [0,1], and the engine speed ωe should not exceed its maximum value (approximately 450 rad/s) or be negative. Furthermore, the last constraint in Eq. (41) is introduced to guarantee that SoC remains a real number.
The time-variant state grid X(k) and input grid U(k) can be defined as follows:
(42)
(43)
where (Xmin(k), Xmax(k)), and (Umin(k), Umax(k)) denote the minimum and the maximum limits for X(k) and U(k), respectively. Besides, the control input by which Vk*(X(k)) is attained is the optimal control input, U*(X(k)).
Remark 1

The selection of the minimum and maximum bounds for the time-variant state and input grids shall be such that the control design goals are ensured. In the case of this study for example, a large value for Δxmin(k) or Δxmax(k) could cause the vehicle to hit the rear or front vehicle. Even in the case of no surrounding traffic where any Δxmin(k) or Δxmax(k) would not cause an accident, still a large value for ΔTd,min(k) or ΔTd,max(k) will let the controller to be allowed to have a harsh deviation from the requested drive torque and hence could be a challenge for drive comfort. Thus, a justifiable tolerance will be the one that empowers the control designer to use the reachable sets to enforce the control design goals.

3.4 Approximate Dynamic Programming Training Within the Reachable Set.

Once the reachable set is identified, the training of the customized ADP method is based on the reachable set.5 Unlike the standard ADP in which the region of training is not restricted, in the customized ADP used in this paper, the training is only conducted within the reachable set. In this way, it can be ensured that the optimal control solution identified can always meet constraints of the optimization problem. Specifically, only the points inside the reachable set R(k) are used to train the network ϕk(). Once the network is trained, it can be used to approximate the cost-to-go within the desired set. The training procedure is done backward in time. At first, the network learns the optimal cost-to-go at the final step using Eq. (34) for the points inside R(N), i.e., ϕN() approximates VN*(). Then, at each step kN, h number of random training samples inside R(k) are chosen. Subsequently, the optimal control input U*(k) is required to determine the approximated minimum cost-to-go V¯k*(). In the standard ADP methods [6063], the cost-to-go is defined in a quadratic form and the system is control affine (Eq. (35)); thus, this step is solved analytically through Eq. (37). However, for the particular optimization problem we considered in this paper, the cost function is not quadratic and the system is not control affine. To still be able to find the optimal control inputs under this setting, we replace the analytical solution in Eq. (37) by a numerical process shown as Steps 3 and 4 in Algorithm 1. This is to compare the costs associated with different values of the control inputs within the allowable range and choose that with the minimum cost. Note that this numerical step won’t cause much time consumption since the dimension of the input space is low. Once this step is done for all the sample points, the sample state points and their corresponding V¯k*() will be used as the features and the targets to train the neural network ϕk(), i.e., Steps 5–10 in Algorithm 1. To compute the norm in Step 8 of Algorithm 1, mean absolute error performance is used.
(44)

After completing the training process, the trained neural network is utilized to obtain the sub-optimal constrained control input sequence given the initial condition X(0). The implementation algorithm is further described in Algorithm 2.

Training neural network

Algorithm 1

  1 Select a small positive number α, and a big enough integer IterMax;

2 fork=N:1:0do

3  Choose h different random training samples Xl(k) in the reachable set R(k) where l{1,2,...,h};

4  For each training sample Xl(k) find V¯k*(Xl(k)) using Eq. (44);

5  Initialize ϕk0() with random parameters;

6  fori=1:IterMaxdo

7   Update the neural network ϕki() and find the parameters to approximate V¯k*(Xl(k)) using backpropagation on the entire training samples;

8   ifϕki()ϕki1()αthen

9    Break;

    end

   end

10  ϕk()ϕki();

end

Implementation

Algorithm 2

  1 fork=0:N1

2  U*(X(k))=argminU(k)U(k)(δt(m˙fuel(k))+ϕk+1(X(k+1)));

3  X(k+1)=X(k)+δtF(X(k),U(k))

end

Remark 2

As reinforced in Algorithms 1 and 2, the proposed method is a global optimization method meaning that it solves the powertrain energy management problem for the whole drive cycle in a single simulation. Then after the end of the time horizon, a new set of control commands is optimized for the next time horizon. This process will continue repeatedly.

4 Simulation Results

In this section, the performance of the proposed controller is evaluated using a real-world data set [64]. The data set contains recorded information on the external kinematics of vehicles, such as velocity, acceleration, and headway to leading and rear vehicles. The data were collected on a 150-meter-long section of the I-35 Corridor in Austin, TX. To assess the effectiveness of the controller, a random vehicle from the data set is selected. Figure 8 displays the baseline velocity profile of the selected vehicle.

Fig. 8
Baseline velocity profile of the vehicle before applying flexible power demand method
Fig. 8
Baseline velocity profile of the vehicle before applying flexible power demand method
Close modal
To ensure passenger safety and drive comfort, the corresponding constraint Eqs. (45)(47) are enforced on Δx, Δv, and ΔTd. These constraints restrict the maximum range of flexibility in longitudinal displacement Δx and velocity Δv based on the relative distances and velocities with the front and rear vehicles at each step. The flexibility in driveline torque ΔTd is constrained by the maximum torque of the engine and motor at the current speed. For simplicity, the limitations of these flexibilities remain constant [52,65]. Refer to Table 2 to find these constant limitations.
(45)
(46)
(47)
Remark 3

It is important to note that ΔTd(k) is one of the inputs referenced in Eqs. (24) and (25). As explained in Eq. (43), the minimum and maximum limits for the input space can vary over time. This time-varying limitation is particularly useful in segments of the drive cycle with already high torque demand. In such segments, it is advisable to reduce the deviations limit in ΔTd,max to ensure the HEV’s drive comfort is not compromised. Since the drive cycle in this study does not involve excessively high torque demand, these limitations are kept constant for simplicity throughout the entire drive cycle.

Next, the penalizing function ψ(X(N)) should be designed which is used to ensure that the vehicle will reach a point in the desired terminal set. As mentioned, the controller is trained using the points inside the corresponding reachable set. Thus, a narrow desired terminal set is not favorable from the training perspective as the DNN cannot be well trained. Thus, the desired terminal set must be broad enough, and at the same time, must favor the points close to the ideal terminal conditions, i.e., SoCideal(N)=SoC(0),Δxideal(N)=0,Δvideal(N)=0. We define ψ(X(N)) as follows:
(48)
where
(49)
here x1, x2, and x3 represent Δx, Δv, and SoC, respectively. Also, xi,min(N) and xi,max(N) represent the lower and upper bounds of the desired terminal set, and xi,low, and xi,high represent the limits of the region in which the cost is zero. For the points outside that region, the cost linearly increases to the constant value η.

The control performance of the system utilizing the proposed control method is evaluated for an initial condition X(0)=[0(m),0(m/s),60%]T. The performance results are depicted in Figs. 913. To provide a basis for comparison, the performance of the system under the baseline approach is also illustrated in Figs. 1113. In the baseline optimization, the flexibility in power demand is not considered (fixed demand), and the controller’s task is to supply the precise amount of power required by the upper-level dynamics. Consequently, in the baseline optimization, the state space is reduced to [SoC], and the input space consists of [ωe].

Fig. 9
Displacement deviation history for the vehicle after applying flexible power demand method
Fig. 9
Displacement deviation history for the vehicle after applying flexible power demand method
Close modal
Fig. 10
Velocity deviation history for the vehicle after applying flexible power demand method
Fig. 10
Velocity deviation history for the vehicle after applying flexible power demand method
Close modal
Fig. 11
Battery state of charge history of the vehicle before and after applying flexible power demand method
Fig. 11
Battery state of charge history of the vehicle before and after applying flexible power demand method
Close modal
Fig. 12
Engine power history of the vehicle before and after applying flexible power demand method
Fig. 12
Engine power history of the vehicle before and after applying flexible power demand method
Close modal
Fig. 13
Fuel consumption history of the vehicle before and after applying flexible power demand method
Fig. 13
Fuel consumption history of the vehicle before and after applying flexible power demand method
Close modal
Figure 13 shows fuel consumption histories using both methods. In practice, the terminal SoC may slightly differ from the terminal desired SoC. Following Ref. [66], a fuel consumption compensation method is used in this paper (Eq. (50)) to consider the final SoC deviation from its desired value.
(50)
where Fuelcomp, Fuelact, and ΔSoC represent the compensated fuel consumption corresponding to a zero SoC deviation, the actual fuel consumption, and the final SoC deviation, respectively. Parameter κ, which converts ΔSoC into a corresponding amount of fuel, is a curve-fitting coefficient [66]. Comparisons between the compensated fuel consumption for the strategies shown in Table 1 display that an additional 14.1% fuel economy improvement was achieved by the proposed algorithm over the baseline (fixed demand) strategy.
Table 1

Summary of results for the drive cycle

MethodTerminal Δx (m)Terminal Δv (m/s)Terminal SoC (%)Fuelcomp (g)
Flexible demand0.200.1459.4510.81
Fixed demand60.1712.59
MethodTerminal Δx (m)Terminal Δv (m/s)Terminal SoC (%)Fuelcomp (g)
Flexible demand0.200.1459.4510.81
Fixed demand60.1712.59

5 Conclusion

Traditionally, vehicle coordination optimization and powertrain energy management have been studied separately in the context of HEVs. However, this paper introduces a novel approach that combines and optimizes these two levels simultaneously by leveraging a unique feature which exists in autonomous HEVs: flexible power demand. The concept of flexible power demand acknowledges that the powertrain does not necessarily have to meet the power required by the external dynamics of an autonomous HEV at every step. By exploiting this flexibility, the proposed powertrain energy management method aims to enhance fuel economy.

The optimization problem is formulated within the framework of a customized approximate dynamic programming method, utilizing the notion of reachable sets. To evaluate the effectiveness of the proposed method, a case study is conducted using real-world data. The results demonstrate a significant 14.1 % improvement in fuel consumption compared to a conventional optimization method that employs fixed power demand. This highlights the potential of the proposed approach in achieving enhanced fuel efficiency in HEVs.

Acknowledgment

This work was partially supported by the National Science Foundation under Grant No. 1826410.

Conflict of Interest

There are no conflicts of interest.

Data Availability Statement

The authors attest that all data for this study are included in the paper.

Appendix

The environmental characteristics, vehicle specifications, and tuning parameters used for training the neural networks in this study are provided in Table 2.

Table 2

Specifications of the environment, vehicle system, and neural network

ParameterValueParameterValueParameterValueParameterValue
m1350 kgr0.28 mrr0.078 mrs0.030 m
μR0.007ρ1.225kgs/m3Cd0.3Af2.2m2
kC3.9μg0.9μm0.9Vbatt202 V
Qbatt23400 A.sRbatt0.45ΩΔxmin(k)3.5mΔxmax(k)3.5 m
Δvmin(k)2.5m/sΔvmax(k)2.5 m/sSoCmin(k)50%SoCmax(k)70%
ΔTd,min(k)150N.mΔTd,max(k)150 N.mωe,min(k)0 rad/sωe,max(k)450 rad/s
Δxmin(N)2.0mΔxmax(N)2.0 mΔxlow0.5mΔxhigh0.5 m
Δvmin(N)1.5m/sΔvmax(N)1.5 m/sΔvlow0.5m/sΔvhigh0.5 m/s
SoCmin(N)53%SoCmax(N)67%SoClow59%SoChigh63%
α0.05IterMax15h7000η2.5
κ2.4 g/%
ParameterValueParameterValueParameterValueParameterValue
m1350 kgr0.28 mrr0.078 mrs0.030 m
μR0.007ρ1.225kgs/m3Cd0.3Af2.2m2
kC3.9μg0.9μm0.9Vbatt202 V
Qbatt23400 A.sRbatt0.45ΩΔxmin(k)3.5mΔxmax(k)3.5 m
Δvmin(k)2.5m/sΔvmax(k)2.5 m/sSoCmin(k)50%SoCmax(k)70%
ΔTd,min(k)150N.mΔTd,max(k)150 N.mωe,min(k)0 rad/sωe,max(k)450 rad/s
Δxmin(N)2.0mΔxmax(N)2.0 mΔxlow0.5mΔxhigh0.5 m
Δvmin(N)1.5m/sΔvmax(N)1.5 m/sΔvlow0.5m/sΔvhigh0.5 m/s
SoCmin(N)53%SoCmax(N)67%SoClow59%SoChigh63%
α0.05IterMax15h7000η2.5
κ2.4 g/%

Footnotes

1

A preliminary work was presented in a conference [1]. The journal paper is significantly different from the conference version. More detailed algorithm derivation and explanation, adaptation of rechable set for ADP training, and more testing results are presented in the submitted journal paper. In addition, major rewordings are conducted to avoid being repetitive.

3

Properties of the vehicle system and specifications of the environment mentioned in this section can be found in Table 2.

4

The reachable set’s minimum and maximum range for the state space and input space used in this study are listed in Table 2.

5

The tuning parameters used to train the neural networks mentioned in this study are provided in Table 2.

References

1.
Kargar
,
M.
, and
Song
,
X.
,
2022
, “
Power Control Optimization for Autonomous Hybrid Electric Vehicles With Flexible Driveline Torque Demand
,”
2022 American Control Conference (ACC)
,
Atlanta, GA
,
IEEE
, pp.
2012
2017
.
2.
Qu
,
X.
,
Yu
,
Y.
,
Zhou
,
M.
,
Lin
,
C.-T.
, and
Wang
,
X.
,
2020
, “
Jointly Dampening Traffic Oscillations and Improving Energy Consumption With Electric, Connected and Automated Vehicles: A Reinforcement Learning Based Approach
,”
Appl. Energy
,
257
, p.
114030
.
3.
Zhang
,
F.
,
Wang
,
L.
,
Coskun
,
S.
,
Pang
,
H.
,
Cui
,
Y.
, and
Xi
,
J.
,
2020
, “
Energy Management Strategies for Hybrid Electric Vehicles: Review, Classification, Comparison, and Outlook
,”
Energies
,
13
(
13
), p.
3352
.
4.
Kavas-Torris
,
O.
,
Cantas
,
M. R.
,
Meneses Cime
,
K.
,
Aksun Guvenc
,
B.
, and
Guvenc
,
L.
,
2020
, The Effects of Varying Penetration Rates of l4-l5 Autonomous Vehicles on Fuel Efficiency and Mobility of Traffic Networks, Technical Report.
5.
Duarte
,
F.
, and
Ratti
,
C.
,
2018
, “
The Impact of Autonomous Vehicles on Cities: A Review
,”
J. Urban Technol.
,
25
(
4
), pp.
3
18
.
6.
Doshi
,
P.
,
Kapur
,
D.
, and
Iyer
,
R.
,
2017
, “
Hybridization-Bridge for Electrification
,”
2017 IEEE Transportation Electrification Conference (ITEC—India)
,
Pune, India
,
IEEE
, pp.
1
5
.
7.
Liu
,
J.
, and
Peng
,
H.
,
2008
, “
Modeling and Control of a Power-Split Hybrid Vehicle
,”
IEEE Trans. Control Syst. Technol.
,
16
(
6
), pp.
1242
1251
.
8.
Hong
,
S.
,
Kim
,
H.
, and
Kim
,
J.
,
2015
, “
Motor Control Algorithm for an Optimal Engine Operation of Power Split Hybrid Electric Vehicle
,”
Int. J. Autom. Technol.
,
16
(
1
), pp.
97
105
.
9.
Borhan
,
H. A.
,
Vahidi
,
A.
,
Phillips
,
A. M.
,
Kuang
,
M. L.
, and
Kolmanovsky
,
I. V.
,
2009
, “
Predictive Energy Management of a Power-Split Hybrid Electric Vehicle
,”
2009 American Control Conference
,
St. Louis, MO
,
IEEE
, pp.
3970
3976
.
10.
Wang
,
R.
, and
Lukic
,
S. M.
,
2012
, “
Dynamic Programming Technique in Hybrid Electric Vehicle Optimization
,”
2012 IEEE International Electric Vehicle Conference
,
Greenville, SC
,
IEEE
, pp.
1
8
.
11.
Pérez
,
L. V.
,
Bossio
,
G. R.
,
Moitre
,
D.
, and
García
,
G. O.
,
2006
, “
Optimization of Power Management in an Hybrid Electric Vehicle Using Dynamic Programming
,”
Math. Comput. Simul.
,
73
(
1–4
), pp.
244
254
.
12.
Zeng
,
X.
, and
Wang
,
J.
,
2015
, “
A Parallel Hybrid Electric Vehicle Energy Management Strategy Using Stochastic Model Predictive Control With Road Grade Preview
,”
IEEE Trans. Control Syst. Technol.
,
23
(
6
), pp.
2416
2423
.
13.
Huang
,
Y.
,
Wang
,
H.
,
Khajepour
,
A.
,
He
,
H.
, and
Ji
,
J.
,
2017
, “
Model Predictive Control Power Management Strategies for HEVs: A Review
,”
J. Power Sources
,
341
, pp.
91
106
.
14.
Hofman
,
T.
,
Steinbuch
,
M.
,
Van Druten
,
R.
, and
Serrarens
,
A.
,
2007
, “
Rule-Based Energy Management Strategies for Hybrid Vehicles
,”
Int. J. Electr. Hybrid Vehicles
,
1
(
1
), pp.
71
94
.
15.
Jalil
,
N.
,
Kheir
,
N. A.
, and
Salman
,
M.
,
1997
, “
A Rule-Based Energy Management Strategy for a Series Hybrid Vehicle
,”
Proceedings of the 1997 American Control Conference (Cat. No. 97CH36041)
,
Albuquerque, NM
, Vol.
1
,
IEEE
, pp.
689
693
.
16.
Paganelli
,
G.
,
Delprat
,
S.
,
Guerra
,
T.-M.
,
Rimaux
,
J.
, and
Santin
,
J.-J.
,
2002
, “
Equivalent Consumption Minimization Strategy for Parallel Hybrid Powertrains
,”
Vehicular Technology Conference. IEEE 55th Vehicular Technology Conference. VTC Spring 2002 (Cat. No. 02CH37367)
,
Birmingham, AL
, Vol.
4
,
IEEE
, pp.
2076
2081
.
17.
Škugor
,
B.
,
Deur
,
J.
,
Cipek
,
M.
, and
Pavković
,
D.
,
2014
, “
Design of a Power-Split Hybrid Electric Vehicle Control System Utilizing a Rule-Based Controller and an Equivalent Consumption Minimization Strategy
,”
Proc. Inst. Mech. Eng., Part D: J. Automobile Eng.
,
228
(
6
), pp.
631
648
.
18.
Yuan
,
Z.
,
Teng
,
L.
,
Fengchun
,
S.
, and
Peng
,
H.
,
2013
, “
Comparative Study of Dynamic Programming and Pontryagin’s Minimum Principle on Energy Management for a Parallel Hybrid Electric Vehicle
,”
Energies
,
6
(
4
), pp.
2305
2318
.
19.
Jeong
,
J.
,
Lee
,
D.
,
Kim
,
N.
,
Zheng
,
C.
,
Park
,
Y.-I.
, and
Cha
,
S. W.
,
2014
, “
Development of PMP-Based Power Management Strategy for a Parallel Hybrid Electric Bus
,”
Int. J. Precis. Eng. Manuf.
,
15
(
2
), pp.
345
353
.
20.
Ahmadizadeh
,
P.
,
Mashadi
,
B.
, and
Lodaya
,
D.
,
2017
, “
Energy Management of a Dual-Mode Power-Split Powertrain Based on the Pontryagin’s Minimum Principle
,”
IET Intell. Transp. Syst.
,
11
(
9
), pp.
561
571
.
21.
Lian
,
R.
,
Peng
,
J.
,
Wu
,
Y.
,
Tan
,
H.
, and
Zhang
,
H.
,
2020
, “
Rule-Interposing Deep Reinforcement Learning Based Energy Management Strategy for Power-Split Hybrid Electric Vehicle
,”
Energy
,
197
, p.
117297
.
22.
Wu
,
J.
,
He
,
H.
,
Peng
,
J.
,
Li
,
Y.
, and
Li
,
Z.
,
2018
, “
Continuous Reinforcement Learning of Energy Management With Deep Q Network for a Power Split Hybrid Electric Bus
,”
Appl. Energy
,
222
, pp.
799
811
.
23.
Yazar
,
O.
,
Coskun
,
S.
,
Li
,
L.
,
Zhang
,
F.
, and
Huang
,
C.
,
2023
, “
Actor-Critic TD3-Based Deep Reinforcement Learning for Energy Management Strategy of HEV
,”
2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)
,
Istanbul, Turkey
,
IEEE
, pp.
1
6
.
24.
Fagnant
,
D. J.
, and
Kockelman
,
K.
,
2015
, “
Preparing a Nation for Autonomous Vehicles: Opportunities, Barriers and Policy Recommendations
,”
Transp. Res. Part A: Policy Practice
,
77
, pp.
167
181
.
25.
Rashid
,
M. M.
,
Farzaneh
,
F.
,
Seyedi
,
M.
, and
Jung
,
S.
,
2023
, “
Evaluation of Risk Injury in Pedestrians’ Head and Chest Region During Collision With an Autonomous Bus
,”
Int. J. Crashworthiness
,
29
(
2
), pp.
1
11
.
26.
Martínez-Díaz
,
M.
, and
Soriguera
,
F.
,
2018
, “
Autonomous Vehicles: Theoretical and Practical Challenges
,”
Transp. Res. Procedia
,
33
, pp.
275
282
.
27.
Litman
,
T.
,
2020
, “Autonomous Vehicle Implementation Predictions: Implications for Transport Planning”.
28.
Ye
,
L.
, and
Yamamoto
,
T.
,
2019
, “
Evaluating the Impact of Connected-Dissipation of Stop-and-Go Waves Via and Autonomous Vehicles on Traffic Safety
,”
Physica A: Stat. Mech. Appl.
,
526
, p.
121009
.
29.
Stern
,
R. E.
,
Cui
,
S.
,
Delle Monache
,
M. L.
,
Bhadani
,
R.
,
Bunting
,
M.
,
Churchill
,
M.
,
Hamilton
,
N.
, et al.,
2018
, “
Dissipation of Stop-and-Go Waves Via Control of Autonomous Vehicles: Field Experiments
,”
Transp. Res. Part C: Emerg. Technol.
,
89
, pp.
205
221
.
30.
Ghasemi
,
M.
, and
Song
,
X.
,
2018
, “
Powertrain Energy Management for Autonomous Hybrid Electric Vehicles With Flexible Driveline Power Demand
,”
IEEE Trans. Control Syst. Technol.
,
27
(
5
), pp.
2229
2236
.
31.
Panday
,
A.
, and
Bansal
,
H. O.
,
2014
, “
A Review of Optimal Energy Management Strategies for Hybrid Electric Vehicle
,”
Int. J. Vehicular Technol.
,
2014
(
1
), p.
160510
.
32.
Zulkefli
,
M. A. M.
,
Zheng
,
J.
,
Sun
,
Z.
, and
Liu
,
H. X.
,
2014
, “
Hybrid Powertrain Optimization With Trajectory Prediction Based on Inter-Vehicle-Communication and Vehicle-Infrastructure-Integration
,”
Transp. Res. Part C: Emerg. Technol.
,
45
, pp.
41
63
.
33.
Kim
,
N.
,
Cha
,
S. W.
, and
Peng
,
H.
,
2011
, “
Optimal Equivalent Fuel Consumption for Hybrid Electric Vehicles
,”
IEEE Trans. Control Syst. Technol.
,
20
(
3
), pp.
817
825
.
34.
Kim
,
H.
, and
Kum
,
D.
,
2016
, “
Comprehensive Design Methodology of Input-and Output-Split Hybrid Electric Vehicles: In Search of Optimal Configuration
,”
IEEE/ASME Trans. Mechatron.
,
21
(
6
), pp.
2912
2923
.
35.
Kim
,
S. J.
,
Kim
,
K. -S.
, and
Kum
,
D.
,
2015
, “
Feasibility Assessment and Design Optimization of a Clutchless Multimode Parallel Hybrid Electric Powertrain
,”
IEEE/ASME Trans. Mechatron.
,
21
(
2
), pp.
774
786
.
36.
Kim
,
N.
,
Rousseau
,
A.
, and
Lee
,
D.
,
2011
, “
A Jump Condition of PMP-Based Control for PHEVs
,”
J. Power Sources
,
196
(
23
), pp.
10380
10386
.
37.
Anderson
,
S. J.
,
Peters
,
S. C.
,
Pilutti
,
T. E.
, and
Iagnemma
,
K.
,
2010
, “
An Optimal-Control-Based Framework for Trajectory Planning, Threat Assessment, and Semi-Autonomous Control of Passenger Vehicles in Hazard Avoidance Scenarios
,”
Int. J. Vehicle Autonom. Syst.
,
8
(
2–4
), pp.
190
216
.
38.
Foderaro
,
G.
,
Ferrari
,
S.
, and
Wettergren
,
T. A.
,
2014
, “
Distributed Optimal Control for Multi-agent Trajectory Optimization
,”
Automatica
,
50
(
1
), pp.
149
154
.
39.
Jantapremjit
,
P.
, and
Wilson
,
P. A.
,
2007
, “
Control and Guidance for Homing and Docking Tasks Using an Autonomous Underwater Vehicle
,”
2007 IEEE/RSJ International Conference on Intelligent Robots and Systems
,
San Diego, CA
,
IEEE
, pp.
3672
3677
.
40.
Ma
,
J.
,
Zheng
,
Y.
, and
Wang
,
L.
,
2015
, “
LQR-Based Optimal Topology of Leader-Following Consensus
,”
Int. J. Robust Nonlinear Control
,
25
(
17
), pp.
3404
3421
.
41.
Zhang
,
H.
,
Feng
,
T.
,
Yang
,
G.-H.
, and
Liang
,
H.
,
2014
, “
Distributed Cooperative Optimal Control for Multiagent Systems on Directed Graphs: An Inverse Optimal Approach
,”
IEEE Trans. Cybern.
,
45
(
7
), pp.
1315
1326
.
42.
Zhang
,
H.
,
Zhang
,
J.
,
Yang
,
G.-H.
, and
Luo
,
Y.
,
2014
, “
Leader-Based Optimal Coordination Control for the Consensus Problem of Multiagent Differential Games Via Fuzzy Adaptive Dynamic Programming
,”
IEEE Trans. Fuzzy Syst.
,
23
(
1
), pp.
152
163
.
43.
Yao
,
Q.
,
Tian
,
Y.
,
Wang
,
Q.
, and
Wang
,
S.
,
2020
, “
Control Strategies on Path Tracking for Autonomous Vehicle: State of the Art and Future Challenges
,”
IEEE Access
,
8
, pp.
161211
161222
.
44.
Atkinson
,
C.
,
Lewis
,
A.
,
Salvia
,
A.
, and
Vishwanathan
,
G.
,
2015
, “
Powertrain Innovations for Connected and Autonomous Vehicles
,” Proceedings of Powertrain Innovations Workshop, Advanced Research Projects Agency -Energy, pp.
1
8
.
45.
Kargar
,
M.
,
Zhang
,
C.
, and
Song
,
X.
,
2022
, “
Integrated Optimization of Powertrain Energy Management and Vehicle Motion Control for Autonomous Hybrid Electric Vehicles
,”
2022 American Control Conference (ACC)
,
Atlanta, GA
, pp.
404
409
.
46.
Zhang
,
L.
,
Ye
,
X.
,
Xia
,
X.
, and
Barzegar
,
F.
,
2020
, “
A Real-Time Energy Management and Speed Controller for an Electric Vehicle Powered by a Hybrid Energy Storage System
,”
IEEE Trans. Ind. Inf.
,
16
(
10
), pp.
6272
6280
.
47.
Wang
,
W.
,
Guo
,
X.
,
Yang
,
C.
,
Zhang
,
Y.
,
Zhao
,
Y.
,
Huang
,
D.
, and
Xiang
,
C.
,
2022
, “
A Multi-objective Optimization Energy Management Strategy for Power Split HEV Based on Velocity Prediction
,”
Energy
,
238
, p.
121714
.
48.
Zhao
,
L.
,
Mahbub
,
A. I.
, and
Malikopoulos
,
A. A.
,
2019
, “
Optimal Vehicle Dynamics and Powertrain Control for Connected and Automated Vehicles
,”
2019 IEEE Conference on Control Technology and Applications (CCTA)
,
Hong Kong, China
,
IEEE
, pp.
33
38
.
49.
Mahbub
,
A.
, and
Malikopoulos
,
A. A.
,
2019
, “Concurrent Optimization of Vehicle Dynamics and Powertrain Operation Using Connectivity and Automation”. arXiv preprint arXiv:1911.03475.
50.
Ma
,
G.
,
Ghasemi
,
M.
, and
Song
,
X.
,
2017
, “
Integrated Powertrain Energy Management and Vehicle Coordination for Multiple Connected Hybrid Electric Vehicles
,”
IEEE Trans. Vehicular Technol.
,
67
(
4
), pp.
2893
2899
.
51.
Engbroks
,
L.
,
Görke
,
D.
,
Schmiedler
,
S.
,
Strenkert
,
J.
, and
Geringer
,
B.
,
2018
, “
Applying Forward Dynamic Programming to Combined Energy and Thermal Management Optimization of Hybrid Electric Vehicles
,”
IFAC-PapersOnLine
,
51
(
31
), pp.
383
389
.
52.
Zhang
,
F.
,
Hu
,
X.
,
Langari
,
R.
,
Wang
,
L.
,
Cui
,
Y.
, and
Pang
,
H.
,
2021
, “
Adaptive Energy Management in Automated Hybrid Electric Vehicles With Flexible Torque Request
,”
Energy
,
214
, p.
118873
.
53.
Sánchez
,
M.
,
Delprat
,
S.
, and
Hofman
,
T.
,
2020
, “
Energy Management of Hybrid Vehicles With State Constraints: A Penalty and Implicit Hamiltonian Minimization Approach
,”
Appl. Energy
,
260
, p.
114149
.
54.
Serrao
,
L.
,
Onori
,
S.
, and
Rizzoni
,
G.
,
2009
, “
ECMS As a Realization of Pontryagin’s Minimum Principle for HEV Control
,” In 2009 American control conference,
IEEE
, pp.
3964
3969
.
55.
Elbert
,
P.
,
Ebbesen
,
S.
, and
Guzzella
,
L.
,
2012
, “
Implementation of Dynamic Programming for N-Dimensional Optimal Control Problems With Final State Constraints
,”
IEEE Trans. Control Syst. Technol.
,
21
(
3
), pp.
924
931
.
56.
Kargar
,
M.
,
Zhang
,
C.
, and
Song
,
X.
,
2023
, “
Integrated Optimization of Power Management and Vehicle Motion Control for Autonomous Hybrid Electric Vehicles
,”
IEEE Trans. Vehicular Technol.
,
72
(
9
), pp.
11147
11155
.
57.
Kim
,
N.
,
Cha
,
S.
, and
Peng
,
H.
,
2010
, “
Optimal Control of Hybrid Electric Vehicles Based on Pontryagin’s Minimum Principle
,”
IEEE Trans. Control Syst. Technol.
,
19
(
5
), pp.
1279
1287
.
58.
Bellman
,
R.
,
1966
, “
Dynamic Programming
,”
Science
,
153
(
3731
), pp.
34
37
.
59.
Lewis
,
F. L.
, and
Vrabie
,
D.
,
2009
, “
Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control
,”
IEEE Circu. Syst. Mag.
,
9
(
3
), pp.
32
50
.
60.
Heydari
,
A.
, and
Balakrishnan
,
S. N.
,
2012
, “
Finite-Horizon Control-Constrained Nonlinear Optimal Control Using Single Network Adaptive Critics
,”
IEEE Trans. Neural Netw. Learn. Syst.
,
24
(
1
), pp.
145
157
.
61.
Li
,
C.
,
Ding
,
J.
,
Lewis
,
F. L.
, and
Chai
,
T.
,
2021
, “
A Novel Adaptive Dynamic Programming Based on Tracking Error for Nonlinear Discrete-Time Systems
,”
Automatica
,
129
, p.
109687
.
62.
Kiumarsi
,
B.
,
AlQaudi
,
B.
,
Modares
,
H.
,
Lewis
,
F. L.
, and
Levine
,
D. S.
,
2019
, “
Optimal Control Using Adaptive Resonance Theory and Q-Learning
,”
Neurocomputing
,
361
, pp.
119
125
.
63.
Abu-Khalaf
,
M.
, and
Lewis
,
F. L.
,
2005
, “
Nearly Optimal Control Laws for Nonlinear Systems With Saturating Actuators Using a Neural Network HJB Approach
,”
Automatica
,
41
(
5
), pp.
779
791
.
64.
Khajeh-Hosseini
,
M.
, and
Talebpour
,
A.
,
2021
, “
A Novel Clustering Approach to Identify Vehicles Equipped With Adaptive Cruise Control in a Vehicle Trajectory Data
,” 100th Annual Meeting of the Transportation Research Board of National Academies, Washington, DC.
65.
Kargar
,
M.
,
Sardarmehni
,
T.
, and
Song
,
X.
,
2022
, “
Optimal Powertrain Energy Management for Autonomous Hybrid Electric Vehicles With Flexible Driveline Power Demand Using Approximate Dynamic Programming
,”
IEEE Trans. Vehicular Technol.
,
71
(
12
), pp.
12564
12575
.
66.
Onori
,
S.
,
Serrao
,
L.
, and
Rizzoni
,
G.
,
2016
, “Hybrid Electric Vehicles: Energy Management Strategies”.