Abstract
Technology advancement for on-road vehicles has gained significant momentum in the past decades, particularly in the field of vehicle automation and powertrain electrification. The optimization of powertrain controls for autonomous vehicles typically involves a separated consideration of the vehicle’s external dynamics and powertrain dynamics, with one key aspect often overlooked. This aspect, known as flexible power demand, recognizes that the powertrain control system does not necessarily have to precisely match the power requested by the vehicle motion controller at all times. Leveraging this feature can lead to control designs achieving improved fuel economy by adding an extra degrees-of-freedom to the powertrain control while maintaining safety and drive comfort. The present research investigates the use of an approximate dynamic programming (ADP) approach to develop a powertrain controller, which takes into account the flexibility in power demand within the ADP framework. The concept of reachable sets is incorporated into the ADP framework to ensure safety, improve ride comfort, and enhance the accuracy of the optimization solution. The formulation is based on an autonomous hybrid electric vehicle, while the methodology can also be applied to other types of vehicles. It is also found that necessary customization of the ADP algorithm is needed for this particular control problem to prevent convergence issues. Finally, a case study is presented to evaluate the effectiveness of flexible power demand, as addressed by the ADP method. The experiment demonstrates a 14.1% improvement in fuel economy compared to a scenario without flexible power demand.
1 Introduction
Transportation accounts for over 70% of the total oil consumption in the United States, and almost 65% of the U.S transportation consumption is from passenger vehicles [2]. The need for lower fuel consumption and cleaner powertrain operation has been driving continuous development in automotive technologies, specifically in the field of powertrain electrification/hybridization [3]. Meanwhile, research on autonomous vehicles has gained major momentum recently, and a much higher percentage of autonomous/semi-autonomous vehicles is expected to be on the road in the near future [4,5]. Although each technology has its own advantages, combining powertrain electrification/hybridization and autonomy has the potential to significantly enhance the fuel efficiency of vehicles.
Hybridization is an intermediate step on the path toward full electrification [6]. A hybrid electric vehicle (HEV) is propelled by an internal combustion engine (ICE) and a battery pack that interacts with electric motors [7]. The presence of an alternative power source gives the powertrain an extra degrees-of-freedom, and thus, the engine can be controlled to operate close to its optimal region [8]. The deficit or surplus of the engine power is taken from or stored in the battery pack. The task of optimally supplying the requested power from the power sources is called powertrain energy management or power-split management [9]. To solve this optimization problem, several methodologies have been applied in the literature, including dynamic programming (DP) [10,11], model predictive control [12,13], rule-based methods [14,15], equivalent consumption minimization strategy [16,17], Pontryagin’s minimum principle (PMP) [18–20], and reinforcement learning [21–23]. Powertrain energy management is a crucial task since weak power management can result in battery overcharge, battery drainage, and eventually poor fuel economy.
Proper control of autonomous vehicles has the potential to dramatically reduce congestion, injuries, and fuel consumption [24–26]. For example, autonomous vehicles (AVs) can have shorter gaps with their leading vehicles which results in improving traffic throughput [27,28]. Also, researchers have shown AVs can improve fuel consumption by eliminating the stop-and-go waves in traffic [29]. Combining autonomy and powertrain hybridization results in an autonomous HEV. As shown in Fig. 1, an autonomous HEV has two levels of control: an upper-level controller and a lower-level controller. The upper-level controller is in charge of optimizing the external dynamics of the vehicle and decides how much driveline torque is needed to meet the maneuvering goals. The lower-level controller is responsible for efficiently allocating the requested driving torque between the ICE and the electric power source.
Most existing research studies these two levels separately: some research studies focus only on the lower-level control design (powertrain energy management) [31–36] and some solely studied the upper-level control design (motion tracking/coordination) [32,37–43]. Studies from the U.S. Department of Energy have shown that augmenting these two optimization problems can offer fuel-saving potentials that cannot be achieved by powertrain optimization alone [44,45]. The underlying reason is that when these levels are solved separately, the upper-level controller cannot observe the dynamics of the powertrain, and thus, the requested driving power may not consider the powertrain’s most efficient condition.
The benefits of joint optimization of these two levels have urged the researchers to focus on solving the speed control and power energy management in a unified framework [46–50]. To mitigate the computational burden due to the high coupling between the dynamics of the upper level and lower level, Refs. [46–49] proposed a hierarchical control framework to optimize (1) the vehicle’s speed profile and (2) the powertrain efficiency of the vehicle for the optimal speed profile derived in (1). Although these studies have shown promising results, these two levels are still solved separately, and the full potential of the integrated optimization in fuel minimization is still unrealized. Ma et al. [50] devised a control architecture to solve the vehicle dynamics and the powertrain dynamics in one optimization problem. Nevertheless, the drawback in their methodology (forward DP) is that it depends on the system’s initial condition and new optimization is needed anytime the initial condition changes [51].
This paper will explore a customized approach in which still two levels are optimized separately and the lower-level controller receives the power demand from the upper-level controller. However, the lower-level controller is given an extra degrees-of-freedom by using the unique property in autonomous HEVs, i.e., flexibility in power demand. This property implies that the vehicle does not have to precisely match the power demanded by the upper-level controller in real-time. There can be some variations in power that deviate from what is required by the upper-level controller. This feature is only achievable in autonomous HEVs where both upper-level and lower-level controllers operate in the background, and no human driver is intervening. This approach has been sought by some researchers recently [30,52] which has shown promising results in terms of fuel consumption reduction. However, the methodologies they have used make them less practical in the application of HEV power management. In Ref. [30], PMP was used to optimize the powertrain control optimization. For conventional HEVs, PMP is a powerful method for addressing powertrain energy management. However, when constraints on states exist in the problem, it can be challenging to implement PMP to meet them [53,54]. Thus, for autonomous HEVs with flexible driveline power demand wherein state constraints are applied [30], PMP can cause some convergence issues. Also, Zhang et al. [52] deployed an adaptive equivalent consumption minimization strategy to optimize power splits for an automated parallel HEV with flexible power demand. However, the main drawback in Ref. [52] is that the proposed method could not satisfy the final conditions.
The research presented in this paper investigates a customized approximate dynamic programming (ADP) approach, which is used for the first time to address the issue of powertrain optimization with flexible power demand. Solving this problem using ADP is independent of the initial condition of the system and requires much less memory storage capacity compared to the conventional DP. However, directly using standard ADP to solve the optimal control problem can be challenging. The standard ADP method needs the dynamical system to have a quadratic cost function to minimize, as well as an affine structure of the inputs. However, these conditions do not hold in the energy management of HEVs. Also, the system has nonlinear constraints on states and control inputs which standard ADP cannot take care of. Therefore, two methods are adopted to address these challenges. Initially, the ADP method is modified to facilitate its use in handling non-quadratic cost functions with non-affine inputs. Second, the concept of reachable set [55] is adopted in implementing ADP to meet the nonlinear constraints on states and control inputs. Studies have shown that utilizing the concept of reachable sets can improve the accuracy and the efficiency of the optimization solution noticeably [55].
2 Hybrid Electric Vehicle Dynamics
2.1 Upper-Level Dynamics.
2.2 Lower-Level Dynamics
2.2.1 Powertrain Dynamics.
In this study, a power-split hybrid powertrain is considered (Fig. 3). This mechanism which was commercialized by Toyota is known as Toyota hybrid system (THS). THS is formed of a battery pack, an ICE, a coupler gear set, a planetary gear set, an inverter, and two electric machines. In the literature, the electric machine that mostly acts as motion power output is labeled as the “motor,” and the one that is expected to mostly operate as an electricity generation unit is called the “generator.” The powertrain splits the power between the ICE and the battery through the planetary gear set that consists of three main elements: the sun, the carrier, and the ring. The engine is linked to the carrier, and the sun is connected to the generator.
Gear G0 merges the ring’s power and the motor’s power through two identical gears, G1 and G2, to drive the driveline. Neglecting the inertia of the moving parts in the powertrain and using the power balance at the planetary and the coupler gear sets, the following algebraic equations between different components hold:
2.2.2 Battery Dynamics.
2.2.3 Fuel Consumption Dynamics.
2.2.4 Flexible Power Demand Dynamics.
As stated before, the upper-level controller determines the torque and the power required by the vehicle based on the upper-level dynamics. Conventionally, these power and torque demands are fed to the lower-level controller. The lower-level controller determines the distribution of power among the ICE, electrical machines, and the battery pack to meet the power demand. In the majority of the research studies, the driveline power demand is strictly met by the lower-level controller, i.e., fixed power demand. However, in an autonomous HEV, there exists a uniqueness which allows the lower-level controller to have certain degrees of flexibility to meet the instantaneous power requested by the upper level called flexible power demand. This flexibility in power demand, which inherently introduces a corresponding flexibility in torque demand as illustrated in Fig. 5, adds an extra degrees-of-freedom to the powertrain control optimization which can further improve fuel efficiency. Flexible power demand and flexible torque demand refer to the same method and are used interchangeably throughout this paper.
Variations in the power supplied by the powertrain will lead to deviations in the anticipated acceleration determined by the upper-level controller. Consequently, there will be corresponding deviations in the expected velocity and displacement of the vehicle. However, in order to ensure that the vehicle reaches its intended destination at the end of the drive cycle, allowances for velocity and displacement deviations are made only during intermediate steps. In other words, although the longitudinal displacement, velocity, and driveline torque demand may differ from their expected values at each intermediate moment determined by the upper-level controller, the deviation in the longitudinal displacement and velocity must diminish as time approaches the end of the considered time horizon (note that the time horizon can be a short period, and the optimization can be done for each time period going forward one by one).
2.3 State Space Model.
3 Optimal Control Formulation
3.1 Standard Approximate Dynamic Programming Methods.
3.2 Customized Approximate Dynamic Programming Method.
To address these issues, in this study, we propose to modify the original ADP method to offer a customized ADP method for the optimization problem considered. First of all, this customized ADP method is applicable to both control affine and non-affine systems. Furthermore, it is capable of handling non-quadratic nonlinear cost functions associated with both states and inputs. Finally, it adopts a way to handle complex constraints using the concept of reachable sets.
3.3 Reachable Set.
To address the complex constraints associated with the optimization problem, the concept of reachable sets is adopted to the customized ADP framework.4 The reachable set, at each step , is defined as the set of points for which there is at least one sequence of actions under which the system transforms from the current state to a state in the desired region at the end of the time horizon. Figure 7 illustrates the concept of reachable sets for a two-dimensional dynamical system. For instance, in this figure, is a point in the reachable set at step since there is at least a sequence of actions which can transform to a point in the desired region at step . Note that this sequence of actions may not necessarily be the optimal sequence of actions from to the end of the horizon.
In this figure, each rectangle represents the horizon of the state space at its corresponding time index, and the shaded region within the rectangle illustrates the reachable set at that time index. To determine the reachable set at each step, we proceed backward in time.
The selection of the minimum and maximum bounds for the time-variant state and input grids shall be such that the control design goals are ensured. In the case of this study for example, a large value for or could cause the vehicle to hit the rear or front vehicle. Even in the case of no surrounding traffic where any or would not cause an accident, still a large value for or will let the controller to be allowed to have a harsh deviation from the requested drive torque and hence could be a challenge for drive comfort. Thus, a justifiable tolerance will be the one that empowers the control designer to use the reachable sets to enforce the control design goals.
3.4 Approximate Dynamic Programming Training Within the Reachable Set.
After completing the training process, the trained neural network is utilized to obtain the sub-optimal constrained control input sequence given the initial condition . The implementation algorithm is further described in Algorithm 2.
Training neural network
1 Select a small positive number , and a big enough integer ;
2 fordo
3 Choose different random training samples in the reachable set where ;
4 For each training sample find using Eq. (44);
5 Initialize with random parameters;
6 fordo
7 Update the neural network and find the parameters to approximate using backpropagation on the entire training samples;
8 ifthen
9 Break;
end
end
10 ;
end
Implementation
1 for
2 ;
3
end
As reinforced in Algorithms 1 and 2, the proposed method is a global optimization method meaning that it solves the powertrain energy management problem for the whole drive cycle in a single simulation. Then after the end of the time horizon, a new set of control commands is optimized for the next time horizon. This process will continue repeatedly.
4 Simulation Results
In this section, the performance of the proposed controller is evaluated using a real-world data set [64]. The data set contains recorded information on the external kinematics of vehicles, such as velocity, acceleration, and headway to leading and rear vehicles. The data were collected on a 150-meter-long section of the I-35 Corridor in Austin, TX. To assess the effectiveness of the controller, a random vehicle from the data set is selected. Figure 8 displays the baseline velocity profile of the selected vehicle.
It is important to note that is one of the inputs referenced in Eqs. (24) and (25). As explained in Eq. (43), the minimum and maximum limits for the input space can vary over time. This time-varying limitation is particularly useful in segments of the drive cycle with already high torque demand. In such segments, it is advisable to reduce the deviations limit in to ensure the HEV’s drive comfort is not compromised. Since the drive cycle in this study does not involve excessively high torque demand, these limitations are kept constant for simplicity throughout the entire drive cycle.
The control performance of the system utilizing the proposed control method is evaluated for an initial condition . The performance results are depicted in Figs. 9–13. To provide a basis for comparison, the performance of the system under the baseline approach is also illustrated in Figs. 11–13. In the baseline optimization, the flexibility in power demand is not considered (fixed demand), and the controller’s task is to supply the precise amount of power required by the upper-level dynamics. Consequently, in the baseline optimization, the state space is reduced to [], and the input space consists of [].

Battery state of charge history of the vehicle before and after applying flexible power demand method
5 Conclusion
Traditionally, vehicle coordination optimization and powertrain energy management have been studied separately in the context of HEVs. However, this paper introduces a novel approach that combines and optimizes these two levels simultaneously by leveraging a unique feature which exists in autonomous HEVs: flexible power demand. The concept of flexible power demand acknowledges that the powertrain does not necessarily have to meet the power required by the external dynamics of an autonomous HEV at every step. By exploiting this flexibility, the proposed powertrain energy management method aims to enhance fuel economy.
The optimization problem is formulated within the framework of a customized approximate dynamic programming method, utilizing the notion of reachable sets. To evaluate the effectiveness of the proposed method, a case study is conducted using real-world data. The results demonstrate a significant 14.1 % improvement in fuel consumption compared to a conventional optimization method that employs fixed power demand. This highlights the potential of the proposed approach in achieving enhanced fuel efficiency in HEVs.
Acknowledgment
This work was partially supported by the National Science Foundation under Grant No. 1826410.
Conflict of Interest
There are no conflicts of interest.
Data Availability Statement
The authors attest that all data for this study are included in the paper.
Appendix
The environmental characteristics, vehicle specifications, and tuning parameters used for training the neural networks in this study are provided in Table 2.
Specifications of the environment, vehicle system, and neural network
Parameter | Value | Parameter | Value | Parameter | Value | Parameter | Value |
---|---|---|---|---|---|---|---|
m | 1350 kg | r | 0.28 m | 0.078 m | 0.030 m | ||
0.007 | 0.3 | ||||||
3.9 | 0.9 | 0.9 | 202 V | ||||
23400 A.s | 3.5 m | ||||||
2.5 m/s | 50% | 70% | |||||
150 N.m | 0 rad/s | 450 rad/s | |||||
2.0 m | 0.5 m | ||||||
1.5 m/s | 0.5 m/s | ||||||
53% | 67% | 59% | 63% | ||||
0.05 | 15 | 7000 | 2.5 | ||||
2.4 g/% |
Parameter | Value | Parameter | Value | Parameter | Value | Parameter | Value |
---|---|---|---|---|---|---|---|
m | 1350 kg | r | 0.28 m | 0.078 m | 0.030 m | ||
0.007 | 0.3 | ||||||
3.9 | 0.9 | 0.9 | 202 V | ||||
23400 A.s | 3.5 m | ||||||
2.5 m/s | 50% | 70% | |||||
150 N.m | 0 rad/s | 450 rad/s | |||||
2.0 m | 0.5 m | ||||||
1.5 m/s | 0.5 m/s | ||||||
53% | 67% | 59% | 63% | ||||
0.05 | 15 | 7000 | 2.5 | ||||
2.4 g/% |
Footnotes
A preliminary work was presented in a conference [1]. The journal paper is significantly different from the conference version. More detailed algorithm derivation and explanation, adaptation of rechable set for ADP training, and more testing results are presented in the submitted journal paper. In addition, major rewordings are conducted to avoid being repetitive.
Properties of the vehicle system and specifications of the environment mentioned in this section can be found in Table 2.
The reachable set’s minimum and maximum range for the state space and input space used in this study are listed in Table 2.
The tuning parameters used to train the neural networks mentioned in this study are provided in Table 2.