Abstract
Predicting discharge capacities of lithium-ion batteries (LIBs) is essential for safe battery operation in electric vehicles (EVs). In this paper, a convolutional neural network-long short term memory (CNN-LSTM) approach is proposed to estimate the discharge capacity of LIBs. The parameters such as the voltage, current, temperature, and charge/discharge capacity are recorded from a battery management system (BMS) at various stages of the charge–discharge cycles. The experiments are conducted to obtain the data at different cycles, where each cycle is divided into four steps. Each testing cycle comprises charging, rest, discharging, and rest. In the predictive model, the initial layers are convolutional layers that help in feature extraction. Then, the long and short term memory layer is used to retain or forget related information. Finally, the prediction is completed by selecting the corresponding activation function. The evaluation model is established via the multiple train test split method. The lower values of weighted mean squared error suggest that discharge capacity estimation using CNN-LSTM is a reliable method. The CNN-LSTM approach can further be compiled in BMSs of EVs to get real-time status for state of charge and state of health values.
1 Introduction
Because of the high energy per unit mass relative to other electrical energy storage systems, lithium-ion batteries (LIBs) are widely used as power sources in industrial equipment [1–4]. They also have better temperature performance, high power-to-weight ratio, high energy efficiency, and low self-discharge, which gives them prolonged life. Most electric vehicles (EVs) use these LIBs [5–8].
Since LIBs primarily act as an energy storage device in machines requiring high power, such as EVs, the capacity of each LIB is a significant factor in determining the performance of the entire pack [9,10]. However, the capacity of LIB decreases with the increase of number of cycles, known as aging effect of batteries [11]. The performance of LIB declines due to the changes in the battery internal structure during each cycle. To ensure the safety and reliability of individual cells, a battery management system (BMS) needs to estimate the state of health (SOH) and state of charge (SOC) [11]. The discharge capacity of a battery is one of the most important factors for determining SOH. By predicting discharge capacity, it is possible to estimate the lifetime of a battery cell which prevents forthcoming failures by replacing the cell at the right time [12]. Generally, the battery is considered to reach its end of life (EoL) when the battery capacity decreases to 70–80% of its initial capacity, or the battery resistance increases to 160% of its initial resistance.
To predict battery capacity, the parameters such as current, voltage, charge, and temperature should be measured by the BMS [13]. Based on traditional mechanism analysis, researchers have designed various data-driven research methods, such as neural network [14,15], Box-Cox transformation [16], support vector machines [17,18], autoregressive approach [19], and others [20,21]. When corresponding battery data are known, these data-driven models can better train data, extract features, and establish correlations between parameters.
The capacity estimation using traditional data-driven models does not study the characteristics of data. It only extracts features and fits machine learning models. Capacity is closely related to the prediction of SOH, SOC, and other states, and they can all characterize the battery usage status. Dai et al. [22] proposed a multiple regression model combined with stress effects to calculate SOC accurately. Moreover, the standard Kalman filter and improved Kalman filter have been used with a support vector machine for SOC and SOH estimation [23]. The Bayesian Monte Carlo method for SOH and remaining useful life (RUL) prediction has been developed in Ref. [24]. Zheng et al. [25] used the incremental capacity analysis and differential voltage analysis methods to accomplish the battery estimation of SOC and capacity. Yang et al. [26] realized the prediction of battery capacity and SOH through an equivalent circuit model, battery degradation model, and adaptive extended Kalman filtering. Stroe and Schaltz [27] studied the mechanism of battery capacity degradation through incremental capacity analysis, constructed corresponding capacity and SOH prediction models, and verified the accuracy through experiments.
Artificial neural networks (ANNs) have been used for battery state estimation in a novel way [15]. ANN helps capture the non-linearity in the data and helps find regular interval patterns using historical information stored in data [28]. However, during the actual experiment, time-series data are not necessarily regular and may be random. In that case, it would be difficult for simple neural networks to capture the data characteristics accurately. In such instances, the long short term memory (LSTM) network performs better. It can capture relevant information with long intervals or delays in the time-series [29]. A convolutional neural network (CNN)-LSTM model is a hybrid model consisting of a CNN followed by an LSTM network [30–32]. This architecture helps in extracting features and information stored at various intervals in data. It would be crucial to complement the advantages of both CNN and LSTM and explore its application for battery state estimation.
The main contributions of this paper are as follows: (1) a method for predicting battery discharge capacity through CNN-LSTM is proposed, and its feasibility is verified through experiments. (2) Through the multiple train test split method, the model parameters are selected reasonably and the prediction accuracy is improved.
The rest of the paper comprises problem definition, data analysis, the architecture of CNN-LSTM network, hyperparameter tuning, model evaluation, and conclusion.
2 Problem Definition
Experimental data were collected over four months, and it is measured directly from the laboratory in 2018. The basic information of the battery is shown in Table 1 [11,33]. The data acquisition process contained four steps, which are charging, rest, discharging, and rest. The plots for charge capacity, voltage, current, and discharge capacity in one cycle are shown in Fig. 1.
The basic information of LIB and test
Project | Information |
---|---|
Battery | 18650 |
Manufacturing company | Xingyuan Electronic Technology Co., Ltd. |
Initial capacity | 2.4 A h |
Working voltage | 2.7 –4.2 V |
Working current | 1.3 A |
Test device | Arbin testing system |
Environment temperature | 25 °C |
Project | Information |
---|---|
Battery | 18650 |
Manufacturing company | Xingyuan Electronic Technology Co., Ltd. |
Initial capacity | 2.4 A h |
Working voltage | 2.7 –4.2 V |
Working current | 1.3 A |
Test device | Arbin testing system |
Environment temperature | 25 °C |
According to the graphs in our data, the charging process can be divided into two parts. In one part, the current remains constant. In other, the voltage remains constant. When the current is steady at 1.3 A, the voltage increases until it reaches a value of 4.2 V. Then, the current decreases and reaches 0.05 A at this constant voltage. We set a 30 min rest time to reduce data errors. The discharging step occurs at constant current −1.3 A until the voltage reaches 2.75 V.
Since it is difficult to measure the SOH when its capacity decreases directly, we formulate a method to measure and analyze other parameters to establish a relationship. These parameters are easy to measure and are more accurate. The test procedures of LIB key parameters are shown in Table 2.
Test procedures of LIB
Step | Process | Value | Cutoff value |
---|---|---|---|
1 | Charging | Constant current: 1.3 A Constant voltage: 4.2 V | Voltage limit = 4.2 V Charging current ≤0.05 A |
2 | Rest | Time: 30 min | |
3 | Discharging | Constant current: −1.3 A | Discharging voltage ≤2.75 V |
4 | Rest | Time: 30 min |
Step | Process | Value | Cutoff value |
---|---|---|---|
1 | Charging | Constant current: 1.3 A Constant voltage: 4.2 V | Voltage limit = 4.2 V Charging current ≤0.05 A |
2 | Rest | Time: 30 min | |
3 | Discharging | Constant current: −1.3 A | Discharging voltage ≤2.75 V |
4 | Rest | Time: 30 min |
These data are periodic. Each parameter current, voltage, and charge capacity repeat themselves after a cycle completes. We use the trend data, seasonal data, and residuals to decompose our real data, as shown in Fig. 2. It means the battery discharge capacity is a set of time-series data. In order to better express its characteristics, we have made an autocorrelation graph of the current battery capacity and the previous capacity observations, as shown in Fig. 3.

Autocorrelation in discharge capacity value: (a) whole autocorrelation and (b) partial autocorrelation
In this paper, we have used the CNN-LSTM network. LSTM helps capture sequence pattern information, while CNN network helps filter out noise of input data and extract important features.
3 Convolutional Neural Network-Long Short Term Memory Networks
3.1 Convolutional Neural Network Architecture.
The convolutional layer in CNN is generally followed by a max-pooling layer and a fully connected dense layer. The convolutional layer has been known to extract features from segments of one-dimensional (1D) data. The extracted features from data segments are mapped to the internal segment features. The 1D CNN is generally implemented to derive features from different length segments of the overall dataset, where it is not so important where the information is located in the segment. Here we have used a sequence of length 48.
After this comes the max-pooling layer. Since the output feature after the convolutional layer is sensitive to the feature location, we down sample it. The pooling layer extracts information from different segments of feature maps to down sample the map. Our model uses the max-pooling method, a commonly used pooling method that summarizes the most activated feature information.
The first fully connected layer takes the result of the pooling layer and applies weight to give a correct prediction. The final fully connected layer provides the final output.
3.2 Long Short Term Memory Architecture.
LSTM network helps us selectively remember and forget things. It helps in capturing information stored at various intervals. The information at any state depends on three values: the information stored in memory after previous state, the output of previous unit, and the current state input.
The output gate decides the output of the current layer. A sigmoid function filters the part of unit to generate a result ot. This is then multiplied with the tanh layer to generate the final output matrix. The input gate, forget gate, and output gate together can learn long-term dependencies simultaneously, avoiding gradients vanishing and exploding. The entire process is shown in Fig. 4.
4 Model Formulation Procedure
To make the capacity prediction of LIB reliable, we use the CNN-LSTM model here. The CNN-LSTM model architecture is summarized in Fig. 5. There are four inputs, and each input is discretized into 48 time-steps. Hence, the shape of input matrix here is 48 * 4. Each column of the matrix represents each input. So the basic structure of the input matrix is [Vi Ii Ci Ti]48*4.
The input data are first fed into the initial 1D CNN layer. The first layer is a feature detecting layer. In our practical model, the height of the filter is 3. One filter allows the model to learn one feature. To learn more features, we define 64 such filters in our first layer. Generally, the output dimension of 1D CNN is calculated as . The stride defaults to the length of one unit. But if several such layers are stacked, the input dimension at one point might become smaller than kernel size. To prevent loss of input data, we add padding. This ensures that the output dimension is similar to the input dimension. Since there are 64 filters, the output after the first 1D convolutional layer is a matrix of dimension 48 * 64. Then we passed it into the max-pooling layer to reduce the output complexity and prevent overfitting. The shape of the output data after the pooling layer reduces to 24 * 48. Another sequence of the 1D convolutional layer follows to extract higher-level features. Then, pass the output of this layer to the LSTM network to capture information at different time intervals. Finally, a sequence of dense layers is added to this model. The final dense layer reduces the dimension of the output from 1 * 32 to 1 * 1. The activation function used in every layer is rectified linear units except for the last layer, where it is linear.
4.1 Hyperparameter Tuning.
While modeling, a set of parameters need to be selected like learning rate, batch size, optimizer, number of layers, etc. Here we manually tuned the parameters and chose the one which was giving better results. We built a baseline model, fixed batch size, and number of epochs and observed different combinations of learning rate and optimizer. The results of each combination have been summarized. The process was repeated with different parameters. The basic architecture of the model is shown in Table 3, and the results are summarized in Tables 4–6.
Model architecture
Layer (type) | Output shape | Parameter # |
---|---|---|
conv1d_1 (Conv1D) | (None, 48, 64) | 832 |
max_pooling1d_1 (MaxPooling1) | (None, 24, 64) | 0 |
conv1d_2 (Conv1D) | (None, 24, 32) | 6176 |
max_pooling1d_2 (MaxPooling1) | (None, 12, 32) | 0 |
lstm_1 (LSTM) | (None, 8) | 1312 |
dense_1 (dense) | (None, 32) | 288 |
dense_2 (dense) | (None, 32) | 1056 |
dense_3 (dense) | (None, 1) | 33 |
activation_1 (activation) | (None, 1) | 0 |
Layer (type) | Output shape | Parameter # |
---|---|---|
conv1d_1 (Conv1D) | (None, 48, 64) | 832 |
max_pooling1d_1 (MaxPooling1) | (None, 24, 64) | 0 |
conv1d_2 (Conv1D) | (None, 24, 32) | 6176 |
max_pooling1d_2 (MaxPooling1) | (None, 12, 32) | 0 |
lstm_1 (LSTM) | (None, 8) | 1312 |
dense_1 (dense) | (None, 32) | 288 |
dense_2 (dense) | (None, 32) | 1056 |
dense_3 (dense) | (None, 1) | 33 |
activation_1 (activation) | (None, 1) | 0 |
Results for cell 1 under different conditions
Epochs = 10, batch size = 100 | Epochs = 10, batch size = 500 | ||||
---|---|---|---|---|---|
Learning rate | Optimizer | Result (k-fold) | Learning rate | Optimizer | Result (k-fold) |
0.001 | rmsprop | 4.123 × 10−3 | 0.001 | rmsprop | 5.061 × 10−3 |
0.001 | adam | 9.560 × 10−3 | 0.001 | adam | 4.576 × 10−3 |
0.01 | rmsprop | 4.397 × 10−3 | 0.01 | rmsprop | 1.220 × 10−2 |
0.01 | adam | 7.920 × 10−3 | 0.01 | adam | 4.361 × 10−3 |
0.1 | rmsprop | 6.081 × 10−3 | 0.1 | rmsprop | 6.939 × 10−3 |
0.1 | adam | 4.966 × 10−3 | 0.1 | adam | 1.036 × 10−2 |
Epochs = 10, batch size = 100 | Epochs = 10, batch size = 500 | ||||
---|---|---|---|---|---|
Learning rate | Optimizer | Result (k-fold) | Learning rate | Optimizer | Result (k-fold) |
0.001 | rmsprop | 4.123 × 10−3 | 0.001 | rmsprop | 5.061 × 10−3 |
0.001 | adam | 9.560 × 10−3 | 0.001 | adam | 4.576 × 10−3 |
0.01 | rmsprop | 4.397 × 10−3 | 0.01 | rmsprop | 1.220 × 10−2 |
0.01 | adam | 7.920 × 10−3 | 0.01 | adam | 4.361 × 10−3 |
0.1 | rmsprop | 6.081 × 10−3 | 0.1 | rmsprop | 6.939 × 10−3 |
0.1 | adam | 4.966 × 10−3 | 0.1 | adam | 1.036 × 10−2 |
Note: Bold value is the better result of our scheme.
Results for cell 2 under different conditions
Epochs = 10, batch size = 100 | Epochs = 10, batch size = 500 | ||||
---|---|---|---|---|---|
Learning rate | Optimizer | Result (k-fold) | Learning rate | Optimizer | Result (k-fold) |
0.001 | rmsprop | 8.250 × 10−3 | 0.001 | rmsprop | 6.940 × 10−3 |
0.001 | adam | 6.994 × 10−3 | 0.001 | adam | 7.385 × 10−3 |
0.01 | rmsprop | 7.979 × 10−3 | 0.01 | rmsprop | 6.314 × 10−3 |
0.01 | adam | 1.229 × 10−2 | 0.01 | adam | 7.293 × 10−3 |
0.1 | rmsprop | 6.722 × 10−3 | 0.1 | rmsprop | 9.592 × 10−3 |
0.1 | adam | 1.289 × 10−2 | 0.1 | adam | 1.423 × 10−2 |
Epochs = 10, batch size = 100 | Epochs = 10, batch size = 500 | ||||
---|---|---|---|---|---|
Learning rate | Optimizer | Result (k-fold) | Learning rate | Optimizer | Result (k-fold) |
0.001 | rmsprop | 8.250 × 10−3 | 0.001 | rmsprop | 6.940 × 10−3 |
0.001 | adam | 6.994 × 10−3 | 0.001 | adam | 7.385 × 10−3 |
0.01 | rmsprop | 7.979 × 10−3 | 0.01 | rmsprop | 6.314 × 10−3 |
0.01 | adam | 1.229 × 10−2 | 0.01 | adam | 7.293 × 10−3 |
0.1 | rmsprop | 6.722 × 10−3 | 0.1 | rmsprop | 9.592 × 10−3 |
0.1 | adam | 1.289 × 10−2 | 0.1 | adam | 1.423 × 10−2 |
Note: Bold value is the better result of our scheme.
4.2 Model Evaluation and Results.
In general, for machine learning problems, the dataset is split randomly into training and test sets, but for time-series data, the observations are not independent. Hence random splitting is not reliable. The observations in our dataset are autocorrelated in nature and contain trend and seasonality components as well. The observation at the tth time-step is dependent on observation at the (t − 1)th time-step. Therefore, a method similar to k-fold cross-validation has been used in the paper to evaluate the model. We divide the dataset into six folds and start with training the model. The first fold is trained with a set of hyperparameters and then test on the second fold. After this, we train the first and second fold and test on the third fold. We train the model for a certain period and test it in the upcoming period. The metrics used for evaluation is MSE and root-mean-squared error (RMSE).
The MSE and RMSE for each fold and the final score of models have been summarized in Table 7.
Results obtained for cell 1 and cell 2
Cell | Metrics | Fold-1 | Fold-2 | Fold-3 | Fold-4 | Fold-5 | Weighted average |
---|---|---|---|---|---|---|---|
1 | MSE | 5.249 × 10−3 | 4.819 × 10−3 | 4.504 × 10−3 | 3.658 × 10−3 | 3.763 × 10−3 | 4.123 × 10−3 |
RMSE | 7.245 × 10−2 | 6.942 × 10−2 | 6.711 × 10−2 | 6.048 × 10−2 | 6.134 × 10−2 | 6.421 × 10−2 | |
2 | MSE | 1.989 × 10−2 | 7.996 × 10−3 | 5.446 × 10−3 | 6.330 × 10−3 | 4.657 × 10−3 | 6.722 × 10−3 |
RMSE | 1.410 × 10−1 | 8.942 × 10−2 | 7.380 × 10−2 | 7.956 × 10−2 | 6.824 × 10−3 | 8.199 × 10−2 |
Cell | Metrics | Fold-1 | Fold-2 | Fold-3 | Fold-4 | Fold-5 | Weighted average |
---|---|---|---|---|---|---|---|
1 | MSE | 5.249 × 10−3 | 4.819 × 10−3 | 4.504 × 10−3 | 3.658 × 10−3 | 3.763 × 10−3 | 4.123 × 10−3 |
RMSE | 7.245 × 10−2 | 6.942 × 10−2 | 6.711 × 10−2 | 6.048 × 10−2 | 6.134 × 10−2 | 6.421 × 10−2 | |
2 | MSE | 1.989 × 10−2 | 7.996 × 10−3 | 5.446 × 10−3 | 6.330 × 10−3 | 4.657 × 10−3 | 6.722 × 10−3 |
RMSE | 1.410 × 10−1 | 8.942 × 10−2 | 7.380 × 10−2 | 7.956 × 10−2 | 6.824 × 10−3 | 8.199 × 10−2 |
In both cells, the respective models suggest better performance as the number of folds increases. The real and predicted values for each cell in Fig. 7 show that this method can capture the information stored at various intervals, give reliable predictions, and have good prediction accuracy.
5 Conclusions
This paper discusses LIB degradation in EVs and proposes a CNN-LSTM deep learning model to predict battery discharge capacity. We use the voltage, current, temperature, and charging capacity in the experimental data as the input of the model and the discharge capacity as the output of the model. Through the convolutional layer and the long and short term memory unit, the deep timing features in the data are effectively extracted, and the correlation between input and output is constructed through the activation function. At the same time, because the data are time-related, traditional verification methods are no longer applicable. We propose a multiple train test split method similar to k-fold cross-validation, which effectively avoids the phenomenon of future values to predict past values. The prediction accuracy displays that MSE is 4.123 × 10−3 and RMSE is 6.421 × 10−2 (the result of another verification cell dataset: MSE is 6.722 × 10−3, RMSE is 8.199 × 10−2), which has high prediction accuracy and proves the reliability of our method.
Battery parameter detection and prediction system via digital twin to achieve real-time observation of battery SOC and SOH is our next research orientation. Meanwhile, improving the prediction accuracy through novel deep learning models is also our important research focus.
Acknowledgment
This work was partially supported by the Science and Technology Innovation Program of “Chengdu-Chongqing Double City Economic Circle Construction” (Grant No. KJCXZD2020013), the Special Funding for Postdoctoral Research Program in Chongqing (Grant No. XmT2020115), and the China Postdoctoral Science Foundation (Grant No. 2020M683237).
Conflict of Interest
There are no conflicts of interest.
Data Availability Statement
The datasets generated and supporting the findings of this article are obtained from the corresponding author upon reasonable request.