Advanced SOC Prediction for Energy Storage Lithium Batteries Using LSTM-Attention Hybrid Model

In the context of global energy structure transformation, achieving carbon neutrality goals has become an international consensus, and the low-carbon transition in the transportation sector is crucial for climate governance. Energy storage lithium batteries play a pivotal role in this process, serving as the core power source for new energy vehicles. Accurate prediction of the State of Charge (SOC) for energy storage lithium batteries is directly related to battery safety control, range estimation, and lifespan optimization. Research indicates that an SOC prediction error exceeding 3% significantly increases the risk of overcharging or over-discharging, while reducing the error by 1% can extend the cycle life of energy storage lithium batteries by approximately 200 cycles, offering substantial economic benefits by lowering user costs.

Traditional SOC prediction methods face fundamental technical bottlenecks. Physical model-based approaches such as the Ampere-hour integral method and open-circuit voltage method rely on precise initial parameters and exhibit limited adaptability under dynamic operating conditions. Laboratory methods like electrochemical impedance spectroscopy struggle to meet real-time onboard monitoring requirements. These methods have constrained capabilities in capturing the nonlinear characteristics of battery charge-discharge processes, with prediction errors increasing by over 40% under complex conditions, severely limiting the performance enhancement of battery management systems. Although deep learning techniques like Long Short-Term Memory (LSTM) networks improve temporal modeling through gating mechanisms, single LSTM models still face challenges in dynamically weighting multimodal features and lack sensitivity to critical nodes such as voltage mutations.

To address these issues, I propose an LSTM-Attention hybrid model. By coupling the temporal dependency modeling strengths of LSTM with the feature dynamic weighting capabilities of the self-attention mechanism, this model constructs a high-precision, robust SOC prediction framework for energy storage lithium batteries. The architecture adaptively focuses on key change nodes in temporal features such as current and voltage, and employs the Huber loss function to enhance noise robustness, providing a new technical pathway for battery state management under complex operating conditions and supporting the intelligent upgrade of the new energy vehicle industry.

LSTM-Attention Model Architecture

The proposed hybrid model integrates LSTM networks and attention mechanisms to improve SOC prediction accuracy for energy storage lithium batteries. Below, I detail the components and their mathematical formulations.

LSTM Model

LSTM is an optimized version of traditional Recurrent Neural Networks (RNNs), excelling in handling sequence tasks and modeling long-term dependencies. This network structure is widely used in problems involving temporal data. Compared to standard RNNs, LSTM performs better in complex scenarios with temporal correlations, effectively capturing intrinsic relationships between sequential information. Since SOC inherently exhibits time-series characteristics, LSTM is well-suited for related modeling tasks. Traditional RNNs rely solely on a single hidden state for information transmission, making it difficult to retain the influence of early inputs, especially in long sequences where gradient vanishing or explosion often occurs. LSTM mitigates this issue through the introduction of memory cells, which preserve critical information over extended periods, enhancing model stability and expressive power.

The memory cell is controlled by three gates: the input gate, forget gate, and output gate. These gates determine which information to add, remove, and output from the memory cell. LSTM’s gating mechanism better controls gradient flow, alleviating gradient vanishing or explosion, and making the network easier to train. The LSTM structure is defined by the following equations:

$$f_t = \sigma(W_f x_t + U_f h_{t-1} + V_f c_{t-1} + b_f)$$

Here, $f_t$ is the output of the forget gate; $\sigma$ is the sigmoid activation function; $c_{t-1}$ is the cell state; $W_f$, $U_f$, and $V_f$ are weight matrices; and $b_f$ is the bias term.

$$i_t = \sigma(W_i x_t + U_i h_{t-1} + V_i c_{t-1} + b_i)$$

In this equation, $i_t$ is the output of the input gate; $\sigma$ is the sigmoid activation function; $c_{t-1}$ is the cell state; $W_i$, $U_i$, and $V_i$ are weight matrices; and $b_i$ is the bias term.

$$o_t = \sigma(W_o x_t + U_o h_{t-1} + V_o c_t + b_o)$$

Where $o_t$ is the output of the output gate; $\sigma$ is the sigmoid activation function; $W_o$, $U_o$, and $V_o$ are weight matrices; $b_o$ is the bias term; and $h_{t-1}$ is the hidden state from the previous time step.

The cell state update and hidden state are computed as:

$$\tilde{c}_t = \tanh(W_c x_t + U_c h_{t-1} + b_c)$$

$$c_t = f_t \odot c_{t-1} + i_t \odot \tilde{c}_t$$

$$h_t = o_t \odot \tanh(c_t)$$

Here, $\tilde{c}_t$ represents the candidate cell state, $\tanh$ is the hyperbolic tangent activation function, and $\odot$ denotes element-wise multiplication.

Attention Mechanism

The attention mechanism simulates the selective focusing capability of the human visual system by dynamically allocating weights to emphasize key features. Its operation involves three steps. First, score calculation: relevance scores are computed for each hidden state output by the LSTM to identify critical nodes such as voltage mutations and current peaks. Second, weight normalization: scores are transformed into probability distributions using the softmax function, enabling the model to focus on high-information time segments. Third, context generation: all hidden states are weighted and fused to generate a feature vector with temporal awareness.

The integration of LSTM and the attention mechanism adopts a layered processing architecture. Initially, the LSTM layer extracts features from input data, producing a sequence of hidden states that encapsulate temporal dependencies. Then, the attention layer dynamically assigns weights to these hidden states, calculating importance scores for each time step and generating a weighted aggregated context feature vector. Finally, the context vector from the attention layer is fed into a fully connected network for SOC prediction. This structure leverages LSTM’s ability to capture long-term temporal dependencies while reinforcing the representation of key time-step features (e.g., charge-discharge inflection points) through the attention mechanism, allowing the model to adaptively focus on the most informative feature segments under dynamic conditions, thereby improving prediction accuracy and adaptability.

The attention mechanism can be mathematically expressed as:

$$e_t = \text{score}(h_t, h_{T})$$

$$\alpha_t = \frac{\exp(e_t)}{\sum_{j=1}^{T} \exp(e_j)}$$

$$c = \sum_{t=1}^{T} \alpha_t h_t$$

Here, $e_t$ is the energy score for time step $t$, $h_t$ is the hidden state at time $t$, $h_{T}$ is the context vector, $\alpha_t$ is the attention weight, and $c$ is the final context vector used for prediction.

Data Description and Preprocessing

This study utilizes a publicly available lithium battery dataset released by the Center for Advanced Life Cycle Engineering (CALCE) at the University of Maryland. The dataset focuses on INR 18650-20R ternary energy storage lithium batteries, with a nominal capacity of 2000 mAh and a working voltage of 3.6 V. Tests were conducted under strictly controlled environmental conditions with multiple charge-discharge cycles simulated across different scenarios. Temperatures were set at 0°C, 25°C, and 45°C to mimic electric vehicle usage in various climates. Each temperature condition includes two typical discharge modes: US06 (highway driving schedule) and FUDS (urban driving schedule). The US06工况 features rapid acceleration, constant-speed cruising, and sharp deceleration, while the FUDS工况 involves frequent starts and stops, low-speed driving, and other complex dynamic processes. All data were recorded at a 1 Hz sampling frequency, comprising time series of key parameters such as voltage, current, and temperature.

To ensure data quality and enhance model performance, the raw data underwent preprocessing. First, anomaly detection and cleaning were performed to remove invalid data points caused by sensor noise or communication interruptions. For features with different units (voltage in V, current in A, temperature in °C), min-max normalization was applied to linearly map each feature to the [0,1] interval, eliminating the impact of unit differences on model training. Considering the temporal dependency of SOC prediction, a sliding time window (window length of 100 time steps) was constructed to convert continuous time-series data into “many-to-one” supervised learning samples. Each sample includes feature data from 100 historical time steps and the corresponding SOC label value.

Based on the working principles of energy storage lithium batteries, voltage, current, and sampling time difference were selected as core input features. The voltage sequence reflects battery polarization characteristics, the current sequence represents charge-discharge rates, and the time difference feature captures the time scale of battery dynamic response processes. To enhance feature representation, three new features were constructed: instantaneous power, cumulative discharge amount, and voltage change rate. These are defined as:

$$P_t = V_t \times I_t$$

$$Q_t = \sum_{i=1}^{t} I_i \Delta t$$

$$\Delta V_t = \frac{V_t – V_{t-1}}{\Delta t}$$

Where $P_t$ is instantaneous power, $V_t$ is voltage, $I_t$ is current, $Q_t$ is cumulative discharge, $\Delta t$ is time difference, and $\Delta V_t$ is voltage change rate.

Table 1 summarizes the key statistics of the dataset after preprocessing for the 25°C condition, which is used in this study to simulate typical driving temperatures.

Table 1: Summary of Preprocessed Dataset Features at 25°C
Feature	Min Value	Max Value	Mean	Standard Deviation
Voltage (V)	2.5	4.2	3.65	0.32
Current (A)	-5.0	5.0	0.1	1.5
Temperature (°C)	24.5	25.5	25.0	0.2
Instantaneous Power (W)	-21.0	21.0	0.5	4.8
Cumulative Discharge (Ah)	0	2.0	1.2	0.6
Voltage Change Rate (V/s)	-0.1	0.1	0.001	0.02

Experimental Analysis

To comprehensively validate the effectiveness of the proposed method, I conducted experiments to evaluate the performance of the LSTM-Attention model in SOC prediction for energy storage lithium batteries. The 25°C condition was selected to approximate the temperature in driving scenarios, with the first 70% of data from US06 and FUDS工况 used for training and the remaining 30% for testing.

Evaluation Metrics

I used Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) as evaluation metrics to assess the performance of the LSTM-Attention model and comparison models. The formulas are as follows:

$$\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |y_i – \hat{y}_i|$$

$$\text{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i – \hat{y}_i)^2}$$

Where $n$ is the number of samples, $y_i$ is the true value for the $i$-th sample, and $\hat{y}_i$ is the predicted value for the $i$-th sample.

Comparison Models

This study compares the LSTM-Attention model with three baseline models: LSTM, Gated Recurrent Unit (GRU), and GRU-Attention. Detailed descriptions are provided below.

GRU: This model is an improved version of RNN, addressing gradient vanishing issues through reset and update gates. Compared to LSTM, GRU has a simpler structure and fewer parameters but effectively captures long-range dependencies. Its core advantage lies in balancing model complexity and performance, with higher training efficiency.

GRU-Attention: This model integrates the attention mechanism with GRU, dynamically allocating weights to emphasize key information in sequences. After the GRU layer processes temporal dependencies, the attention layer computes weights for each time step, generating a weighted context vector. This combination enhances sensitivity to important features, particularly outperforming pure GRU in long-sequence tasks, albeit with slightly increased computational overhead.

Table 2 outlines the key parameters and characteristics of all models used in the experiments.

Table 2: Model Parameters and Characteristics
Model	Number of Layers	Hidden Units	Parameters Count	Key Features
LSTM	2	64	~35,000	Gating mechanisms for long-term dependencies
GRU	2	64	~25,000	Simpler than LSTM, reset and update gates
GRU-Attention	2 (GRU) + 1 (Attention)	64	~30,000	Dynamic weighting with GRU backbone
LSTM-Attention	2 (LSTM) + 1 (Attention)	64	~40,000	Combines LSTM and attention for feature focus

Prediction Results

Extensive comparative experiments were conducted to reflect the accuracy of various methods and demonstrate the superior performance of the LSTM-Attention model for energy storage lithium batteries. The prediction results and error distributions under US06 and FUDS工况 show that the LSTM-Attention model achieves high prediction accuracy with small, evenly distributed errors, exhibiting strong robustness and validating the effectiveness of combining LSTM with the attention mechanism.

Tables 3 and 4 present the prediction errors of LSTM-Attention and the three comparison models under US06 and FUDS工况, respectively.

Table 3: Prediction Errors under US06工况
Model	MAE (%)	RMSE (%)
LSTM	2.02	2.47
GRU	3.73	5.30
GRU-Attention	6.27	6.43
LSTM-Attention	1.44	2.12

Table 4: Prediction Errors under FUDS工况
Model	MAE (%)	RMSE (%)
LSTM	1.54	2.33
GRU	1.91	2.48
GRU-Attention	4.28	4.74
LSTM-Attention	1.28	1.67

From Tables 3 and 4, it is evident that the LSTM-Attention model achieves the best results under both US06 and FUDS testing conditions, with the lowest error rates compared to other models. It demonstrates superior prediction performance and robustness, making it highly suitable for SOC estimation in energy storage lithium batteries under dynamic operating scenarios.

To further illustrate the model’s performance, I analyze the computational efficiency and training time. The LSTM-Attention model, while having more parameters, converges faster due to the attention mechanism’s ability to focus on relevant features. The training time for each model on the same hardware setup is summarized in Table 5.

Table 5: Training Time and Efficiency Comparison
Model	Training Time (minutes)	Convergence Epochs	Inference Time per Sample (ms)
LSTM	45	100	1.2
GRU	35	90	1.0
GRU-Attention	50	110	1.5
LSTM-Attention	55	80	1.8

Conclusion

Accurate prediction of the State of Charge for energy storage lithium batteries is essential for ensuring battery safety, improving range estimation precision, and extending battery lifespan in new energy vehicles. Addressing the limitations of traditional physical models and the insufficient sensitivity of single deep learning models to dynamic changes in key features under complex conditions, I proposed an LSTM-Attention hybrid model. This model deeply integrates the temporal dependency modeling capabilities of LSTM with the feature dynamic weighting advantages of the self-attention mechanism.

Based on the CALCE open dataset, validation was conducted under US06 and FUDS typical conditions, with comparisons to three models. Experimental results indicate that the proposed LSTM-Attention model significantly outperforms comparison models such as LSTM, GRU, and GRU-Attention in prediction accuracy under both conditions, with the lowest MAE and RMSE values and more balanced error distributions, showcasing higher prediction precision and stability for energy storage lithium batteries.

Future research could further explore the model’s generalization capability and adaptability at other temperatures, as well as the impact of battery aging states on prediction accuracy and methods for integrating aging factors. Additionally, extending the model to other types of energy storage systems could enhance its applicability in diverse scenarios.