In the pursuit of advancing the accuracy and engineering adaptability of State of Health (SOH) estimation for energy storage lithium batteries, we have developed a novel approach that leverages fragmented operational data from real-world scenarios. The proliferation of energy storage systems, particularly lithium-ion batteries, is critical for achieving carbon neutrality and constructing modern power grids. However, the performance degradation of these energy storage lithium batteries over time poses significant challenges, including reduced efficiency and increased safety risks. Traditional SOH estimation methods, such as model-based techniques including electrochemical and equivalent circuit models, often struggle with robustness and parameter identification errors. Data-driven approaches, while promising, frequently rely on laboratory data that may not translate well to field conditions. To address these limitations, we analyzed operational data from a photovoltaic energy storage power station and designed simulated condition experiments to identify key characteristic parameters for SOH evaluation. By focusing on charging voltage differences within specific voltage ranges and employing an improved neural network algorithm, we have created a model that demonstrates high accuracy and practical applicability. This article details our methodology, results, and the application of this model in real-world settings, emphasizing the importance of fragmented data in enhancing the reliability of energy storage lithium battery management.
The importance of accurately estimating the SOH of energy storage lithium batteries cannot be overstated, as it directly impacts the longevity and safety of battery energy storage systems. SOH is typically defined as the ratio of the current maximum usable capacity to the rated capacity, expressed as a percentage: $$ \text{SOH} = \frac{Q_C}{Q_{\text{new}}} \times 100\% $$ where \( Q_C \) is the current capacity and \( Q_{\text{new}} \) is the nominal capacity. In practical applications, especially in energy storage systems, batteries undergo partial charge-discharge cycles, leading to fragmented data that complicates SOH assessment. Our analysis of a 50 MW/100 MWh energy storage power station revealed that batteries often operate within a State of Charge (SOC) range of 20% to 100%, with charging currents stabilizing around 0.5 C near the end of charge cycles. This observation guided our selection of characteristic parameters derived from charging voltage segments, specifically the voltage difference in the 30 minutes before reaching 3.41 V, with time intervals of 1 minute, 3 minutes, and 5 minutes. These parameters were chosen based on the aging mechanisms of lithium iron phosphate batteries, where differential voltage (dV/dQ) analysis shows peak shifts indicative of active lithium and material loss. For instance, the relationship between voltage change and capacity can be approximated as: $$ \frac{dV}{dQ} \approx \frac{\Delta V}{\Delta Q} $$ highlighting the correlation between voltage differences and battery degradation.
To build our dataset, we conducted cycling tests on 20 Ah lithium iron phosphate batteries under simulated conditions that mirrored the operational profile of the energy storage power station. The tests were performed in a controlled environment at 25°C ± 2°C, using a battery tester to apply charge-discharge cycles between 2.5 V and 3.65 V, with a constant current of 0.5 C. The SOC range was maintained at 20% to 100% to replicate real-world usage, and capacity calibration was performed every 100 cycles to track SOH degradation. Three batteries, labeled LFP20-1#, LFP20-2#, and LFP20-3#, were used, with initial capacities of approximately 19.3 Ah. After 4,100 cycles, the capacities degraded to around 17.7 Ah, representing a capacity retention of about 92%. The charging data, particularly the voltage profiles, were recorded at high resolution to extract the voltage differences for the specified time intervals. This dataset formed the basis for our modeling efforts, with a total of over 12,000 data points collected across the three batteries.

The characteristic parameters for SOH estimation were derived from the charging voltage data in the segment preceding 3.41 V. We computed the voltage differences for 1-minute, 3-minute, and 5-minute intervals, resulting in multidimensional feature vectors. For example, the 1-minute interval produced 29 voltage difference values per cycle, while the 3-minute and 5-minute intervals yielded 10 and 6 values, respectively. These features were normalized using min-max scaling to ensure consistent model training: $$ X’ = \frac{X – X_{\text{min}}}{X_{\text{max}} – X_{\text{min}}} $$ where \( X \) is the original data, and \( X_{\text{min}} \) and \( X_{\text{max}} \) are the minimum and maximum values. The normalized dataset was split into training and testing sets in an 8:2 ratio, with an additional validation set of 1,200 cycles from a separate battery of the same type to evaluate model generalization. The target variable, SOH, was calculated using the capacity measurements from calibration tests, with the equivalent capacity for the 20% to 100% SOC range adjusted as: $$ Q_d = 0.8 \times Q_s $$ where \( Q_s \) is the measured capacity in the simulated cycle.
We employed a Genetic Algorithm-Improved Back Propagation (GA-BP) neural network to model the relationship between the voltage difference features and SOH. The BP neural network is adept at handling nonlinear mappings but is prone to local optima and slow convergence. The genetic algorithm was integrated to optimize the initial weights and thresholds, enhancing global search capability and accelerating training. The GA process involves encoding, selection, crossover, and mutation operations to evolve a population of solutions. For instance, chromosomes representing weight sets undergo crossover and mutation to generate new offspring, with fitness evaluation based on prediction error. The BP network architecture consisted of an input layer with nodes corresponding to the number of voltage difference features (e.g., 29 for 1-minute intervals), three hidden layers with 10 nodes each, and an output layer with a single node for SOH estimation. The activation function used was sigmoid, and the training involved forward propagation of inputs and backward propagation of errors to adjust weights and thresholds. The combined GA-BP approach reduced the risk of overfitting and improved model accuracy, as demonstrated by the low errors in our results.
The performance of the models was evaluated using Mean Absolute Percentage Error (MAPE) and Root Mean Square Error (RMSE), defined as: $$ E_{\text{MAP}} = \frac{1}{N} \sum_{n=1}^{N} \left| \frac{Y(n) – S(n)}{S(n)} \right| \times 100\% $$ and $$ E_{\text{RMS}} = \sqrt{ \frac{1}{N} \sum_{n=1}^{N} (Y(n) – S(n))^2 } $$ where \( Y(n) \) is the predicted SOH, \( S(n) \) is the actual SOH, and \( N \) is the number of samples. The validation with 1,200 unseen cycles showed that the model using 1-minute interval voltage differences achieved the highest accuracy, with a MAPE of 0.37% and RMSE of 0.4565. Comparative results for the different time intervals are summarized in Table 1, highlighting the trade-off between feature granularity and estimation precision. The superior performance of the 1-minute interval model can be attributed to its ability to capture finer details in the voltage profile, which are lost with longer intervals. This underscores the value of high-resolution data in SOH estimation for energy storage lithium batteries.
| Model | Feature Parameter | MAPE (%) | RMSE |
|---|---|---|---|
| M-1 | 1-minute interval voltage difference | 0.37 | 0.4565 |
| M-2 | 3-minute interval voltage difference | 0.40 | 0.5095 |
| M-3 | 5-minute interval voltage difference | 0.44 | 0.5224 |
To enhance the model’s generalizability, we applied transfer learning to adapt it for 260 Ah lithium iron phosphate batteries, which are commonly used in energy storage power stations. Initially, the model trained on 20 Ah batteries exhibited a maximum error of 5.52% when applied directly to the larger batteries, due to differences in battery characteristics and operating conditions. We fixed the first two hidden layers of the pre-trained GA-BP model and fine-tuned the last hidden layer and output layer using 60 additional data points from 260 Ah batteries. This approach leveraged the pre-learned features while adapting to the new data, reducing the maximum error to 1.89%. The effectiveness of model transfer is evident in the improved SOH estimates, as shown in Table 2, which compares the errors before and after transfer. This process demonstrates the practicality of using limited data from field deployments to refine models for specific energy storage lithium battery types, facilitating broader application.
| Condition | Maximum Error (%) | Average MAPE (%) |
|---|---|---|
| Before Transfer | 5.52 | 2.85 |
| After Transfer | 1.89 | 0.65 |
We further validated the model’s engineering adaptability by applying it to a cluster of energy storage lithium batteries in the photovoltaic power station. Using the 1-minute interval voltage difference features extracted from the station’s operational data, we estimated the SOH for 224 battery cells. The results, depicted in Figure 1, show that the SOH values are distributed around 95%, indicating consistent performance across the cluster. This batch estimation capability is crucial for large-scale energy storage systems, where manual testing is impractical. The model’s ability to handle fragmented data from real-world operations, combined with its high accuracy, makes it a valuable tool for proactive maintenance and safety management of energy storage lithium batteries. Moreover, the use of genetic algorithm optimization in the neural network training ensured robust performance, even with noisy field data.
The aging mechanism of energy storage lithium batteries, particularly lithium iron phosphate types, involves complex electrochemical processes that affect voltage characteristics. Our differential voltage analysis revealed that peaks in the dV/dQ curve, such as P1 and P2, shift with degradation, indicating loss of active materials and lithium inventory. The voltage difference in the 3.41 V region captures these changes effectively, as it corresponds to critical phase transitions in the electrode materials. The relationship between voltage difference and capacity loss can be modeled using empirical equations, such as: $$ \Delta V = k \cdot \Delta Q + c $$ where \( k \) and \( c \) are constants derived from regression analysis. This linear approximation, while simplified, aligns with the observed trends in our data and supports the use of voltage-based features for SOH estimation. Additionally, the integration of genetic algorithms into the neural network framework addresses the nonlinearities in battery aging, enabling more accurate predictions over the lifespan of energy storage lithium batteries.
In conclusion, our study presents a robust framework for SOH estimation of energy storage lithium batteries using fragmented charging data. The key findings include the superiority of 1-minute interval voltage differences as feature parameters, with a MAPE of 0.37% and RMSE of 0.4565, and the successful application of model transfer to reduce errors in larger batteries. The practical implementation in a photovoltaic energy storage power station demonstrates the model’s engineering viability, offering a scalable solution for monitoring battery health in real-time. Future work could explore the integration of additional features, such as temperature and current variations, to further enhance accuracy. Overall, this approach underscores the potential of data-driven methods combined with advanced optimization techniques to improve the management and longevity of energy storage lithium batteries, contributing to the reliability and safety of modern energy systems.
