Abstract
Accurate estimation of the State of Health (SOH) for energy storage battery is critical to enhancing the operational efficiency and safety of battery energy storage systems (BESS). This study proposes a data-driven framework that combines fragmented operational data with a Genetic Algorithm-optimized Back Propagation (GA-BP) neural network to estimate the SOH of lithium iron phosphate (LiFePO₄) energy storage battery. By analyzing real-world operational data from a 50 MW/100 MWh photovoltaic energy storage power station, voltage differences within the 30-minute charging phase preceding 3.41 V (at 1-, 3-, and 5-minute intervals) were identified as key feature parameters. A GA-BP neural network model was trained using laboratory data from 20 Ah LiFePO₄ cells and validated on 1,200 independent test cycles. The model achieved a mean absolute percentage error (MAPE) of 0.37% and a root mean square error (RMSE) of 0.4565 when using 1-minute interval voltage differences. Model migration further reduced the maximum SOH estimation error for 260 Ah batteries from 5.52% to 1.89%. The framework demonstrated robust engineering adaptability in batch SOH estimation for battery clusters in photovoltaic stations.

1. Introduction
Energy storage battery, particularly lithium-ion variants, are pivotal in achieving carbon neutrality and stabilizing modern power grids. However, performance degradation and inconsistency among cells during operation pose significant challenges to system longevity and safety. Traditional SOH estimation methods, such as electrochemical models and equivalent circuit models, suffer from poor robustness or parameter identification errors. Data-driven approaches, including Long Short-Term Memory (LSTM) and Support Vector Regression (SVR), often lack adaptability to real-world operating conditions.
This study addresses these limitations by leveraging fragmented operational data from energy storage battery. A hybrid GA-BP neural network is developed to map voltage differences during charging to SOH, with model migration enhancing scalability across battery capacities.
2. Methodology
2.1 Feature Parameter Selection
The aging mechanism of LiFePO₄ energy storage battery was analyzed using differential voltage (dV/dQ) curves. Two characteristic peaks (P1 and P2) were observed during charging, with P2 shifting leftward as SOH declines (Figure 1). The 30-minute charging phase before reaching 3.41 V was identified as the optimal feature window due to its stable current conditions and sensitivity to capacity fade. Voltage differences at 1-, 3-, and 5-minute intervals within this window were extracted as feature parameters (Tables 1–3).
Table 1: Voltage Difference Data for 1-Minute Intervals
| Cycle | ΔV₁ (mV) | ΔV₂ (mV) | … | ΔVₙ (mV) |
|---|---|---|---|---|
| 1 | 0.2 | 0.2 | … | 0.9 |
| 2 | 0.8 | 0.1 | … | 1.2 |
| … | … | … | … | … |
2.2 Data Preprocessing
SOH is defined as:SOH=QcQnew×100%SOH=QnewQc×100%
where QcQc is the current maximum available capacity, and QnewQnew is the rated capacity. Data normalization was applied to mitigate dimensional disparities:X′=X−XminXmax−XminX′=Xmax−XminX−Xmin
The dataset was split into training (80%) and testing (20%) sets.
2.3 GA-BP Neural Network Architecture
The BP neural network comprised:
- Input layer: Nodes corresponding to voltage differences.
- Hidden layers: Three layers with 10 nodes each.
- Output layer: Single node for SOH estimation.
Genetic Algorithm (GA) optimized initial weights and thresholds to avoid local minima (Figure 2). The fitness function minimized prediction errors during training.
Figure 2: GA-BP Neural Network Training Flow
- Encode initial weights/thresholds as chromosomes.
- Select, cross, and mutate chromosomes.
- Update BP network parameters iteratively.
3. Results and Discussion
3.1 Model Performance
The GA-BP model was validated on 1,200 test cycles. Voltage differences at 1-minute intervals yielded the highest accuracy:
Table 2: Model Accuracy Comparison
| Feature Interval | MAPE (%) | RMSE |
|---|---|---|
| 1-minute | 0.37 | 0.4565 |
| 3-minute | 0.40 | 0.5095 |
| 5-minute | 0.44 | 0.5224 |
Shorter intervals preserved temporal resolution, reducing information loss during nonlinear aging phases.
3.2 Model Migration
Direct application of the 20 Ah-trained model to 260 Ah energy storage battery resulted in a maximum error of 5.52%. By freezing the first two hidden layers and retraining the final layer with 60 migration samples, the error dropped to 1.89% (Figure 3).
Table 3: Error Reduction via Model Migration
| Battery Capacity | Max Error (Original) | Max Error (Migrated) |
|---|---|---|
| 260 Ah | 5.52% | 1.89% |
3.3 Engineering Application
The model estimated SOH for 224 cells in a photovoltaic station cluster, with results clustered around 95% (Figure 4). This validated its practicality for large-scale energy storage battery monitoring.
4. Conclusion
This study presents a fragmented data-driven framework for SOH estimation in energy storage battery. Key contributions include:
- Feature Selection: Voltage differences at 1-minute intervals within 3.41 V charging phases are optimal for SOH modeling.
- Algorithm Enhancement: GA optimization improved BP neural network accuracy (MAPE: 0.37%).
- Scalability: Model migration reduced errors by 65.8% for 260 Ah batteries.
- Industrial Relevance: Successful deployment in a 50 MW/100 MWh station confirmed engineering adaptability.
Future work will integrate temperature and current dynamics to further refine SOH estimation for energy storage battery.
