Real-Time SOC Calibration Strategy for Energy Storage Cells Based on Signal Processing and Machine Learning

In the management of energy storage systems, accurately estimating the State of Charge (SOC) of energy storage cells is critical for ensuring operational safety, optimizing performance, and extending battery lifespan. However, real-time SOC calibration often faces challenges such as abrupt jumps and inaccuracies due to noisy sensor data and dynamic operating conditions. To address these issues, I propose a novel strategy that integrates signal processing techniques with machine learning algorithms for real-time SOC calibration. This approach leverages Kalman smoothing, XGBoost modeling, wavelet transform denoising, and isotonic regression to enhance the accuracy and stability of SOC estimates. By processing voltage, temperature, and current data from energy storage cells, this strategy effectively reduces fluctuations and biases, providing reliable SOC values for battery management systems. The following sections detail the methodology, experimental validation, and results, demonstrating the efficacy of this approach in practical scenarios.

The importance of SOC calibration for energy storage cells cannot be overstated, as it directly impacts energy efficiency, system reliability, and user experience. Traditional methods, such as the open-circuit voltage (OCV) method and current integration, often struggle with dynamic changes and noise interference. For instance, the OCV method is simple but slow to respond, while current integration requires high-precision sensors and accounts for efficiency losses. More advanced techniques like Kalman filtering and machine learning offer improvements but come with complexities in implementation and computational demands. My strategy aims to overcome these limitations by combining multiple data processing steps into a cohesive framework. Specifically, I focus on energy storage cells used in grid-scale applications, where real-time SOC estimation is essential for load balancing and peak shaving. The key innovations include applying Kalman smoothing to input features, using XGBoost for SOC prediction, denoising with wavelet transforms, and enforcing monotonicity through isotonic regression. This holistic approach ensures that SOC values remain consistent and accurate, even under varying operational conditions.

To provide context, I first review common SOC calibration strategies and their drawbacks. Table 1 summarizes the advantages and disadvantages of widely used methods, highlighting the need for a more robust solution. As shown, each strategy has trade-offs between simplicity, accuracy, and computational cost. My proposed method addresses these by leveraging the strengths of signal processing and machine learning, tailored specifically for energy storage cells. In the following sections, I describe the step-by-step process, supported by mathematical formulations and empirical results.

Table 1: Comparison of Common SOC Calibration Strategies for Energy Storage Cells
Strategy Advantages Disadvantages
Open-Circuit Voltage (OCV) Method Simple and intuitive; requires minimal computational resources. Slow response to dynamic changes; affected by temperature and discharge rates.
Current Integration Dynamic tracking of charge/discharge states; effective during active cycles. Requires precise current measurement; complex due to efficiency considerations.
Kalman Filtering Dynamic state estimation using system models; improves accuracy with sensor data. Needs accurate prior knowledge of models and errors; computationally intensive.
Machine and Deep Learning Handles nonlinearities and complex dynamics; high precision with sufficient data. Data-intensive training and tuning; high computational load; potential latency issues.
Data Fusion and Multi-Sensor Integration Reduces single-sensor errors; enhances robustness and accuracy. Complex algorithm design; requires sensor calibration and synchronization.

The methodology for real-time SOC calibration of energy storage cells consists of six sequential steps, each designed to refine the data and improve estimation quality. First, I collect voltage, temperature, current, and SOC data from the battery management system at regular intervals, such as every 30 seconds. This data serves as the foundation for all subsequent processing. Second, I apply Kalman smoothing to the voltage, temperature, and current data to reduce noise and fluctuations. The Kalman algorithm, as implemented here, uses a simplified approach for one-dimensional sequences. Let $$X_k^{\text{obs}}$$ represent the observed value at time $$k$$, $$P_k^{\text{obs}}$$ the observed deviation, $$X_k^{\text{pred}}$$ the predicted value, and $$P_k^{\text{pred}}$$ the predicted deviation. The formulas are as follows:

Predicted value: $$X_k^{\text{pred}} = A \cdot X_{k-1}^{\text{true}} + B \cdot U_{k-1}$$

Predicted deviation: $$P_k^{\text{pred}} = (1 – H_{k-1}) \cdot P_{k-1}^{\text{pred}}$$

Kalman gain: $$H_k = \frac{(P_k^{\text{pred}})^2}{(P_k^{\text{pred}})^2 + (P_k^{\text{obs}})^2}$$

True value: $$X_k^{\text{true}} = H_k \cdot X_k^{\text{obs}} + (1 – H_k) \cdot X_k^{\text{pred}}$$

Here, $$A$$ is the state transition matrix, $$B$$ is the control input matrix, and $$U_{k-1}$$ is the external control input. For energy storage cells, I set initial values for $$P^{\text{pred}}$$ and $$P^{\text{obs}}$$ based on empirical data, ensuring that the smoothed sequences exhibit reduced volatility. This step is crucial for stabilizing inputs to the machine learning model.

Third, I build an XGBoost model to predict SOC using the smoothed voltage, temperature, and current data. XGBoost is chosen for its efficiency and ability to handle nonlinear relationships. The model parameters include a base score of 0.5, booster set to ‘gbtree’, importance type as ‘gain’, learning rate of 0.3, max depth of 6, min child weight of 1, and 100 estimators. These default settings provide a balance between performance and computational cost. The model is trained on historical data from energy storage cells, where the input features are the smoothed values, and the target is the SOC. The training process involves splitting data into training and test sets, typically in an 80:20 ratio, to evaluate performance using metrics like mean squared error (MSE).

Fourth, I denoise the predicted SOC values using wavelet transform (WT). Wavelet transform decomposes the SOC sequence into different frequency components, applies thresholding to remove noise, and reconstructs the signal. I use the Daubechies 8 (db8) wavelet for decomposition. The threshold is set to 1 based on visual inspection and prior knowledge of SOC behavior. The denoising process involves:

Decomposition: The SOC signal is decomposed into approximation and detail coefficients at multiple levels.

Thresholding: Coefficients below the threshold are suppressed to eliminate noise. The threshold value is determined as $$\lambda = \sigma \sqrt{2 \log N}$$, where $$\sigma$$ is the noise standard deviation and $$N$$ is the signal length, but in this case, I use a fixed threshold of 1 for simplicity.

Reconstruction: The thresholded coefficients are inverse-transformed to obtain the denoised SOC sequence.

This step reduces prediction biases, especially in phases where current and voltage fluctuations cause SOC irregularities.

Fifth, I apply isotonic regression to the denoised SOC sequence to ensure monotonicity during charge and discharge cycles. Isotonic regression fits a non-decreasing function to the data, minimizing the sum of squared errors while preserving order. For a sequence of indices $$i$$ and corresponding SOC values $$y_i$$, the algorithm solves:

$$\min \sum_{i=1}^{n} (y_i – \hat{y}_i)^2$$ subject to $$\hat{y}_i \leq \hat{y}_j$$ for all $$i < j$$

This is implemented using dynamic programming, resulting in a smoothed SOC curve that adheres to the expected monotonic behavior of energy storage cells.

Finally, the calibrated SOC value is returned for use in the battery management system. This integrated strategy enhances real-time SOC estimation by addressing data noise, model errors, and physical constraints. The entire workflow is designed for efficiency, making it suitable for practical applications where computational resources are limited.

For experimental validation, I use real-world data from an energy storage station, focusing on energy storage cells with specifications outlined in Table 2. The data spans from March 1 to March 19, 2024, with training data from March 1-18 and test data from March 19. The station comprises outdoor liquid-cooled energy storage systems, battery modules, cooling units, battery management systems, fire protection, power conversion systems, and energy management systems. Data points include total voltage, current, maximum and minimum temperatures, and SOC, recorded at regular intervals.

Table 2: Specifications of the Energy Storage Station for Energy Storage Cells
Component Specifications Quantity Unit
Outdoor Liquid-Cooled Energy Storage System 100 kW/215 kWh, 768 V, 0.5C 12 units
Battery Module 43 kWh, 153.6 V, 0.5C 5 sets
Liquid Cooler Cooling capacity: 3 kW 1 unit
Battery Management System (BMS) Compatible with batteries 1 set
Fire Protection System Smoke and temperature sensors 1 set
Power Conversion System (PCS) Rated power 100 kW, AC output 400 V/50 Hz, DC input 600-900 V, three-phase four-wire 1 set
Enclosure and Accessories Customized 1 set
Energy Management System (EMS) Customized 1 set

The training data consists of 50,502 records, with examples provided in Table 3. Each record includes timestamps, total voltage, current, maximum temperature, minimum temperature, and SOC. I preprocess this data by applying Kalman smoothing to the current, voltage, and temperature sequences. For instance, the current data shows significant fluctuations, which are reduced after smoothing, as illustrated in the processed sequences. Similarly, voltage and temperature data are stabilized, ensuring that the XGBoost model receives clean inputs.

Table 3: Sample Training Data for Energy Storage Cells
Timestamp Total Voltage (V) Current (A) Max Temperature (°C) Min Temperature (°C) SOC (%)
2024/3/1 0:01 797.8 0 12 9 100
2024/3/1 0:02 797.9 0 12 9 100
2024/3/1 0:03 797.9 0 12 9 100
2024/3/18 23:59 766.6 0 30 23 13

After smoothing, I train the XGBoost model on the training set. The model achieves an MSE of 0.7869 on the test set, with mean and variance values closely matching the actual data (mean: 55.0938 vs. 55.0856, variance: 32.8064 vs. 32.7915). This indicates that the model generalizes well and captures the underlying patterns of SOC for energy storage cells. Feature importance analysis reveals that current and voltage are the most influential inputs, consistent with the physics of battery behavior.

For prediction, I use data from March 19, 2024, comprising 2,878 records. The raw current and voltage data exhibit notable fluctuations, particularly during the second charging phase, which are mitigated by Kalman smoothing. The XGBoost model then predicts SOC, but these predictions still show noise due to residual variances. Wavelet transform denoising is applied, using the db8 wavelet and a threshold of 1, which smooths the SOC sequence. Finally, isotonic regression enforces monotonicity, dividing the SOC sequence into segments corresponding to charge, discharge, and idle states. The calibrated SOC values demonstrate improved consistency and accuracy compared to the original BMS-recorded SOC.

To quantify the performance, I compare the calibrated SOC with the actual SOC from the BMS. In multiple scenarios, the calibrated SOC avoids abrupt jumps and maintains a smooth trajectory. For example, in cases where the actual SOC drops suddenly from 14% to 0%, the calibrated SOC remains stable, reflecting true battery behavior. Similarly, during charging phases, the calibrated SOC increases monotonically, whereas the actual SOC may oscillate. These results underscore the effectiveness of the strategy in real-world applications for energy storage cells.

In conclusion, the proposed SOC calibration strategy for energy storage cells combines signal processing and machine learning to address key challenges in real-time estimation. By integrating Kalman smoothing, XGBoost modeling, wavelet denoising, and isotonic regression, I achieve enhanced accuracy, stability, and physical consistency. Experimental results on real data confirm that this approach reduces errors and anomalies, providing reliable SOC values for battery management. Future work could explore adaptive thresholding in wavelet transforms and model optimization for larger datasets, further advancing the calibration of energy storage cells in diverse operational environments.

Scroll to Top