State of Charge Estimation for LiFePO4 Batteries Using Ultrasonic Time-Domain Features and Random Forest Regression

The accurate and reliable estimation of the State of Charge (SOC) remains a critical function within the Battery Management System (BMS) for electric vehicles and energy storage systems. Among various battery chemistries, the Lithium Iron Phosphate (LiFePO4) battery is widely favored for its superior safety, long cycle life, and thermal stability. However, its characteristic flat open-circuit voltage (OCV) versus SOC curve, especially during the middle plateau region, presents a significant challenge for traditional estimation methods that rely heavily on voltage signals. Within this plateau, minor voltage changes correspond to large SOC variations, making electrical signals insensitive and leading to considerable estimation errors. This intrinsic limitation of LiFePO4 batteries necessitates the exploration of alternative, non-invasive sensing modalities that can provide complementary information tied to the battery’s internal state.

Ultrasonic testing has emerged as a powerful technique for the in-situ, non-destructive characterization of battery internals. The underlying principle is based on the interaction between acoustic waves and the battery’s multi-layered porous structure. During charge and discharge cycles, the intercalation and deintercalation of lithium ions cause volumetric changes (expansion/contraction) in the electrode materials, primarily the graphite anode. These morphological changes alter the material’s physical properties, such as density ($$ \rho $$) and the effective Young’s modulus ($$ E $$). According to acoustic theory, these properties directly influence the acoustic impedance ($$ Z $$) and the speed of sound ($$ C $$) within the medium:

$$ Z = \sqrt{\rho E} $$

$$ C = \sqrt{\frac{K + \frac{4}{3}G}{\rho}} $$

where $$ K $$ is the bulk modulus and $$ G $$ is the shear modulus. Consequently, as the SOC of a LiFePO4 battery changes, the resulting variations in $$ \rho $$ and $$ E $$ modulate the ultrasonic waves propagating through the cell. This modulation is captured in the received time-domain signal, establishing a robust “structure-property-performance” relationship that can be exploited for SOC estimation, independent of the flat OCV curve.

This article details a comprehensive methodology for estimating the SOC of LiFePO4 batteries, particularly within the challenging plateau region. The approach fuses high-correlation ultrasonic time-domain features with a low-complexity, high-accuracy Random Forest regression model. We begin by analyzing the consistency and correlation of conventional ultrasonic features under varying operational conditions. Subsequently, we expand the feature set by extracting multi-dimensional descriptors from the ultrasonic signal envelope. Finally, after comparing various data-driven and model-driven algorithms, we demonstrate that a model trained on ultrasonic features using Random Forest provides exceptional estimation accuracy under both static and dynamic load profiles.

Ultrasonic Signal Acquisition and Preprocessing

The experimental setup for ultrasonic testing of LiFePO4 batteries integrates electrical cycling, thermal control, and acoustic measurement modules. A pair of piezoelectric transducers, coupled to the surface of a commercial 1 Ah LiFePO4 pouch cell using high-vacuum grease, are used in a through-transmission mode. One transducer acts as a transmitter, excited by a pulsed signal from an ultrasonic pulser-receiver, while the other acts as a receiver. The received signal, which carries the imprint of the battery’s internal state, is digitized and stored synchronously with electrical data (current, voltage) and temperature.

To ensure the integrity of the features extracted, the raw ultrasonic signal undergoes a denoising process. Wavelet transform-based denoising is employed due to its effectiveness in handling non-stationary signals like ultrasonic echoes. Using a Symlet wavelet base with 9 decomposition levels and a ‘Rigrsure’ threshold rule provides an optimal signal-to-noise ratio (SNR > 70 dB), effectively removing high-frequency noise while preserving the essential signal structure. The denoised signal is then normalized, and its upper envelope is fitted for subsequent feature extraction.

Analysis of Conventional Ultrasonic Features

Initial investigations focus on two conventional time-domain ultrasonic features: Signal Amplitude (SA) and Time-of-Flight (TOF). SA is the peak amplitude of the received ultrasonic pulse, primarily influenced by the acoustic impedance mismatch at material interfaces. TOF is the propagation time corresponding to the SA peak, directly related to the effective speed of sound within the battery stack.

Experiments were conducted to evaluate the behavior of these features under different ultrasonic excitation frequencies (0.5, 1.0, 1.5, 2.0, 2.5, 4.0 MHz), current rates (0.4C, 0.6C, 0.8C, 1.0C), and temperatures (5°C, 10°C, 25°C, 40°C). The key metrics assessed were consistency (repeatability across consecutive cycles) and correlation with the true SOC (calculated via coulomb counting).

Effect of Excitation Frequency: The results indicated a clear frequency dependence. Lower frequencies (e.g., 0.5 MHz) showed poorer cycle-to-cycle consistency and moderate linear correlation with SOC. As the frequency increased, both consistency and correlation improved significantly. At 4 MHz, the SA and TOF curves from consecutive charge/discharge cycles nearly overlapped, demonstrating excellent repeatability. The Pearson correlation coefficients (PCC) at 4 MHz were the highest, reaching 0.911 for SA and 0.924 for TOF, confirming a strong linear relationship suitable for modeling.

Effect of Current Rate and Temperature: Using the optimal 4 MHz excitation, tests across different C-rates and temperatures showed that these operational parameters had a minimal impact on the high consistency and strong linear correlation already established. The features maintained PCC values above 0.90 in most cases. This robustness is crucial for practical BMS applications where operating conditions vary. The results underscore that high-frequency ultrasonic interrogation yields stable, SOC-sensitive features for the LiFePO4 battery.

Condition Parameter Signal Amplitude (SA) PCC Time-of-Flight (TOF) PCC
Frequency 0.5 MHz 0.729 0.783
2.0 MHz 0.817 0.442
4.0 MHz 0.911 0.924
Current Rate (at 4 MHz) 0.4C 0.922 0.920
1.0C 0.903 0.924
Temperature (at 4 MHz) 5°C 0.930 0.909
40°C 0.907 0.928

Expansion and Selection of High-Correlation Time-Domain Features

While SA and TOF are effective, the information within the ultrasonic signal envelope is richer. To better characterize the subtle shape changes correlated with SOC, we extract an extended set of ten time-domain features from the upper envelope line, as illustrated in the conceptual figure. These features describe the geometry and dynamics of the ultrasonic pulse:

  • Slope Features: $$ k_{ab}, k_{bc}, k_{cd}, k_{de}, k_{ac}, k_{ce} $$ (slopes between characteristic points on the envelope).
  • Temporal Features: Rise time ($$ t_r $$), Fall time ($$ t_f $$), Duration ($$ t_w $$).
  • Integral Feature: Envelope area ($$ S $$).

The correlation of each expanded feature with the SOC was calculated. A filtering process based on the absolute Pearson correlation coefficient (|PCC|) was used to select the most relevant features for SOC estimation. The ranking revealed that four expanded features exhibited even higher correlation with SOC than the conventional SA and TOF.

Rank Ultrasonic Time-Domain Feature Pearson Correlation Coefficient (PCC) with SOC
1 Slope $$ k_{ab} $$ 0.958
2 Slope $$ k_{ac} $$ 0.951
3 Rise Time $$ t_r $$ -0.940
4 Duration $$ t_w $$ -0.934
5 Signal Amplitude (SA) 0.911
6 Time-of-Flight (TOF) 0.924

The four high-correlation features ($$ k_{ab}, k_{ac}, t_r, t_w $$) along with the two conventional features (SA, TOF) form a six-dimensional input vector. This feature set captures complementary aspects of the ultrasonic signal’s interaction with the evolving internal state of the LiFePO4 battery, providing a rich and descriptive basis for the regression model.

Random Forest Regression for SOC Estimation

Among various machine learning algorithms evaluated (Least Squares, Elastic Net, Support Vector Machines, Backpropagation Neural Networks), the Random Forest (RF) regressor demonstrated the best combination of accuracy and computational efficiency for this task. RF is an ensemble learning method that operates by constructing a multitude of decision trees during training. The final SOC estimate is the average prediction of the individual trees, which reduces overfitting and improves generalization.

Given a training dataset $$ G_n = \{(X_1, Y_1), (X_2, Y_2), …, (X_n, Y_n)\} $$ where $$ X_i $$ is the 6-dimensional ultrasonic feature vector and $$ Y_i $$ is the corresponding SOC value, the RF algorithm follows these steps:

  1. Bootstrap Sampling: Create $$ q $$ new training sets $$ G_n^1, G_n^2, …, G_n^q $$ by randomly sampling from $$ G_n $$ with replacement.
  2. Tree Construction: For each bootstrapped set, grow a decision tree. At each node, instead of searching over all 6 features, a random subset of $$ r $$ features is considered to find the best split (e.g., based on minimizing variance).
  3. Prediction: Each tree provides a SOC estimate $$ \hat{Y}_t = S(X, G_n^t) $$. The final RF estimate is the average:

    $$ \hat{Y}_{RF} = \frac{1}{q} \sum_{t=1}^{q} S(X, G_n^t) $$

The model was trained using data from multiple LiFePO4 batteries under constant-current conditions. Key hyperparameters, the number of trees ($$ q = 100 $$) and the number of features considered at each split ($$ r = 3 $$), were optimized for performance.

Estimation Performance and Validation

The performance of the Ultrasonic-RF model was evaluated and compared against other methods under both constant-current (CC) and dynamic driving cycles.

Constant-Current Performance: Under CC charge and discharge within the SOC plateau region (20%-80%), the RF model achieved superior accuracy. Its estimates almost perfectly overlapped the reference SOC curve.

Algorithm CC Charge (Plateau) CC Discharge (Plateau) Avg. Comp. Time (s)
RMSE (%) MAE (%) RMSE (%) MAE (%)
Least Squares 7.67 7.02 13.5 12.1 ~0.25
Elastic Net 1.23 0.97 1.88 1.50 ~11.9
SVM 5.28 4.19 2.17 1.67 ~120.0
BP Neural Network 1.63 1.27 1.67 1.37 ~6.0
Random Forest (Proposed) 1.22 0.95 1.38 0.99 ~7.8

Dynamic Profile Validation: To assess real-world applicability, the model was validated under standard automotive driving cycles—Dynamic Stress Test (DST) and the New European Driving Cycle (NEDC). The Ultrasonic-RF model was also compared against a traditional model-based approach using a Thevenin equivalent circuit model combined with an Extended Kalman Filter (EKF).

Driving Cycle Algorithm RMSE (%) MAE (%) Computation Time (s)
DST EKF (Model-based) 2.75 2.53 7.82
Ultrasonic-RF (Proposed) 1.93 1.63 8.43
NEDC EKF (Model-based) 2.39 1.97 8.73
Ultrasonic-RF (Proposed) 1.66 1.42 9.62

The results are conclusive. The proposed data-driven method significantly outperforms the model-based EKF, reducing RMSE by approximately 30% on average. While the ultrasonic feature processing and RF prediction add a marginal computational overhead (less than 1 second), the gain in estimation accuracy is substantial. This trade-off is highly favorable for a BMS, where accurate SOC knowledge is paramount for safety, longevity, and user experience. The EKF’s performance suffers precisely because of the inherent challenge it faces with the LiFePO4 battery: in the flat voltage plateau, the voltage error feedback used for correction is minuscule, leading to poor observability and correction drift.

Conclusion and Outlook

This work presents a novel and effective solution to the persistent challenge of estimating the State of Charge in LiFePO4 batteries, especially during their electrochemically flat plateau region. By shifting the sensing paradigm from electrical to acoustical, we leverage the consistent and high-correlation physical changes within the battery that are faithfully reported by ultrasonic waves. The systematic analysis confirms that high-frequency (4 MHz) ultrasonic interrogation provides robust features across varying operational conditions. Expanding the feature set to include shape-descriptive parameters like specific slopes and temporal characteristics further enhances the information content relevant to SOC.

The fusion of these high-correlation ultrasonic time-domain features with a Random Forest regression model creates a powerful, data-driven estimator. It bypasses the need for an exact electrochemical model and is inherently immune to the limitations imposed by the flat OCV curve of the LiFePO4 battery. The method demonstrates excellent accuracy (RMSE < 1.9%, MAE < 1.7%) under both static and dynamic loads, with a reasonable computational cost suitable for embedded BMS implementation.

For practical deployment, future work could focus on integrating miniature, low-cost ultrasonic transducers directly into the battery module packaging or between cells and cooling plates. This would enable permanent, in-situ monitoring. Furthermore, the ultrasonic sensing framework shows great promise for multi-state estimation, potentially extending to diagnosing State of Health (SOH) by tracking long-term feature degradation, detecting early-stage failures like lithium plating or electrode delamination, and providing thermal monitoring—all from a single, non-invasive sensor suite. This approach marks a significant step towards more intelligent, reliable, and comprehensive management systems for LiFePO4 battery packs and beyond.

Scroll to Top