Lithium-ion Battery RUL Prediction Under Multiple Health Factors Using SABO-ELM Model

Accurately predicting the Remaining Useful Life (RUL) of lithium-ion batteries is paramount for ensuring the safety, reliability, and economic efficiency of systems dependent on them, most notably in electric vehicles and grid-scale energy storage. The degradation of a lithium-ion battery is a complex electro-chemical process influenced by operating conditions, leading to capacity fade and power deterioration. Precise RUL estimation allows for proactive maintenance, prevents catastrophic failures, and optimizes battery usage. However, this task is challenged by factors like capacity regeneration phenomena, measurement noise, and the non-linear nature of battery aging.

Traditional model-based approaches, which rely on detailed electrochemical or equivalent circuit models, often struggle with complexity and parameter identification. Consequently, data-driven methods have gained prominence. These methods leverage historical operational data to train machine learning models that learn the mapping between measurable battery parameters and its degradation trajectory. The core of an effective data-driven RUL prediction framework lies in two aspects: first, the extraction of informative Health Factors (HFs) that are strongly correlated with the underlying state of health; and second, the selection and optimization of a predictive model that is both accurate and robust against the inherent randomness and noise in the data.

This work presents a novel, hybrid data-driven methodology that integrates advanced signal processing, systematic health feature engineering, and a metaheuristic-optimized machine learning model to enhance the accuracy and stability of RUL prediction for lithium-ion batteries. The proposed method is validated using a well-known public dataset, demonstrating superior performance compared to several benchmark models.

Data Source and Health Indicator Extraction

The experimental data utilized in this study is sourced from the publicly available NASA lithium-ion battery dataset. Cells B05, B06, and B07 are employed for model development and validation. These are 18650-type lithium-ion batteries with a nominal capacity of 2 Ah. They underwent repeated charge-discharge cycles under controlled conditions: a constant-current (CC) charge at 1.5A to 4.2V, followed by a constant-voltage (CV) charge until the current dropped to 20mA, and finally a constant-current discharge at 2.0A to a specified cut-off voltage (2.7V for B05, 2.5V for B06, and 2.2V for B07). A battery is considered to have reached its end-of-life (EOL) when its available capacity fades to 70% of its nominal value (1.4 Ah). For cell B07, which did not degrade to 1.4 Ah, a threshold of 1.45 Ah is used. The capacity degradation curves clearly illustrate the non-linear aging and occasional capacity regeneration phenomena typical of lithium-ion batteries.

Incremental Capacity (IC) Analysis and Denoising

Incremental Capacity Analysis (ICA) is a powerful diagnostic tool that transforms voltage-capacity (V-Q) charge/discharge curves into IC curves by computing the derivative of capacity with respect to voltage, $dQ/dV$. Peaks and valleys in the IC curve are directly related to phase transition events within the battery electrodes, making it highly sensitive to degradation mechanisms. The IC value at a given point is calculated from discrete measurement data as:

$$IC = \frac{dQ}{dV} \approx \frac{\Delta Q}{\Delta V}$$

However, raw IC curves derived from experimental data are often noisy, obscuring critical features. A filtering step is essential. In this work, a Kalman filter is applied to smooth the raw IC curves. The Kalman filter operates recursively on streams of noisy input data to produce a statistically optimal estimate of the underlying system state. The process can be summarized by the standard Kalman filter equations. The prediction step is:

$$
\begin{aligned}
\hat{x}_{k|k-1} &= F_k \hat{x}_{k-1|k-1} + B_k u_k \\
P_{k|k-1} &= F_k P_{k-1|k-1} F_k^T + Q_k
\end{aligned}
$$

And the update step is:

$$
\begin{aligned}
\tilde{y}_k &= z_k – H_k \hat{x}_{k|k-1} \\
S_k &= H_k P_{k|k-1} H_k^T + R_k \\
K_k &= P_{k|k-1} H_k^T S_k^{-1} \\
\hat{x}_{k|k} &= \hat{x}_{k|k-1} + K_k \tilde{y}_k \\
P_{k|k} &= (I – K_k H_k) P_{k|k-1}
\end{aligned}
$$

Where $ \hat{x} $ is the state estimate (the smoothed IC value), $ P $ is the estimation error covariance, $ F $ is the state transition model, $ z $ is the measurement, $ H $ is the observation model, $ Q $ and $ R $ are process and measurement noise covariances, and $ K $ is the Kalman gain. The application of this filter results in a much smoother IC curve where peaks are more distinct and analyzable, providing a reliable basis for feature extraction.

Health Factor (HF) Extraction and Selection

Instead of relying on a single indirect health indicator, this work proposes extracting multiple HFs from the denoised IC curve to comprehensively capture the degradation signature of the lithium-ion battery. The selection of HFs is guided by practical charging scenarios, where batteries are often not fully discharged. A voltage window from 3.85V to 4.09V is chosen as the region of interest. Within this window, ten distinct HFs are extracted:

HF1 to HF8: The IC ($dQ/dV$) values at eight equally spaced voltage points (every 0.03V) within the defined window.
HF9: The magnitude of the main peak in the IC curve ($IC_{peak}$).
HF10: The voltage at which the main peak occurs ($V_{peak}$).

To ensure the selected HFs are truly informative and to reduce model dimensionality, a correlation analysis is performed. The Spearman’s rank correlation coefficient is used due to its robustness to non-linear relationships. For two variables $X$ and $Y$, the Spearman’s coefficient $ρ_s$ is calculated as:

$$ρ_s = 1 – \frac{6 \sum d_i^2}{n(n^2 – 1)}$$

where $d_i$ is the difference between the ranks of corresponding variables, and $n$ is the number of observations. The absolute value of $ρ_s$ indicates the strength of the monotonic relationship: $0.8-1.0$ (very strong), $0.6-0.8$ (strong), $0.4-0.6$ (moderate). HFs with a very strong correlation (|ρ_s| > 0.89 in this study) with the actual battery capacity are retained. The analysis for the three batteries yields the following highly correlated HF sets:

Battery	Selected Health Factors (HFs)
B05	HF1, HF2, HF3, HF4, HF5, HF9
B06	HF1, HF2, HF3, HF4, HF5, HF9, HF10
B07	HF1, HF2, HF3, HF4, HF5, HF9

This selection process effectively removes redundant or weakly correlated features, providing a robust and concise input vector for the subsequent prediction model that accurately represents the degradation state of the lithium-ion battery.

The Proposed SABO-ELM Prediction Framework

Extreme Learning Machine (ELM) as the Base Predictor

The Extreme Learning Machine (ELM) is chosen as the base regression model for RUL prediction due to its extremely fast training speed and good generalization performance. Unlike traditional neural networks that iteratively tune all parameters, ELM randomly assigns weights and biases for the hidden layer and analytically determines the output weights. For $N$ arbitrary distinct samples $( \mathbf{x}_i, \mathbf{t}_i )$, where $\mathbf{x}_i \in \mathbb{R}^n$ and $\mathbf{t}_i \in \mathbb{R}^m$, the standard single-hidden layer feedforward network (SLFN) with $L$ hidden nodes is modeled as:

$$
\sum_{i=1}^{L} \boldsymbol{\beta}_i g(\mathbf{w}_i \cdot \mathbf{x}_j + b_i) = \mathbf{o}_j, \quad j=1, \ldots, N
$$

where $g(\cdot)$ is the activation function, $\mathbf{w}_i$ is the input weight vector, $b_i$ is the bias of the $i$-th hidden neuron, and $\boldsymbol{\beta}_i$ is the output weight vector connecting the $i$-th hidden neuron to the output layer. This can be written compactly as $\mathbf{H} \boldsymbol{\beta} = \mathbf{T}$, where $\mathbf{H}$ is the hidden layer output matrix:

$$
\mathbf{H} =
\begin{bmatrix}
g(\mathbf{w}_1 \cdot \mathbf{x}_1 + b_1) & \cdots & g(\mathbf{w}_L \cdot \mathbf{x}_1 + b_L) \\
\vdots & \ddots & \vdots \\
g(\mathbf{w}_1 \cdot \mathbf{x}_N + b_1) & \cdots & g(\mathbf{w}_L \cdot \mathbf{x}_N + b_L)
\end{bmatrix}_{N \times L}
$$

$\boldsymbol{\beta} = [\boldsymbol{\beta}_1^T, \ldots, \boldsymbol{\beta}_L^T]^T$, and $\mathbf{T} = [\mathbf{t}_1^T, \ldots, \mathbf{t}_N^T]^T$. The unique smallest norm least-squares solution for the output weights is:

$$
\hat{\boldsymbol{\beta}} = \mathbf{H}^{\dagger} \mathbf{T}
$$

where $\mathbf{H}^{\dagger}$ is the Moore-Penrose generalized inverse of matrix $\mathbf{H}$. While fast, the random initialization of $\mathbf{w}_i$ and $b_i$ can lead to unstable performance and sub-optimal generalization in predicting the RUL of a lithium-ion battery.

Subtraction-Average-Based Optimizer (SABO)

To overcome the instability of ELM, the Subtraction-Average-Based Optimizer (SABO), a novel metaheuristic algorithm, is employed to optimize the input weights and hidden biases. SABO’s update mechanism is uniquely based on the concept of the average of difference vectors between population members. The core position update for a search agent $\mathbf{K}_i$ is defined as:

$$
\mathbf{K}_i^{\text{new}} = \mathbf{K}_i + \mathbf{r}_i \cdot \frac{1}{N} \sum_{j=1}^{N} (\mathbf{K}_i \ \dot{-} \ \mathbf{K}_j)
$$

where $\mathbf{r}_i$ is a random vector, $N$ is the population size, and the operator $\dot{-}$ represents a special “subtraction by average” operation. This operator is defined as:

$$
\mathbf{A} \ \dot{-} \ \mathbf{B} = \text{sign}(F(\mathbf{A}) – F(\mathbf{B})) (\mathbf{A} – \overset{\rightarrow}{v} \cdot \mathbf{B})
$$

Here, $\text{sign}()$ is the signum function, $F()$ is the objective function (e.g., prediction error), and $\overset{\rightarrow}{v}$ is a random vector with components from the set {1, 2}. This update rule encourages efficient exploration and exploitation of the search space, driving the population towards the global optimum—the best set of ELM parameters for the lithium-ion battery RUL prediction task.

SABO-ELM Integration for RUL Prediction

The integrated SABO-ELM framework works as follows: The SABO algorithm’s search agents represent potential sets of ELM’s input weights and hidden layer biases. The objective function minimized by SABO is the Root Mean Square Error (RMSE) of the ELM model’s prediction on the training data. The optimal parameters found by SABO are then used to construct the final, stable ELM model for RUL estimation. This hybrid approach leverages the global search capability of SABO to find a robust parameter initialization, thereby mitigating ELM’s randomness and significantly improving the prediction accuracy and consistency for the degradation of lithium-ion batteries.

Experimental Validation and Results Analysis

Evaluation Metrics and Experimental Setup

The performance of the proposed and comparative models is assessed using three standard statistical metrics:

Mean Absolute Error (MAE): $MAE = \frac{1}{n} \sum_{i=1}^{n} |y_i – \hat{y}_i|$
Mean Absolute Percentage Error (MAPE): $MAPE = \frac{100\%}{n} \sum_{i=1}^{n} \left| \frac{y_i – \hat{y}_i}{y_i} \right|$
Root Mean Square Error (RMSE): $RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i – \hat{y}_i)^2}$

where $y_i$ is the true RUL or capacity, $\hat{y}_i$ is the predicted value, and $n$ is the number of test cycles. For each battery, data from the initial 70% of cycles is used for training, and the remaining 30% is used for testing the RUL prediction. The proposed SABO-ELM model is compared against three benchmarks: the standard ELM, an ELM optimized by the Beluga Whale Optimization algorithm (BWO-ELM), and a Long Short-Term Memory (LSTM) network, a popular deep learning model for sequence prediction.

Impact of Health Factor Selection

Ablation studies confirm the necessity of the Spearman-based HF selection. Three HF sets are compared: the selected set (G1), the full set of 10 HFs without selection (G2), and an over-filtered set where HFs with inter-correlation >0.95 were removed (G3). The results, shown in the table below, demonstrate that the selected set (G1) consistently provides the best or near-best performance across all batteries, validating the feature engineering process for effective lithium-ion battery prognostics.

Battery	HF Set	MAE (%)	MAPE (%)	RMSE (%)
B05	G1 (Selected)	1.36	1.01	1.82
	G2 (All 10 HFs)	1.50	1.13	1.73
	G3 (Over-filtered)	1.38	1.04	1.85
B06	G1 (Selected)	1.34	1.06	1.96
	G2 (All 10 HFs)	2.43	1.96	3.16
	G3 (Over-filtered)	1.45	1.17	2.01
B07	G1 (Selected)	1.42	0.97	1.91
	G2 (All 10 HFs)	1.64	1.14	2.05
	G3 (Over-filtered)	1.47	1.02	1.76

Performance Comparison with Benchmark Models

The comprehensive comparison of prediction models unequivocally demonstrates the superiority of the proposed SABO-ELM approach for estimating the RUL of lithium-ion batteries. The prediction curves show that SABO-ELM most closely tracks the actual capacity fade, including the subtle capacity regeneration “bumps,” while other models like standard ELM exhibit significant deviation and instability.

The quantitative results summarized in the table below provide clear evidence. For instance, on battery B07, the SABO-ELM model achieves an MAE of 1.42%, MAPE of 0.97%, and RMSE of 1.91%. In contrast, the standard ELM performs poorly with errors of 5.36%, 4.8%, and 3.26%, respectively. The BWO-ELM model, while better than standard ELM, is still slightly outperformed by SABO-ELM. Most notably, compared to the sophisticated LSTM deep learning model, the proposed SABO-ELM reduces MAE by 52.03%, MAPE by 51.98%, and RMSE by 42.99% on average across the three batteries. This highlights the effectiveness of the metaheuristic optimization in conjunction with a well-tuned feature set.

Battery	Model	MAE (%)	MAPE (%)	RMSE (%)
B05	ELM	3.88	2.86	4.57
	BWO-ELM	1.40	1.04	1.90
	LSTM	2.67	1.97	3.06
	SABO-ELM (Proposed)	1.36	1.01	1.82
B06	ELM	4.31	3.32	4.99
	BWO-ELM	1.49	1.17	2.36
	LSTM	2.22	1.89	2.49
	SABO-ELM (Proposed)	1.34	1.06	1.96
B07	ELM	5.36	4.80	3.26
	BWO-ELM	1.50	1.03	2.03
	LSTM	2.96	2.02	3.35
	SABO-ELM (Proposed)	1.42	0.97	1.91

Robustness Analysis at Different Prediction Starting Points

To evaluate the robustness and early prediction capability of the model, experiments are conducted with different training set sizes, simulating predictions that start earlier in the battery’s life (at 50% and 60% of total cycles). The results show that while prediction error naturally increases as less data is available for training, the SABO-ELM model maintains commendable accuracy. Even when predicting RUL starting from the 50% cycle point, the maximum MAPE across all batteries remains below 1.6%, and the RMSE below 2.2%. This demonstrates the model’s reliability and its potential for practical early-warning applications in lithium-ion battery management systems.

Conclusion

This work has presented a comprehensive and effective framework for predicting the Remaining Useful Life (RUL) of lithium-ion batteries. The core contributions are threefold. First, a systematic health feature engineering methodology was developed, utilizing Incremental Capacity (IC) curves denoised via a Kalman filter and extracting multiple voltage-point and peak-based Health Factors (HFs). The application of Spearman correlation analysis ensured the selection of a robust, non-redundant feature set highly indicative of lithium-ion battery degradation. Second, to address the instability inherent in the standard Extreme Learning Machine (ELM), the novel Subtraction-Average-Based Optimizer (SABO) was successfully integrated to optimize ELM’s input weights and hidden biases. This SABO-ELM hybrid model effectively escapes local optima, leading to a stable and highly accurate predictor.

Extensive validation on the NASA lithium-ion battery dataset confirmed the superiority of the proposed approach. The SABO-ELM model significantly outperformed benchmark models including standard ELM, BWO-ELM, and a deep LSTM network. It achieved prediction errors (MAPE, RMSE) consistently below 2%, demonstrating high precision and robustness even at early prediction starting points. The method’s ability to accurately track capacity fade, including regeneration phenomena, underscores its practical value. Future work will focus on validating the framework on larger and more diverse battery datasets under varying operational profiles and exploring transfer learning techniques to enhance its generalizability across different lithium-ion battery chemistries and formats.