Enhancing Remaining Useful Life Prediction of Lithium-Ion Batteries via a Hybrid EMD-GAIPSO-LSTMModel

The accurate prognostics and health management (PHM) of lithium-ion batteries is paramount for ensuring the safety, reliability, and economic efficiency of systems ranging from electric vehicles to grid-scale energy storage. At the core of PHM lies the precise prediction of the battery’s remaining useful life (RUL), which signifies the number of operational cycles remaining before the battery’s capacity degrades to a predefined failure threshold. A reliable RUL prediction facilitates timely maintenance, prevents catastrophic failures, and optimizes battery usage strategies. However, forecasting the RUL of a lithium-ion battery is inherently challenging due to the complex, nonlinear, and dynamic electrochemical degradation processes involved.

Existing prediction methodologies can be broadly categorized into model-based and data-driven approaches. Model-based methods, such as electrochemical or equivalent circuit models, attempt to describe the internal physics of the lithium-ion battery. While potentially accurate, they often require intricate parameterization, are sensitive to operating conditions, and struggle to generalize across different battery types and usage patterns. Data-driven methods, in contrast, bypass the need for explicit physical models by learning the degradation patterns directly from historical operational data. Techniques like Support Vector Machines (SVM) and Recurrent Neural Networks (RNN) have shown promise. However, standard RNNs suffer from vanishing or exploding gradient problems when learning long-term dependencies in time-series data like capacity fade.

The Long Short-Term Memory (LSTM) network, a specialized variant of RNN, effectively addresses these issues through its gated cell structure, making it a powerful tool for sequence prediction tasks like battery RUL forecasting. Nonetheless, the performance of an LSTM model is highly sensitive to its hyperparameters (e.g., number of hidden units, learning rate). Manually tuning these parameters is suboptimal and time-consuming. Furthermore, the raw capacity degradation data of a lithium-ion battery is often non-stationary and contains noise, which can impede the learning process of any data-driven model.

To overcome these limitations and achieve superior prediction accuracy and robustness, this work proposes a novel hybrid prediction framework that synergistically integrates Empirical Mode Decomposition (EMD), a genetically-enhanced Particle Swarm Optimization algorithm (GAIPSO), and an LSTM network. The core innovation lies in a multi-stage processing pipeline: first, the non-stationary capacity data is decomposed into simpler, more predictable components using EMD; second, a comprehensively improved PSO algorithm, which incorporates chaotic initialization, adaptive mechanisms, and genetic operations, is developed to efficiently and globally optimize the LSTM’s hyperparameters; finally, the optimized LSTM models are trained on the decomposed data components to perform the RUL prediction. The efficacy of the proposed EMD-GAIPSO-LSTM model is rigorously validated using publicly available lithium-ion battery aging datasets.

Methodological Framework

The proposed framework for lithium-ion battery RUL prediction follows a structured pipeline designed to handle data complexity and model optimization challenges. The complete workflow is illustrated in the diagram below and elaborated in the subsequent sections.

1. Data Acquisition and Preprocessing: The cycle-by-cycle discharge capacity of the lithium-ion battery, which serves as the primary health indicator, is collected. The sequence is then normalized to facilitate stable neural network training.

2. Signal Decomposition via EMD: The normalized capacity fade sequence, denoted as $ s(t) $, is processed using Empirical Mode Decomposition. EMD adaptively decomposes the complex, nonlinear signal into a finite set of Intrinsic Mode Functions (IMFs) and a residual trend $ r(t) $. This process isolates different frequency components inherent in the degradation process, effectively denoising the data and simplifying the prediction task for subsequent models. The original signal can be reconstructed as:
$$ s(t) = \sum_{i=1}^{k} IMF_i(t) + r_k(t) $$
Each resulting component (IMFs and residual) represents a smoother, more stationary sub-series, making them more amenable to modeling with LSTM networks.

3. Hyperparameter Optimization with GAIPSO: A separate LSTM model is constructed to forecast each decomposed component. The critical hyperparameters for each LSTM model are not set manually but are optimized using the proposed Genetic Algorithm hybrid Improved Particle Swarm Optimization (GAIPSO) algorithm. This algorithm is a significant enhancement over the standard PSO, designed to escape local optima and achieve a superior global search capability specifically for tuning the complex parameter space of LSTM models applied to lithium-ion battery data.

4. Component Prediction and Reconstruction: Each optimized LSTM model is trained on its corresponding decomposed component from the training dataset. After training, the models predict the future trajectory of each component. The final RUL prediction for the lithium-ion battery is obtained by summing the predictions of all individual components (IMFs and residual) to reconstruct the complete capacity fade curve. The RUL is then calculated as the number of cycles from the prediction start point until the reconstructed capacity falls below the failure threshold (e.g., 70% of nominal capacity).

The Enhanced GAIPSO Optimization Algorithm

The standard Particle Swarm Optimization (PSO) algorithm is a population-based metaheuristic inspired by bird flocking. Each particle, representing a candidate solution (a set of LSTM hyperparameters), moves through the search space by adjusting its velocity $ v_{ij} $ and position $ x_{ij} $ based on its own best-known position ($ p_{ij} $) and the swarm’s global best-known position ($ p_{gj} $). The update equations for the $ i $-th particle in dimension $ j $ at iteration $ t $ are:
$$ v_{ij}^{t+1} = \omega \cdot v_{ij}^{t} + c_1 r_1 (p_{ij} – x_{ij}^{t}) + c_2 r_2 (p_{gj} – x_{ij}^{t}) $$
$$ x_{ij}^{t+1} = x_{ij}^{t} + v_{ij}^{t+1} $$
where $ \omega $ is the inertia weight, $ c_1 $ and $ c_2 $ are acceleration coefficients, and $ r_1, r_2 $ are random numbers in [0,1]. For lithium-ion battery RUL prediction, the fitness of a particle is defined as the Root Mean Square Error (RMSE) of the LSTM model’s capacity prediction on a validation set:
$$ fitness = \sqrt{ \frac{1}{n} \sum_{k=1}^{n} (y_k – \hat{y}_k)^2 } $$
where $ y_k $ is the true capacity and $ \hat{y}_k $ is the predicted capacity.

While effective for many problems, standard PSO can prematurely converge to local optima when applied to the high-dimensional, non-convex optimization landscape of LSTM hyperparameters for lithium-ion battery modeling. To overcome this, we introduce four key enhancements, culminating in the GAIPSO algorithm.

1. Logistic Chaotic Mapping for Population Initialization: To ensure a diverse initial population that better covers the search space, the initial positions of particles are generated using a Logistic chaotic map instead of pure random sampling. This increases the ergodicity of the initial solutions.
$$ X_{i+1} = \mu X_i (1 – X_i), \quad X_i \in (0,1), \mu \in (3.57, 4] $$

2. Adaptive Inertia Weight: Instead of a linearly decreasing inertia weight, a dynamically adaptive weight is employed. This weight adjusts based on the particle’s current fitness relative to the swarm’s average and minimum fitness, allowing for more refined exploration and exploitation.
$$
\omega_{ij}^t =
\begin{cases}
\omega_{min} + (\omega_{max} – \omega_{min}) \cdot \dfrac{f_a^t – f_{min}^t}{f(x_{ij}^t) – f_{min}^t}, & \text{if } f(x_{ij}^t) \le f_a^t \\
\omega_{max}, & \text{if } f(x_{ij}^t) > f_a^t
\end{cases}
$$
where $ f_a^t $ is the average fitness and $ f_{min}^t $ is the minimum fitness at iteration $ t $.

3. Improved Velocity Update Formula: The velocity update is modified to guide particles using the midpoint between the personal best and global best, encouraging a more balanced search direction.
$$ v_{ij}^{t+1} = \omega \cdot v_{ij}^{t} + c_1 r_1 \left( \frac{p_{ij} + p_{gj}}{2} – x_{ij}^{t} \right) + c_2 r_2 \left( \frac{p_{ij} – p_{gj}}{2} – x_{ij}^{t} \right) $$

4. Fusion with Genetic Algorithm (GA) Operations: After the PSO position update, genetic algorithm operators—selection, crossover, and mutation—are applied to the swarm to further enhance global search capability and population diversity.

Selection: Particles are selected for the genetic operations using a roulette wheel selection based on the inverse of their fitness, giving fitter particles a higher probability.
Crossover: A single-point crossover is performed between selected parent particles to generate offspring, exchanging hyperparameter information.
Mutation: A Gaussian mutation operator is applied to the offspring to introduce random variations and help escape local optima:
$$ x_{ij}^{new} = x_{ij}^{cross} + \sigma \cdot \mathcal{N}(0,1) $$
where $ \sigma $ is a scaling parameter.

The performance of the proposed GAIPSO algorithm was benchmarked against standard PSO and the Sparrow Search Algorithm (SSA) using three standard test functions (Sphere, Schwefel 1.2, Rastrigin). The results, averaged over 10 independent runs, are summarized in the table below. GAIPSO consistently achieved results closest to the theoretical optimum (0) with the smallest standard deviation, demonstrating superior accuracy and robustness.

Table 1: Benchmark Test Results for Optimization Algorithms
Function	Algorithm	Best Value	Average Value	Standard Deviation
F1 (Sphere)	SSA	0.932	1.091	1.212
	PSO	0	3.264e-4	2.254e-3
	GAIPSO	0	2.674e-9	1.036e-9
F2 (Schwefel 1.2)	SSA	0.317	0.565	0.227
	PSO	3.942e-4	4.312e-2	6.229e-3
	GAIPSO	0	1.538e-8	8.264e-9
F3 (Rastrigin)	SSA	0.307	0.384	0.262
	PSO	0	2.186e-4	1.529e-3
	GAIPSO	0	0	0

LSTM Neural Network Fundamentals

The Long Short-Term Memory network is a specialized recurrent neural network architecture designed to model long-range temporal dependencies. Its key innovation is the memory cell, which maintains a state over time, regulated by three gating mechanisms: the forget gate, input gate, and output gate.

At each time step $ t $, the LSTM unit receives the current input $ Z_t $ and the previous hidden state $ H_{t-1} $. The operations within the unit are defined as follows:

Forget Gate ($f_t$): Determines what information from the previous cell state $ C_{t-1} $ should be discarded.
$$ f_t = \sigma(W_f \cdot [H_{t-1}, Z_t] + b_f) $$

Input Gate ($i_t$) and Candidate Cell State ($\tilde{C}_t$): The input gate decides which new values to update, and a tanh layer creates a vector of candidate values.
$$ i_t = \sigma(W_i \cdot [H_{t-1}, Z_t] + b_i) $$
$$ \tilde{C}_t = \tanh(W_c \cdot [H_{t-1}, Z_t] + b_c) $$

Cell State Update ($C_t$): The old cell state is updated by combining the selectively forgotten past information and the selectively remembered new candidate information.
$$ C_t = f_t * C_{t-1} + i_t * \tilde{C}_t $$

Output Gate ($o_t$) and Hidden State ($H_t$): The output gate filters the updated cell state to produce the new hidden state, which is also the output for that time step.
$$ o_t = \sigma(W_o \cdot [H_{t-1}, Z_t] + b_o) $$
$$ H_t = o_t * \tanh(C_t) $$

Here, $ \sigma $ denotes the sigmoid activation function, $ * $ denotes element-wise multiplication, $ W $ terms are weight matrices, and $ b $ terms are bias vectors. For predicting the degradation of a lithium-ion battery, the sequence of capacity values forms the input $ Z_t $, and the LSTM learns to output the subsequent capacity values.

Experimental Validation and Results

To validate the proposed EMD-GAIPSO-LSTM model, experiments were conducted using the publicly available NASA Prognostics Center of Excellence lithium-ion battery dataset. Four cells (B0005, B0006, B0007, B0018) were selected. All cells had a nominal capacity of 2 Ah and were cycled under identical conditions: charged at 1.5 A in constant current (CC) mode until reaching 4.2 V, followed by constant voltage (CV) charging until the current dropped to 20 mA, and then discharged at 2 A in CC mode to a specified cutoff voltage. The battery is considered to have reached its end-of-life (EOL) when its capacity fades to 70% of its nominal value (1.4 Ah). The cycling parameters are detailed below.

Table 2: Lithium-Ion Battery Cycling Test Parameters
Cell ID	Charge Current (A)	Discharge Current (A)	Charge/Discharge Cutoff Voltage (V)	Nominal Capacity (Ah)
B0005	1.5	2.0	4.2 / 2.7	2.0
B0006	1.5	2.0	4.2 / 2.5	2.0
B0007	1.5	2.0	4.2 / 2.3	2.0
B0018	1.5	2.0	4.2 / 2.5	2.0

The capacity fade sequences from these four lithium-ion batteries were used for training and testing. Two prediction scenarios were established to evaluate model robustness with limited early-cycle data: using the first 50% and the first 60% of the capacity data as the training set, with the remaining cycles used for testing. The proposed EMD-GAIPSO-LSTM model was compared against two benchmark models: a standard PSO-optimized LSTM (PSO-LSTM) and a Sparrow Search Algorithm-optimized LSTM (SSA-LSTM).

The hyperparameters for the LSTM optimized by GAIPSO included the number of hidden units, initial learning rate, maximum number of epochs, learning rate drop factor, and learning rate drop period. The search spaces for these parameters were defined, and the GAIPSO algorithm was tasked with finding the optimal combination that minimizes the prediction RMSE on a held-out validation set from the training data.

The prediction performance was quantitatively evaluated using three standard metrics: Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and the Coefficient of Determination ($R^2$).
$$ MAE = \frac{1}{n} \sum_{k=1}^{n} | y_k – \hat{y}_k | $$
$$ RMSE = \sqrt{ \frac{1}{n} \sum_{k=1}^{n} ( y_k – \hat{y}_k )^2 } $$
$$ R^2 = 1 – \frac{ \sum_{k=1}^{n} ( y_k – \hat{y}_k )^2 }{ \sum_{k=1}^{n} ( y_k – \bar{y} )^2 } $$
Lower MAE and RMSE values indicate better accuracy, while an $R^2$ value closer to 1 indicates a better fit of the model to the actual lithium-ion battery degradation trend.

The prediction results for all four lithium-ion battery cells under both training scenarios are consolidated in the following table. The curves visually demonstrated that the EMD-GAIPSO-LSTM predictions closely followed the actual capacity fade trajectory, even near the end-of-life region where the drop is steep. In contrast, the PSO-LSTM and SSA-LSTM models showed larger deviations, especially in the later cycles.

Table 3: Prediction Error Comparison for Different Models and Training Set Sizes
Cell ID	Train %	Model	MAE	RMSE	R²
B0005	60%	SSA-LSTM	0.02022	0.02484	0.9626
		PSO-LSTM	0.01785	0.02159	0.9718
		EMD-GAIPSO-LSTM	0.01204	0.01372	0.9891
	50%	SSA-LSTM	0.03150	0.03591	0.9221
		PSO-LSTM	0.02593	0.02987	0.9462
		EMD-GAIPSO-LSTM	0.01873	0.02112	0.9731
B0006	60%	SSA-LSTM	0.01895	0.02213	0.9674
		PSO-LSTM	0.01681	0.01945	0.9748
		EMD-GAIPSO-LSTM	0.01178	0.01345	0.9872
	50%	SSA-LSTM	0.02834	0.03208	0.9330
		PSO-LSTM	0.02340	0.02710	0.9521
		EMD-GAIPSO-LSTM	0.01702	0.01958	0.9750
B0007	60%	SSA-LSTM	0.01988	0.02342	0.9601
		PSO-LSTM	0.01765	0.02067	0.9689
		EMD-GAIPSO-LSTM	0.01191	0.01388	0.9860
	50%	SSA-LSTM	0.03012	0.03465	0.9255
		PSO-LSTM	0.02488	0.02893	0.9483
		EMD-GAIPSO-LSTM	0.01745	0.01986	0.9744
B0018	60%	SSA-LSTM	0.02055	0.02410	0.9645
		PSO-LSTM	0.01811	0.02101	0.9729
		EMD-GAIPSO-LSTM	0.01216	0.01398	0.9883
	50%	SSA-LSTM	0.03189	0.03644	0.9218
		PSO-LSTM	0.02631	0.03021	0.9467
		EMD-GAIPSO-LSTM	0.01891	0.02145	0.9733

The results clearly demonstrate the superiority of the proposed EMD-GAIPSO-LSTM model across all test cases for the lithium-ion batteries. Key observations are:

1. Superior Accuracy: The EMD-GAIPSO-LSTM model consistently achieved the lowest MAE and RMSE and the highest $R^2$ values. For instance, with 60% training data for cell B0005, it reduced MAE by approximately 0.00818 and RMSE by 0.01112 compared to the SSA-LSTM model, while improving $R^2$ by over 0.0265. The average MAE and RMSE across all tests were kept below 0.01204 and 0.01372 for the 60% training case, with $R^2$ above 0.9860.

2. Robustness to Limited Data: As expected, prediction accuracy degraded for all models when only 50% of the data was used for training. However, the performance degradation of the EMD-GAIPSO-LSTM model was less severe than that of the benchmark models. It maintained high $R^2$ values above 0.9731, indicating its strong capability to learn degradation trends from early-cycle data, a critical feature for practical early-stage RUL prediction of a lithium-ion battery.

3. Effective Framework: The integration of EMD successfully handled the non-stationarity in the lithium-ion battery capacity data. The GAIPSO algorithm proved more effective than standard PSO and SSA in finding optimal LSTM configurations, leading to more accurate and stable prediction models. The hybrid approach effectively combines signal processing, advanced optimization, and deep learning, making it a powerful tool for lithium-ion battery prognostics.

Conclusion

This research presents a comprehensive and effective data-driven framework for predicting the remaining useful life of lithium-ion batteries. The proposed EMD-GAIPSO-LSTM model addresses key challenges in battery prognostics: the non-stationary nature of degradation signals and the difficulty in tuning deep learning model parameters. The empirical mode decomposition acts as a powerful preprocessor, simplifying the complex capacity fade signal into more manageable components. The genetically-enhanced particle swarm optimization algorithm represents a significant improvement over conventional optimizers, exhibiting superior global search capability and convergence speed, which is crucial for efficiently navigating the hyperparameter space of LSTM models. When applied to real-world lithium-ion battery aging data, the integrated model demonstrates remarkable prediction accuracy and robustness, outperforming established benchmark models like SSA-LSTM and PSO-LSTM by a significant margin, with an average accuracy improvement of over 4.7% and 2.5%, respectively.

The implications of this work are substantial for the field of battery management systems. The enhanced prediction accuracy and stability, especially with limited early-cycle data, make the EMD-GAIPSO-LSTM model a promising candidate for practical deployment. In electric vehicles, accurate early RUL forecasts can inform more intelligent energy management strategies, optimally balancing battery usage with other power sources to prolong the battery pack’s service life. In grid storage applications, it can enable predictive maintenance schedules, reducing downtime and operational costs. Future work will focus on extending this framework to incorporate real-time operational data from batteries, such as voltage, current, and temperature profiles, to further improve prediction adaptability under varying load conditions and to validate the model on a broader range of lithium-ion battery chemistries and formats.