Aging Performance Modeling of Li-ion Batteries Based on VaDE-WGANGP

The study of aging performance in li ion battery systems is fundamental for ensuring reliability and safety in applications ranging from electric vehicles to grid-scale energy storage. Full life-cycle aging tests, however, are notoriously time-consuming and resource-intensive, creating a bottleneck for research that relies on large, statistically significant datasets. Furthermore, data-driven algorithms for state-of-health (SOH) estimation or lifetime prediction often require substantial volumes of data for training, which may not be readily available. To address this challenge of data scarcity and to enable a more detailed analysis of inter-cell aging variations, this work proposes a generative modeling framework for simulating the fundamental external characteristics of li ion battery cells across their entire lifespan.

The core methodology integrates two advanced generative models: Variational Deep Embedding (VaDE) and Wasserstein Generative Adversarial Network with Gradient Penalty (WGANGP). The combined VaDE-WGANGP architecture is designed to learn the underlying probability distribution of battery aging data. The model takes as input the key operational signatures from a battery’s discharge cycle—namely voltage, current, and discharged capacity profiles. Through its encoder, the VaDE component maps this high-dimensional data into a structured latent space governed by a Gaussian Mixture Model (GMM). This latent space effectively captures the statistical characteristics and the nuanced differences between cells at various states of health. By sampling from this learned distribution and processing the samples through the decoder (which also serves as the generator for the WGANGP), the model can synthesize realistic, novel battery cycling data. This approach provides a powerful tool for augmenting limited experimental datasets, facilitating the development and validation of data-driven battery management algorithms, and offering a novel perspective for analyzing aging heterogeneity among li ion battery cells.

The training and evaluation are conducted using a publicly available dataset comprising full life-cycle tests on numerous li ion battery cells. The performance analysis demonstrates that the VaDE-WGANGP model not only achieves superior clustering of aging data in the latent space compared to standalone models like VAE or VaDE but also generates higher-fidelity simulated voltage, current, and capacity curves. The utility of the generated data is further validated by employing it to augment training sets for a convolutional neural network-based SOH estimator, resulting in improved estimation accuracy. This confirms that the synthetic data preserves the essential aging characteristics of real li ion battery cells.

Methodological Framework: The VaDE-WGANGP Architecture

The proposed model is a synergistic integration of a clustering-oriented generative network and an adversarial training framework. Its primary objective is to learn a rich, structured representation of li ion battery aging data and then leverage this representation for high-quality data generation.

1. Variational Deep Embedding (VaDE) as a Structured Prior

VaDE extends the standard Variational Autoencoder (VAE) by imposing a Gaussian Mixture Model (GMM) as the prior distribution in the latent space. For a dataset of battery cycling profiles \( \mathbf{x} \), VaDE assumes the data is generated from a latent variable \( \mathbf{z} \) which follows a mixture of \( K \) Gaussian distributions. The generative process is:
$$ p(\mathbf{x}, \mathbf{z}, t) = p_{\theta}(\mathbf{x}|\mathbf{z}) p(\mathbf{z}|t) p(t) $$
where \( t \) is a categorical variable indicating the mixture component, \( p(t) = \text{Cat}(\boldsymbol{\pi}) \), \( p(\mathbf{z}|t) = \mathcal{N}(\mathbf{z}|\boldsymbol{\mu}_t, \boldsymbol{\sigma}_t^2\mathbf{I}) \), and \( p_{\theta}(\mathbf{x}|\mathbf{z}) \) is the likelihood (decoder). The key is to infer the posterior \( p(\mathbf{z}, t|\mathbf{x}) \), which is approximated by a variational distribution \( q_{\phi}(\mathbf{z}, t|\mathbf{x}) = q_{\phi}(\mathbf{z}|\mathbf{x}) q_{\phi}(t|\mathbf{x}) \). The model is trained by maximizing the Evidence Lower Bound (ELBO):
$$ \mathcal{L}_{\text{VaDE}} = \mathbb{E}_{q_{\phi}(\mathbf{z}, t|\mathbf{x})}[\log p_{\theta}(\mathbf{x}|\mathbf{z})] – \text{KL}(q_{\phi}(\mathbf{z}, t|\mathbf{x}) || p(\mathbf{z}, t)) $$
This loss can be decomposed into a reconstruction term and a regularization term that forces the encoded distribution to align with the GMM prior. In the context of li ion battery aging, each mixture component \( t \) can naturally correspond to a distinct SOH interval, allowing the model to disentangle and cluster aging stages within the latent space.

2. Wasserstein GAN with Gradient Penalty (WGANGP) for Enhanced Generation

While VaDE provides a structured latent space, the quality of data generated from its decoder can be further refined. Generative Adversarial Networks (GANs) excel at producing sharp, realistic samples. We employ WGANGP, a stabilized variant of GAN, which minimizes the Wasserstein-1 distance between real and generated data distributions. It consists of a generator \( G \) and a critic \( D \). The critic loss with gradient penalty is:
$$ \mathcal{L}_D = \mathbb{E}_{\tilde{\mathbf{x}} \sim \mathbb{P}_g}[D(\tilde{\mathbf{x}})] – \mathbb{E}_{\mathbf{x} \sim \mathbb{P}_r}[D(\mathbf{x})] + \lambda \mathbb{E}_{\hat{\mathbf{x}} \sim \mathbb{P}_{\hat{\mathbf{x}}}}[ (|| \nabla_{\hat{\mathbf{x}}} D(\hat{\mathbf{x}}) ||_2 – 1)^2 ] $$
where \( \mathbb{P}_r \) is the real data distribution (aging profiles), \( \mathbb{P}_g \) is the generator distribution, \( \hat{\mathbf{x}} \) is a random interpolation between real and generated samples, and \( \lambda \) is the penalty coefficient. The generator aims to minimize:
$$ \mathcal{L}_G = -\mathbb{E}_{\tilde{\mathbf{x}} \sim \mathbb{P}_g}[D(\tilde{\mathbf{x}})] $$
WGANGP offers more stable training and meaningful loss metrics compared to standard GANs, which is crucial for modeling the complex dynamics of li ion battery discharge curves.

3. Integrated VaDE-WGANGP Model

The integrated architecture shares parameters between the VaDE decoder and the WGANGP generator. The encoder \( E_{\phi} \) compresses a li ion battery input profile \( \mathbf{x} \) to the parameters of the latent distribution \( (\boldsymbol{\mu}_z, \boldsymbol{\sigma}_z) \). The generator \( G_{\theta} \) takes a sampled latent variable \( \mathbf{z} \) (or a sample from the GMM) and reconstructs/generates a profile \( \hat{\mathbf{x}} \). The critic \( D_{\psi} \) then tries to distinguish between real \( \mathbf{x} \) and generated \( \hat{\mathbf{x}} \) profiles. The total loss function for the integrated model is a combination:
$$ \mathcal{L}_{\text{Total}} = \mathcal{L}_{\text{VaDE}} + \alpha \mathcal{L}_G + \beta \mathcal{L}_D $$
where \( \alpha \) and \( \beta \) are weighting coefficients. The adversarial feedback from \( D \) guides the generator (and consequently, the encoder) to produce more realistic samples, improving beyond the sometimes blurry outputs of standalone VaDE. This synergy allows the model to not only cluster aging data effectively but also to generate high-fidelity, novel aging trajectories for li ion battery cells.

Data Processing and Experimental Setup

The model was trained and validated using data from the Stanford li ion battery dataset, which contains full life-cycle testing of A123 18650 LFP cells. The following steps outline the data pipeline:

Data Selection: Discharge segments from 30 randomly selected cells were used. The discharge was performed at a constant 4A current, providing a consistent operational context to study aging effects.

Feature Extraction: Three time-series profiles from each discharge cycle were taken as input features: discharge voltage (V), discharge current (A), and discharged capacity (Ah). These encapsulate the primary external aging signatures of a li ion battery.

Preprocessing: Each profile was interpolated to a fixed length of 500 time steps, creating a uniform input dimension. The final input tensor shape was \( N \times 500 \times 3 \), where \( N \) is the total number of discharge cycles across all selected cells (9636 cycles). The SOH for each cycle was calculated and used to label data into five contiguous intervals: 100%-96%, 96%-92%, 92%-88%, 88%-84%, and 84%-80%.

Hyperparameter Value/Choice
Latent Space Dimension 10
Number of GMM Components (K) 5
Batch Size 64
Learning Rate 0.001
Optimizer Adam
Gradient Penalty Coefficient (λ) 10
Pre-training Epochs (VAE) 20
Total Training Epochs 100

Results and Analysis

1. Latent Space Clustering of Aging Characteristics

The ability of the model to structure the latent space was evaluated by visualizing the encoded data using t-SNE. The VaDE-WGANGP model demonstrated superior clustering performance compared to VAE and standalone VaDE. The latent representations formed more distinct, separable clusters corresponding to the five SOH intervals, with minimal overlap between adjacent aging stages. This indicates that the model successfully learned to compress the high-dimensional aging data into a lower-dimensional space where the progression of li ion battery degradation is clearly organized.

Model Clustering Accuracy (ACC) (%)
VAE 30.48
VaDE 74.55
VaDE-WGANGP 83.68

The quantitative clustering accuracy, calculated by matching the GMM component assignments to the true SOH interval labels, confirms this observation. The integration of the adversarial loss (WGANGP) not only aids generation but also refines the discriminative power of the encoder, leading to a more informative latent space for li ion battery aging states.

2. Quality of Generated Battery Profiles

The primary goal is to generate realistic discharge profiles. The quality was first assessed via the reconstruction error on the training set, which measures how well the model can encode and decode known data points. A lower reconstruction error suggests a more accurate underlying model.

Model Mean Squared Reconstruction Error
VAE 0.8501
VaDE 0.0664
VaDE-WGANGP 0.0022

Qualitatively, the voltage profiles generated by VaDE-WGANGP for a specific SOH interval (e.g., 92%-96%) show high fidelity. They accurately capture the key characteristics of real li ion battery discharge: the initial voltage drop, the long flat voltage plateau, and the final rapid voltage drop at the end of discharge. The generated curves are smooth and exhibit the natural cell-to-cell variation observed in the real dataset, unlike VAE-generated curves which often appear noisy and miss key transitional features. The model successfully generated plausible profiles for all three features (voltage, current, capacity) across all five SOH intervals, demonstrating its capability to simulate the entire aging trajectory of a li ion battery.

3. Utility of Generated Data for SOH Estimation

A critical test of the generated data’s utility is its performance in downstream tasks. We augmented a small original training set of li ion battery data with synthetic data from each model (VAE, VaDE, VaDE-WGANGP). A Convolutional Neural Network (CNN) was then trained on each augmented set to estimate SOH from discharge voltage and current profiles, and tested on a held-out set of real cells.

Training Data Test Cell RMSE (%) MAE (%)
Original Data Only Cell 23 4.96 4.12
Cell 24 4.75 4.55
Cell 36 5.36 4.16
Cell 37 5.26 3.64
Original + VAE Generated Cell 23 6.77 5.26
Cell 24 6.62 6.12
Cell 36 8.96 7.54
Cell 37 7.92 6.53
Original + VaDE Generated Cell 23 3.59 3.04
Cell 24 3.87 3.08
Cell 36 4.32 3.66
Cell 37 3.14 2.60
Original + VaDE-WGANGP Generated Cell 23 2.38 2.13
Cell 24 3.23 2.73
Cell 36 3.43 2.97
Cell 37 2.83 2.08

The results are conclusive. Augmentation with low-quality VAE-generated data degraded SOH estimation performance. Augmentation with VaDE-generated data provided a significant improvement. However, the best performance across all metrics (Root Mean Square Error – RMSE, Mean Absolute Error – MAE) was achieved when the training set was augmented with data generated by the integrated VaDE-WGANGP model. This proves that the synthetic data produced by our proposed model contains meaningful and generalizable aging patterns that enhance the learning capability of a data-driven li ion battery SOH estimator, validating its practical utility for overcoming data scarcity.

Conclusion

This work presents a novel generative modeling approach for simulating the aging characteristics of li ion battery cells. By integrating the structured probabilistic framework of Variational Deep Embedding (VaDE) with the high-fidelity adversarial training of WGANGP, the developed VaDE-WGANGP model effectively learns the complex, multi-modal distribution of battery discharge profiles across a full lifespan. The model demonstrates two key capabilities: first, it creates a well-clustered latent space where distinct states of health are separable, providing an insightful low-dimensional representation of li ion battery degradation. Second, and more importantly, it can generate high-quality, realistic synthetic data for voltage, current, and capacity profiles at arbitrary SOH intervals.

The practical value of this generation capability was confirmed through a downstream SOH estimation task, where augmenting a small real dataset with VaDE-WGANGP generated data led to a significant reduction in estimation error. This establishes the model as a powerful tool for data augmentation in li ion battery research, particularly when experimental data is limited. Furthermore, the model’s ability to capture and simulate cell-to-cell variations offers a new avenue for studying aging heterogeneity within battery packs, a critical factor for system-level reliability and longevity assessment.

Scroll to Top