Data-Driven State of Health Estimation for Sodium-Ion Batteries

As the demand for energy storage solutions grows, sodium-ion batteries have emerged as a promising alternative to lithium-ion batteries due to the abundance of sodium resources and similar working principles. However, sodium-ion batteries, like other electrochemical systems, experience performance degradation over time and cycles, which necessitates accurate State of Health (SOH) estimation for safe and efficient applications. In this article, I propose a comprehensive data-driven framework for SOH estimation of sodium-ion batteries, leveraging feature extraction, selection, and machine learning algorithms. The methodology is validated using experimental data from 2.5 Ah pouch-type sodium-ion batteries, demonstrating high accuracy with root mean square errors below 1.6%, and particularly below 0.8% for Gaussian process regression.

Sodium-ion batteries are gaining traction in large-scale energy storage and electric mobility due to the limited availability of lithium. Sodium constitutes approximately 2.83% of the Earth’s crust, compared to only 0.01% for lithium, making it a more sustainable option. However, sodium-ion batteries undergo complex aging processes influenced by operational conditions, leading to capacity fade and increased internal resistance. Accurately estimating SOH is crucial for battery management systems to optimize performance, extend lifespan, and ensure safety. SOH is defined as the ratio of current capacity to initial capacity, expressed as:

$$SOH = \frac{Q_{act}}{Q_{nom}} \times 100\%$$

where $Q_{act}$ is the actual capacity at the current aging state, and $Q_{nom}$ is the nominal capacity. Traditional SOH estimation methods, such as model-based approaches, require precise electrochemical models, which are challenging to develop for sodium-ion batteries due to unclear degradation mechanisms. Therefore, data-driven methods, which rely on machine learning to map aging features to SOH without explicit models, offer a viable solution.

The core of my data-driven approach involves three steps: feature extraction from charging data, feature selection to identify the most informative indicators, and SOH estimation using machine learning models. Charging data is utilized because charging processes are often controlled and reproducible, unlike discharging cycles that vary with usage. From the charging voltage curves, I extract statistical parameters, interval capacities, and incremental capacity (IC) curve features. The IC curve, derived from voltage and capacity data, highlights phase transitions and side reactions in sodium-ion batteries, with peaks corresponding to electrochemical processes. The capacity as a function of voltage is computed as:

$$Q(V) = \int_{V(t)=V_{lower}}^{V(t)=V} |I(t)| dt$$

where $V_{lower}$ is the lower cutoff voltage, and $I(t)$ is the current. For a voltage interval $[V_1, V_2]$, the capacity increment is:

$$\Delta Q(V_1, V_2) = \int_{V(t)=V_1}^{V(t)=V_2} |I(t)| dt = Q(V_2) – Q(V_1)$$

To handle partial charging scenarios common in real-world applications, I focus on voltage ranges from 3.20 V to 3.95 V, which represents typical operating conditions for sodium-ion batteries. Features are extracted through equal-width grouping and equal-capacity grouping methods. In equal-width grouping, the voltage range is divided into segments with fixed widths, such as 0.15 V intervals, to compute statistics like mean voltage, standard deviation, and capacity within each segment. In equal-capacity grouping, the charging capacity is equally partitioned to define voltage intervals, allowing extraction of voltage extremes and variations. Additionally, IC curve features—including peak height, position, area, half-width, and slopes—are calculated to capture degradation patterns. Table 1 summarizes the feature extraction methods and resulting parameters.

Table 1: Feature Extraction Methods for Sodium-Ion Batteries
Method Voltage Intervals (V) Extracted Features
Equal-Width Grouping 3.20–3.35, 3.35–3.50, 3.50–3.65, 3.65–3.80, 3.80–3.95 Mean voltage, voltage standard deviation, capacity in interval
Equal-Width Grouping (Extended) 3.20–3.50, 3.20–3.60, 3.20–3.70, 3.30–3.60, 3.30–3.70 Capacity in interval
Equal-Capacity Grouping 33%–67%, 67%–100%, 33%–100% of capacity Mean voltage, voltage standard deviation, minimum voltage, maximum voltage
Incremental Capacity Curve Full voltage range Peak height, peak position, peak area, peak half-width, left slope, right slope

From an initial set of 50 features, I apply a multi-stage feature selection process to eliminate redundant or irrelevant indicators, enhancing model efficiency and preventing overfitting. First, variance filtering removes features with variance below $10^{-4}$, as they show negligible change during aging. Second, grey relational analysis (GRA) evaluates the geometric similarity between each feature’s trajectory and the capacity degradation curve. The grey relational coefficient for a feature $x_j$ with respect to SOH $y$ is computed as:

$$r_j = \frac{1}{n} \sum_{i=1}^{n} \frac{\min(\delta_j) + \rho \max(\delta_j)}{\delta_j(i) + \rho \max(\delta_j)}$$

where $\delta_j(i) = |y_i – x_{ij}|$ is the absolute difference at sample $i$, $\rho$ is a resolution coefficient (typically 0.5), and $n$ is the number of samples. Features with $r_j \geq 0.65$ are retained. Third, recursive feature elimination (RFE) based on support vector machine (SVM) weights ranks the remaining features by importance, iteratively removing the least significant ones. The weight vector $w$ from SVM training is used to score features, with the feature having the smallest weight removed in each iteration. After RFE, the top four features are selected: voltage standard deviation in the 33%–100% capacity interval, minimum voltage in the 67%–100% capacity interval, maximum voltage in the 33%–67% capacity interval, and capacity in the 3.20–3.95 V voltage interval. These features effectively capture aging dynamics in sodium-ion batteries, as summarized in Table 2.

Table 2: Selected Aging Features for Sodium-Ion Battery SOH Estimation
Feature Description Source Method Importance Rank
Voltage standard deviation in 33%–100% capacity interval Equal-Capacity Grouping 1
Minimum voltage in 67%–100% capacity interval Equal-Capacity Grouping 2
Maximum voltage in 33%–67% capacity interval Equal-Capacity Grouping 3
Capacity in 3.20–3.95 V voltage interval Equal-Width Grouping 4

With the selected features, I develop SOH estimation models using four machine learning algorithms: multiple linear regression (MLR), support vector machine (SVM), Gaussian process regression (GPR), and back propagation neural network (BPNN). Each model is trained on data from three sodium-ion batteries and tested on the fourth, with data normalized to mitigate scale effects. For MLR, the model is expressed as:

$$Y = \alpha_1 x_1 + \alpha_2 x_2 + \alpha_3 x_3 + \alpha_4 x_4 + \beta$$

where $Y$ is the estimated SOH, $x_1$ to $x_4$ are the selected features, $\alpha$ are regression coefficients, and $\beta$ is the intercept. SVM employs a Gaussian kernel to handle nonlinearities, with the decision function:

$$f(x) = \sum_{i=1}^{m} (\hat{\alpha}_i – \alpha_i) K(x_i, x_j) + b$$

where $K(x_i, x_j) = \exp(-\gamma ||x_i – x_j||^2)$ is the Gaussian kernel, and $\hat{\alpha}_i, \alpha_i, b$ are parameters optimized via grid search with 5-fold cross-validation. GPR assumes a Gaussian process prior over functions, with predictions following a posterior distribution. The squared exponential kernel is used:

$$k(x, x’) = \sigma_f^2 \exp\left(-\frac{||x – x’||^2}{2l^2}\right) + \sigma_n^2 I$$

where $\sigma_f^2$ is the signal variance, $l$ is the length scale, and $\sigma_n^2$ is the noise variance. Hyperparameters are optimized through marginal likelihood maximization. BPNN consists of an input layer (4 neurons), a hidden layer (3 neurons), and an output layer (1 neuron), with weights updated via backpropagation to minimize mean squared error. The activation function for the hidden layer is sigmoid, and for the output layer is linear. The network error $E$ is computed as:

$$E = \frac{1}{2N} \sum_{i=1}^{N} (\hat{y}_i – y_i)^2$$

where $\hat{y}_i$ is the predicted SOH, $y_i$ is the true SOH, and $N$ is the sample size. Training stops when $E \leq 10^{-3}$ or after 100 iterations.

To validate the models, I conduct accelerated aging tests on 2.5 Ah pouch-type sodium-ion batteries with layered oxide cathodes and hard carbon anodes. The tests involve cycling at 25°C with a 1C charge rate to 3.95 V, a 10-minute rest, and discharge at 0.5C or 1C to 1.5 V until SOH drops to 80%. Data is sampled at 1 Hz, and cycles show capacity degradation with local fluctuations, typical for sodium-ion batteries. The dataset includes four batteries, with three used for training and one for testing in a rotated manner. Performance is evaluated using root mean square error (RMSE) and coefficient of determination ($R^2$), defined as:

$$RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i – f(x_i))^2}$$
$$R^2 = 1 – \frac{\sum_{i=1}^{n} (y_i – f(x_i))^2}{\sum_{i=1}^{n} (y_i – \bar{y})^2}$$

where $y_i$ is the true SOH, $f(x_i)$ is the estimated SOH, and $\bar{y}$ is the mean true SOH. Results indicate that all models achieve accurate SOH estimation for sodium-ion batteries, with RMSE values below 1.6% and $R^2$ above 98%. GPR performs best, with an RMSE of 0.8% and $R^2$ of 99.5%, followed by MLR (RMSE 1.2%, $R^2$ 98.8%), BPNN (RMSE 1.4%, $R^2$ 98.6%), and SVM (RMSE 1.5%, $R^2$ 98.2%). The superior performance of GPR can be attributed to its probabilistic framework, which provides uncertainty estimates and handles small datasets effectively, as often encountered with sodium-ion battery aging data. Table 3 summarizes the performance metrics.

Table 3: Performance Comparison of SOH Estimation Models for Sodium-Ion Batteries
Model RMSE (%) $R^2$ (%) Key Characteristics
Multiple Linear Regression 1.2 98.8 Linear mapping, simple and interpretable
Support Vector Machine 1.5 98.2 Nonlinear kernel, robust to overfitting
Gaussian Process Regression 0.8 99.5 Probabilistic output, handles uncertainty
BP Neural Network 1.4 98.6 Nonlinear approximation, requires tuning

The estimation results demonstrate that the selected aging features—derived from partial charging data—effectively correlate with capacity fade in sodium-ion batteries. For instance, the voltage standard deviation in the high-capacity interval reflects increased internal resistance and electrode degradation, while interval capacities indicate active material loss. The IC curve features, though not among the top four after selection, provide insights into phase transitions specific to sodium-ion chemistry, such as volume changes in hard carbon anodes. However, for practical SOH estimation, direct voltage-based features suffice, reducing computational complexity. The data-driven approach bypasses the need for explicit degradation models, which are challenging to develop for sodium-ion batteries due to their nascent stage and variable aging mechanisms.

In real-world applications, sodium-ion battery management systems can leverage this framework by monitoring charging cycles to extract features and apply trained models for online SOH estimation. The method is adaptable to different sodium-ion battery chemistries and operational scenarios, though feature selection may require recalibration for new datasets. Challenges include noise in voltage measurements and limited data availability, which can be addressed through signal processing techniques and transfer learning. Future work could explore deep learning models for automatic feature extraction or integrate temperature and impedance data for enhanced accuracy.

In conclusion, I have presented a data-driven methodology for State of Health estimation in sodium-ion batteries, combining feature engineering and machine learning. The framework achieves high precision, with Gaussian process regression yielding an RMSE below 0.8%, enabling reliable battery health monitoring. As sodium-ion battery technology advances for grid storage and electric vehicles, such data-driven tools will be essential for optimizing performance and ensuring safety. The proposed approach underscores the potential of machine learning in addressing complex aging phenomena in emerging battery systems like sodium-ion batteries, paving the way for smarter energy storage solutions.

Scroll to Top