The escalating global energy crisis and environmental degradation driven by fossil fuel dependence necessitate the urgent development of sustainable and clean energy sources. Solar energy, as an abundant and renewable resource, stands out as a pivotal solution for achieving carbon neutrality and energy security. While silicon-based solar cells dominate the current photovoltaic market, their limitations in cost, flexibility, and efficiency for diverse applications have spurred research into alternative technologies. Perovskite solar cells have emerged as a promising candidate due to their high power conversion efficiency, low-cost fabrication, and tunable optoelectronic properties. However, the toxicity of lead-based perovskites and their instability under operational conditions pose significant barriers to commercialization. To address these challenges, this study focuses on accelerating the discovery of lead-free inorganic halide double perovskite materials with optimal bandgaps for high-performance perovskite solar cells. Traditional experimental and computational methods, such as trial-and-error approaches and density functional theory (DFT) calculations, are often time-consuming and resource-intensive. Here, we propose a synergistic strategy that integrates deep learning with DFT to efficiently predict and optimize material properties, enabling the rapid screening of environmentally friendly perovskite solar cell absorbers.

The integration of machine learning, particularly deep learning, with DFT calculations offers a transformative approach to materials science. By leveraging large datasets from sources like the Materials Project, machine learning models can learn complex relationships between material compositions, structures, and properties, such as bandgap. This data-driven paradigm reduces the reliance on costly experiments and accelerates the design of novel perovskite solar cell materials. In this work, we constructed a comprehensive database of 1181 inorganic halide double perovskites and developed a deep neural network (DNN) model for bandgap prediction. We compared the DNN’s performance with traditional machine learning algorithms, including Random Forest Regression (RFR), Gradient Boosting Regression (GBR), Support Vector Regression (SVR), and eXtreme Gradient Boosting Regression (XGBR). The DNN model demonstrated superior accuracy, achieving a mean absolute error (MAE) of 0.264 eV and a coefficient of determination (R²) of 0.925 on the test set. Using this model, we screened 55 candidate materials and identified four promising lead-free double perovskites: Cs₂GaAgCl₆, Cs₂AgIrF₆, Cs₂InAgCl₆, and Cs₂AlAgBr₆. These materials exhibit direct bandgaps within the ideal range of 1.0–2.0 eV for perovskite solar cell applications, along with high thermodynamic and structural stability. Device simulations using SCAPS-1D further confirmed their potential, with Cs₂AgIrF₆ and Cs₂AlAgBr₆ achieving simulated efficiencies of 23.71% and 22.37%, respectively. This study underscores the efficacy of combining deep learning and DFT for rational material design, paving the way for high-performance, stable, and eco-friendly perovskite solar cells.
Methodology
Density Functional Theory Calculations
All DFT calculations were performed using the Vienna Ab initio Simulation Package (VASP), which employs the projector augmented wave (PAW) method and plane-wave pseudopotentials. The exchange-correlation functional was treated within the generalized gradient approximation (GGA) using the Perdew-Burke-Ernzerhof (PBE) formulation. A plane-wave basis set with a cutoff energy of 400 eV was used to ensure accurate descriptions of electron wavefunctions. For structural optimization, the convergence criteria were set to 1 × 10⁻³ eV for electronic self-consistency and 0.02 eV/Å for ionic forces, allowing changes in cell shape and volume. A 3×3×3 k-point mesh was employed for structural relaxation, while a 5×5×5 k-point mesh was used for electronic property calculations. The band structures were plotted along high-symmetry paths: Gamma (0, 0, 0) → X (0.5, 0, 0) → R (0.5, 0.5, 0.5) → M (0.5, 0.5, 0) → Gamma (0, 0, 0) → R (0.5, 0.5, 0.5). Crystal structures were visualized using VESTA software. The formation energy (ΔH) was calculated to assess thermodynamic stability, with negative values indicating stable compounds. The tolerance factor (f_T) and octahedral factor (f_O) were computed to evaluate structural stability, using the following equations:
$$ f_T = \frac{R_A + R_X}{\sqrt{2} \left( \frac{R_B + R_{B’}}{2} + R_X \right)} $$
$$ f_O = \frac{R_B + R_{B’}}{2 R_X} $$
where R_A, R_B, R_{B’}, and R_X are the ionic radii of A-site, B-site, B’-site, and X-site elements, respectively. Stable perovskites typically satisfy 0.75 ≤ f_T ≤ 1.08 and 0.4 ≤ f_O ≤ 1.0.
Deep Learning Model Training
The machine learning database was constructed from the Materials Project, containing 1181 inorganic halide double perovskites after filtering for bandgaps between 0.5 eV and 4.0 eV to cover the relevant range for perovskite solar cell applications. Initially, over 100 feature variables were generated using the Matminer library and Pymatgen, but only 30 were retained after applying a Pearson correlation threshold of 0.8 to ensure feature independence. Key features included atomic density, Mendeleev numbers of B and X sites, energy above hull, average bulk modulus, and others related to electronic and structural properties. The dataset was split into 60% training, 20% validation, and 20% test sets. Five machine learning models were implemented: RFR, GBR, SVR, XGBR, and DNN. The DNN architecture consisted of an input layer with 30 neurons, two hidden layers with 768 and 384 neurons, respectively, using the Swish activation function, and an output layer with a linear activation for regression. The model was trained for 1500 epochs with a learning rate of 0.001, employing early stopping and learning rate reduction on plateau callbacks to prevent overfitting. Hyperparameters for other models were optimized via grid search and 10-fold cross-validation, as summarized in the tables below.
| Model | Hyperparameters |
|---|---|
| RFR | bootstrap=True, max_depth=8, max_features=’sqrt’, min_samples_leaf=2, min_samples_split=2, n_estimators=300 |
| GBR | learning_rate=0.05, max_depth=3, max_features=’sqrt’, min_samples_leaf=5, min_samples_split=10, n_estimators=150, subsample=0.8 |
| SVR | kernel=’rbf’, gamma=0.1, epsilon=0.2, C=15 |
| XGBR | colsample_bytree=0.2, gamma=0.1, learning_rate=0.1, max_depth=3, min_child_weight=2, n_estimators=150, subsample=0.4, reg_alpha=0.1, reg_lambda=1 |
| DNN | hidden_layers=[768, 384], activation=’Swish’, optimizer=’Adam’, learning_rate=0.001, epochs=1500 |
Model performance was evaluated using the coefficient of determination (R²), mean absolute error (MAE), and mean squared error (MSE), defined as:
$$ R^2 = 1 – \frac{\sum_{i=1}^{n} (Y_i – y_i)^2}{\sum_{i=1}^{n} (Y_i – \bar{Y})^2} $$
$$ MAE = \frac{1}{n} \sum_{i=1}^{n} |Y_i – y_i| $$
$$ MSE = \frac{1}{n} \sum_{i=1}^{n} (Y_i – y_i)^2 $$
where Y_i is the DFT-calculated bandgap, y_i is the predicted bandgap, and \bar{Y} is the mean of DFT bandgaps. Higher R² and lower MAE/MSE indicate better performance.
Device Simulation
The photovoltaic performance of selected perovskite solar cell materials was simulated using SCAPS-1D, a one-dimensional solar cell capacitance simulator. The device structure was configured as a conventional n-i-p stack: FTO/TiO₂/perovskite absorber/Cu₂O/Au. The simulation solved the drift-diffusion, Poisson, and continuity equations to model carrier transport and recombination. Key equations include:
$$ \frac{\partial^2 \phi}{\partial x^2} = -\frac{q}{\epsilon} \left[ p – n + N_D^+ – N_A^- + p_t – n_t \right] $$
$$ J_n = q \mu_n n E + q D_n \frac{\partial n}{\partial x}, \quad J_p = q \mu_p p E – q D_p \frac{\partial p}{\partial x} $$
$$ \frac{\partial n}{\partial t} = G_n – R_n + \frac{1}{q} \frac{\partial J_n}{\partial x}, \quad \frac{\partial p}{\partial t} = G_p – R_p – \frac{1}{q} \frac{\partial J_p}{\partial x} $$
where φ is the electrostatic potential, q is the elementary charge, ε is the permittivity, n and p are electron and hole densities, N_D^+ and N_A^- are donor and acceptor concentrations, J_n and J_p are current densities, μ_n and μ_p are mobilities, D_n and D_p are diffusion coefficients, and G and R are generation and recombination rates. The parameters for each layer are listed in the table below.
| Material | Thickness (nm) | Bandgap (eV) | Electron Affinity (eV) | Dielectric Constant | N_C (cm⁻³) | N_V (cm⁻³) | μ_e (cm²/V·s) | μ_h (cm²/V·s) |
|---|---|---|---|---|---|---|---|---|
| FTO | 300 | 3.5 | 4.0 | 9 | 2.2×10¹⁹ | 1.8×10¹⁹ | 20 | 10 |
| TiO₂ | 50 | 3.2 | 4.0 | 9 | 1.0×10¹⁸ | 1.0×10¹⁹ | 2 | 1 |
| Cu₂O | 300 | 2.17 | 3.2 | 6.6 | 2.5×10²⁰ | 1.0×10²⁰ | 3.4 | 3.4 |
| Cs₂AgIrF₆ | 600 | 1.36 | 4.1 | 10 | 1.72×10¹⁹ | 1.0×10¹⁹ | 12 | 5 |
| Cs₂AlAgBr₆ | 600 | 1.20 | 4.15 | 8 | 2.0×10¹⁹ | 2.0×10¹⁹ | 10 | 5 |
| Cs₂GaAgCl₆ | 600 | 1.02 | 4.3 | 12 | 2.0×10¹⁹ | 2.0×10¹⁹ | 11.8 | 2 |
| Cs₂InAgCl₆ | 600 | 1.08 | 4.0 | 10 | 2.0×10¹⁹ | 2.0×10¹⁹ | 5 | 3 |
Results and Discussion
Feature Importance Analysis
The DNN model’s interpretability was enhanced using SHapley Additive exPlanations (SHAP) to identify key features influencing bandgap predictions. The top five features were atomic density, Mendeleev number of the X-site (C-mendeleev number), energy above hull, Mendeleev number of the B-site, and average bulk modulus. Atomic density exhibited the highest importance, with high values correlating to larger bandgaps due to reduced B-X bond lengths and enhanced orbital overlap. Conversely, low atomic density led to smaller bandgaps from weaker orbital coupling. The Mendeleev numbers of B and X sites, representing elemental positions in the periodic table, directly impact orbital hybridization and spin-orbit coupling, thereby modulating bandgaps. Energy above hull, indicative of metastability, showed a slight positive correlation with bandgap, as structural distortions from high energy states can enhance spin-orbit coupling and band localization. These insights align with fundamental physical mechanisms, validating the DNN’s ability to capture complex relationships in perovskite solar cell materials.
| Rank | Feature | Importance Score |
|---|---|---|
| 1 | Atomic Density | 0.156 |
| 2 | C-mendeleev number | 0.142 |
| 3 | Energy above hull | 0.138 |
| 4 | B-mendeleev number | 0.125 |
| 5 | Avg bulk modulus | 0.112 |
| 6 | Avg critical temperature | 0.098 |
| 7 | Avg van der Waals radius | 0.085 |
| 8 | Max unfilled d electrons | 0.074 |
| 9 | Avg electronegativity | 0.069 |
| 10 | Valence electrons | 0.061 |
Screening of Predicted Systems
Using the trained DNN model, bandgaps were predicted for 55 candidate double perovskite materials. Screening criteria included direct bandgaps between 1.0 eV and 2.0 eV (accounting for PBE’s bandgap underestimation), negative formation energy for thermodynamic stability, and tolerance and octahedral factors within stable ranges. After applying these filters, four lead-free double perovskites were selected: Cs₂GaAgCl₆, Cs₂AgIrF₆, Cs₂InAgCl₆, and Cs₂AlAgBr₆. DFT calculations confirmed their direct bandgaps: 1.01 eV, 1.36 eV, 1.08 eV, and 1.20 eV, respectively. All compounds exhibited negative formation energies and stable structural parameters, ensuring their viability as absorber layers in perovskite solar cells. The avoidance of toxic elements like lead further underscores their environmental friendliness, making them ideal candidates for sustainable perovskite solar cell development.
| Compound | ML Bandgap (eV) | DFT Bandgap (eV) | Bandgap Type | ΔH (eV/atom) | f_T | f_O |
|---|---|---|---|---|---|---|
| Cs₂GaAgCl₆ | 1.3885 | 1.0153 | Direct | -1.70 | 0.99 | 0.53 |
| Cs₂AgIrF₆ | 1.3377 | 1.3644 | Direct | -2.09 | 0.84 | 0.75 |
| Cs₂InAgCl₆ | 1.5347 | 1.0806 | Direct | -1.74 | 0.94 | 0.54 |
| Cs₂AlAgBr₆ | 1.4479 | 1.2018 | Direct | -1.60 | 0.98 | 0.47 |
Photovoltaic Performance Analysis
SCAPS-1D simulations were conducted to evaluate the photovoltaic performance of the screened perovskites in an n-i-p device structure. The energy level alignment showed proper band offsets between the perovskite absorbers and charge transport layers (TiO₂ and Cu₂O), facilitating efficient carrier extraction. The current-density-voltage (J-V) characteristics and external quantum efficiency (EQE) spectra were analyzed under AM1.5G illumination. Cs₂AgIrF₆ and Cs₂AlAgBr₆ demonstrated the highest efficiencies, attributed to their optimal bandgaps and high carrier mobilities. The open-circuit voltage (V_OC), short-circuit current density (J_SC), fill factor (FF), and power conversion efficiency (PCE) are summarized below.
| Perovskite Absorber | V_OC (V) | J_SC (mA/cm²) | FF (%) | PCE (%) |
|---|---|---|---|---|
| Cs₂AgIrF₆ | 0.94 | 31.19 | 80.81 | 23.71 |
| Cs₂AlAgBr₆ | 0.78 | 36.73 | 77.66 | 22.37 |
| Cs₂GaAgCl₆ | 0.61 | 44.55 | 73.36 | 19.85 |
| Cs₂InAgCl₆ | 0.68 | 41.85 | 75.32 | 21.29 |
Cs₂AgIrF₆ achieved the highest PCE of 23.71%, with a high V_OC of 0.94 V and FF of 80.81%, indicating minimal non-radiative recombination and efficient carrier separation. Cs₂AlAgBr₆ followed with a PCE of 22.37%, benefiting from a high J_SC of 36.73 mA/cm² due to its narrower bandgap. The high FFs in both cases suggest low series resistance and high shunt resistance, reducing energy losses. In contrast, Cs₂GaAgCl₆ had the highest J_SC but the lowest PCE, as its low V_OC and FF implied severe internal recombination. These results highlight the critical role of bandgap and material properties in optimizing perovskite solar cell performance. The synergy between deep learning and DFT enables the identification of such high-performing materials, accelerating the development of efficient perovskite solar cells.
Conclusion
This study demonstrates the powerful synergy between deep learning and density functional theory for accelerating the discovery of lead-free double perovskite materials for solar cell applications. By constructing a robust database of 1181 inorganic halide double perovskites and developing a deep neural network model, we achieved accurate bandgap predictions with an MAE of 0.264 eV and R² of 0.925, outperforming traditional machine learning algorithms. The DNN’s interpretability revealed key features, such as atomic density and Mendeleev numbers, that govern bandgap variations through fundamental physical mechanisms. Screening of 55 candidates yielded four promising perovskites—Cs₂GaAgCl₆, Cs₂AgIrF₆, Cs₂InAgCl₆, and Cs₂AlAgBr₆—with direct bandgaps ideal for perovskite solar cells, along with excellent stability and eco-friendliness. Device simulations confirmed their high photovoltaic performance, with Cs₂AgIrF₆ and Cs₂AlAgBr₆ reaching efficiencies of 23.71% and 22.37%, respectively. This integrated approach not only reduces the time and cost associated with traditional methods but also provides a rational framework for designing high-performance, stable, and sustainable perovskite solar cells. Future work will involve experimental validation and extension to other material properties, further advancing the field of perovskite photovoltaics.
