Solar Panel Infrared Images Using an Enhanced U-Net Architecture

Semantic segmentation of infrared images plays a pivotal role in detecting defects and ensuring the operational efficiency of solar panels. Traditional methods often struggle with edge adhesion, background noise, and insufficient feature extraction in complex environments. To address these challenges, this study proposes an enhanced U-Net model tailored for solar panel infrared image segmentation. The model integrates three key innovations: white-edge preprocessing, VGG16-based encoder, and Res-CBAM attention mechanisms. Experimental results demonstrate a significant improvement in segmentation accuracy, achieving a mean Intersection over Union (mIoU) of 99.73% and an accuracy of 99.87%, outperforming state-of-the-art models such as DeepLabV3+ and HRNetV2.

1. Introduction

The global transition toward renewable energy has accelerated the adoption of solar panels. However, surface defects, such as cracks or hotspots, reduce energy conversion efficiency and are often visible as high-intensity regions in infrared (IR) images. Accurate segmentation of solar panels from IR imagery is critical for automated fault detection systems. Conventional segmentation techniques, including region growing and K-means clustering, lack robustness against noise and complex backgrounds. Deep learning models, particularly U-Net, have shown promise in medical imaging but face limitations in large-scale solar panel applications due to information loss and inadequate edge detection.

This work introduces an optimized U-Net architecture that addresses these limitations through:

White-edge preprocessing to enhance boundary features.
VGG16 encoder for hierarchical semantic extraction.
Res-CBAM attention modules to suppress noise and prioritize critical regions.

2. Methodology

2.1 White-Edge Preprocessing

The irregular shapes and low contrast of solar panels in IR images necessitate robust preprocessing. The white-edge technique highlights panel boundaries by assigning pixel values as follows:

Solar panel region: Pixel value = 1.
Background: Pixel value = 0.
Edges: Pixel value = 255.

Mathematically, the processed image I′I′ is generated by:I′=I×(1−M)+M×255I′=I×(1−M)+M×255

where II is the original image, and MM is a binary mask derived from annotated contours. This step amplifies edge features, improving the model’s ability to distinguish adjacent solar panels (Figure 1).

2.2 VGG16 Encoder

The original U-Net encoder is replaced with the first 13 convolutional layers of VGG16. This modification enhances shallow feature extraction while maintaining computational efficiency. For an input image of size 512×512×3512×512×3, the encoder generates multi-scale feature maps:Block 1: 512×512×64Block 2: 256×256×128Block 3: 128×128×256Block 4: 64×64×512Block 5: 32×32×512Block 1: 512×512×64Block 2: 256×256×128Block 3: 128×128×256Block 4: 64×64×512Block 5: 32×32×512

These features are propagated to the decoder via skip connections, preserving spatial details critical for solar panel segmentation.

2.3 Res-CBAM Attention Mechanism

The Convolutional Block Attention Module (CBAM) combines channel and spatial attention to refine feature maps. Let F∈RH×W×CF∈RH×W×C denote an input feature map.

Channel Attention:Mc(F)=σ(MLP(AvgPool(F))+MLP(MaxPool(F)))F′=Mc(F)⊗FMc(F)F′=σ(MLP(AvgPool(F))+MLP(MaxPool(F)))=Mc(F)⊗F

where σσ is the sigmoid function, and ⊗⊗ denotes element-wise multiplication.

Spatial Attention:Ms(F′)=σ(f7×7([AvgPool(F′);MaxPool(F′)]))F′′=Ms(F′)⊗F′Ms(F′)F′′=σ(f7×7([AvgPool(F′);MaxPool(F′)]))=Ms(F′)⊗F′

where f7×7f7×7 is a 7×7 convolutional layer.

Res-CBAM integrates residual connections to mitigate gradient vanishing:O=F+F′′O=F+F′′

This module enhances the model’s focus on solar panel edges while suppressing irrelevant background features.

3. Experimental Setup

3.1 Dataset and Training

A dataset of 2,000 IR images of solar panels was collected using an HT20 thermal camera mounted on a DJI M300RTK drone. The images, spanning residential and industrial installations, were split into training (80%), validation (10%), and testing (10%) sets.

Training Parameters:

Framework: PyTorch 2.3.0
Hardware: NVIDIA RTX 3080 Ti
Input size: 512×512512×512
Batch size: 16
Optimizer: Adam (LR = 1×10−41×10−4)
Loss: Binary cross-entropy

3.2 Evaluation Metrics

Performance was measured using:

mIoU:

mIoU=1K∑i=1KTPiTPi+FPi+FNimIoU=K1i=1∑KTPi+FPi+FNiTPi

Accuracy:

Accuracy=TP+TNTP+TN+FP+FNAccuracy=TP+TN+FP+FNTP+TN

where K=2K=2 (solar panel vs. background).

4. Results and Analysis

4.1 Comparative Performance

The proposed model outperformed existing architectures (Table 1):

Model	mIoU	Accuracy
DeepLabV3+	97.32%	98.17%
PSP-Net	96.28%	97.65%
U-Net	97.72%	98.37%
Res-U-Net	97.93%	97.93%
HRNetV2	97.58%	97.96%
Proposed	99.73%	99.87%

Key advantages include:

Edge clarity: Reduced adhesion between adjacent solar panels.
Noise resilience: Suppressed background interference from buildings or vegetation.
Detail preservation: Avoided over-smoothing in low-light conditions.

4.2 Ablation Study

Ablation tests confirmed the contribution of each component (Table 2):

Configuration	mIoU	Accuracy
Baseline U-Net	97.72%	98.37%
+ White-edge	99.41%	99.71%
+ VGG16	99.22%	99.62%
+ Res-CBAM	98.97%	99.47%
Full Model	99.73%	99.87%

The synergistic effect of all modules maximized performance, particularly in complex scenes.

4.3 Attention Mechanism Comparison

Res-CBAM outperformed other attention strategies (Table 3):

Attention	mIoU	Accuracy
CA	98.74%	99.39%
ECA	98.79%	99.41%
CBAM	98.72%	99.37%
Res-CBAM	98.97%	99.49%

5. Conclusion

This study presents an enhanced U-Net model for semantic segmentation of solar panel IR images. By integrating white-edge preprocessing, a VGG16 encoder, and Res-CBAM attention, the model achieves state-of-the-art accuracy in detecting panel boundaries and suppressing noise. The improvements are validated through rigorous comparisons and ablation studies, demonstrating superior performance in diverse environmental conditions. Future work will explore real-time deployment and scalability to larger solar farms.