Abstract
To address the challenge of low accuracy in solar panel defect detection and classification, we propose an enhanced algorithm integrating improved Single Shot MultiBox Detector (SSD) and Residual Networks (ResNet). By incorporating the Convolutional Block Attention Module (CBAM) into the VGG16 backbone of SSD, we enhance multi-scale feature extraction capabilities. Additionally, we redesign the aspect ratios of default boxes in SSD to better align with solar panel geometries. For ResNet, we embed Squeeze and Excitation (SENet) modules into each residual block to amplify channel-wise feature importance. Experimental results demonstrate that our improved SSD achieves 97.23% detection accuracy at 21 frames per second (FPS), while the enhanced ResNet attains over 95% classification accuracy for defects such as cracks, shadows, and grid interruptions.

1. Introduction
Solar panels are critical components in renewable energy systems, yet defects like micro-cracks, hotspots, and cell fractures significantly degrade their efficiency. Traditional inspection methods rely on manual labor or basic computer vision techniques, which are time-consuming and error-prone. Recent advances in deep learning, particularly object detection and classification frameworks like SSD and ResNet, offer promising solutions. However, these models often struggle with the unique geometries and subtle defects of solar panels.
Our work addresses these limitations by:
- Enhancing SSD with CBAM for adaptive feature recalibration.
- Redesigning default boxes in SSD to match solar panel aspect ratios.
- Integrating SENet into ResNet to prioritize critical defect-related channels.
2. Methodology
2.1 Improved SSD for Solar Panel Localization
Baseline SSD Architecture:
SSD employs a VGG16 backbone to extract multi-scale features. However, uniform weighting across channels and spatial dimensions limits its precision.
CBAM Integration:
We integrate CBAM, which sequentially applies channel and spatial attention (Figure 1):
- Channel Attention: Highlights informative channels using global average and max pooling:Mc(F)=σ(MLP(AvgPool(F))+MLP(MaxPool(F)))Mc(F)=σ(MLP(AvgPool(F))+MLP(MaxPool(F)))
- Spatial Attention: Focuses on defect regions via convolutional operations:Ms(F)=σ(f7×7([AvgPool(F);MaxPool(F)]))Ms(F)=σ(f7×7([AvgPool(F);MaxPool(F)]))where σσ denotes the sigmoid function.
Default Box Redesign:
To accommodate rectangular solar panels, we adjust default box dimensions using:wka=Skar,hka=Skarwka=Skar,hka=arSk
where SkSk scales with feature map resolution, and ar∈{1,2,3,12,13}ar∈{1,2,3,21,31}.
2.2 Enhanced ResNet for Defect Classification
Baseline ResNet Limitations:
Standard ResNet treats all channels equally, diluting subtle defect features.
SENet Integration:
We embed SENet after each residual block (Figure 2). For input feature UU, SENet performs:
- Squeeze: Global average pooling to capture channel-wise statistics:zc=1H×W∑i=1H∑j=1WUc(i,j)zc=H×W1i=1∑Hj=1∑WUc(i,j)
- Excitation: Learn channel dependencies via fully connected layers:s=σ(W2δ(W1z))s=σ(W2δ(W1z))where δδ is ReLU, and W1W1, W2W2 are weights.
- Reweighting: Scale original features with ss:U~c=sc⋅UcU~c=sc⋅Uc
3. Experimental Setup
3.1 Dataset Preparation
- Source: 2,000 solar panel images with defects (cracks, shadows, grid interruptions).
- Augmentation: Rotation, flipping, and contrast adjustment expanded the dataset to 8,000 images.
- Annotation: Labelme software for bounding box annotation (solar panels) and defect classification.
- Split: 80% training, 20% testing.
3.2 Training Configuration
Component | Specification |
---|---|
Hardware | Intel i5-9400F, GTX 2060, 16GB RAM |
Framework | TensorFlow (SSD), PyTorch (ResNet) |
Optimizer | Adam (SSD), SGD (ResNet) |
Batch Size | 16 (SSD), 32 (ResNet) |
Learning Rate | 1e-4 (SSD), 1e-3 (ResNet) |
3.3 Evaluation Metrics
- Accuracy: Acc=TP+TNTP+TN+FP+FNAcc=TP+TN+FP+FNTP+TN
- FPS: Frames processed per second.
- Computational Load: Floating-point operations (FLOPs).
4. Results and Analysis
4.1 Solar Panel Detection Performance
Improved SSD vs. Baseline Models
Model | Accuracy (%) | FPS |
---|---|---|
Faster R-CNN | 90.37 | 7 |
RetinaNet | 85.99 | 30 |
YOLOv3 | 87.59 | 42 |
Our SSD | 97.23 | 21 |
Key observations:
- Our SSD achieves 14.3% higher accuracy than YOLOv3.
- Faster convergence: Stabilizes at 300 epochs vs. 500 for baseline SSD (Figure 3).
Default Box Impact:
Redesigned default boxes reduce localization errors by 22%, aligning better with solar panel shapes.
4.2 Defect Classification Performance
Improved ResNet vs. Baseline Models
Model | Crack (%) | Shadow (%) | Grid (%) | FLOPs (G) |
---|---|---|---|---|
SVM | 93.19 | 89.45 | 90.12 | 1.27 |
VGG16 | 87.44 | 84.35 | 88.67 | 7.77 |
ResNet | 90.36 | 87.45 | 90.11 | 3.75 |
Our ResNet | 96.11 | 95.99 | 97.08 | 3.89 |
Key observations:
- SENet boosts accuracy by 5–8% across defect types.
- Minimal FLOPs increase (3.75G → 3.89G), ensuring real-time feasibility.
5. Conclusion
Our hybrid framework, combining CBAM-enhanced SSD and SENet-augmented ResNet, significantly advances solar panel defect identification. Key contributions include:
- CBAM-SSD: 97.23% detection accuracy with optimized default boxes.
- SENet-ResNet: >95% classification accuracy for critical defects.
This approach outperforms Faster R-CNN, YOLOv3, and SVM, demonstrating robustness in real-world solar panel quality control. Future work will explore lightweight variants for edge deployment.