Efficient Solar Panel Contamination Detection Using an Enhanced SSD Algorithm with Lightweight Attention Mechanisms

In the rapidly expanding field of renewable energy, solar panels are critical components of photovoltaic power stations. Over time, surface contamination such as dust, bird droppings, snow, and physical damage can significantly reduce their energy conversion efficiency. Traditional inspection methods, including electrical characteristic analysis and manual visual checks, are either resource-intensive or lack the precision required for large-scale deployments. To address these challenges, this paper proposes an improved Single Shot MultiBox Detector (SSD) algorithm tailored for efficient solar panel contamination detection. By integrating lightweight architectures and attention mechanisms, our method achieves high accuracy while maintaining real-time performance, making it ideal for deployment on unmanned aerial vehicles (UAVs) equipped with embedded systems.

1. Introduction

Solar panels are prone to contamination during long-term operation, which can reduce power generation efficiency by up to 30% in severe cases. Existing defect detection methods, such as electrical signature analysis and morphology-based image processing, struggle to balance speed and accuracy. Deep learning-based approaches, particularly single-stage detectors like SSD, offer a promising solution. However, conventional SSD models suffer from high computational complexity, limiting their applicability in resource-constrained UAV systems. This work introduces a lightweight SSD framework optimized for solar panel inspection, leveraging MobileNetV3 as the backbone network and integrating a coordinate attention (CA) mechanism. These innovations reduce model complexity while enhancing feature extraction capabilities, enabling real-time detection with minimal hardware requirements.

2. Methodology

2.1 Lightweight SSD-MobileNetV3 Architecture

The original SSD algorithm employs a multi-scale feature extraction strategy but relies on computationally heavy backbones like VGG16. To adapt it for solar panel defect detection, we replace the backbone with MobileNetV3-Large, which utilizes inverted residual bottleneck blocks (Figure 1). This design reduces parameters and floating-point operations (FLOPs) through depthwise separable convolutions and linear activation functions. The modified SSD architecture processes input images at 300×300 resolution and generates predictions across six feature map scales (19×19 to 1×1) to detect contaminants of varying sizes.

Inverted Residual Bottleneck Block:Input→1×1 Conv (Expand)Depthwise 3×3 Conv→1×1 Conv (Reduce)OutputInput1×1 Conv (Expand)Depthwise 3×3 Conv1×1 Conv (Reduce)Output

This structure first expands the channel dimension, applies depthwise convolution for spatial feature extraction, and then compresses channels to retain critical information efficiently.

2.2 Coordinate Attention Mechanism

To improve the model’s focus on contamination regions, we integrate a lightweight CA mechanism. Unlike traditional channel attention, CA encodes spatial coordinates into attention weights, enabling the model to prioritize critical areas in solar panel images.

Mathematical Formulation:

Horizontal and vertical average pooling:

zh=1W∑0≤j<Wxh,jzw=1H∑0≤i<Hxi,wzhzw=W10≤j<W∑xh,j=H10≤i<H∑xi,w

Concatenation and convolution:

f=δ(Conv1D(Concat(zh,zw)))f=δ(Conv1D(Concat(zh,zw)))

Attention weights:

ah=σ(Whf)aw=σ(Wwf)ahaw=σ(Whf)=σ(Wwf)

Output feature map:

yh,w,c=xh,w,c⋅ahh⋅awwyh,w,c=xh,w,c⋅ahh⋅aww

Here, σσ denotes the sigmoid function, and Wh,WwWh,Ww are learnable parameters. This mechanism enhances the detection of small contaminants like bird droppings and localized damage on solar panels.

2.3 Mosaic Data Augmentation

Solar panel defect datasets are often limited in size and diversity. To address this, we apply Mosaic augmentation, which combines four training images into one through scaling, cropping, and stitching (Figure 2). This technique simulates complex real-world scenarios, improving model robustness against varying contamination patterns and lighting conditions.

3. Experiments and Results

3.1 Dataset Preparation

We curated a custom dataset of 7,412 solar panel images, annotated with five contamination categories: clean, bird droppings, dirt, electrical damage, and physical damage. The dataset was split into training (80%), validation (10%), and test (10%) sets (Table 1).

Table 1: Solar Panel Contamination Dataset

Category	Training	Validation	Test
Clean	1,124	119	135
Bird Droppings	1,459	167	180
Dirt	1,703	105	83
Electrical Damage	374	34	35
Physical Damage	297	24	33

3.2 Implementation Details

Training was conducted on an NVIDIA RTX 3060 GPU using PyTorch. Key hyperparameters included:

Input resolution: 300×300
Batch size: 32
Optimizer: SGD (momentum=0.9)
Learning rate: 0.01 (cosine annealing)
Epochs: 300

3.3 Ablation Study

We evaluated the contributions of each component (Table 2). Replacing ResNet50 with MobileNetV3 reduced parameters by 68.3% and FLOPs by 54.3%. Adding the CA mechanism improved mean average precision (mAP) by 5.21%, while Mosaic augmentation boosted accuracy by 8.1%.

Table 2: Ablation Study Results

Model	Backbone	CA	Mosaic	mAP (%)	Accuracy (%)	Params	FLOPs (G)
Baseline	ResNet50	No	Yes	72.68	78.43	44.55M	30.5
MobileNetV3	MobileNetV3	No	Yes	77.50	81.03	14.11M	13.7
+CA	MobileNetV3	Yes	No	78.41	84.11	14.11M	13.7
Proposed	MobileNetV3	Yes	Yes	82.71	92.28	14.11M	13.7

3.4 Comparative Analysis

Our model outperformed state-of-the-art detectors (Table 3). Compared to Faster R-CNN, it achieved higher mAP (82.71% vs. 80.39%) with 92.6% fewer parameters. Against YOLOv3, the accuracy improved by 11.3%, and the inference speed reached 45.6 FPS, making it suitable for real-time solar panel inspections.

Table 3: Performance Comparison

Model	mAP (%)	Accuracy (%)	Params	FLOPs (G)	FPS
Faster R-CNN	80.39	89.6	191.39M	240.0	3.2
YOLOv3	72.63	82.9	61.52M	20.6	21.1
SSD-ResNet50	72.68	78.4	44.55M	30.5	23.9
Proposed	82.71	94.2	14.15M	13.8	45.6

4. Conclusion

This work presents a lightweight SSD-based framework for detecting solar panel contamination with high accuracy and efficiency. By integrating MobileNetV3 and a coordinate attention mechanism, the model achieves a 68.9% reduction in computational cost and a 4.3% improvement in mAP over the original SSD. The inclusion of Mosaic data augmentation further enhances generalization, enabling robust performance across diverse contamination types. With a detection speed of 45.6 FPS and a compact model size of 18.3 MB, our approach is ideally suited for UAV-mounted embedded systems, facilitating timely maintenance of solar panels and maximizing energy output in photovoltaic power stations. Future work will explore edge deployment optimizations and multi-modal data fusion for broader applications.