Improved Dn-YOLOv7 for Small Defect Detection on Photovoltaic Panels

With the increasing demand for renewable energy, solar power generation has become a critical component of the global energy mix. Photovoltaic panels are widely deployed, but their efficiency and reliability are often compromised by surface defects such as cracks, broken grids, and spots. These defects, typically small in size, pose significant challenges for detection systems, especially in aerial imagery where noise and complex backgrounds are prevalent. In this work, we propose an enhanced Dn-YOLOv7 algorithm to address these issues by improving noise robustness and small-target detection capabilities. Our approach integrates a novel denoising module, advanced loss functions, and coordinate-aware convolutions to achieve superior performance in noisy environments.

The detection of defects on photovoltaic panels is crucial for maintaining the operational efficiency of solar farms. However, aerial images captured by drones often contain various types of noise, including Gaussian and impulse noise, which can degrade detection accuracy. Traditional methods struggle with these challenges, as they either lack real-time performance or fail to preserve small target features. Our improved Dn-YOLOv7 model leverages deep learning techniques to mitigate noise interference and enhance the detection of minute defects on solar panels. This article details the methodology, experimental setup, and results, demonstrating the model’s effectiveness through comprehensive evaluations.

Our contributions include the introduction of a Denoising Block (DnBlock) that utilizes a noise-tolerant loss function, the replacement of the Intersection over Union (IoU) loss with Normalized Gaussian Wasserstein Distance (NWD) for better small-target handling, and the integration of CoordConv to reduce feature loss during denoising. These innovations collectively improve the model’s ability to detect defects in photovoltaic systems under noisy conditions. The following sections elaborate on each component, supported by mathematical formulations and empirical data.

The core of our approach lies in the modified Dn-YOLOv7 architecture, which consists of three main parts: the backbone network, the neck network, and the head network. The backbone incorporates Conv, BN, and SiLU (CBS) modules, ELAN, MPConv, and our proposed DnBlock. The neck network uses SPPCSPC and upsampling modules, while the head network employs Rep parameterization and auxiliary training heads. Key improvements include the DnBlock for denoising, NWD for regression loss, and CoordConv for enhanced feature extraction. These elements work together to address the specific challenges of photovoltaic panel defect detection.

To understand the denoising mechanism, we developed the DnBlock based on the DnCNN framework. It employs a residual learning strategy to predict noise maps from noisy inputs, which are then subtracted from the original image to recover clean features. The DnBlock uses multiple CBS layers and a concatenation operation to fuse features, resulting in a 256-dimensional denoised feature map. The loss function for this module is critical; we replace Mean Squared Error (MSE) with Mean Absolute Error (MAE) due to its symmetric properties and higher noise tolerance. For a dataset with uniform noise ratio $\eta$, the risk $R_{\eta}^L$ under loss function $L$ is defined as:

$$ R_{\eta}^L(f) = E_{x,\hat{y}} [L(f(x), \hat{y})] = (1 – \eta) R_L(f) + \eta \left( C – \frac{R_L(f)}{k-1} \right) = \frac{C\eta}{k-1} + \left[ 1 – \frac{k\eta}{k-1} \right] R_L(f), $$

where $C$ is a constant, $k$ is the number of classes, and $f(x)$ is the model output. The difference in risk between two functions $f^*$ and $f$ is:

$$ R_{\eta}^L(f^*) – R_{\eta}^L(f) = \left[ 1 – \frac{k\eta}{k-1} \right] \left[ R_L(f^*) – R_L(f) \right]. $$

This shows that symmetric loss functions like MAE remain unbiased under noise, as they only add a constant to the overall loss. The MAE for a sample $e_i$ and output $u_i$ is computed as:

$$ R_{MAE}^L(f) = \sum_{i=1}^n |e_i – u_i| = \sum_{i=1}^n (1 + A – 2u_i) = n + (n-2)A, $$

where $A$ is a constant related to the activation function. This formulation ensures robustness in noisy conditions, which is essential for aerial images of photovoltaic panels.

For small-target detection, we replace the Complete IoU (CIoU) loss with NWD. This change addresses the sensitivity of IoU to scale variations, as small targets like defects in solar panels often have dimensions as low as 25×25 pixels in 640×640 images. The NWD models bounding boxes as 2D Gaussian distributions and measures similarity using the Wasserstein distance. The similarity $S_{NWD}$ between predicted box $N_p$ and ground truth $N_t$ is given by:

$$ S_{NWD}(N_p, N_t) = \exp \left( -\frac{ \left( [cX_p, cY_p, W_p, H_p]^T, [cX_t, cY_t, W_t, H_t]^T \right)^2 }{c} \right), $$

where $cX, cY, W, H$ denote the center coordinates, width, and height, and $c$ is a constant set to 500 based on the average pixel size of targets. This loss function provides smoother gradients and better convergence for small objects.

Additionally, we integrate CoordConv into the DnBlock and head network to handle non-uniform noise. CoordConv appends coordinate channels (horizontal and vertical) to the input, enabling the network to spatialize features effectively. This is particularly useful for preserving edge information in noisy images of photovoltaic panels. The structure involves simple concatenation of coordinate maps, adding minimal computational overhead while enhancing feature representation.

Our experimental setup uses a publicly available dataset of photovoltaic panel defects, including 2,700 images (1,920 for training, 480 for testing, 300 for validation). Defects are categorized into cracks, broken grids, and spots. We train the model for 200 epochs with an input size of 640×640 pixels, batch size of 64, and Adam optimizer. The learning rate starts at 0.01 with warm-up, weight decay is 0.005, and momentum is 0.9. Evaluations are conducted under various noise conditions: Gaussian noise (mean 0, standard deviations 0.12 and 0.24) and impulse noise (15% and 30%). Performance metrics include mean Average Precision (mAP) and detection speed (frames per second). We also define a precision shift $R$ to quantify noise impact:

$$ R = \frac{\sum_{i=2}^N (P_i – P_1)}{N-1}, $$

where $P_i$ is the mAP for the $i$-th experiment and $P_1$ is the baseline mAP. A lower $R$ indicates better noise robustness.

Ablation studies validate each component’s contribution. We compare eight configurations: baseline YOLOv7s and variants with DnBlock, NWD, and CoordConv. The results, summarized in Table 1, show that the full improved Dn-YOLOv7 model achieves the highest mAP across noise conditions while maintaining reasonable detection speed.

Table 1: Ablation Study Results
Experiment mAP (%) – No Noise mAP (%) – Gaussian Noise (σ=0.12) mAP (%) – Gaussian Noise (σ=0.24) mAP (%) – 15% Impulse Noise mAP (%) – 30% Impulse Noise Speed (fps) R
YOLOv7s 93.9 92.3 82.5 88.6 78.6 88.0 8.40
+ DnBlock 95.9 94.4 90.1 89.7 82.5 79.0 6.73
+ NWD 95.3 94.2 88.9 89.9 80.8 85.0 6.85
+ CoordConv 95.0 93.3 88.1 88.3 78.9 80.0 7.85
Full Model 96.6 94.9 91.4 91.2 85.4 78.0 5.88

The full model achieves an mAP of 96.6% under no noise, 91.4% under strong Gaussian noise, and 85.4% under 30% impulse noise, with a detection speed of 78.0 fps. The precision shift $R$ is reduced to 5.88, indicating enhanced noise robustness. Comparative experiments with other models, such as Fast R-CNN, SSD, YOLOv5, YOLOv6, and YOLOv8, further demonstrate our model’s superiority in noisy environments, as shown in Table 2.

Table 2: Comparative Model Performance
Model mAP (%) – No Noise mAP (%) – Gaussian Noise (σ=0.12) mAP (%) – Gaussian Noise (σ=0.24) mAP (%) – 15% Impulse Noise mAP (%) – 30% Impulse Noise Speed (fps) R
Fast R-CNN 73.9 69.4 62.2 68.6 62.1 30.0 8.3
SSD 71.4 68.2 62.5 65.1 59.2 32.0 7.7
YOLOv5 88.4 84.6 76.3 81.2 68.5 91.0 10.8
YOLOv6 93.4 85.6 75.8 81.3 78.3 98.0 13.2
YOLOv8 96.7 93.6 90.8 90.9 83.4 118.0 7.0
Improved Dn-YOLOv7 96.6 94.9 91.4 91.2 85.4 78.0 5.9

Visual results confirm that our model reduces false positives and missed detections in noisy images of photovoltaic panels. For instance, in high-noise scenarios, it accurately identifies small defects while other models falter. This underscores the practical value of our approach for real-world solar panel inspection.

In conclusion, the improved Dn-YOLOv7 algorithm effectively addresses the challenges of noise and small targets in photovoltaic panel defect detection. By integrating denoising techniques, advanced loss functions, and coordinate-aware convolutions, we achieve high accuracy and robustness. Future work will focus on optimizing inference speed and testing under diverse noise conditions to enhance applicability in mobile deployments. This research contributes to the reliable monitoring of solar energy systems, supporting the sustainable growth of photovoltaic technology.

The mathematical foundations and experimental validations presented here highlight the model’s capabilities. As solar power continues to expand, such advanced detection methods will play a vital role in maintaining the efficiency and longevity of photovoltaic installations. We believe our work sets a benchmark for small-target detection in noisy environments, with potential extensions to other domains beyond solar panels.

Scroll to Top