Lightweight Defect Detection in Solar Panels Using Improved YOLOv5s

As the world increasingly shifts towards renewable energy to mitigate climate change, solar energy has emerged as a critical component due to its abundance and sustainability. Solar panels, which convert sunlight into electricity, are the cornerstone of photovoltaic systems. However, during manufacturing, various defects such as cracks, hot spots, black edges, scratches, and inactive cells can occur, compromising efficiency and longevity. Traditional manual inspection methods are labor-intensive, prone to errors, and costly. With advancements in deep learning, automated defect detection has become feasible, but many existing models face challenges in balancing accuracy, speed, and computational cost, especially for deployment on resource-constrained devices. In this work, I propose a lightweight network based on YOLOv5s, termed LPV-YOLO, for real-time defect detection in solar panels. This approach significantly reduces model parameters and complexity while maintaining high detection accuracy, making it suitable for mobile or edge devices in industrial settings.

The importance of solar panels in the global energy transition cannot be overstated. They harness solar power, a clean and inexhaustible resource, to generate electricity without greenhouse gas emissions. However, the production process of solar panels is complex, and defects can arise from factors like material impurities, mechanical stress, or environmental exposure. These defects, if undetected, can lead to reduced power output, safety hazards, and increased maintenance costs. Therefore, developing efficient and accurate defect detection systems is crucial for ensuring the quality and reliability of solar panels. In recent years, deep learning-based methods have shown promise, but they often require substantial computational resources, limiting their practicality for real-time applications. My goal is to address this gap by designing a lightweight model that achieves a balance between performance and efficiency, specifically tailored for solar panel inspection.

Defect detection in solar panels involves identifying various types of anomalies from images. Common defects include cracks (irregular black shapes), hot spots (bright white patches), black edges (narrow dark lines along edges), scratches (thin lines similar to the background), and inactive cells (large rectangular areas). These defects vary in size, shape, and contrast, making detection challenging. Traditional computer vision techniques often struggle with such variability, leading to the adoption of convolutional neural networks (CNNs). For instance, ResNet-based models with attention mechanisms have been used to enhance feature fusion for crack detection, while YOLO variants have been applied for real-time object detection. However, many of these models are computationally heavy, with high parameter counts, which hinders deployment on devices with limited resources. In my review, I found that lightweight networks like GhostNet and MobileNetV2 offer potential solutions by reducing redundancy in feature maps, but they may sacrifice accuracy. Thus, I aim to integrate lightweight modules with attention mechanisms to optimize both speed and precision for solar panel defect detection.

My proposed LPV-YOLO network is built upon the YOLOv5s architecture, which is known for its balance of speed and accuracy. However, to reduce the model’s parameter count and computational cost, I introduce several key modifications. First, I leverage the Ghost module, which generates redundant features through cheap linear operations instead of expensive convolutions. This significantly cuts down on computations. Specifically, for an input feature map with channels c, height h, and width w, the Ghost module reduces computational cost by approximately a factor of s, where s is the number of linear operations per channel. The compression ratio $$r_s$$ can be expressed as:

$$r_s = \frac{n \cdot h’ \cdot w’ \cdot c \cdot k \cdot k}{\frac{n}{s} \cdot h’ \cdot w’ \cdot c \cdot k \cdot k + (s-1) \cdot \frac{n}{s} \cdot h’ \cdot w’ \cdot d \cdot d} \approx \frac{s \cdot c}{s + c – 1} \approx s,$$

where n is the output channels, k is the kernel size of standard convolution, and d is the kernel size of linear operations. To further enhance performance, I incorporate the Mish activation function, which is smoother and helps avoid overfitting. The Mish function is defined as:

$$f_{\text{Mish}} = x \times \tanh(\ln(1 + e^x)).$$

By combining Ghost modules with Mish, I design two new modules: GhostMConv and C3MGhost. These replace the standard Conv and C3 modules in the backbone and neck of YOLOv5s, leading to a lighter network without significant accuracy loss. The backbone network thus becomes more efficient, focusing on essential features for defect detection in solar panels.

To compensate for any accuracy reduction due to lightweight design, I propose an attention pyramid pooling module. This module integrates the SimAM attention mechanism with a spatial pyramid pooling (SPP) layer. SimAM is a parameter-free attention module that evaluates neuron importance based on energy functions, enhancing both spatial and channel features. For a neuron t in a channel, the energy function $$e_t$$ is given by:

$$e_t = \frac{4(\hat{\sigma}^2 + \lambda)}{(t – \hat{\mu})^2 + 2\hat{\sigma}^2 + 2\lambda},$$

where $$\hat{\mu}$$ and $$\hat{\sigma}^2$$ are the mean and variance of all neurons in the channel, and $$\lambda$$ is a regularization coefficient. Lower energy indicates greater importance. The output feature $$\tilde{X}$$ is then computed as:

$$\tilde{X} = \text{sigmoid}\left(\frac{1}{E}\right) \otimes X,$$

where E is the matrix of all $$e_t$$ values across channels and spatial dimensions. By embedding SimAM into the SPP layer, I create the MSSPPF module, which performs multi-scale feature fusion without adding parameters. This helps capture defects of varying sizes in solar panels, such as small scratches or large inactive areas.

Additionally, I embed the SE (Squeeze-and-Excitation) channel attention module in the neck network. SE attention recalibrates channel weights to emphasize informative features. It consists of a squeeze operation (global average pooling) and an excitation operation (fully connected layers with sigmoid activation). This allows the model to focus on critical defect-related channels, improving detection accuracy for solar panel anomalies. The integration of SE attention ensures that the model effectively distinguishes defects from background noise, which is common in solar panel images due to reflections or dirt.

For training and evaluation, I use a dataset of solar panel defects, which includes images with annotations for five defect types: crack, hot spot, black edge, scratch, and inactive cell. To address data imbalance and augment the dataset, I employ CycleGAN for generating synthetic defect images. CycleGAN uses cycle-consistency loss to translate images between domains without paired data. The loss function combines GAN loss and cycle-consistency loss:

$$L = L_{\text{GAN}} + L_{\text{cycle}},$$

where $$L_{\text{GAN}}$$ ensures realistic image generation, and $$L_{\text{cycle}}$$ preserves content consistency. This augmentation expands the dataset, enhancing model generalization for solar panel defect detection.

In my experiments, I set the input image size to 640×640 and train the model for 300 epochs using SGD optimizer with a batch size of 16. The initial learning rate is 0.001, momentum is 0.937, and weight decay is 0.0005. I evaluate the model using metrics such as mean average precision (mAP), parameters, model size, computational cost (GFLOPs), and frames per second (FPS). The mAP is calculated at different IoU thresholds, with IoU defined as the intersection over union between predicted and ground-truth bounding boxes:

$$\text{IoU} = \frac{|B \cap B_{\text{GT}}|}{|B \cup B_{\text{GT}}|},$$

where B is the predicted box area and $$B_{\text{GT}}$$ is the ground-truth box area. Precision (P) and recall (R) are derived from true positives (TP), false positives (FP), and false negatives (FN):

$$P = \frac{N_{\text{TP}}}{N_{\text{TP}} + N_{\text{FP}}}, \quad R = \frac{N_{\text{TP}}}{N_{\text{TP}} + N_{\text{FN}}}.$$

The average precision (AP) for each class is the area under the P-R curve, and mAP is the mean across all classes. These metrics are crucial for assessing the model’s performance on solar panel defects.

The results demonstrate that LPV-YOLO achieves a mAP of 93.8% on the solar panel defect dataset, with only a 0.6% drop compared to the original YOLOv5s. However, the model’s parameters are reduced by 49%, size by 46%, and computational cost by 50%. Specifically, LPV-YOLO has 3.71 million parameters, a model size of 7.4 MB, and 8.3 GFLOPs, while maintaining a frame rate of 70.42 FPS, suitable for real-time inspection of solar panels. The detection accuracy for individual defects is high: 93.2% for cracks, 92.5% for hot spots, 97.1% for black edges, 87.0% for scratches, and 99.1% for inactive cells. This shows the model’s robustness across different defect types in solar panels.

To validate the effectiveness of each module, I conduct ablation studies. The baseline YOLOv5s achieves 94.4% mAP with 7.23 million parameters. Adding the MGhost modules reduces parameters to 3.70 million but drops mAP to 92.1%. Incorporating MSSPPF improves mAP to 93.3% without increasing parameters. Finally, adding SE attention (LPV-YOLO) boosts mAP to 93.8% with 3.71 million parameters. The table below summarizes these results:

Model	mAP (%)	Parameters (10⁶)	Size (MB)	GFLOPs	FPS
YOLOv5s (Baseline)	94.4	7.23	13.7	16.5	91.74
+ MGhost	92.1	3.70	7.42	8.2	67.11
+ MGhost + MSSPPF	93.3	3.70	7.36	8.2	65.52
LPV-YOLO (Full)	93.8	3.71	7.40	8.3	70.42

Furthermore, I compare LPV-YOLO with other end-to-end detection networks on the solar panel defect dataset. The table below highlights that LPV-YOLO offers a favorable balance of accuracy and efficiency:

Model	mAP (%)	Parameters (10⁶)	Size (MB)	GFLOPs	FPS
YOLOv5s	94.4	7.23	13.7	16.5	91.74
YOLOv7	88.0	9.33	19.0	26.7	107.53
SSD300	77.7	34.30	90.0	51.6	71.00
RetinaNet	72.2	41.90	139.0	212.0	42.90
LPV-YOLO	93.8	3.71	7.4	8.3	70.42

The lightweight design of LPV-YOLO makes it particularly advantageous for deploying on mobile devices or embedded systems in solar panel manufacturing plants. By reducing computational demands, it enables real-time inspection without expensive hardware, potentially lowering costs and improving production throughput. The integration of attention mechanisms ensures that even subtle defects in solar panels, such as fine scratches or faint hot spots, are detected with high precision. This is critical for maintaining the quality standards of solar panels, which directly impact energy conversion efficiency and long-term reliability.

In conclusion, my proposed LPV-YOLO network effectively addresses the challenges of defect detection in solar panels by combining lightweight modules with attention mechanisms. The use of GhostMConv and C3MGhost reduces parameters and computations, while MSSPPF and SE attention enhance feature representation and channel focus. Experimental results confirm that the model maintains high accuracy with significantly lower resource requirements, making it a practical solution for industrial applications. Future work could involve extending this approach to other renewable energy components or integrating it with IoT systems for continuous monitoring of solar panels in the field. As the demand for solar energy grows, such automated detection systems will play a vital role in ensuring the sustainability and efficiency of solar power generation.

The mathematical formulations and architectural innovations presented here underscore the importance of optimizing deep learning models for specific tasks like solar panel inspection. By prioritizing both speed and accuracy, LPV-YOLO sets a benchmark for lightweight defect detection in solar panels, contributing to the broader goal of enhancing renewable energy technologies. As I continue to refine this model, I aim to explore adaptive learning techniques and multi-modal data fusion to further improve performance across diverse environmental conditions and defect types in solar panels.