Research on Hot Spot Detection in Photovoltaic Panels Using Thermal Infrared Images

In recent years, the global shift toward renewable energy has emphasized the importance of photovoltaic systems, which convert solar energy into electricity. However, solar panels operating in outdoor environments are susceptible to various issues, such as shading from dust, debris, or bird droppings, leading to localized overheating known as hot spots. These hot spots can significantly reduce the efficiency of photovoltaic panels and pose safety risks, including potential fire hazards. Therefore, developing effective methods for detecting hot spots in photovoltaic systems is crucial for maintaining their performance and longevity. This study explores two primary approaches for hot spot detection: traditional image processing techniques and machine learning-based object detection algorithms. By comparing these methods, I aim to identify a robust solution that can accurately and efficiently identify hot spots in thermal infrared images of solar panels.

Traditional image processing methods rely on manipulating pixel-level features, such as color, intensity, and edges, to segment or detect objects. For hot spot detection in photovoltaic panels, these methods often involve region segmentation and edge detection algorithms. Region segmentation techniques, like thresholding, histogram equalization, and HSV color extraction, aim to isolate hot spots based on their distinct thermal signatures. For instance, thresholding involves converting an image to grayscale and applying a predefined or adaptive threshold to separate hot spots from the background. The process can be mathematically represented as:

$$ I_{binary}(x,y) = \begin{cases} 1 & \text{if } I_{gray}(x,y) \geq T \\ 0 & \text{otherwise} \end{cases} $$

where $ I_{gray} $ is the grayscale image, $ T $ is the threshold value, and $ I_{binary} $ is the resulting binary image. However, this method struggles when the background and hot spots have similar intensities, leading to false positives. Histogram equalization enhances contrast by redistributing pixel intensities, but it can amplify noise in thermal images, making hot spot detection challenging. HSV color extraction leverages the hue, saturation, and value components to segment hot spots based on color differences. The transformation from RGB to HSV is given by:

$$ V = \max(R, G, B) $$
$$ S = \begin{cases} \frac{V – \min(R, G, B)}{V} & \text{if } V \neq 0 \\ 0 & \text{otherwise} \end{cases} $$
$$ H = \begin{cases} 60 \cdot \frac{G – B}{V – \min(R, G, B)} & \text{if } V = R \\ 120 + 60 \cdot \frac{B – R}{V – \min(R, G, B)} & \text{if } V = G \\ 240 + 60 \cdot \frac{R – G}{V – \min(R, G, B)} & \text{if } V = B \end{cases} $$

While this approach can be intuitive, its effectiveness depends heavily on image quality, and it often fails in noisy or low-resolution thermal images commonly captured by infrared cameras.

Edge detection methods, such as the Sobel and Canny operators, focus on identifying boundaries between regions by computing gradients. The Sobel operator uses convolutional kernels to approximate horizontal and vertical gradients:

$$ G_x = \begin{bmatrix} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{bmatrix} * I \quad \text{and} \quad G_y = \begin{bmatrix} -1 & -2 & -1 \\ 0 & 0 & 0 \\ 1 & 2 & 1 \end{bmatrix} * I $$
$$ G = \sqrt{G_x^2 + G_y^2} $$

where $ G $ is the gradient magnitude. This method is sensitive to noise and may detect irrelevant edges in complex backgrounds. The Canny edge detector improves upon this by applying Gaussian smoothing, non-maximum suppression, and hysteresis thresholding. The gradient magnitude and direction are computed as:

$$ \theta = \arctan\left(\frac{G_y}{G_x}\right) $$

and edges are retained only if they exceed certain thresholds. However, both methods require careful parameter tuning and perform poorly in images with high noise levels, which are typical in thermal imaging of photovoltaic panels. The limitations of traditional image processing highlight the need for more adaptive approaches, such as machine learning, which can learn complex features from data.

To address the shortcomings of traditional methods, I investigated machine learning-based object detection algorithms, which leverage deep learning to automatically extract features from images. Among these, the YOLOv4 architecture has shown promise due to its balance of speed and accuracy. However, I proposed an improved version of YOLOv4 tailored for hot spot detection in photovoltaic panels. The original YOLOv4 uses a CSPDarkNet53 backbone for feature extraction, but I replaced it with MobileNetV3 to reduce computational complexity and enhance efficiency. MobileNetV3 incorporates depthwise separable convolutions, linear bottleneck structures, and an h-swish activation function, defined as:

$$ \text{h-swish}(x) = x \cdot \frac{\text{ReLU6}(x + 3)}{6} $$

which replaces ReLU6 to improve performance while reducing computation. Additionally, I substituted large convolutional kernels (e.g., 5×5 and 7×7) with stacks of 3×3 kernels to further decrease parameters. The MobileNetV3 backbone also integrates a Squeeze-and-Excitation (SE) attention mechanism, which recalibrates channel-wise feature responses by computing squeeze and excitation operations:

$$ z_c = \frac{1}{H \times W} \sum_{i=1}^{H} \sum_{j=1}^{W} u_c(i,j) $$
$$ s = \sigma(W_2 \delta(W_1 z)) $$

where $ z_c $ is the squeezed feature, $ \sigma $ is the sigmoid function, $ \delta $ is ReLU, and $ W_1 $ and $ W_2 $ are weights. This allows the model to focus on relevant features for hot spot detection. The overall network architecture includes an SPP module for multi-scale feature pooling, a PANet for feature aggregation, and a YOLO Head for output prediction. The loss function combines localization, confidence, and classification losses:

$$ L = \lambda_{\text{coord}} \sum_{i=0}^{S^2} \sum_{j=0}^{B} \mathbb{1}_{ij}^{\text{obj}} \left[ (x_i – \hat{x}_i)^2 + (y_i – \hat{y}_i)^2 + (w_i – \hat{w}_i)^2 + (h_i – \hat{h}_i)^2 \right] + \sum_{i=0}^{S^2} \sum_{j=0}^{B} \mathbb{1}_{ij}^{\text{obj}} \left[ C_i \log(\hat{C}_i) + (1 – C_i) \log(1 – \hat{C}_i) \right] + \lambda_{\text{noobj}} \sum_{i=0}^{S^2} \sum_{j=0}^{B} \mathbb{1}_{ij}^{\text{noobj}} \left[ C_i \log(\hat{C}_i) + (1 – C_i) \log(1 – \hat{C}_i) \right] + \sum_{i=0}^{S^2} \mathbb{1}_{i}^{\text{obj}} \sum_{c \in \text{classes}} \left[ p_i(c) \log(\hat{p}_i(c)) + (1 – p_i(c)) \log(1 – \hat{p}_i(c)) \right] $$

where $ S $ is the grid size, $ B $ is the number of bounding boxes, $ \mathbb{1}_{ij}^{\text{obj}} $ indicates if the box contains an object, and $ \lambda $ terms are weighting factors.

For training and evaluation, I constructed a dataset of thermal infrared images featuring hot spots in photovoltaic panels. The dataset included 1,410 images with a resolution of 560×350 pixels, obtained through a combination of field captures and simulated hot spots using materials like toothpaste to mimic shading effects. Data augmentation techniques, such as rotation, scaling, and mirroring, were applied to increase diversity and prevent overfitting. The images were annotated using bounding boxes to label hot spot regions. The dataset was split into training and testing sets in an 8:2 ratio, and transfer learning was employed by initializing the model with pre-trained weights from ImageNet to accelerate convergence.

Experimental results demonstrated the effectiveness of the improved YOLOv4 model. The performance was evaluated using metrics like average precision (AP), intersection over union (IoU), precision, and recall. The improved model achieved an AP of 93.42%, IoU of 92.31%, precision of 94.36%, and recall of 92.27%, outperforming other models such as SSD, Faster R-CNN, and the original YOLOv4. The table below summarizes the comparison:

Model	AP (%)	IoU (%)	Precision (%)	Recall (%)
SSD	87.29	89.69	87.54	85.39
Faster R-CNN	92.17	93.61	91.26	90.68
YOLOv4	90.03	91.81	91.28	90.12
Improved YOLOv4	93.42	92.31	94.36	92.27

The precision-recall curve and F1 score curve further validated the model’s robustness, with the improved YOLOv4 showing higher area under the curve values. Visual inspections confirmed that the model accurately localized hot spots without false positives or misses, even in images with multiple hot spots or complex backgrounds. This highlights the advantage of the SE attention mechanism in enhancing feature discrimination for photovoltaic panel analysis.

In conclusion, this research underscores the limitations of traditional image processing methods for hot spot detection in photovoltaic panels, which are highly dependent on image quality and prone to errors in noisy thermal images. The proposed improved YOLOv4 model, with its MobileNetV3 backbone and attention mechanisms, offers a superior alternative by leveraging deep learning to achieve high accuracy and efficiency. The experimental outcomes confirm its potential for practical applications in monitoring and maintaining solar energy systems. Future work could focus on integrating real-time detection capabilities and expanding the dataset to include more diverse environmental conditions to further enhance the model’s generalization for photovoltaic infrastructure.