Hot Spot Detection Method of Solar Panel Based on Thermal Infrared Images

1. Introduction

With the proposal of the goals of “carbon peaking” and “carbon neutrality”, solar power generation has become an important branch of the new energy industry. Solar panels are extremely susceptible to hot spot effects due to shading and stains when exposed to the outdoor environment for a long time. This not only reduces the power generation efficiency of the photovoltaic system but also affects the safety and stable operation of the photovoltaic power station. Therefore, the detection of hot spots on solar panels is of great significance.

At present, there are two main methods for hot spot detection: traditional image processing algorithms and machine learning-based target detection algorithms. The traditional image processing algorithm analyzes the color, texture, and other features of the image to detect hot spots. However, this method is highly dependent on the quality of the image and is easily affected by factors such as illumination and background. The machine learning-based target detection algorithm uses a large amount of labeled data for training to achieve automatic hot spot detection. This method has higher accuracy and robustness but requires a large amount of training data and computational resources.

In this paper, we study the hot spot detection methods based on traditional image processing and machine learning. We analyze the advantages and disadvantages of different methods and propose an improved YOLOv4 hot spot detection method. The experimental results show that the proposed method has high accuracy and robustness and can effectively detect hot spots on solar panels.

2. Hot Spot Detection Based on Traditional Image Processing

2.1 Region Segmentation-based Hot Spot Detection

2.1.1 Threshold Segmentation

The threshold segmentation method first thresholds the image and then extracts the target object from the image by removing the background to achieve image segmentation. The process of this method is as follows: first, convert the image to a grayscale format; then, give a grayscale reference value through experiments or adaptive methods; finally, convert the pixels whose grayscale values are greater than or less than this reference value into black or white to complete the extraction of the target.

Threshold segmentation hot spot detection is shown in Figure 1. When the background grayscale is similar to the target, as shown in Figure 1(b), there are a large number of false detections in the image. This shows that the threshold segmentation method is highly dependent on the similarity between the target and the background and is very sensitive to the quality of the image.

Table 1	Threshold Segmentation Hot Spot Detection
(a) Image to be Detected	The original solar panel thermal infrared image with hot spots.
(b) Detection Result	The result after threshold segmentation, where hot spots are extracted, but there are also some misclassifications.

2.1.2 HSV Color Extraction

The HSV color extraction method performs image color extraction and segmentation in the HSV color model of the thermal infrared image. The color parameters of the HSV model are hue (H), saturation (S), and value (V). H is measured in angles and ranges from 0° to 360°. In this range, red is 0°, green is 120° when calculated counterclockwise from red, and blue is 240°. S represents the degree to which the color approaches the spectral color.

HSV color extraction hot spot detection is shown in Figure 3. In the HSV model images in Figure 3(b) and Figure 3(e), the color of the hot spot part is significantly darker than that of the normal part. This indicates that the HSV color extraction method can perform more intuitive color extraction and segmentation. However, the detection result in Figure 3(c) and Figure 3(f) shows that the hot spot detection effect depends on the similarity between the hot spot and the background grayscale and is highly dependent on the quality of the image to be tested.

Table 2	HSV Color Extraction Hot Spot Detection
(a) Image to be Detected 1	A thermal infrared image of a solar panel with hot spots.
(b) HSV Color Space Effect 1	The image converted to the HSV color space, where the hot spot area shows different color characteristics.
(c) Detection Result 1	The result of hot spot extraction, with some hot spots detected accurately.
(d) Image to be Detected 2	Another thermal infrared image for testing.
(e) HSV Color Space Effect 2	The HSV color space conversion result of the second image.
(f) Detection Result 2	The hot spot detection result of the second image, with varying degrees of accuracy.

2.2 Edge Detection-based Hot Spot Detection

2.2.1 Sobel Operator Gradient Detection

The Sobel operator gradient detection method, also known as image gradient calculation, is mainly used to obtain the first-order gradient of a digital image to acquire the edges of the target object. The processing idea of this method is to calculate the weighted difference of the grayscale values of the four neighboring pixels above, below, left, and right of each pixel in the image, so that the edges of the detected object reach the extreme value, thereby detecting the edge trajectory. The Sobel operator is a discrete difference operator used to calculate the approximate value of the gradient of the image brightness function. When using this operator at any point in the image, a corresponding gradient vector or normal vector will be generated. In the actual application of hot spot detection, the maximum value of the convolution of each pixel point with the above operator template is taken as the output value of the pixel point, and then the edge image of the hot spot target can be calculated. However, the noise edges in the background will also be detected. This method is suitable for images to be detected with obvious hot spot boundaries, a single background, and no noise.

Sobel edge detection hot spot detection is shown in Figure 4. The edge detection effect in Figure 4(b) shows that the Sobel operator can detect the edges of the hot spot, but it also detects some noise edges.

Table 3	Sobel Edge Detection Hot Spot Detection
(a) Image to be Detected	The original solar panel thermal infrared image.
(b) Edge Detection Effect	The result of Sobel edge detection, showing the detected edges including those of the hot spot and some noise.

2.2.2 Canny Edge Detection

The main purpose of the Canny edge detection algorithm is to significantly reduce the data size of the image while preserving the original attributes of the image through algorithm operations. The operation process of the Canny edge detection algorithm is as follows: first, perform grayscale processing and filter smoothing on the image, and then calculate the gradient of the pixel points in the image; next, apply non-maximum suppression to eliminate the errors caused by edge detection and make the fuzzy boundaries clear. In this process, the gradient threshold range of the output pixel points of the image must be controlled, and the potential possible boundaries of the detected target are determined hierarchically, so as to complete the detection by suppressing the individual weak edges.

Canny edge detection hot spot detection is shown in Figure 5. Figure 5 shows the hot spot detection results with different gradient threshold ranges. It can be seen that the Canny edge detection method is highly dependent on the selection of the threshold and also has the disadvantage that the edge detection method is difficult to remove background interference.

Table 4	Canny Edge Detection Hot Spot Detection
(a) Image to be Detected	The original solar panel thermal infrared image.
(b) Threshold Range 50 – 120	The detection result with a threshold range of 50 – 120, showing some hot spots detected.
(c) Threshold Range 100 – 180	The detection result with a threshold range of 100 – 180, with different hot spot detection accuracies.

2.3 Analysis of Hot Spot Detection Methods Based on Traditional Image Processing

Through the experimental results of the above hot spot detection methods based on traditional image processing, it can be seen that the traditional image processing method has extremely high requirements for the difference between the target and the background. In the process of hot spot detection based on the thermal infrared images of solar panels, since the hot spots often have similar color characteristics to the background, the region segmentation method has high requirements for the difference in color and morphological characteristics between the hot spots and the background of the normally operating solar panels, which will lead to a large number of false detections of noise. The edge detection method has higher requirements for image noise interference. This type of method mostly judges whether a pixel point is a boundary point by calculating the gradient value of each pixel point in each direction. When there are noise points in the image, this method will lead to a large number of false detections in edge detection. Due to the low resolution of the images caused by the hardware of the infrared thermal imager, the thermal infrared images contain more noise regions, which greatly affects the actual effect of hot spot detection. In summary, when using the traditional image processing method for hot spot detection, the requirements for image quality are too high, and the shooting data of the current conventional infrared thermal imager is difficult to ensure the corresponding data quality. Therefore, a detection method with lower dependence on image quality and higher robustness is needed.

3. Hot Spot Detection Method Based on Improved YOLOv4

3.1 Dataset Construction

A complete and reliable dataset is the prerequisite for the effective scene understanding of the machine learning-based target detection method. The overall process of constructing the hot spot dataset in this paper includes data acquisition and data processing. The implementation process of the process conforms to specific dataset construction specifications. This paper starts from the formulation of the dataset construction specifications and elaborates on the data acquisition and data processing parts to construct a hot spot image dataset that meets the actual requirements.

In the actual situation, due to industry restrictions, it is very difficult to acquire solar panel hot spot data. This paper adopts the method of field shooting of solar panel hot spots combined with simulated hot spots. The acquisition of simulated hot spots draws on the simulation experiment of the Ishihara laboratory in Japan, which covers the photovoltaic panel with sand to induce the hot spot effect. In this experiment, toothpaste of different sizes and shapes is used to simulate the influence of bird droppings and mud on the photovoltaic panel to obtain simulated hot spots. The main process of dataset processing includes data augmentation and data annotation. Before semantic annotation, the data needs to be screened twice to exclude image data that does not meet the specifications, avoid generating more unqualified images due to data augmentation, and reduce the workload of manual semantic annotation. The reason for data augmentation is that only a limited amount of data can be obtained in the field collection of hot spots. To achieve better training results, it is necessary to increase the generalization ability of the model and prevent overfitting. Therefore, the dataset must be expanded to increase the dataset size. Common data augmentation methods include mirroring, rotation, scaling, random cropping, and color jittering. Since the colors in the infrared thermal image contain temperature information, only mirroring, rotation, scaling, and random cropping operations are performed on the existing data with a certain probability during the expansion of the hot spot dataset to achieve data augmentation. To ensure that the dataset after data augmentation conforms to the dataset construction principles, the augmented images need to be screened again to remove invalid data, and the labelImg software is used for manual semantic annotation to frame the actual hot spot part.

Combined with the actual characteristics of solar panel hot spots, this paper constructs a solar panel infrared thermal image hot spot dataset through the above dataset construction process according to the dataset construction principles. The dataset includes 1410 thermal infrared images with a resolution of 560×350 and containing hot spot target labels.

3.2 Improved YOLOv4 Network Structure

The YOLOv4 target detection algorithm is a regression-based target detection algorithm. Its structure is an end-to-end object detection deep convolutional neural network. It can predict multiple candidate boxes at one time and directly regress the object position area in the output layer and automatically label the category of the object within the area. YOLOv4 has optimized the speed and accuracy on the previous generations of the You Only Look Once (YOLO) system. Its main architecture consists of the backbone feature extraction network, the Spatial Pyramid Pooling (SPP), the Path Aggregation Network (PANet), and the You Only Look Once Head (YOLO Head) module. In the actual hot spot detection task, the original backbone feature extraction network CSPDarkNet53 structure contains 104 layers of convolutional networks, which requires a large number of convolutional operation operations, resulting in occupying a large amount of computing resources and running slowly. Therefore, this paper changes the backbone network structure to MobileNetV3 to achieve preliminary feature extraction. MobileNetV3 has a depthwise separable convolution structure and incorporates the linear bottleneck structure and inverted residual module of MobileNetV1 and the h-swish activation function into the network structure. In terms of the activation function, this paper uses h-swish instead of ReLU6. This reduces the amount of computation while improving the performance. To further reduce the number of parameters and speed up the model operation, in the actual code, this paper replaces the 5×5 convolution kernel with two 3×3 convolution kernels stacked and replaces the 7×7 convolution kernel with three 3×3 convolution kernels stacked. In addition, MobileNetV3 also incorporates the Squeeze-and-Excitation Networks (SENet) channel attention mechanism, which can effectively adjust the channel weights with obvious target features, thereby improving the model accuracy.

The improved YOLOv4 network architecture is shown in Figure 7. As can be seen from Figure 7, in addition to the backbone feature network MobileNetV3, the improved YOLOv4 network architecture also includes the SPP module, the PANet module, and the YOLO Head module. The SPP module is used to solve the problem of different feature size images entering the fully connected layer. The PANet module is used for parameter aggregation to adapt to different levels of target detection. The YOLO Head module, as the detection head, is used to calculate the loss function and reshape the data format as needed to activate the original coordinate grid points.

3.3 Experimental Verification

The experiment is implemented through the Keras framework with the Tensorflow backend. All algorithms are implemented in the Python3 programming language. In the experiment, the constructed photovoltaic hot spot dataset is divided into a training set and a test set according to the ratio of 8:2. According to the idea of transfer learning, the features extracted by the backbone network are universal. Therefore, in the experiment, the pre-trained model trained on the ImageNet dataset is introduced into the model in this paper to speed up the network fitting speed. In terms of training parameters, the experiment dynamically adjusts the learning rate and sets the initial learning rate to 0.001 and the decay coefficient to 0.9. After the training is completed, the experiment trains and evaluates the dataset in the experiment through the Precision-Recall (PR) curve and the F1-score curve.

The model PR curve and F1 curve are shown in Figure 8. The PR curve shows the relationship between the precision and recall of the model, and the F1 curve shows the comprehensive performance of the model. A higher F1 score indicates better performance of the model.

Table 5	Model PR Curve and F1 Curve
(a) PR Curve	The curve shows the trade-off between precision and recall for different models.
(b) F1 Curve	The curve shows the F1 score for different models at different thresholds.

This paper uses the trained model for the hot spot detection task and selects the original hot spot images that are collected in the same batch but not involved in the dataset construction for model testing. The hot spot detection result of the improved YOLOv4 is shown in Figure 9. It can be seen from Figure 9 that the trained model can effectively frame the hot spots with high confidence and complete the detection task. After predicting all the test data, this paper obtains that the average precision (AP) is 93.42%; the intersection over union (IoU) between the predicted target box and the actual box is 92.3%; the precision is 94%; and the recall is 92%.

Table 6	Improved YOLOv4 Hot Spot Detection Result
Detection Result	The image shows the detected hot spots with bounding boxes and confidence scores.

To further verify the effectiveness of the method in this paper in the hot spot detection task, this paper uses the SSD, Faster R-CNN, and YOLOv4 target detection models for comparative experiments. The experimental results are shown in Table 1. The experimental results show that all four models can effectively detect hot spots and can correctly frame most of the test images. However, when the image to be detected contains multiple hot spots: SSD fails to complete the complete detection and there will be a large number of missed detections; the Faster R-CNN model can successfully detect hot spots in most cases, but there will be false detections when there are hot spot similar targets; the YOLOv4 model also has missed detections. In contrast, the improved YOLOv4 model proposed in this paper benefits from the SENet channel attention mechanism introduced in the MobileNetV3 structure, which makes the model learn the hot spot features more deeply and increases the weights of the channels where the features are located. Therefore, the detection effect of the model is better. The improved YOLOv4 model has no missed detections and false detections in the experimental test images and can completely and accurately frame the hot spot part. At the same time, as can be seen from Table 1, the improved YOLOv4 model has a certain improvement in hot spot detection accuracy compared with the YOLOv4 model, and the detection accuracy reaches 94.36%, which is 6.82%, 3.1%, and 3.08% higher than that of SSD, Faster R-CNN, and YOLOv4, respectively, and can more accurately complete the hot spot detection.

Model	AP	IoU	Precision	Recall
SSD	87.29	89.69	87.54	85.39
Faster R-CNN	92.17 \| 93.61 \| 91.26 \| 90.68 \|	93.61	91.26	90.68
YOLOv4	90.03	91.81	91.28	90.12
Improved YOLOv4	93.42	92.31	94.36	92.27

4. Conclusion

The hot spot detection process based on traditional image processing methods is analyzed, and it is expounded that this method overly depends on image quality and often has problems such as false detections of noise similar to hot spot features and missed detections of hot spots with unobvious features. Aiming at the problems that are difficult to solve with traditional image processing methods, this paper conducts research on the target detection algorithm based on deep learning and proposes to change the network backbone of YOLOv4 to the MobileNetV3 structure, change the ReLU6 activation function to h-swish, and replace the large convolution kernel with the superposition of multiple 3×3 convolution kernels, thereby obtaining an improved YOLOv4 model. The experimental data and comparative experimental data of the improved YOLOv4 model for hot spot detection show that the improved YOLOv4 model proposed in this paper is intuitive, efficient, and has high accuracy in the hot spot detection task, and its detection accuracy is better than that of the comparison models such as YOLOv4. The improved YOLOv4 model has certain engineering application value.

In future research, we can further explore the application of deep learning algorithms in solar panel hot spot detection, such as using more advanced neural network architectures or improving the performance of the model through optimization algorithms. At the same time, we can also study the combination of multiple detection methods to improve the accuracy and reliability of hot spot detection. In addition, with the development of technology, new types of sensors and imaging devices may be applied to solar panel monitoring, which will provide more data sources and possibilities for hot spot detection. We should also pay attention to the practical application scenarios of hot spot detection and develop more user-friendly and efficient detection systems to contribute to the safe and efficient operation of photovoltaic power plants.

In conclusion, the research on solar panel hot spot detection is of great significance for improving the performance and reliability of photovoltaic power generation systems. Through continuous innovation and improvement of detection methods, we can better promote the development and application of solar energy.