Abstract
Against the backdrop of China’s “dual carbon” goals, the technology of solar solar panels is rapidly developing. The installation characteristics of photovoltaic are large area, irregular, and multi-scene. Therefore, an efficient method for obtaining photovoltaic is needed to provide technical support for real-time detection of their type, location, quantity, and scope. Due to the large number of photovoltaic images, the training areas of each image do not have commonality, resulting in poor performance and low accuracy of traditional supervised classification. Additionally, manual methods are costly and time-consuming, leading to a longer detection cycle and lower efficiency for photovoltaic. In this context, this paper uses a deep learning model based on a convolutional neural network (CNN) to identify and extract high-resolution remote sensing images for high-precision intelligent detection and assessment of photovoltaic.

1. Introduction
1.1 Research Background and Significance
In the context of global carbon reduction, solar photovoltaic technology is undergoing rapid development. China, with its vast territory and abundant solar energy resources, is one of the countries that can make good use of solar energy. The distribution of solar energy resources in China. At the 75th United Nations General Assembly, President Xi Jinping proposed the “dual carbon” goals, namely carbon peaking and carbon neutrality. The Chinese government has decided to enhance the country’s nationally determined contributions and adopt more forceful policies and measures to address the increasingly severe situation of resources and climate.
The significance of this research lies in several aspects: (1) It can avoid direct contact between inspectors and photovoltaic, preventing human damage to photovoltaic; (2) It can shorten the detection cycle of photovoltaic and allow targeted later maintenance and replacement operations. Taking the Taratan Photovoltaic Power Station in Qinghai Province as an example, it is currently the largest solar photovoltaic power generation base in China in terms of total area. However, the natural conditions in its location are relatively complex, and relying solely on manual inspection and maintenance poses a large number of high-difficulty challenges.
1.2 Research Status at Home and Abroad
Various studies have been conducted on the detection and extraction of photovoltaic. For example, Shi Wenxi et al. [22] proposed a method based on Residual Network (ResNet) technology and migrated it to automatic identification research of agricultural greenhouses, achieving high-precision identification of greenhouses in Sentinel-2 satellite images. Yu Haiyang et al. [23] improved the U-Net model to effectively improve model robustness and enhance monitoring capabilities for road cracks.
Overviewing the current research status at home and abroad, it can be seen that computer-based automatic classification using supervised classification can be attempted for photovoltaic classification. However, photovoltaic has characteristics such as significant multi-scale features, large scene differences, and diverse interfering factors in the field of deep learning, making it highly difficult to achieve high-precision extraction of photovoltaic. The difficulties in accurately extracting photovoltaic include: (1) The lack of high-quality photovoltaic images and datasets in different scenarios; (2) The complexity of photovoltaic installation scenes and diverse interfering factors leading to challenges in photovoltaic extraction.
1.3 Research Content and Technical Route
This paper focuses on improving and optimizing the U-Net model to propose a method suitable for identifying and extracting photovoltaic. The main work of this paper includes: (1) Selection and processing of photovoltaic datasets; (2) Improvement of the U-Net model; (3) Experimental comparison and analysis.
Table 1: Main Work and Technical Route
Step | Content |
---|---|
1 | Selection and processing of photovoltaic datasets |
2 | Improvement of the U-Net model |
3 | Experimental comparison and analysis |
2. Research Theory and Related Technologies
2.1 Basic Theory of Supervised Classification
Supervised classification is a commonly used method in remote sensing image classification. It requires a training dataset with known category labels to train the classifier, which is then used to classify unknown data.
2.2 Basic Theory of Deep Learning Methods Based on Convolutional Neural Networks
A convolutional neural network (CNN) is a deep learning model that has achieved remarkable results in image classification, object detection, and semantic segmentation. A CNN mainly consists of convolutional layers, pooling layers, fully connected layers, and other components.
2.3 Convolutional Layer
A convolutional layer is an effective method for extracting image features. It consists of multiple convolutional units, and the parameters of each convolutional unit are obtained through the backpropagation algorithm. Convolutional operations mainly extract the required target features from the input data, compress them, and respond to different feature detectors to produce feature maps, laying the foundation for subsequent identification.
Table 2: Components of a Convolutional Neural Network
Component | Description |
---|---|
Convolutional Layer | Extracts image features |
Pooling Layer | Reduces the dimensionality of features while preserving important information |
Fully Connected Layer | Combines extracted features for classification or regression |
2.4 Classic Convolutional Neural Network Models
This section introduces five classic CNN models: U-Net, DeepLabV3+, PSPNet, SegNet, and HRNet, presenting their structural frameworks and principles, which will be used as comparison models in later experiments.
2.4.1 U-Net
U-Net is named after its U-shaped structure. Proposed in 2015, it was originally used for medical image segmentation. U-Net’s main features are skip connections and full convolution, making it suitable for large-scale image classification with moderate model depth and computational overhead.
3. Data Selection and Processing
The photovoltaic remote sensing images publicly released by Hou Jiang et al. [29] were used in this paper and manually screened to determine the final experimental data. The model training dataset includes four scenarios: WaterSurface, Grassland, SalineAlkali, and Cropland; the model validation dataset is Shrubwood.
Table 3: Dataset Scenarios
Scenario | Description |
---|---|
WaterSurface | Reservoirs and large ponds |
Grassland | Low-density grassland |
SalineAlkali | Saline-alkali land |
Cropland | Farmland |
Shrubwood | Shrubwood for validation |
4. Improved U-Net Model
4.1 Overall Network Structure of Improved U-Net
The improved U-Net integrates the CBAM module in the upsampling and downsampling processes, assigning different weights in the channel and spatial dimensions to focus more on photovoltaic pixels and reduce the impact of non-target features. The embedding of the residual module plays a key role in solving the problems of degradation and gradient disappearance brought by the increase in network depth and model parameters.
5. Experiments and Analysis
5.1 Experimental Comparison
This section compares the proposed algorithm with three supervised classification algorithms to highlight the advantages of deep learning in photovoltaic extraction. Ablation experiments were conducted on the improved U-Net model to prove its successful improvement.
Table 4: Comparison Results of Different Models
Model | mP (Precision) | mR (Recall) | mF1 | mIoU |
---|---|---|---|---|
U-Net | 0.9132 | 0.9186 | 0.9159 | 0.9084 |
ResNet | 0.9175 | 0.9247 | 0.9211 | 0.9108 |
RN+CA | 0.9353 | 0.9584 | 0.9467 | 0.9206 |
Ours (Improved U-Net) | Highest | Highest | Highest | Highest |
5.2 Model Validation and Application
The proposed model was used for multi-year change detection of the Panda Power Station, conducting model migration testing and estimating the installation situation and area of the power station in different years.
6. Conclusion
This paper proposes an improved U-Net model for high-precision extraction of photovoltaic from high-resolution remote sensing images. Experimental results demonstrate the superiority and applicability of the proposed model compared to other semantic segmentation models. Future research directions include further optimization of the model and expansion of the dataset to improve generalization ability.