In the pursuit of carbon peak and neutrality goals, the global energy landscape is rapidly transitioning towards renewable sources. Solar energy, harnessed through photovoltaic technology, stands out as a pivotal clean and practical solution. However, the operational efficiency of solar panels is significantly compromised by environmental factors, with dust accumulation being a primary culprit. The deposition of dust on the surface of solar panels not only reduces light transmittance but also leads to localized heating and potential long-term degradation. The characteristics of dust—such as particle shape, size, and composition—vary widely depending on geographical and environmental conditions, directly influencing the degree of power loss. Consequently, effective identification and analysis of dust types on solar panels are crucial for optimizing maintenance schedules, improving system performance, and informing site selection for new solar installations. Traditional methods for dust analysis, such as scanning electron microscopy (SEM), are accurate but costly, time-consuming, and not feasible for large-scale or field applications. This limitation motivates the exploration of computer vision and deep learning techniques for automated, efficient, and cost-effective dust identification.
Recent advancements in convolutional neural networks (CNNs) have revolutionized image classification tasks, offering high accuracy in various domains. For practical deployment, especially on resource-constrained devices like portable inspection tools, lightweight CNN models are essential. Among these, ShuffleNetV2 has emerged as a highly efficient architecture, balancing accuracy and computational complexity. In this work, I propose an improved ShuffleNetV2-based model specifically designed for identifying dust particles on solar panels. The enhancements integrate a smoother activation function, multi-scale feature extraction, and an advanced attention mechanism to boost performance while maintaining model lightness. The core objective is to develop a robust system that can classify dust from different regions, aiding in the assessment of solar panel soiling and its impact on energy yield.

The detrimental effects of dust on solar panels are well-documented. Particles accumulate on the surface, scattering and absorbing incident sunlight, which reduces the amount of radiation reaching the photovoltaic cells. The extent of power loss depends not just on the dust density but also on the optical properties of the particles, which are dictated by their morphology and material composition. For instance, angular particles may cause more significant transmittance reduction compared to spherical ones. Therefore, classifying dust types can provide insights into the soiling rate and help predict cleaning requirements. Existing studies have categorized dust into various shapes—rectangular, hexagonal, elliptical, spherical, triangular—using microscopic analysis. However, these methods lack scalability. Deep learning approaches, particularly CNNs, have shown promise in particle shape classification from images, but often rely on heavy models unsuitable for edge devices. Hence, there is a clear need for a lightweight yet accurate model tailored for solar panel dust identification.
ShuffleNetV2 serves as an excellent baseline due to its design principles favoring efficiency: maintaining many equal-width channels, using group convolution prudently, reducing network fragmentation, and minimizing element-wise operations. Its basic unit employs channel split, pointwise convolutions, depthwise convolution, and channel shuffle operations to enable effective information flow with low computation. However, for the nuanced task of dust particle recognition—where features can be subtle and varied—the baseline model may benefit from specific refinements. My improvements target three aspects: activation function, convolutional kernel diversity, and attention mechanism.
First, I replace the conventional ReLU activation function with the Mish function. ReLU can suffer from the “dying neuron” problem where gradients become zero for negative inputs, potentially hindering learning. Mish, being a smooth, non-monotonic activation, allows better gradient flow and information propagation through the network. The Mish function is defined as:
$$ f(x) = x \cdot \tanh(\ln(1 + e^x)) $$
This function provides improved optimization properties and often leads to higher accuracy in deep networks. In the context of dust images, which may contain complex textures and edges, Mish helps in capturing finer details.
Second, to handle the multi-scale nature of dust particles—ranging from fine specks to larger aggregates—I incorporate mixed-depthwise convolution. Instead of using a single kernel size (e.g., 3×3) for depthwise convolution, I employ a mixture of kernels (3×3, 5×5, and 7×7) across different channels within the same layer. This approach, inspired by MixNet, enables the network to capture patterns at various scales without a substantial increase in parameters or computations. The mixed-depthwise convolution can be represented as applying different kernel sizes to partitioned channel groups:
$$ \text{Output} = \text{Concat}(\text{DWConv}_{3\times3}(\mathbf{X}_1), \text{DWConv}_{5\times5}(\mathbf{X}_2), \text{DWConv}_{7\times7}(\mathbf{X}_3)) $$
where \(\mathbf{X}\) is the input feature map split into groups. This diversity enriches feature extraction, which is critical for distinguishing between dust types that may appear similar at a single scale.
Third, I integrate a Coordinate Attention (CA) module to enhance feature representation by embedding spatial position information. Traditional attention mechanisms like Squeeze-and-Excitation (SE) focus on channel-wise relationships but ignore spatial coordinates, which can be important for locating dust particles within an image. The CA module separately pools features along the height and width directions, encodes them into attention maps, and then re-calibrates the features. This process captures long-range dependencies along one spatial direction, helping the model to focus on relevant regions—such as dust clusters—while suppressing background noise. Mathematically, for an input feature map \(\mathbf{X} \in \mathbb{R}^{C \times H \times W}\), the CA module computes:
$$ z_c^h(h) = \frac{1}{W} \sum_{0 \leq i < W} x_c(h, i) $$
$$ z_c^w(w) = \frac{1}{H} \sum_{0 \leq j < H} x_c(j, w) $$
These pooled features are then transformed via convolutional layers and combined to produce attention weights that are applied to the input. By replacing the pointwise convolution at the tail of the right branch in the ShuffleNetV2 basic unit with the CA module, I reduce computational cost while improving accuracy.
The overall architecture of my improved ShuffleNetV2 model is summarized in the following table, detailing the modifications applied to the basic units for both stride=1 and stride=2 cases:
| Component | Modification | Purpose |
|---|---|---|
| Activation Function | ReLU → Mish | Enhance gradient flow and feature information propagation |
| Depthwise Convolution | Single kernel → Mixed kernels (3×3, 5×5, 7×7) | Capture multi-scale dust particle features |
| Attention Mechanism | Pointwise convolution → Coordinate Attention (CA) module | Incorporate spatial position information, suppress noise |
| Basic Unit (Stride=1) | Right branch tail: pointwise conv replaced with CA | Reduce computation, improve feature recalibration |
| Basic Unit (Stride=2) | Similar modifications applied for downsampling | Maintain consistency across network stages |
To train and evaluate the model, I curated a dataset of dust accumulation images from solar panels in four distinct regions. The images were captured using digital microscopy, showcasing a variety of particle shapes and distributions. The original dataset comprised 239 images, which I expanded through data augmentation techniques—including brightness adjustment, horizontal flipping, rotation, and translation—to prevent overfitting and improve generalization. After augmentation, the dataset contained 718 images, split into 80% for training and 20% for testing. The augmentation process is crucial because real-world dust on solar panels can appear under different lighting and orientation conditions.
Experimental parameters were set as follows: initial learning rate of 0.001, 100 training epochs, batch size of 24, Adam optimizer, and cross-entropy loss. I employed transfer learning by initializing the model with pre-trained ShuffleNetV2 weights to accelerate convergence. The evaluation metrics included accuracy, parameter count (in millions), and floating-point operations (FLOPs, in gigaFLOPs). Accuracy measures the model’s classification performance, while parameter count and FLOPs assess model complexity and inference efficiency—key factors for deployment on portable devices used for inspecting solar panels.
The results from ablation studies demonstrate the incremental benefits of each modification. Starting with the baseline ShuffleNetV2, I sequentially added Mish activation, CA module, and mixed-depthwise convolution. The following table summarizes the performance on the test set:
| Model Variant | Accuracy (%) | Parameters | FLOPs (×10^8) |
|---|---|---|---|
| ShuffleNetV2 (Baseline) | 87.44 | 2,278,604 | 1.50 |
| + Mish Activation | 88.09 | 2,278,604 | 1.49 |
| + CA Module | 88.98 | 1,992,700 | 1.08 |
| + Mixed Depthwise (3,5,7) | 92.25 | 2,031,288 | 1.14 |
The data augmentation itself contributed to a 0.51 percentage point increase in accuracy compared to using the raw dataset. The Mish activation provided a 0.65-point boost while slightly reducing FLOPs. Replacing pointwise convolution with the CA module not only improved accuracy by 1.54 points but also significantly lowered parameters and FLOPs, highlighting its efficiency. Incorporating mixed-depthwise convolution with kernel sizes 3, 5, and 7 yielded the best accuracy of 92.25%, a substantial 4.81-point gain over the baseline, with only a modest rise in complexity relative to the CA-enhanced version. I also experimented with larger kernel mixtures (e.g., adding 9×9), but accuracy plateaued or decreased, indicating that 3,5,7 offers an optimal balance for dust feature scales.
To contextualize the performance, I compared the improved ShuffleNetV2 model against other established networks, including lightweight models like MobileNetV2 and GhostNet, as well as a deeper ResNet34. The comparison, under identical experimental conditions, is shown below:
| Model | Accuracy (%) | Parameters | FLOPs (×10^8) |
|---|---|---|---|
| Improved ShuffleNetV2 (Ours) | 92.25 | 2,031,288 | 1.14 |
| MobileNetV2 | 91.14 | 3,504,872 | 3.20 |
| ResNet34 | 91.60 | 11,689,512 | 18.21 |
| GhostNet | 87.74 | 5,183,016 | 1.50 |
| ShuffleNetV2 (Baseline) | 87.44 | 2,278,604 | 1.50 |
The improved model achieves the highest accuracy while maintaining the lowest FLOPs among all contenders, except for the baseline ShuffleNetV2 which has similar FLOPs but much lower accuracy. It outperforms MobileNetV2 by 1.11 points and ResNet34 by 0.65 points, despite being far more efficient. This demonstrates the effectiveness of the proposed modifications for the specific task of dust identification on solar panels.
The success of the improved model can be attributed to several factors. The Mish activation facilitates better learning of complex dust patterns. The mixed-depthwise convolution adapts to varying particle sizes—critical since dust on solar panels can range from fine powder to larger debris. The CA module helps the model focus on relevant spatial regions, reducing confusion from background textures or image artifacts. Importantly, these enhancements align with the need for lightweight models; the CA module, in particular, replaces heavy pointwise convolutions, cutting down parameters and computations. From a practical standpoint, this means the model can be deployed on mobile or embedded systems for real-time inspection of solar panels in the field, enabling frequent monitoring without requiring expensive equipment.
However, there are limitations to this study. The dataset, though augmented, is from only four regions; expanding to more diverse geographical locations would improve model robustness. Additionally, the current work focuses on classification accuracy but does not directly correlate dust types with power loss metrics of solar panels. Future research should integrate physical models to estimate efficiency degradation based on identified dust categories. Another direction is to explore real-time video analysis for dynamic dust accumulation monitoring on solar panels.
In conclusion, I have developed an enhanced ShuffleNetV2-based model for identifying dust particles on solar panels. By incorporating Mish activation, mixed-depthwise convolution, and coordinate attention, the model achieves superior accuracy and reduced computational complexity compared to existing lightweight networks. This work underscores the potential of tailored deep learning solutions for maintaining solar panel efficiency, contributing to the broader goal of optimizing renewable energy systems. As solar energy continues to expand, such automated inspection tools will become increasingly valuable for ensuring peak performance and longevity of solar installations worldwide.
