Dynamic Model Identification of Solar Inverters Using Recursive Least Squares

The growing integration of renewable energy sources into power grids necessitates accurate and efficient modeling of generation units. Among these, solar photovoltaic (PV) systems are a cornerstone technology. The core component enabling grid connection is the solar inverter, which converts direct current (DC) from the PV array into grid-synchronized alternating current (AC). The dynamic behavior of this solar inverter significantly influences the overall stability and power quality of the network. Therefore, developing precise dynamic models for solar inverters is paramount for system analysis, control design, and real-time simulation.

Traditional modeling approaches for solar inverters often rely on detailed physical principles, involving the explicit representation of power electronic switches, filter circuits, and complex control loops (e.g., Phase-Locked Loops, voltage and current controllers). While highly accurate, such white-box models are computationally intensive, require extensive knowledge of proprietary control algorithms, and their parameters are often not readily available to system operators. This complexity poses a challenge for large-scale power system studies where numerous solar inverter units need to be simulated.

To address these challenges, we propose a data-driven, “black-box” identification method for constructing dynamic models of three-phase solar inverters. The fundamental idea is to treat the entire solar inverter unit—encompassing the DC link, power electronics, filters, and all control systems—as a single dynamic system whose internal structure is unknown. We focus solely on the relationship between its easily measurable external inputs and outputs. For a grid-connected solar inverter, the natural inputs are the DC-side voltage and current from the PV array, while the outputs are the three-phase AC-side voltages and currents fed into the grid. Consequently, the solar inverter is treated as a Multiple-Input Multiple-Output (MIMO) dynamic system.

The core of our method is the application of the Recursive Least Squares (RLS) algorithm for online parameter identification. Unlike offline batch least-squares methods that process all data at once, the RLS algorithm updates the model parameters recursively as each new data sample arrives. This makes it perfectly suited for real-time or online identification, allowing the model of the solar inverter to adapt to changing operating conditions, such as variations in solar irradiance and temperature throughout the day. The identified model takes the form of a discrete-time transfer function matrix or difference equation, providing a compact and computationally efficient representation of the solar inverter’s dynamics for system-level studies.

System Structure and Data Flow for Solar Inverter Identification

The proposed identification framework is built on a clear definition of the solar inverter’s boundary and a structured data acquisition pipeline. The central concept is the “black-box” view of the solar inverter unit.

Defining Inputs and Outputs of the Solar Inverter System

We consider a standard three-phase two-level voltage source converter as the target solar inverter. The physical setup includes the PV array, the DC-link capacitor, the IGBT bridge, the AC filter inductors, and the connection to the grid. However, for our identification purpose, we ignore this internal topology. The system boundary is drawn at the DC input terminals and the AC output terminals of the solar inverter.

Inputs (2): DC-link voltage ($u_{dc}(t)$) and DC input current ($i_{dc}(t)$).
Outputs (6): Three-phase AC line-to-neutral voltages ($u_a(t), u_b(t), u_c(t)$) and three-phase AC line currents ($i_a(t), i_b(t), i_c(t)$).

Thus, the solar inverter is formally defined as a 2-input, 6-output dynamic system. The objective is to identify a mathematical model that maps the time-series of inputs to the time-series of outputs.

Data Acquisition and Processing Flow

Accurate identification requires high-quality, synchronized measurement data. The data flow proceeds as follows:

Measurement: Current and voltage transducers are installed at the DC input and AC output of the operational solar inverter. A synchronized data acquisition system records all eight signals ($u_{dc}, i_{dc}, u_a, u_b, u_c, i_a, i_b, i_c$) simultaneously at a high sampling rate.
Data Transfer & Screening: The recorded waveforms are transferred to a processing computer. The raw data undergoes screening to remove outliers, bad data segments (e.g., during inverter startup/shutdown), and to ensure signal quality.
Model Identification Loop: The preprocessed data streams are fed into the RLS identification algorithm.
- The algorithm is initialized with a candidate model structure (difference equation order) and initial parameter guesses.
- It processes data sequentially. For each new sample pair (inputs and outputs), it calculates the model’s predicted output.
- The error between the predicted output and the actual measured output is used to recursively update the model parameters.
- This process continues, and the model parameters evolve over time, capturing the dynamic behavior of the solar inverter.
Validation: The final identified model is validated by comparing its simulated output (driven by the measured DC inputs) against the held-out or subsequent measured AC outputs, using error metrics like Root Mean Square Error (RMSE).

Recursive Least Squares Algorithm for Solar Inverter Dynamic Modeling

The RLS algorithm is a cornerstone of online system identification. Its application to our solar inverter problem involves formulating the model, defining the parameter vectors, and implementing the recursive update equations.

Discrete-Time Model Structure for the Solar Inverter

We represent the input-output relationship for each output channel using a discrete-time AutoRegressive with eXogenous input (ARX) model. Given the system has two inputs, the model for a single output (e.g., phase-A voltage $u_a$) is:

$$
\begin{aligned}
u_a(k) &+ d_{ua1}u_a(k-1) + d_{ua2}u_a(k-2) + … + d_{ua n_1}u_a(k-n_1) = \\
& e_{ua0}u_{dc}(k) + e_{ua1}u_{dc}(k-1) + … + e_{ua n_1}u_{dc}(k-n_1) + \\
& f_{ua0}i_{dc}(k) + f_{ua1}i_{dc}(k-1) + … + f_{ua n_1}i_{dc}(k-n_1) + \varepsilon_{ua}(k)
\end{aligned}
$$

Where $k$ is the discrete-time index, $n_1$ is the model order for the phase-A voltage channel, $d_{uai}$, $e_{uaj}$, $f_{uaj}$ are the parameters to be identified, and $\varepsilon_{ua}(k)$ is the modeling error (white noise). A similar equation holds for the other five outputs (phase B and C voltages, and all three currents), each potentially with its own optimal order ($n_2, n_3, …, n_6$). This set of equations constitutes the full MIMO model of the solar inverter. In transfer function form, the relationship for phase A is:

$$
\begin{bmatrix} U_a(z) \\ I_a(z) \end{bmatrix} =
\begin{bmatrix}
\frac{B_{11}(z^{-1})}{A_{ua}(z^{-1})} & \frac{B_{12}(z^{-1})}{A_{ua}(z^{-1})} \\[6pt]
\frac{B_{21}(z^{-1})}{A_{ia}(z^{-1})} & \frac{B_{22}(z^{-1})}{A_{ia}(z^{-1})}
\end{bmatrix}
\begin{bmatrix} U_{dc}(z) \\ I_{dc}(z) \end{bmatrix}
$$

where $A_{ua}(z^{-1}) = 1 + d_{ua1}z^{-1} + … + d_{ua n_1}z^{-n_1}$, $B_{11}(z^{-1}) = e_{ua0} + e_{ua1}z^{-1} + … + e_{ua n_1}z^{-n_1}$, etc.

Recursive Parameter Estimation Formulation

To apply RLS, we rewrite the ARX equation in linear regression form. For the phase-A voltage output at time $k$:

$$
u_a(k) = \mathbf{h}_{ua}^T(k) \boldsymbol{\theta}_{ua} + \varepsilon_{ua}(k)
$$

where:

$\boldsymbol{\theta}_{ua} = [-d_{ua1}, …, -d_{ua n_1}, e_{ua0}, …, e_{ua n_1}, f_{ua0}, …, f_{ua n_1}]^T$ is the parameter vector.
$\mathbf{h}_{ua}(k) = [u_a(k-1), …, u_a(k-n_1), u_{dc}(k), …, u_{dc}(k-n_1), i_{dc}(k), …, i_{dc}(k-n_1)]^T$ is the regression vector containing past outputs and current/past inputs.

The RLS algorithm provides an efficient way to update the parameter estimate $\hat{\boldsymbol{\theta}}_{ua}(k)$ as new data $\{\mathbf{h}_{ua}(k), u_a(k)\}$ arrives:

$$
\begin{aligned}
\text{Gain Update:} \quad & \mathbf{K}_{ua}(k) = \frac{\mathbf{P}_{ua}(k-1) \mathbf{h}_{ua}(k)}{\lambda + \mathbf{h}_{ua}^T(k) \mathbf{P}_{ua}(k-1) \mathbf{h}_{ua}(k)} \\
\text{Parameter Update:} \quad & \hat{\boldsymbol{\theta}}_{ua}(k) = \hat{\boldsymbol{\theta}}_{ua}(k-1) + \mathbf{K}_{ua}(k)\left[u_a(k) – \mathbf{h}_{ua}^T(k) \hat{\boldsymbol{\theta}}_{ua}(k-1)\right] \\
\text{Covariance Update:} \quad & \mathbf{P}_{ua}(k) = \lambda^{-1} \left[ \mathbf{I} – \mathbf{K}_{ua}(k) \mathbf{h}_{ua}^T(k) \right] \mathbf{P}_{ua}(k-1)
\end{aligned}
$$

Here, $\mathbf{P}_{ua}(k)$ is the error covariance matrix, and $\lambda$ is the forgetting factor ($0 < \lambda \leq 1$). A forgetting factor less than 1 (e.g., 0.995) is crucial for tracking the time-varying dynamics of the solar inverter under changing environmental conditions, as it gradually discounts old data. The same set of equations runs in parallel for all six output channels of the solar inverter, each maintaining its own parameter vector $\hat{\boldsymbol{\theta}}$ and covariance matrix $\mathbf{P}$.

Identification Procedure and Model Order Selection

The overall identification procedure involves an outer loop for model order selection and an inner loop for RLS execution.

Initialization: Set initial parameter vector $\hat{\boldsymbol{\theta}}(0)$ to a small random vector or zero, and the covariance matrix $\mathbf{P}(0)$ to a large positive definite matrix (e.g., $\eta \mathbf{I}$, where $\eta = 10^6$).
Outer Loop (Order Selection): Start with a low model order $n=1$.
Inner Loop (Online RLS): For each time step $k = 1, 2, …, N$ (total data points):
- Construct the regression vector $\mathbf{h}(k)$ using measured data up to time $k$.
- Execute the RLS update equations to obtain $\hat{\boldsymbol{\theta}}(k)$.
Error Calculation: After processing the dataset, simulate the identified model (using the final parameters or the time-varying series) and calculate the RMSE against the true output.
Order Increment: Increment the model order $n = n+1$ and repeat steps 3-4. The optimal order $n^*$ is chosen as the lowest order beyond which the RMSE does not improve significantly or starts to increase due to overfitting.

A key advantage observed in our work with solar inverters is that the optimal model order $n^*$ is typically low (often 1 or 2). This leads to very compact models with few parameters, resulting in minimal memory footprint and extremely fast real-time execution, which is ideal for digital twin applications or embedding in grid simulation tools.

Experimental Case Study: Modeling a Three-Phase Solar Inverter

To validate the proposed RLS-based identification method, we conducted a detailed numerical case study using a simulated 100 kW grid-connected PV system. The solar inverter was a standard three-phase VSC with voltage-oriented control (VOC).

Test Data Generation Under Various Conditions

Real-world irradiance and temperature profiles from both summer and winter seasons were used to drive a high-fidelity simulation model in MATLAB/Simulink. This generated realistic input-output data for the solar inverter. Data was collected for three typical weather scenarios per season:

Clear Sunny Day
Cloudy/Overcast Day
Rainy (Summer) / Snowy (Winter) Day

For each scenario, the DC-side voltages/currents and AC-side three-phase voltages/currents were recorded at a 1 kHz sampling rate over the main daylight hours (e.g., 08:00-18:00). This resulted in long, time-varying datasets (over 100,000 samples each) representing the dynamic operation of the solar inverter. The key parameters of the benchmark solar inverter system are summarized below.

Table 1: Parameters of the Benchmark Three-Phase Solar Inverter System
Component	Specification / Parameter
System Configuration	Two-stage (DC/DC Booster + DC/AC Inverter)
PV Array	Composed of SPR-305-WHT modules
MPPT Algorithm	Perturb & Observe (P&O)
Inverter Topology	Three-phase, two-level Voltage Source Converter (VSC)
Control Strategy	Voltage-Oriented Control (VOC) with PI regulators
DC-Link Voltage (Rated)	500 V
AC Output Voltage (Phase, RMS)	120 V (209 V peak)
AC Output Current (Rated, RMS)	175 A (308 A peak)

Model Identification Results and Performance Analysis

The RLS algorithm with a forgetting factor $\lambda=0.999$ was applied to each of the six weather datasets. The first task was determining the optimal model order for each output channel. The following table shows the RMSE for different orders during the identification phase for the phase-A voltage and current models.

Table 2: Model Order Selection Based on Identification RMSE for Phase-A Outputs
Season	Output	Order=1	Order=2	Order=3	Optimal Order (n*)
Summer	Voltage ($u_a$)	0.599	>100	>100	1
Summer	Current ($i_a$)	1.722	>100	>100	1
Winter	Voltage ($u_a$)	0.357	>100	>100	1
Winter	Current ($i_a$)	3.786	2.906	2.681	3

The results confirm that low-order models (1 or 2) are generally sufficient to accurately capture the dominant dynamics of the solar inverter for most conditions. Higher orders often lead to overfitting and numerical instability (indicated by very high RMSE).

Using the optimal orders, the final identified models were simulated. The simulated outputs showed excellent agreement with the measured test data. The following table summarizes the maximum relative errors observed during the simulation for the phase-A outputs across all weather scenarios.

Table 3: Maximum Relative Error of Identified Solar Inverter Models for Phase-A
Season	Weather	Phase-A Voltage		Phase-A Current
		Max Positive Error	Max Negative Error	Max Positive Error	Max Negative Error
Summer	Sunny	+0.50%	-1.10%	+1.30%	-21.20%
	Cloudy	+12.10%	-13.00%	+17.60%	-79.90%
	Rainy	+0.50%	-1.10%	+2.10%	-80.90%
Winter	Sunny	+0.40%	-0.60%	+5.60%	-56.00%
	Cloudy	+0.40%	-0.50%	+5.40%	-55.90%
	Snowy	+0.50%	-0.90%	+1.60%	-22.00%

Key observations from the results are:

High Accuracy for Voltage: The voltage output models are exceptionally accurate, with relative errors mostly within ±1.5% across all conditions. This is because the AC voltage is tightly regulated by the solar inverter’s control system to follow the grid voltage.
Dynamic Tracking of Current: The current output models show higher relative errors, particularly during transient periods like rapid cloud passages (seen in cloudy/rainy scenarios). The error magnitude is larger when the current value is small (e.g., near zero at dawn/dusk). However, the absolute error remains acceptable for dynamic studies. The RLS algorithm successfully tracks the significant variations in current caused by changing solar power input.
Parameter Evolution: The time-varying nature of the identified parameters was clearly observed. Parameters remained relatively constant during steady irradiation but changed smoothly and rapidly in response to step-changes in DC power, demonstrating the adaptive capability of the online RLS method for solar inverter modeling.
General Applicability: The method performed robustly across all six distinct datasets—different seasons and weather patterns—proving its general applicability for modeling solar inverters under diverse operational environments.

Discussion, Advantages, and Future Extensions

The proposed RLS-based black-box identification presents a powerful alternative to physics-based modeling of solar inverters. Its primary advantage lies in its simplicity and practicality. System operators or researchers do not need detailed circuit diagrams or proprietary control code of the solar inverter. The model is derived solely from standard, accessible measurements (DC and AC side voltages/currents), which are routinely available in modern PV plants via SCADA systems.

The online, recursive nature of the algorithm is another significant strength. It enables the creation of a dynamic or adaptive model of the solar inverter. As the operating point shifts due to daily and seasonal cycles, or as the solar inverter’s performance degrades over time, the model parameters can be continuously updated to reflect the current behavior. This is ideal for digital twin applications, real-time stability assessment, and predictive maintenance.

Furthermore, the low order of the identified models leads to high computational efficiency. Simulating a simple difference equation model is orders of magnitude faster than simulating a detailed switching model of a solar inverter. This makes large-scale simulation studies involving hundreds or thousands of solar inverter units computationally feasible.

However, some limitations and future research directions are noted:

Data Saturation and Forgetting Factor: The standard RLS can suffer from “data saturation,” where it becomes less sensitive to new data after processing a large number of samples. The introduction of a forgetting factor ($\lambda$) mitigates this, as used in our study. Tuning $\lambda$ provides a trade-off between tracking agility and parameter estimate stability.
Excitation Requirement: The identification quality depends on the input data being sufficiently “exciting,” i.e., containing rich dynamic variations. Periods of constant, steady irradiation may not provide enough information to identify all dynamics accurately. Data from days with varying clouds or from tests with deliberate small grid disturbances are ideal.
Future Enhancements: The method can be extended in several ways:
1. Using more sophisticated model structures like ARMAX or state-space models to better handle specific noise characteristics.
2. Incorporating regularization techniques to improve numerical robustness during online updates.
3. Developing methods to physically interpret the identified black-box parameters in relation to the solar inverter’s internal control gains and filter time constants.
4. Validating the approach with field measurement data from commercial solar inverter units of various topologies and power ratings.

Conclusion

In this work, we have successfully developed and demonstrated a recursive least squares-based method for the dynamic model identification of three-phase solar inverters. By adopting a black-box perspective, we treat the solar inverter as a 2-input, 6-output dynamic system. The RLS algorithm provides an efficient mechanism to identify a low-order discrete-time model (ARX type) online, using only measured DC input and AC output data. The case study, encompassing summer and winter data under sunny, cloudy, and rainy/snowy conditions, confirmed the method’s high accuracy, robustness, and broad applicability. The voltage models were particularly precise, while the current models effectively captured the dynamic power variations. The resulting models are computationally compact and adaptable, making them highly suitable for integration into large-scale power system simulation tools, real-time monitoring platforms, and digital twins for modern grids with high penetration of solar photovoltaic generation. This data-driven approach offers a practical and effective pathway for obtaining accurate dynamic representations of solar inverters without requiring intimate knowledge of their internal design.