Correlated power time series of individual wind turbines: A data driven model approach

Wind farms can be regarded as complex systems that are, on the one hand, coupled to the nonlinear, stochastic characteristics of weather and, on the other hand, strongly influenced by supervisory control mechanisms. One crucial problem in this context today is the predictability of wind energy as an intermittent renewable resource with additional non-stationary nature. In this context, we analyze the power time series measured in an offshore wind farm for a total period of one year with a time resolution of 10 min. Applying detrended fluctuation analysis, we characterize the autocorrelation of power time series and find a Hurst exponent in the persistent regime with cross-over behavior. To enrich the modeling perspective of complex large wind energy systems, we develop a stochastic reduced-form model ofpower time series. The observed transitions between two dominating power generation phases are reflected by a bistable deterministic component, while correlated stochastic fluctuations account for the identified persistence. The model succeeds to qualitatively reproduce several empirical characteristics such as the autocorrelation function and the bimodal probability density function.


Introduction
In the context of anthropogenic climate change, the challenge of reducing carbon emissions is of central importance. Renewable sources of energy are considered to be one of the most promising solutions in the electricity sector to cover an increasing energy demand without exacerbating the high carbon emissions coupled to it. Wind energy in particular appears to be one of the most strongly increasing sources of renewable energy [1,2] but demands an extraordinary adaption of grids and related power systems due to its intermittent nature [3,4]. It consequently raises the need for a profound understanding of this intermittency and the opportunity to perform extensive studies on the reliability of power systems by suitable models. Such models can only be calibrated with respect to empirical data while different approaches are required to reflect features on multiple spatial and temporal scales [5,6]. Moreover, they can be used to study the impact of certain well known statistical characteristics on the resulting dynamics. This provides wind farm controllers with a rich set of tools to analyse variable scenarios taking influences into account that are known to be of importance.
In this work, we focus on the dynamics of single wind turbine (WT) power generation in an offshore wind farm. While especially power generation of aggregated wind farms or even complexes of several wind farms have received considerable attention, the challenge of intermittent and stochastic characteristics is particularly high on the scale of single turbines [7]. Still, it does not vanish for larger units like wind farms or national wide wind power generation [8]. How generated power fluctuations from geographically separated wind farms are dampened when aggregated [9,10] is of crucial importance for large scale grid stability [11,12].
Our perspective on this problem is to study the autocorrelation of power time series as an indicator of their intermittent and nonstationary nature in a first step. In most modelling approaches, information on the complex temporal evolution of variations is included and hence autocorrelation is of great general importance. Nonstationarity is another feature that is frequently encountered dealing with complex systems [13]. We address this within the analysis of power output correlations using the popular method of Detrended Fluctuation Analysis (DFA) [14]. It can deal with polynomial nonstationarities in a generalized fashion. The fluctuations around these polynomial trends are subject to the correlation analysis which yields a Hurst exponent as an indicator for the strength and nature of the autocorrelation.
The idea that power time series should exhibit scaling behaviour is based on the finding that the underlying wind speed dynamics are governed by atmospherical turbulence [15]. The traditional tool to capture this aspect is the power spectrum obtained by Fourier analysis [16]. Several studies have examined the power spectral density both for wind speed [17,18] and power time series [19,20], also uncovering how aggregation on different spatial scales impacts power fluctuations [21,10]. Other classical methods range from structure functions to estimators for the Hurst exponent based on scale dependent variance calculation on the respective time series [22]. Still, these methods do not take the above mentioned nonstationarity into account. To this extent, several works study the autocorrelation of wind speed time series u(t) applying DFA and Multifractal DFA (MFDFA) and conclude that it behaves persistent with a multifractal dependence [23,24,25]. While most research in this context focuses on local wind speed measurements of single sites, some studies reveal multifractality of wind speed on a larger spatial scale. For instance, [26] incorporates wind speed data spatially distributed all over Switzerland and finds multifractal dependence of persistence for the cooperative behaviour in the context of monitoring systems. Similar research for power time series appears more limited. The authors of [27,28] reveal a high degree of multifractality for both wind speed and power time series of an aggregated wind farm and join both in a description via generalized correlation functions. Furthermore, aggregated power output of wind farms is known to show complex correlations in terms of persistence and multifractality. The authors in [29] uncover multifractality for power time series of an aggregated wind farm in South Australia with data on a similar time scale. They classify power time series in the persistent regime, especially on short time scales of several minutes. Yet to our best knowledge, no research has employed DFA or multifractal methods focusing on single WTs' power output.
After we have obtained a better understanding of autocorrelations, we put forward a model based on these theoretical insights and an important feature of the empirical probability density function (PDF) of power output. A broad range of approaches is considered for wind power generation models in the respective literature. A general distinction of models can be based on whether power is directly or indirectly modelled, e.g. by mapping certain variables like measured wind speed on power via a distinct transformation. Our work contributes to direct power modelling of time series for single WTs. Most models aim at finding a precise point forecast of time series for a certain time scale and horizon [30]. In contrast to that, other effort is put into modelling different properties related to power output such as the power curve [31,32], wind power ramps [33] or power density estimates [34]. Models also vary in terms of the time scale [35] that ranges from ultra-short-term (ms-s) [36] to long-term forecasts (months) [37].
The model proposed in this work aims at modelling short-term power time series without the objective to give precise point forecasts. Instead, we aim at deepening the systemic understanding of a WT due to its stochastic nature on the one hand and the impact of control mechanisms (e.g. curtailment) on the other hand. Following this motivation, the model is based on a single nonlinear stochastic differential equation that is split into two components in a Langevin fashion. Since WTs are directly coupled to the complex atmospherical dynamics of wind, it is paramount to incorporate a stochastic component that allows for a certain degree of complex diffusive behaviour. A sufficient approach to include such complex diffusive dynamics without a loss of simple applicability is given by Fractional Gaussian Noise (FGN) [38]. Time series generated as FGN entail a certain degree of correlation and yield Fractional Brownian Motion when cumulated. Thus we are able to reflect on the results from the correlation analysis from a model perspective. The second component of our model takes the impact of control mechanisms on power output into account. We address this feature in two steps: a deterministic component in the differential equation accounts for the characteristical shape of the PDF as a first mechanism to focus the power values around the respective fixed points. The second step incorporates control mechanisms such as curtailing in a simplified numerical fashion. Finally, we include a simple first approach to account for a time dependent seasonal drive of power output aswell. By constructing our model in such a way, its easily distinguishable components and parameters give way to a better understanding of how certain theoretical features affect power time series qualitatively, such as the degree of autocorrelation. It further yields the opportunity to test parametrized scenarios and may be used in large power network simulations based on simplified models of single wind turbine dynamics.
The paper is organised as follows: We present the data set and perform a cleansing procedure on it in sec. 2 such that we can get a first impression of its fundamental characteristics afterwards. In sec. 3, we identify the autocorrelation of power time series via the traditional autocorrelation function and the method of DFA. The stochastic model we introduce in sec. 4 is based on these empirical findings and will enlarge upon our understanding of how varying autocorrelations have an impact on fundamental statistical features by comparison with the data. When we present the results, we will find a sufficient agreement between the empirical and the modelled features. We summarize our results in sec. 5.

Data Treatment and Characteristic Features
We briefly introduce the data set in 2.1. The data cleansing procedure described in 2.2 is important to focus only on a reasonable subset of the empirical data [39]. Finally, we get a first glance of the most substantial characteristic features of the wind farm data in 2.3.

Data Set
The data set we analyse comprises time series of 30 WTs located at the German offshore windfarm RIFFGAT. Several observables are measured via a SCADA (Supervisory Control and Data Acquisition) system, however we will focus only on the active power output of the WTs. The respective time series cover a total period of one year between 01/03/2014 and 28/02/2015. All analysis is carried out on ten minute average values calculated from data points measured with 1 Hz frequency. Thus the timestamp precision is limited to ten minutes and we obtain 52560 values in total.

Data Cleansing
In the following we briefly present the applied data cleansing procedure. We detect outliers and ensure that we only consider a reasonable subset of the initially measured data. Every data value we consider to be erroreneous will be set to NA (not available) and is not included in any upcoming analysis. In a first step we ensure that there are no redundant timestamps in any time series and look for consecutive identical values. Since one ten minute average value is based on 600 measured values, it can be rated as a highly unlikely case that two consecutive 10-minute averages are identical with a five digit precision in the data. This could only be conceivable if e.g. constant rated power is generated for ten minutes without any variation. To rule out this case, we also took the maximum and minimum values for the respective ten minute intervals into account.
Apart from the ten minute average values, we inspect the respective standard deviations. Any data value with a vanishing ten minute standard deviation is set to NA. In a last step we analyse if there are unreasonable changes of consecutive power values which we will call power increments. We consider each power increment to be unreasonably high (regardless of its direction) if it meets both of the following criteria: the power increment Ξ + = (P(t + 1) − P(t)) P + relative to the so called rated power output P + of the WT is higher than a certain threshold Ξ 0 and the respective minimum power P min in a ten minute time interval is higher than the maximum power P max in the previous time interval by a certain factor q: P min > qP max which yields unphysical data. We choose the threshold value Ξ 0 and the factor q in a way that limits extreme power increments to a typical value found in the respective literature [40]. This yields Ξ 0 = 0.67 and q = 0.99. The higher these values are chosen, the more unphysical ramps are still kept in the data which results in a higher number of strong jumps between low and high power generation. If we apply a too strict choice, some of the true strong ramps that resemble intermittent fluctuations are spuriously eliminated, also biasing results for temporal correlations. After applying the stated cleansing steps, we obtain 9.58% of NA values in the data. We will only consider pairs of values containing no NA value in every calculation of correlations between time series.

Bimodality and Power Increments
To achieve a sufficient understanding of wind power data, some basic facts about the conversion of wind speed u into active power output P and the control of WTs have to be outlined [41,42]. Although an increase in wind speed obviously leads to a higher gain of generated power in general, WTs only run within a certain operating range. This limitation is due to the finite performance of power generators. In fact, the operation of WTs is limited by a lower cut-in value u − and an upper cut-off value u + of wind speed u. Below u − , there is simply not enough wind energy for an economic use of the turbine so that P = P − (= 0 kW ). When u exceeds u + , power is controlled to remain constant at the rated power output P = P + (= 3600 kW ). For even higher wind speeds, it becomes essential to avoid mechanical damage and the WTs are turned down by a continuous adjustment of the rotor blades. Within u − ≤ u ≤ u + , the generated power of an ideal WT increases proportionally to u 3 . For this work the most important conclusion to draw from these control mechanisms is that power time series have to be at least in parts artificially flattened. We expect power values to be constant at P = P − or P = P + for certain time periods.
This manifests itself in fig.1a as a striking bimodal pattern in the five displayed time series and a striking bimodality of the empirical PDF in fig.1b, visualized as a histogram. Most power values are concentrated around zero power generation P − and active power output P + . In fact, the intervals 50 kW < P < 200 kW and 3550 kW < P < 3800 kW around the two peaks sum up to 41.62% of all values. This feature also governs the dynamics observed from the time series and is observed in several different works [43,34,44,45]. Nevertheless, other analyses [46,47] also show unimodal distributions around P − or flatter, less concentrated distributions. While the first observation is related to the different efficiencies of WTs, the latter is mostly found for aggregated power of several wind farms where WTs with different rated power outputs are combined. Despite of this bimodal shape of the PDF, also values exceeding the rated power can be observed. Finally, strong downward ramps to zero power output can be observed for some of the WTs.  We will characterize such power ramps by the increments Ξ k (t) = P k (t + ∆t) − P k (t) which are a fundamental property for the understanding and management of wind farms. We define them as the standardized, dimensionless differences For our data, we only analyse ∆t = 10 min. A visual inspection of fig.2a shows an example of this property for one WT in a time period of two weeks. Apparently, increments of similar size cluster in time, indicating some degree of correlation. This generally indicates that simple random walk models do not yield a sufficient description that captures the complex temporal correlations of time series. Power seems to fluctuate symmetrically but clearly in a non-Gaussian fashion as it can be seen in the PDF of all Ξ k (t) for the total time period in fig.2b. Here we compare the empirical distribution to a Gaussian (black) with the same mean value and standard deviation. For the aggregated wind farm, the strong fluctuations appear slightly dampened. These findings are in accordance with results in the literature for increments on even shorter time scales [8].  If the PDF deviates from Gaussain statistics we speak of intermittent behavior after Kolmogorov 1962, expressed by a heavy tailed distribution and multifractal statistics. Note this turbulent intermittency term is not the same as the alternative denotation of intermittency, i.e. nonstationary time series switching between different flow states like between laminar and turbulent flows.

Autocorrelation
Both the bimodality of power, persisting at values around zero and rated power output, and the complex dynamics of power increments suggest that an analysis of correlations can be fruitful. As a first step, we display linear autocorrelations for power time series P k (t).
To this extent, we use the Pearson Correlation with a time delay τ defined by Quantifying autocorrelations of power time series yields valuable information on how power can be generally modelled (sec. 4). In this paper, we only consider autocorrelation of power time series. With Θ(τ) as a correlation measure we only account for linear dependence. Moreover, it is sensible to outliers and only gives sufficient information for time series with finite variance. Figure 3a shows Θ(τ) for power time series of all 30 WTs. It is shown up to a lag of seven days which is approximately the point where they drop below a significant level (dashed horizontal lines). As such we use a simple surrogate approach and shuffle the time series, eliminating temporal information but preserving the PDF. We identify a (constant) confidence band by calculating its width as 2σ of the autocorrelation after the shuffeling process, averaged over all time series. The autocorrelation of all P k (t) slowly decreases over three orders of magnitude and thus indicates long-range dependence. Nevertheless, Θ(τ) does not show a typical power law decay but runs through several local maxima, reflecting the inherent nonstationarity. In [48] an explanation for this observation is provided that corroborates our results: the autocorrelation of wind speed decreases with slightly visible maxima due to weather related seasonalities. Since power is closely coupled to wind speed and persists at an almost constant level for values around u − and u + , the local maxima are not only sustained but amplified. The displayed autocorrelations do finally not show significant differences between single WTs even though it is known that relative positions of WTs play a role for power generation, e.g. through wind shear effect [49]. Such differences could potentially become visible in lagged cross-correlations between WTs of varying relative position which is not within the scope of this work though.  The scaling behaviour of power time series is expected to be related to that of wind time series. For the latter, it is well known that the Fourier power spectrum obtained yields a power law decay E( f ) ∝ f −β with β = 5 3 estimated through linear regression in the log-log plot. This finding is in accordance with results one obtains from Kolmogorov's turbulence model. A simple way to confirm that similar scaling can be found for the power time series, we estimate β from a respective Fourier power spectrum in fig.3b. To this extent, we compute a smoothed spectrum (black) by splitting the full time series P(t) of an exemplary WT into ten chunks and average over spectral power values obtained for each of the single segments (example in gray). The linear MLE regression (orange) yields β = 1.58 ≈ 5 3 within a linear region of sufficient width and thus matches both the expected universal scaling (red) and results from similar data well [50].
The presence of nonstationarity leads to biased or spurious detection of autocorrelations [51]. To uncover autocorrelations in presence of nonstationary we apply DFA [52] to the power time series P k (t). The main idea of DFA is to eliminate nonstationarity in a generalized fashion by subtracting polynomial trends and consequently focusing on correlations that are present in the remaining noise. The method aims at calculating the Hurst exponent H, introduced by Hurst in 1951 [53]. We distinguish between persistent (0.5 < H < 1) and antipersistent (0 < H < 0.5) behaviour of a time series. The special case H = 0.5 yields diffusive behaviour i.e. uncorrelated white noise with X H (t)X H (t ) = 0 for the time series X(t) at arbitrary times t and t . Cumulated white noise entails Brownian Motion. The persistent regime H > 0.5 results in super-diffusive dynamics and long-range dependence of the increments X(t). Antipersistence yields the opposite case, i.e. a time series changes its direction more frequently than a diffusive time series. The respective extension of a white noise process is called Fractional Gaussian Noise (FGN) and will be adressed later.
We now briefly sketch the method of DFA and refer to [52] for a more detailed description. As a first step we calculate the integrated, mean adjusted power time series. A time series of equal length is obtained that is splitted into N s = T /s disjunct subsets of equal length s. The brackets round the value of T /s down to an integer value. We repeat this step with the reversed time series to incorporate all subsets. Subsequently, a polynomial quadratic detrending via MLE-regression (Maximum Likelihood Estimation) is applied for all subsets respectively. We then calculate the standard deviation F ν (s) of all detrended subsets and from that derive the average standard deviation. Repeating this calculation for different s, we obtain the fluctuation function that represents the scale dependent fluctuation strength. We estimate the scaling exponent α from F(s) ∝ s α empirically via linear regression in a double-logarithmic plot since we expect F(s) to increase as a power law. For stationary time series (e.g. FGN [22]), we can identify α with the Hurst exponent H. In the more general case frequently encountered for real data, even for the detrended time series some nonstationarity remains present. In this case, the Hurst exponent can only be estimated as H ≈ α − 1 which only strictly holds for a Fractional Brownian Motion (FBM) process. This distinction is sometimes not pointed out distinctively in the literature [54]. In general, α is also linked to the Fourier scaling exponent β via β = 2α − 1 = 1 + 2H, enabling us to assess the consistency of our results.
For applications, s must not be chosen too small for a significant outcome. The DFA method is known to cause misleading finite size effects with respect to short time scales s and the applied detrending generally needs to be regarded critically [55]. Another typical issue are values at the limits or outside of the range 0 < α < 1. For α > 1, we have to consider the time series as an integrated process with more complex nonstationarities (H = 1.5 equals Brownian Motion) [56,57]. Apart from that, the resulting increase of F(s) does not have to be completely linear but can contain crossovers with different slopes [58]. In many cases, these crossovers are meaningful and uncover different scaling regions that entail different correlations [14]. As we see in fig The same type of crossover behaviour is also found for hourly wind speed data at geographically different sites [54]. The authors conjecture that this scaling behaviour might arise from the different scales of weather patterns which would manifest itself in a multifractal spectrum of different scaling exponents. Yet, the crossover is not critically reflected on, even though misleading crossovers in DFA are known to appear with several known causes. In [14] it is suggested to test whether different orders of polynomial detrending change the crossover position which we ensured not to occur. Furthermore, the number of NA-values does not have a significant impact on our results as supposed in [59]. Although the scaling law of F(s) is still valid for α > 1 [60], this special case hints that unidentified trends remain after the detrending procedure. Such trends are also known as a source of erroneous crossovers [61,62] which we can not fully rule out. If such trends were the cause, similar trends would also be present in the wind speed data in [54] though. The consequences of such a crossover still seem to be of potential importance for possible modelling approaches which will be adressed in sec. 4. .

Stochastic Model
The identified features of single WT power time series motivate a model approach that could be calibrated with empirical data. We now put forward such a time series modelling approach that has the potential to be used in a larger simulation, aiming at a multi-scale model of wind power generation. In sec. 4.1, we introduce our model approach. We compare our model results to the empirical data in sec. 4.2 to gain first qualitative insights into the scope of the model.

Model Approach
Numerous approaches of power time series P(t) simulations are employed in the literature. A major fraction tends to simulate wind farm power time series only and does not concentrate on single WTs as we intend to do [63,64,65,35]. Another central distinction is the aim of the model. Our reduced-form approach does not intend to reproduce temporally ordered forecasts but only statistical features such as distributions and long-term averages. Several models in the literature manage to give precise forecasts of different time horizons of P(t) using black-box models with a high number of parameters or simple regression parameters that often lack a comprehensive contextual meaning. Typical model approaches in this context are Markov processes [66,48,67] and ARIMA (Autoregressive integrated moving average) processes [37] which both focus on the autocorrelation of P(t). Further approaches are based on nonlinear models [68], stochastic models [43] and stochastic drift-diffusion-models [20]. In contrast to the stochastic process approach in [20] which aims to model the stochastic wind speed / wind power relation in seconds by the estimators of Kramers-Moyal coefficients, we aim here to achieve a stochastic modelling of the power output based on 10 minute values.
We will try to introduce our model parameters with a comprehensible meaning. Furthermore, our model equation itself is not a regression formula but is based on our qualitative understanding of power time series. The simulations are carried out on a 10-minute time scale. The essence of our model comes down to one fundamental stochastic differential equation (SDE) based on central statistical features we identified in sec. 2 and sec. 3. We do not model NA-values separately and aim at simulating the raw uncleansed data that is directly obtained from the SCADA system [69]. For the sake of simplicity all model-related equations are formulated in a notation for continuous systems, yet baring in mind that we are dealing with discrete data. We now briefly introduce the stochastic model equation and summarize the related mathematical concepts. In our modelling framework, we will regard power generation as an autocorrelated stochastic process with a deterministic and a stochastic component. These drift-diffusion-types of models are often expressed by a Langevin equation of the general form dP Here, P(t) denotes the power time series. The first term equals the deterministic drift component with a drift parameter k and a potential function V (P). The second component incorporates stochastic fluctuations ξ H (t) with a constant diffusion strength D. The index H denounces the Hurst exponent which hints at the fact that we can include arbitrary power law-like correlations into our model. Thus, ξ H (t) itself is a solution to a simple FGN-stochastic process [70] with the defining property In this way, we can transfer our empirical knowledge about the autocorrelation of P(t) from sec. 3 to the model approach. Furthermore, we choose the deterministic potential function V (P) in a way that focuses on the time series dynamics around the fundamental fixed points. Since we have observed that power generation mostly concentrates around zero power output P − and rated power P + , we choose a bimodal approach with the double-well potential function While a describes the steepness of the potential, P 0 gives us the position at which it is centered. Such SDEs are used in different applications in the literature and are refered to as a description of an overdamped Brownian particle in a double-well potential [71] in statistical physics. As an illustration, we can think of the underlying dynamics as a particle that would jump between the potential minima, driven by correlated fluctuations. The latter consequently introduce a characteristic time scale for the transitions between the fixed points. Often, such models are extended by a driving periodical force that entails a biased occupation of one fixed point with respect to a certain seasonality. Since power generation from wind energy follows seasonal variations, we incorporate this into our model with the simple approach which is added to eq.4 as the driving force. We determine ω from Fourier analysis as the leading frequency within the spectrum of P(t). A gives us the adaptable strength of the seasonal variations. With this extension, we introduce another characteristic time scale into our approach. The interplay of both this component and the stochastic fluctuations determines the transition dynamics as described by the general phenomenon of stochastic resonance [72]. Our approach effectively biases the bimodal PDF of modelled power towards one of its peaks which reproduces the seasonal variation of wind power generation on a rather basic level. In the parameter estimation of ω, we incorporate data from a specific month to calibrate the seasonality according to the month in the data.
As we have seen in fig.1, the control of the WTs results in extremely narrow peaks of the PDF of P(t). This fact is caused by the intervening external control of power output (curtailment). Our model approach already succeeds to concentrate the power values around P − and P + as fixed points in a similar manner but fails to narrow the peaks down as sharply as required due to the simple analytical approach. Hence we have to incorporate the artificial flattening of time series we observe in the data into our model sufficiently. To do so, we cut off power values beyond the operating range P − ≤ P ≤ P + by setting them to the respective constant threshold value P − or P + . As we observe in the data, these limits are sometimes exceeded in the empirical time series anyway. Consequently, this procedure is only applied with a certain probability p ≷ . Note that the model can not be regarded as a typical Langevin-equation driven model since the correlated noise term ξ H (t) on the one hand and the artificial flattening due to curtailment on the other hand differ strongly from the standard delta-correlation of such models. Taking all explained considerations of our model approach into account, the resulting model formula is with the respective probabilities p ≷ for having a power value beyond the operation range and a N (0, σ ) -distributed random number z. The model includes ten parameters in total. A detailed description of the parameter calibration and all resulting values are given in app. 5. Six parameters (P 0 , ω, p ≷ , σ ≷ ) are calibrated on the data before a time series is to be modelled. P 0 determines around which power value the distribution of values should be centered. The frequency ω should include some limited degree of season-specific variation and is fixed via Fourier analysis. The remaining four parameters p ≷ and σ ≷ calibrate the curtailment and variability of power beyond the operating range. The three parameters a , A and D are optimized such that the model reproduces the empirical statistical features most effectively while avoiding overfitting. The Hurst exponent H will be varied in sec. 4.2 to observe how different autocorrelations affect the generated power.

Results
We get a first glance of the model results by inspecting simulated time series. We vary the Hurst exponent H from diffusive (H = 0.5) to persistent (H = 0.7) up to strongly persistent (H = 0.9) dynamics to account for different degrees of correlation within the model. Thus, our approach does not incorporate the uncovered crossover behaviour that would separate between different autocorrelations below and above s c but only account for scales s < s c in the stochastic component. Yet the deterministic component results in strong ramps that are intended to resemble the observed power ramps that lead to the Hurst exponent H > 1 for s > s c . We always compare the simulated results to only one exemplary WT since the model aims at characterizing the typical dynamics instead of a single specific WT.  The quality of the reproduced PDF shown in fig.6 for different values of H is not as obvious. This time, we choose the sampled time series P(t) whose PDF reproduces the height of the bimodal peaks most sufficiently but still reflects a typical result. While the PDF for H = 0.9 reproduces the bimodality of the power time series in a satisfactory manner, the average values (dashed line) clearly differ. For H = 0.5, peak heights can not be reproduced: a sample of P(t) with uncorrelated fluctuations tends to occupy both fixed points P − and P + with the same frequency. This behaviour is entailed by the enhanced frequency of transitions we observed in fig.5. Yet it fails to give a convincing result for the mean value even though it should be more stably centered for a balanced PDF.
The power increments Ξ k (t) are an essential quantity for the control of WTs and grid stability. Besides the analysis in terms of autocorrelation and power spectra of P k (t) (see below), higher statistical features of the power fluctuations can be grasped by investigating the statistics of the power increments, yielding higher order statistical features of power fluctuations. A comparison of increment statistics between real and modelled data can display strengths and weaknesses of the model to capture the intermittent nature of power generation, expressed by strong correlated fluctuations. To this extent, we compare the quantiles of the model results for the power increments with the empirical data. This empirical test clarifies qualitatively in which way the two distributions generally differ regarding their quantiles. From this we conclude that a certain threshold of persistence has to be exceeded so that big jumps between P − and P + dominate and generate strong heavy tails. Thus, the persistence of power is closely related to its intermittent behaviour regarding its fluctuations. The results we compared so far demonstrated that the model succesfully reproduces important statistical characteristics of power generation with respect to the statistical distribution. The variation of Hurst exponents as a tool to incorporate the temporal correlation of power time series should yet yield a strong impact on the autocorrelation function. Thus, we analyse how well the autocorrelation function Θ(τ) and also the power spectrum is reproduced by our model. Figure 8 shows in black the monthly averaged autocorrelation of empirical power time series. We compare autocorrelations for the three different Hurst exponents in the same average. The model in general succeeds to reproduce the slowly decreasing autocorrelation with a decay on the same time scale. Although we observe the expected relation that higher values of H lead to a more slowly decaying autocorrelation, none of the model results yields a sufficient reproduction of the decay for τ ≤ 1 d. Only for H = 0.9 the simulated autocorrelation approaches the empirical one within this interval. In general, the results for τ > 1 d are more applicable and even tend to reproduce the weak anticorrelation for τ > 5 d. The discussed seasonal bumps are not clearly distinguishable for any of the model results. The model autocorrelations show stronger fluctuations which indicates that the monthly average does not cancel out the impact of monthly varying time scales as much as for the empirical data. A more sophisticated approach for the seasonal component in our model could enhance this aspect. Finally, fig.3b shows the three corresponding averaged power spectra for the model time series (green). In contrast to the empirical spectrum, the slope remains almost constant for all H values even beyond the cut-off value (dashed line), yielding lower power for the higher frequencies. Within the fitted region, all three spectra show a qualitatively similar decay with β ≈ 5/3. In particular, the time series with the highest persistence again gives the most convincing result (H = 0.5 : β = 1.71 ± 0.03 , H = 0.7 : β = 1.69 ± 0.04 , H = 0.9 : β = 1.64 ± 0.03).

Conclusion
We analysed a data set comprising time series of 30 WTs located at the German offshore windfarm RIFFGAT over a total time period of one year. As important basic features, we found power values of each WT to be bimodally distributed with density peaks at zero power output P − and rated power output P + . The power output P k (T ) of a WT turned out to have a slowly decreasing autocorrelation which connects to general findings in the literature. Beyond these basic features, we took into account the non-stationarity of time series by applying DFA to both P k (T ). In terms of the Hurst exponent as a general classifier for autocorrelations of single WTs, different characteristic time scales were identified on which correlations vary between persistence and more involved nonstationarity, separated by a crossover s c ≈ 3 d. Based on these findings, we put forward a reduced form model for the power output P k (t). With this model we intended to provide a tool for reproducing several statistical properties and testing power generation scenarios with respect to empirically motivated parameters. To this extent, we employed a nonlinear stochastic differential equation with a double-well potential and a correlated diffusion component. Most parameters were estimated empirically while some were optimized to capture statistical properties of the data. The model succeeded to qualitatively reproduce important statistical characteristics such as a bimodal PDF, intermittent fluctuations and a slowly decreasing autocorrelation. It thus enriches the modelling perspective on power generation by focussing on a comprehensive set of statistical features only while still producing characteristical results. Possible applications include wind farm or large power network simulations that still aim at reflecting single WT dynamics adequately. Even for such aggregated large-scale approaches, taking correlations on a single WT scale into account is essential since models with uncorrelated fluctuations will underestimate important statistical effects such as heavy tailed distributions and persistence.
Anyway, the proposed model should be regarded as a first step towards a study that quantitatively reflects on the model dynamics in greater detail and undermines the significance of the obtained results. While a systematic examination of parameter effects on the outcome remains to be done, some parameter estimation methods could also be subject to further improvement. The observed crossover which uncovers different correlations on time scales t ≤ 3 d could be of interest for possible model extensions. The strong ramping behaviour on short time scales that is suspected to entail the found crossover is an important component in the context of predictability despite strong intermittent fluctuations. Finally, we only focused on single WTs. An evident extension of the proposed model would be to aggregate several correlated single WTs' [73] and wind farms' [74] model power outputs and study characteristics of aggregated output [73], especially with regards to an amplification of correlated fluctuations on the aggregated level.
The remaining four parameters should be regarded as more dynamical parameters that we try to estimate by optimizing a certain statistical feature with respect to this parameter. One exception is the Hurst exponent H which we vary between the three values H = 0.5 , 0.7 , 0.9 in sec. 4.2 towards a better understanding of the impact of power law autocorrelations on power generation. We briefly sketch the estimation methods for the parameter a of the potential function, the diffusion strength D and the strength of the seasonal variations A. For each parameter, we optimize it by sampling a reasonable number of time series for each of the 12 months.
• Curtailment indicator a: After we have set P 0 = 1800 kW, the most obvious choice for a would also be a = 1800 kW to ensure that the bimodal peaks center around the correct fixed points P − = 0 kW and P + = 3600 kW. Anyway, the artificial flattening procedure plays an important role in this context: if we choose a = 1800 kW, the amount of values that are either set to one of the thresholds P − or P + or scattered around them is too high compared to the empirical data. This hints at choosing a value a < 1800 kW. We optimize a by sampling power time series with different a < 1800 kW and calculating the amount p ≷ of power values above (below) or around P + (P − ). The resulting value (for fixed H) minimizes the difference between the respective empirical and simulated value. Consequently, we regard the difference (P 0 − a) as the power difference that entails how strongly the empirical power time series are subject to curtailment.
• Diffusion strength D: The constant diffusion strength D determines how strongly the stochastic component dominates power generation. It is an important parameter in the context of transition time scales between the two fixed points P − and P + . If we regard Dξ H (t) as the diffusion function of a Langevin equation, we can apply the standard method to obtain this diffusion function directly from the empirical data [77]. It can be shown [78] that the diffusion function can be extracted by calculating Since this method potentially yields errorenous results for correlated noise processes, we can only expect an approximate estimate of D [79]. Nevertheless, we follow this approach by minimizing the difference min D |D emp 2 (P(t)) − D sim 2 (P(t); D)| of the empirical and the model diffusion function with respect to D.

• Seasonal variation strength A:
A biases the power generation towards one of the two peaks of the bimodal power distribution. In winter months, a high average power output is likely whereas in spring, usually lower wind speeds occur which lead to a pronounced peak around P − . Thus we can calculate resembling an energy indicator for the empirical and simulated data to minimize the difference between the respective time-averaged values E(t) t . We consider A to be optimized when this difference of averaged accumulated power values is minimized for each month.
We summarize the parameters choices we obtain from the above mentioned estimation methods: