Comparative study of chaotic features in hourly wind speed using recurrence quantification analysis

Due to the shortage in electricity supply in Nigeria, there is a need to improve the alternative power generation from wind energy by analysing the wind speed data available in some parts of the country, for a better understanding of its underlying dynamics for the purpose of good prediction and modelling. The wind speed data used in this study were collected over a period of two years by National Space Research and Development Agency (NASRDA) from five different stations in the tropics namely; Abuja (7050 02.09"N and 6004 29.97"E), Akungba (6059 05.40"N and 5035 52.23"E), Nsukka (6051 28.14"N and 7024 28.15"E), Port Harcourt (4047 05.41"N and 6059 30.62"E), and Yola (9017 33.58"N and 12023 26.69"E). In this paper, recurrence plot (RP) and recurrence quantification analysis (RQA) are applied to investigate a non-linear deterministic dynamical process and nonstationarity in hourly wind speed data from the study areas. Using RQA for each month of the two years, it is observed that wind speed data for the wet months exhibit higher chaoticity than that of the dry months for all the stations, due to strong and weak monsoonal effect during the wet and dry seasons respectively. The results show that recurrence techniques are able to identify areas and periods for which the harvest of wind energy for power generation is good (high predictability) and poor (low predictability) in the study areas. This work also validates the RQA measures (Lmax, DET and ENT) used and establishes that they are similar/related as they give similar results for the dynamical characterization of the wind speed data. © 2018 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). https://doi.org/10.1063/1.4998674

The shortage in electricity supply in Nigeria suggests a shift to alternative power generation from wind speed. Hence, there is need to study wind speed data, for optimum power generation, using RQA measures. The RQA measures have been applied in the study of various data such rainfall data, geomagnetic data, ionospheric data, wind speed data etc obtained from different part of the world. This work extends such analyses to wind speed data collected from Nigeria for which such analyses have not been carried-out. This analysis could be useful in solving the problem of acute shortage in electricity supply experienced in Nigeria, by determining the best location for wind power plant and by predicting the period with good and poor harvest of wind power for proper planning. For these reasons wind speed data were collected from five locations in Nigeria comprising Abuja (7 0 50 02.09"N and 6 0 04 29.97"E), Akungba (6 0 59 05.40"N and 5 0 35 52.23"E), Nsukka (6 0 51 28.14"N and 7 0 24 28.15"E), Port Harcourt (4 0 47 05.41"N and 6 0 59 30.62"E), and Yola (9 0 17 33.58"N and 12 0 23 26.69"E). The results obtained from the different RQA measures (L max, DET and ENT) established that Yola and Abuja which are further to the North, are better locations for citing wind power plants than the rest of the regions considered. It is also established that the dry season period is better for the harvesting of wind energy than the wet season, in all the

I. INTRODUCTION
Wind is widely known to be one of the greatest sources of natural energy. Wind is simply air in motion which is caused by differences in atmospheric pressure as a result of the Sun's uneven heating of the air that envelopes the Earth. It is free, clean and available day and night for the production of cheap electrical power. The energy generated from wind solely depends on the wind speed in the area where the wind energy conversion system is located. Wind turbines convert wind power to wind energy which is used to generate electricity. Unlike conventional power plants, wind power which is alternative to fossil fuels, is readily available, clean, renewable, widely distributed, and produces no greenhouse gas emissions to the environment during operation. The generation of wind power has increased tremendously in developed countries, especially in Europe, due to the desire to reduce environmental degradation of the convectional energy sources; due to its advantage, there is a general growing interest in the wind power development in Nigeria. The world cumulative installed capacity of wind power gradually increased from 6100 MW in 1996to 158,505 MW in 2009(GWEC 2010. At the end of 2009 in Africa, Tunisia, Morocco, and Egypt were the leading countries with installed wind power capacity of 54MW, 253 MW and 430 MW respectively (GWEC 2010). In 2011, about 83 countries around the globe were using wind power on a commercial basis. Wind power production was over 2.5% of worldwide power as at 2010 and was growing at a rate of more than 25% per annum. In order to benefit from wind power in an electrical grid, it is necessary to predict the electrical energy generated by the wind (Foley et al. 2012).
In the liberalised electricity markets, the predictions of wind power from a couple of hours to a few days ahead are important where the expected power production and market prices are used as a yardstick to determine the optimum demanding strategy with the least possible risk (Gomes and Castro 2012).
For effective production and prediction of wind power, one should be able to characterize the nature and predictability of the recorded time series of the wind speed and the related wind power to find the proper model structure as well as the model inputs and to be able to claim the validity of its model (Bigdeli and Lafmejani Sadegh 2016).
From the above discussion, it is necessary to determine the nature of the fluctuations of the wind speed whose cubic is proportional to the amount of wind energy produced from a wind turbine. The fluctuations can be as a result of stochastic or chaotic behaviour of the wind speed. Some works have been done using statistical methods for wind speed predictions which includes autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA) and its variants fitted to the time series of wind speed (Cadenas and Rivera 2007, Kamal and Jafri 1997, Kavasseri and Seetharaman, 2009. The presence of chaos in wind velocity was first studied by Tsonis and Elsner (1988), who studied ten seconds averages of vertical wind velocity recorded ten metres above the ground at the National Oceanic and Atmospheric Administration (NOOA) in Boulder, CO, USA. Grassberger and Procaccia (1983) used correlation dimension method and reported the presence of chaos with correlation dimension close to 7.3, Zeng and Pielke (1993) analyzed vertical wind velocity, reported low correlation dimensions between 3 and 4.5 and the presence of chaos. Bigdeli and Lafmejani Sadegh (2016) investigated the dynamic characterization and predictability of hourly wind speed and power time series via nonlinear time series analysis methods, and reported the stochastic nature of these time series.
Many other works have been carried out on atmospheric time series using nonlinear quantifiers like Lyapunov exponents, correlation dimension, recurrence plot, recurrence quantification analysis, etc., however, none of these attempted to use nonlinear tools to characterize wind speed data in Nigeria.
The need to develop alternative power supply with a particular focus on wind energy in Nigeria cannot be over emphasized. In this work, variation in monthly wind speed data were tested using recurrence plot and recurrence quantification analysis to identify regions and periods with good and poor harvesting of wind energy for power generation.
The rest of the paper is structured as follows: Section II describes the theoretical background of the study, section III describes the study areas, section IV deals with the analysis of the wind speed data, results and discussions, while section V concludes the report.

II. THEORETICAL BACKGROUND
A. Recurrence plot Eckmann et al. (1987) elaborated graphical tools known as Recurrence Plots which are based on Phase Space Reconstruction. The recurrence plots show the time-dependent behaviour of the dynamical systems which are pictured on the phase space. A recurrence plot (RP) is a visualisation of state-space dynamics that shows all those times at which a state of the dynamical system recurs. The RP is represented as where R ij is the recurrence matrix, N is the number of considered states x i , " i is a predefined threshold characterizing the distance between two neighbouring points, k . k a norm (i.e Euclidean norm), ⇥ is the Heaviside function, ! x i and ! x j define the point in phase space at which the system is situated at times i and j respectively. A recurrence of a state at time i at a different time j is therefore marked within a two-dimensional squared matrix with ones and zeros in which axes of the recurrence matrix are time axes. One can assign "black" dot to be one and a "white" dot to be zero. A black point in the RP signifies that the system returns to an "-neighborhood of the corresponding point in phase space (Thiel et al. 2002). The features in RP can be characterized into two patterns namely, typology (Eckmann et al. 1987) and texture Marwan (2003).
For the qualitative interpretation of a RP, Marwan (2003) proposed the following guidelines which can be classified into large scale (typology) and small scale patterns (texture).
(i) Homogeneous typologies indicates stationary. (ii) Fading in the upper left and lower right corners indicate nonstationarity and also contains a trend or drift. (iii) Disruptions (white bands) indicate that the process is nonstationarity; some states are rare or far from the normal which indicated that transitions may have occurred. (iv) Periodic patterns: the process gives diagonal lines and checkerboard structures; the period corresponds to the time distance between periodic patterns (e.g. lines). (v) For Single isolated points, the process is heavily fluctuated. If only these occur, then the process may be a random process. (vi) Diagonal lines (parallel with the LOI) correspond to similarities in the evolution of states at different times. The process may be regarded deterministic. if these diagonal lines are shorter, then the process can be from chaos. If the diagonals are shorter, it seems that the RP is related to the chaotic systems. (vii) Diagonal lines (orthogonal to the LOI) correspond to the similarities in the evolution of states at different times but they move in opposite directions in phase space. This may be a sign for an insufficient embedding. (viii) For Vertical and horizontal lines/clusters, system exhibiting Laminar states do not change or change slowly for some time.

B. Recurrence quantification analysis
The recurrence plot could contain hidden patterns which are not easily ascertained by visual inspection. In order to solve this problem, Zbilut and Webber (1992) proposed a tool called recurrence quantification analysis (RQA) to quantify the presence of patterns in the RPs. It quantifies the density of recurrence points as well as the histograms of the length of the diagonal and vertical lines in the RP. The measures to quantify the deterministic structure and complexity of RPs include the following: (i) The recurrence rate (RR) is the percentage of darkened pixels and density of recurrence points in recurrence plot.
Where N is the length of the time series and R i j is recurrence matrix which is one if the state of the system at time i and that at time j have a distance less than " and zero otherwise. Its measure can be used to detect changes in a dynamical system. It values can range from 0% (no recurrence) to 100% (full recurrence). (ii) Determinism (DET) is the ratio of recurrence points forming diagonal structures (of at least length l min ) to all recurrence points.
Where P(l) is the frequency distribution of the diagonal line lengths (for a diagonal parallel to the main diagonal); l is the length of the line structure. Long diagonal lines depict periodic signals (e.g. sine waves), short diagonal lines depict chaotic signals (e.g., Henon attractor) and no diagonal lines depict stochastic signals (e.g., random numbers) Webber and Zbilut (2005). It has been used successfully to quantify the degree of determinism of a physiological system (Webber and Zbilut 1994). (iii) Maxline (L max ) is the longest line segment measured parallel to the main diagonal in the plot. Eckmann et al. (1987) proposed that the longest diagonal line structure was inversely proportional to the most positive Lyapunov exponent. Trulla et al. (1996) was able to establish this for the logistic equation operating in its chaotic regime. The shorter the longest line is, the more divergent the trajectories will be. A periodic signal will give long line segments, while short lines indicate chaos.
N l is the total number of diagonal lines. (iv) Entropy (ENT) is the Shannon information entropy (Shannon 1948) of all diagonal lines lengths distributed over integer bins in a histogram. It is a measure of signal complexity which shows the richness of deterministic structuring. The higher the value of Entropy, the more complex of certainty structure of the system in recurrence plot.
(v) Laminarity (LAM) measures the percentage of recurrent points comprising vertical line structures rather than diagonal line structures. Laminarity can vary from 0% (no laminarity) to 100% (full laminarity).
where P(⌫) is the total number of vertical lines of the length ⌫ in the RP. Laminarity depicts the occurrence of laminar states in the system without laying emphasis on the length of these laminar phases.
These three measures (L max DET and ENT) used in this work are related in the following ways: Their measures are based on the diagonal lines of RP. When a system is chaotic, the L max , determinism and entropy are all low. And this occurs when the system is far from equilibrium.
When the system attains equilibrium, the values of L max , determinism and entropy are all high. Therefore, the similarities arise due to the fact that their measurements are based on the diagonal lines of RP, expressed in equations (3), (4) and (5).

C. Determination of embedding dimension and time delay
In the phase space reconstruction and recurrence analysis from a time series, two essential parameters are necessary to be estimated. These parameters are delay time (⌧) and embedding dimensions (m) which can be widely found in the literature (e.g., Kantz and Schreiber 1997).
There are two Standard approaches to estimate the optimal time delay ⌧: autocorrelation method and mutual information approach. The delay parameter can be estimated as the first zero of the autocorrelation function or as the first minimum of the mutual information (MI) function. The difference between these two methods is that the first looks for linear independence while the second measures a general dependence of two variables (takes into account nonlinear correlations). In nonlinear time series analysis, the second method seems to be preferred and is define in Takens (1981).
Where p i is the probability of finding a time series value in the i th interval in the partition and p i j (⌧) is the joint probability to find a time series value in the i th interval and a time series value in the j th interval after a time ⌧. Kennel et al. (1992)  Where a point in the phase-space represents the state of the system at a given time. The number of samples is N after reconstruction. The evolution of the system from some initial state (assumed to be known) can be described by the trajectories of the phase-space diagram and hence represent the history of the system.
Phase-space is known to be a very important tool because with a model and a set of appropriate variables, dynamics can represent a real-world system as the geometry of a single moving point (Sivakumar et al. 2002).

III. STUDY AREA AND METHODS
The study area is Nigeria located in the tropical zone and lies between latitudes 3 0 and 14 0 and longitudes 3 0 and 15 0 as shown in Fig. 1. The study area was subdivided along the latitude into three zones: Guinea (coast -8 0 N), Savanna (8 -11 0 N) and Sahel (11 -16 0 N) Omotosho and Abiodun (2007). The study area exhibits two distinct seasons: rainy and dry seasons with a single peak known as the summer maximum as a result of its distance from the equator.
The pattern of rainfall depends on the movement of the Inter Tropical Discontinuity (ITD) which is determined by the variation of the Inter-Tropical Convergence Zone (ITCZ) (Ojo and Falodun 2012). The ITCZ shifts northward over West Africa from the Southern Hemisphere to the Tropic of Cancer between March and June where it delivers rainfall to most Southern parts of the country earlier than in the Northern part as a result of heavy showers coming from pre-monsoonal convective clouds (Southwest trade wind). The ITCZ reaches first to the Southern part of the Nigeria due to their proximity to the equator. By June the ITCZ stands over the Tropic of Cancer and rainfall reaches a maximum in most Southern part of Nigeria and the Northern part experiences convectional rains, which reaches a maximum in July/August. By August/September the ITCZ returns to the Southern part of country and delivers increased precipitation for the second time. In December, the ITCZ shifts further southwards to the Tropic of Capricorn and so all the regions experience dry season. Thus, most part of the country experience dry season due to the prevalence of the northeastern trade wind over Nigeria (Nicholson 2013, Inter-Tropical Convergence Zone, Tropical Dry and Wet Climate).
The rainy season usually occurs between April and October in other regions while in the northern regions, it occurs between June and September.
It has been reported that a short dry break known as August break occurs within the rainy season between August and September (Adejuwon and Odekunle 2006). After the August break, a short rainy season starts in early September with its peak period usually at the end of September and ends mid-October.
During the dry season, Harmattan occurs as a result of a dust laden airmass that blows over West African subcontinent from Sahara Desert while airmass influences rainy season heavily which originates from the south atlantic ocean.
There are wide climatic variations in different regions of the country which makes the study area unique.
The data used in this study was obtained from NASRDA and it covers the period of two years starting from January, 2010 -December, 2011 with Campbell Scientific Automatic Weather Stations that generates real time data at five minutes update cycle throughout the day.
The software used for all the computations, in this research work, of RP and RQA analysis is the CRP Toolbox 5.16 coded by Norbert Marwan, and freely available on the web (Marwan 2010).

IV. ANALYSIS, RESULTS AND DISCUSSIONS
In this section, we give a detailed analysis of hourly wind speed data from January, 2010 to December 2011 using different RP and RQA.
We display in Figs. 2 and 3 the 3D phase space diagram for some selected months for both wet and dry seasons in all the locations which is reconstructed using equation (8). It is observed, for all cases, that the points are not evenly distributed, with high concentration in the middle of the phase space diagram. Consequently, the phase space diagram does not look like a random process, rather some signature of attractor-like shape is visible in it. This observation suggests for further investigation.
With the help of time delayed mutual information as shown in Fig. 4 and the false nearest neighbour depicted in Fig. 5, it is observed that any choice of delay 6 and any dimension 6 are the suitable choices for ⌧ and m, respectively. In order not to under-embed one system than the other, the embedding dimension is chosen from a system with the higher dimension and also from the distribution of time delayed mutual information, we pick a single value that seems to be good characterization across all the stations following the concept in Wallot (2017). By repeating similar analysis for all the time series of different stations considered in this study, these choices of ⌧ and m are found to be true. Thus, the choice of ⌧ as 6 and m as 6 for further analysis of this observation are reasonable.
The recurrence plots representing five stations (ABJ, AKG, NSK, PHT and YOL) for hourly wind speed are shown in Figs. 6(a-e) and 7(a-e).
Two recurrence plots are chosen from each station representing both dry and wet seasons. In this work, the recurrence threshold is chosen to range from 4% -5% of the maximum attractor diameter (i.e., 8% -10% of the corresponding attractor radius) based on the method suggested in Schinkel et al. (2008).
where " is the recurrence threshold, and d A is the maximum attractor diameter. Euclidean norm is used throughout this study. The data used was also normalized to zero mean and standard deviation of one. The recurrence plots considered in all the five stations exhibit nonhomogeneous but quasi-periodic recurrent structures reflecting that the distances between the diagonal lines vary.
The increased proportion of short line segments suggests deterministic structures which are clearly visible and are interrupted by isolated points or white horizontal and vertical spaces. Short line segments indicate a positive maximal Lyapunov exponent, (Eckmann et al. 1987, Marwan et al. 2007, Christodoulou et al. 2016) and hence, chaotic dynamics of the hourly wind speed has been identified in other studies (Karakasidis andCharakopoulos 2009, Drisya et al. 2014).
Since the qualitative analysis technique of the RP method is unable to give detailed information of the nonlinear dynamical system, there is a need for further analysis using RQA method. The RQA parameters such as L max , DET, and ENTR are used to further analyse the recurrence characteristics of hourly wind speed time series in each station. The recurrence parameters were used to study the monthly variability of the hourly wind speed data in the study area. The result is displayed in Table I.
It is obvious from Table I    Lmax towards the end of every year. From the beginning of the year (January) it decreases gradually towards the middle of the year (June) and rises during the short dry break between August and September known as the August break and thereafter it decreases for the remaining part of the rainy season before it rises again at the return of the dry season. It was observed that the dry months in all the stations exhibit lower chaoticity than the wet months (as L max is inversely proportional to the largest Lyapunov exponent). Notably, Nsukka station exhibits the highest chaoticity, followed by Akungba, Port Harcourt, Yola and Abuja stations in decreasing order of chaoticity. The low chaoticity observed for Yola and Abuja stations is due to the shortness of the wet season that exist between June and September every year in the regions. The gradual increase in the value of L max in all the stations in the month of October can be attributed to the effect of the intertropical discontinuity which affects the regions during this period. Low chaoticity in the months of December and January in all the stations for the two years is as a result of Tropical continental airmass (the harmattan) which is a wind that originates from North Africa and crosses the Sahara Desert into west Africa and to Nigeria. Therefore, while high chaoticity in all these regions is attributable to high effect of monsoon during the wet season, low chaoticity is attributable to weak effect of monsoon during the dry season. We depict in Figs. 9(a-e) and 10(a-e) respectively, the results of Determinism and Entropy for the period under investigation. The results exhibit the same trend with high values during the dry months and low values in wet months. The measure DET shown in Fig. 9 reveals the occurrence of regular patterns (predictability) in a time series in all the stations for the two seasons (wet and dry seasons). The measure DET for the two years ranges from 0.2780 to 0.5993 for Abuja, 0.1809 to 0.8004 for Akungba, 0.1016 to 0.4133 for Nsukka, 0.3673 to 0.7313 for Port Harcourt and 0.3077 to 0.7772 for Yola stations. High determinism (predictability) values could be found in Yola, Abuja, Port Harcourt and Akungba stations while the lowest is observed in Nsukka station during the dry season. During wet season, High determinism (predictability) values could be found in Yola and Abuja due to the fact that these regions experience short wet season between June and September compared to other stations that experience long wet season between April and October every year. Sudden changes in transitions into or out of a 'wet' or 'dry' state are hard to predict, while the behaviour within a season (within wet or dry seasons) follows to a certain extent similar pattern throughout (Eroglu et al 2016). High predictability during dry season is linked to the weak effect of monsoon while low predictability during wet season is linked to strong effect of monsoon. The values of entropy ranges from 0.5844 to 1.1630 for Abuja, 0.4649 to 1.4624 for Akungba, 0.2693 to 0.8930 for Nsukka, 0.7821 to 1.4038 for Port Harcourt and 0.5863 to 1.6875 for Yola stations. An increase in the entropy values at the onset of dry season for all the stations was noted for the rising complexity of the deterministic part of the dynamics underlying hourly wind speed process. The high value of entropy during dry season suggests stability while the low value during wet season suggests instability in all these regions. Recurrence quantification analysis has been able to reveal variability in hourly wind speed caused by high and low effect of monsoon during wet and dry seasons respectively in these regions.
To demonstrate the similarity in the three measures we introduce Fig. 11 plotted using L max , DET and ENT values from Akungba station.
From Fig. 11, it is observed that the trend of L max , DET and ENT measures follow the same trend. They decrease from a maximum in January (no rainfall) to a minimum in June (maximum rainfall) and therefrom increase to a maximum again in December (no rainfall).
In general, it is observed that the trend of L max , DET and ENT measures are similar which might be attributed to the fact that their measurements are based on the diagonal lines of RP.

V. CONCLUSION
In this study, the hourly wind speed records from five different stations across Nigeria were studied using the methods of RP and RQA and the underlying dynamics and hidden structures in the wind speed data were visualized and quantified. It was observed that the dynamics of wind speed change processes are characterized by nonlinearity, nonstationarity and chaos. Furthermore, the chaoticity in the wind speed data was observed to be high [low]  This study has, therefore, established that the northern region of the country and the dry season period are more suitable for generating wind power due to low chaoticity (high predictability) than the southern region and the wet season period which are characterized by high chaoticity (low predictability).
Finally, this work has established that the RQA measures (L max , DET and ENT) are similar in the dynamical characterization of wind speed data.