Double symbolic joint entropy in nonlinear dynamic complexity analysis

Symbolizations, the base of symbolic dynamic analysis, are classified as global static and local dynamic approaches which are combined by joint entropy in our works for nonlinear dynamic complexity analysis. Two global static methods, symbolic transformations of Wessel N. symbolic entropy and base-scale entropy, and two local ones, namely symbolizations of permutation and differential entropy, constitute four double symbolic joint entropies that have accurate complexity detections in chaotic models, logistic and Henon map series. In nonlinear dynamical analysis of different kinds of heart rate variability, heartbeats of healthy young have higher complexity than those of the healthy elderly, and congestive heart failure (CHF) patients are lowest in heartbeats' joint entropy values. Each individual symbolic entropy is improved by double symbolic joint entropy among which the combination of base-scale and differential symbolizations have best complexity analysis. Test results prove that double symbolic joint entropy is feasible in nonlinear dynamic complexity analysis.


I. INTRODUCTION
Heart rate variability (HRV), the variation in beat-tobeat intervals represented by RR or NN interval [1], displays irregular and non-stationary behaviors whose nonlinear dynamics provide valuable information for cardiac scientific and clinical researches [2,3]. To measure its nonlinear dynamical features, some complexity parameters, such as fractal dimensions, Lyapunov exponents, geometric and entropy methods et al., are proposed [4][5][6]. Symbolic dynamic analysis, a kind of fast, simple and efficient method, provides rigorous ways to analyze nonlinear dynamics [7].
Symbolic time series analysis consists of symbolization and statistical analysis to the symbolic series, and it has effective applications in physiological signal analysis [8,9]. Symbolization involves in transforming infinite-value series into symbol sequence on basis of a given alphabet [10], so it greatly reduce demands on the data and bring convenience to series analysis [2,11]. These symbolic transformations are classified into two groups, global static and local dynamic methods [12]. Symbolic transformation in works of Wessel N. et al. [13,14] and base-scale entropy [15] belong to global static approaches, and symbolizations in permutation entropy [16] and differential entropy are typical local dynamic ways, and they all employ Shannon entropy for symbolic series analysis. Our objective is to take both global and local dynamical information into account to make comprehensive analysis of nonlinear dynamic complexity. There are some feasible ideas to combine the two symbolic methods such as multi-dimension theory [15,16]. These attempts, however, are more appropriate to be described as compromises of, on the one hand, maintaining flexibility of global * wangj@njupt.edu.cn static methods and of, on the other hand, extracting sufficient local dynamic information. In order to make efficient use of the two types of symbolizations, we apply joint entropy to combine them for nonlinear dynamic complexity.
In our contributions to combine the two kinds of symbolic transformations, we conduct global static and local dynamic symbolic transformation simultaneously, and apply the two kinds of symbolizations' joint entropy to nonlinear dynamic analysis of classical nonlinear chaotic models and three kinds of real-world HRV.

II. SYMBOLIC TRANSFORMATION
Symbolization is a course of coarse-graining or reduction, and its basic idea is to transform series X L = {x 1 , x 2 , . . . , x L } into symbolic sequence S N = {s 1 , s 2 , . . . , s N } whose element s i is a finite number of symbols (letters from some alphabet). Global static methods perform symbolization according to different sequences intervals which are identified by several parameters obtained from the whole sequence, and local dynamical approaches, on other hands, take contribution of local adjacent elements' relationships to carry on symbolic representation. Both types of symbolization, targeting different types of nonlinear dynamical information, have effective applications in complexity detections.

A. Wessel N. Symbolization
To make physiology-connected symbolization which is relatively easy to interpret, Wessel N. et al. develop a four symbols context-dependent pragmatic symbols transformation [13,14,17]. The symbolic transformation, referring to three given levels, namely (1 − α)µ, µ, and (1 + α)µ, performs as Eq. (1) where µ is the series mean and α is special controlling parameter which is recommended as from 0.03 to 0.07 according to tests and does not significantly differ resulting symbol sequences in nonlinear forecasting of cardiac arrhythmias features.

B. Base-scale Symbolization
Symbolization in base-scale entropy [18] is a kind of four-symbol global method, which employs multidimensional vector reconstruction firstly as Eq. (2) and makes symbolic transformation in each vector.
In Eq.(2), m is embedding dimension and τ is delay time. And then base scale, the root-mean square of the differences between every two contiguous values in a vector, of each reconstructed vector is calculated as Eq. (3).
The base-scale symbolic transformation goes as Eq. (4) where µ m represents the mean of m-dimension vector X m (i) and α describes controlling parameter which could be chosen from 0.1 to 2 accordingly. Multi-dimensional procedure brings adaptability and flexibility as well as some local dynamical information. In our works, therefore, we perform symbolic transformation on the whole time series as a vector to extract global nonlinear information.

C. Permutated Symbolic Transformation
Permutation entropy, with advantages of simplicity, fast calculation and robustness, carries on typical local dynamic symbolization [19,20]. By comparing neighboring values and mapping time series onto symbols sequences [21], permutation entropy is a classical complexity parameter. Multi-dimensional procedure, same as base-scale entropy, is needed to transform series into symbolic sequences. Accordingly to the values' sizes, series are reorganized in for example ascending order in each reconstructed vector as x i+(j1−1)τ ≤ x i+(j2−1)τ ≤ · · · ≤ x i+(jm −1)τ . π j = {j 1 , j 2 , · · · , j m } is a new sequence consisting of the elements' original positions, and there are m! permutations considering all possibilities. Permutation entropy is Shannon entropy of all permutations' probabilities as H(m) = − p(π i )log 2 p(π i ), where p(π i ) = 0 .

D. Symbolization in Differential Entropy
Taking differences between adjacent elements into account, we proposed differential entropy as a dynamic complexity measure. This symbolization attributes its complexity detection to detailed local dynamic information. The differences between current element and its forward and backward ones are Parameter α in could be adjusted from 0.3 to 0.6.
Code series C(i), whose formation is the next step following symbolization, is constructed by m-bit encoding of symbolic sequences, and measurements for the code series involve classical statistics and information theory, such as Shannon entropy. Taking symbols 'abc' as example, coding procedure could be c(i) = a * n 2 + b * n + c where n should not be smaller than the amount of symbols' types, and code forms do not make significant differences to symbolic analysis. 3-bit encoding is applied in our following symbolic dynamic analysis to all symbolic sequences.

III. DOUBLE SYMBOLIZED DYNAMIC ANALYSIS
Global static symbolic transformations flexibly select the number and size of partitions according to signals' characteristics, and local dynamic symbolizations effectively extract local detailed dynamic information. To obtain both global static and local dynamical information is the main concern of this section.
Multi-dimension [15,16,22] vector reconstruction, a very attractive theoretical problem, is used in the basescale entropy and permutation entropy. Through vector reconstruction, global static symbolizations is carried out in each individual vector, making base-scale entropy more adaptable to signal changes. Symbolizations of different vectors are independent from each other, and it is helpful to improve the flexibility of transformation and extract some local dynamic information. And in extreme cases, when reconstructed vector is small enough that each vector contains only 3 or even 2 elements, the global static methods are almost equivalent to local dynamic ones. In the multi-dimension method, the selection of vector length and the setting of delay factor are worthy of further and in-depth researches. Multi-dimension processing takes account of the two kinds symbolic transformations, but it is still a compromising method and cannot give fully comprehensive considerations to both sides. Another try to combine the two different symbolizations is to directly integrate the two symbolic series into new sequences. For example, the global static sequence is '0123' and local dynamic orders in '1230', and their combinated symbols are '01 12 23 30'(symbols of the global static method is in the front of each 2-bit recombination). The disadvantage of this symbolic combination is increasing amount of symbols (if symbols amounts of the two symbolizations are N and M, there are N*M symbols in the combination), but it is worth making such attempts for some unforeseen achievements.
A feasible solution is to combine the two kinds nonlinear dynamical information by joint entropy. The Shannon seminal work rationalized and initiated early efforts into information theory, which is the most influential contribution to entropy [23]. The information contents of two (sub)systems are illustrated in Fig.1, and these relationships apply to two kinds of symbolizations as well.

FIG. 1. The relationships between information entropy of two (sub)systems
Joint entropy, in Eq. (6) or (7), is used to measure the combined amount of information.
In combination of two different symbolizations, global static and local dynamical symbolic series are obtained simultaneously as X G and X L whose joint entropy are

IV. DOUBLE SYMBOLIC JOINT ENTROPY ANALYSIS OF CHAOTIC MODELS
The four double symbolic joint entropy methods are tested by logistic and Henon map series. Delay time in the four symbolizations are set to 1. We refer to choices of controlling parameters in their original works and their performances in logistic map analysis, and set α in basescale entropy to 0.2 and differential entropy to 0.5, while value to Wessel N. symbolic entropy is 0.3.
The canonical form of logistic difference equation, X i+1 = r ·x i (1 − x i ), is attractive by virtue of its extreme simplicity [24] and is widely applied in chaotic and nonlinear dynamical analysis. Its bifurcation diagram and chaotic detections of four double symbolic joint entropy are shown in Fig.2.
Logistic map shows its chaotic characteristics when r is larger than the cut-off point r*=3.567, exactly ≈ 3.569946, and its nonlinear complexity increases with the increase of r. The four combined joint entropies effectively identify chaotic behaviors at r*, and their entropy values increase with chaotic enhancement of logistic map showing in Fig.2. The double symbolized joint entropies, therefore, are effective to serve as complexity parameters.
In Henon map tests, the four symbolic combined joint entropies show their satisfied chaotic complexity detection as well. The discrete-time dynamical system is a simple mapping [25] of the plane defined by x i+1 = ry i + 1 − 1.4x 2 i , y i+1 = 0.3rx i , where r is a controlling parameter. As r increase from 0.9 to 1, Henon system exhibits more chaotic behaviors, and its four double symbolic joint entropy are listed in Table I.
From Tab 1, as Henon series chaotic behaviors increase, four joint entropy methods have corresponding increase, verifying their effective nonlinear complexity detections.

V. DOUBLE SYMBOLIC JOINT ENTROPY IN HRV ANALYSIS
Three kinds of heartbeat intervals (derived from ECG) from Physionet Database [26] are applied in our works. Firstly, 15 subjects with severe congestive heart failure (CHF), NYHA class 3-4 [27], including 11 men aged 22 to 71 and 4 women aged 54 to 63. Secondly, 20 young (21 to 34 years old) and 20 elderly (68 to 85 years old) underwent 2 hours of continuous data collecting [28] in a resting state in sinus rhythm.
We firstly apply four double symbolic joint entropy methods and individual symbolic entropy approaches to the three kinds of HRV, and analysis results are illustrated in FIG.3a and 3b. In this part, controlling parameters are all set to 0.55. From Fig.3a, the four double symbolic joint entropies distinguish the three kinds of HRV and share coincident distinctions which are consistent with the 'complexityloss' theory of aging and disease in relevant researches [2,[29][30][31] that joint entropies of healthy young volunteers' heart rates are higher than those of healthy elderly ones, and CHF patients have the lowest joint entropy values. Healthy young subjects have better cardiac states and their heartbeats present more complex processes in nature than those of the elderly and CHF patients. The healthy elderly subjects represent slight weakness in cardiac function due to aging, and therefore their dynamic features are less than the young ones. CHF group, having profound abnormalities in cardiac function and severe damage to the cardiac control system, largely lose their HRV dynamic features, therefore their group have the lowest dynamic complexity.
T tests for the four double symbolic joint entropy analysis of three different kinds of heartbeats are carried out, and p values are listed in Table II.
From Table II, differences between each two kinds of heart signals' complexity extracted by WN-DE JEn and BS-DE JEn are significant (p < 0.05) that the two DEjointed methods achieve satisfied nonlinear distinctions among three groups of HRV and BS-DE joint entropy show optimal nonlinear dynamic complexity detections. WN-PE JEn and BS-PE JEn effectively separate cardiac rhythms of CHF patients and two kinds of healthy subjects while they both fail to significantly distinguish the elderly and young volunteers' heartbeats (p=0.083 and 0.086, larger than 0.05). The failures of two PE-jointed entropy in distinguishing two kinds of healthy HRV may lie in the misleadings of permutation entropy in these heart signals nonlinear analysis. In Figure 3b, permutation entropy has different results from other three symbolic entropy that nonlinear complexities of the three kinds of HRV show oppositions. Heartbeats of CHF patients have biggest entropy of 5.372 and those of elderly persons have permutation entropy of 5.335 while complexity of healthy young subjects' heart rates, 5.121, are lowest. This paradox phenomenon, we guess, may be involved in multi-scale theory that higher complexity for certain pathologic processes, such as CHF, than for healthy dynamics in permutation entropy analysis lies in that it fails to account for the multi-scale information [9,29,31,32]. Multi-scale concept is to construct coarse-grained series y τ j = 1/τ Σ jτ i=(j−1)τ +1 x i , 1 ≤ j ≤ N/τ and for scale 1, {y 1 j } is the original series {x i }. The single-scale permutation entropy may be related to this inconsistency in our works . Except for permutation entropy, the other three individual symbolic entropy methods effectively distinguish the different heart signals and their results are not inconsistent with previous 'complexity-loss' theory. Independent samples t tests for other three symbolic entropy methods show that they all effectively distinguish three different kinds of heart rate variability in nonlinear dynamic analysis (CHF-Elderly p values of WNSE, BSE and DE are 0.031, 0.001 and 0.019, the Elderly-Young correspondings are 0.002, 0.024 and 0.042, and CHF-Young p values are all 0.000).
Through the above analysis, we find that double symbolic joint entropy improves nonlinear complexity extraction of individual symbolic entropy. The two DEcombined joint entropies, reducing p values of different kinds of heartbeats of each individual symbolic entropy, improve distinctions among the three groups of heart signals and more effectively identify different kinds of HRV. The two PE-combined joint entropies, having correct distinctions of the three HRV complexity shown in Fig.3a, overcome the unenviable situations of permutation entropy and effectively distinguish HRV of CHF patients and two healthy groups.
In this part, we observe impacts of data length on joint entropy analysis. Data length increases from 300 to 4500 with step size of 300 for researches on the four joint en-tropy analysis of HRV, and results are shown in Fig.4. Illustrated by Fig 4a and 4b, in the beginning data length of two WN-joint entropy, heartbeats of CHF patients have higher nonlinear complexity than that of the healthy elderly and their relationships change when data length become to and larger than 2000. In Fig 4c and  4d, BS-joint entropy values of three groups of HRV are consistent with normal relationships in previous analysis and become to stable when data length comes to 2000. In the Fig 4, in beginning parts entropy values of two healthy heartbeats undergo increasing trends while those of CHF patients have first-increasing and then-decreasing changes.
In the four subplots, healthy young people maintain higher entropy to the healthy elderly ones, which are not affected by the data length. The differences in charts are mainly reflecting in CHF patients entropy trends, reasons for this we suppose are that CHF patients HRV signals are in poor stability that contributes to these fluctuations.
From Fig.4, the relationships between the three HRV joint entropies change at the beginning parts and tend to converge as data lengths increase to about 2000 and larger. And we come to the point that the four double symbolic joint entropies have certain requirements for data length in HRV analysis.

VI. DISCUSSIONS
To take flexibility of global static transformations and detailed dynamic information extraction of local dynamic methods into account, we introduce joint entropy to combine characteristics of the two kinds of symbolizations.
Among the four individual symbolic transformations, there are three with controlling parameter whose adjustments play important role in nonlinear dynamic analysis. Referring to choice ranges of controlling parameters given in their original works, we make adjustment in nonlinear complexity detection of chaotic models and physiological signals accordingly. In logistic and Henon map analysis, α is set 0.3 to Wessel N. symbolic entropy, 0.2 to basescale entropy and 0.5 to differential entropy while in nonlinear complexity extraction of heartbeats α are adjusted to 0.55 to all three symbolizations to achieve satisfied results. We find that the choices controlling parameter of Wessel N. symbolization, whether 0.3 in chaotic models complexity detections or 0.55 in different HRV nonlinear dynamic analysis, are not in recommended range of 0.03 to 0.07 in the original literatures [13,14]. It seems we cannot find optimal controlling parameters for all different kinds of data. The reasons, we guess, account for this lie in differences of structural information or of dynamical complexity in different types of signals, so the parameters should be adjusted accordingly. And it need to be validated that whether our findings apply to other nonlinear signals.
Both global static and local dynamic symbolizations contain irreplaceable dynamic information about series, and there is not too much redundant information in two types of symbolic transformations. Taking heartbeats of the first CHF patient 'chf01' as an example, four com-bined symbolic joint entropies are 7.0001, 6.9738, 8.8819 and 8.8555 which are approximately equal to the sum of each two individual entropy which are 1.6959 to WNSE, 3.5776 to BSE, 5.2779 to DE and 5.3042 to PE. The same results are true for those of the healthy elderly persons and CHF patients. Joint entropy values are close to the sum of each symbolization entropy, proving that the two symbolic transformation approaches extract series nonlinear dynamic information from different perspectives, and degree of repetition of the two different symbolizations is very low.

VII. CONCLUSIONS
The above analysis and tests show that it is feasible and effective to use joint entropy of global static and local dynamic symbolizations for series nonlinear dynamic complexity detection. Double symbolic joint entropy is beneficial to improve final nonlinear complexity detections of each individual symbolic entropy in our nonlinear analysis of heart rate variability. And the 'complexityloss' theory of aging and disease is validated in our contributions.