From fractals to stochastics

30 Years of Nonlinear Dynamics in Geosciences Rhodes, Greece, 03-08 July 2016

From fractals to stochastics:

Seeking theoretical consistency in analysis of geophysical data Demetris Koutsoyiannis1*, Federico Lombardo2, Panayiotis Dimitriadis1, Yannis Markonis1 and Spencer Stevens3 1Department of Water Resources and Environmental Engineering, School of Civil Engineering, National Technical University of Athens, Greece 2 Independent researcher, Rome, Italy. 3 Independent researcher, London, UK. (*[email protected], http://www.itia.ntua.gr/dk/) Presentation available online: http://www.itia.ntua.gr/1627/

Part 1: On nonlinear dynamics and fractals

Tὸ ἀντίξουν συμφέρον καὶ ἐκ τῶν διαφερόντων καλλίστην ἁρμονίαν καὶ πάντα κατ' ἔριν γίνεσθαι Opposition unites, the finest harmony springs from difference, and all comes about by strife Heraclitus; ca. 540-480 BC D. Koutsoyiannis et al., From fractals to stochastics 1

30 Years of Nonlinear Dynamics?

http://books.google.com/ngrams/ D. Koutsoyiannis et al., From fractals to stochastics 2

30 Years of Nonlinear Dynamics?

(von Kármán, 1940) Kármán Vortex Street caused by wind flowing around the Juan Fernández Islands off the Chilean coast (Wikipedia) D. Koutsoyiannis et al., From fractals to stochastics 3

Why do we not prefer fractals (over stochastics)?  A lot of ambiguity  Confusion between local and global properties of processes  Use of the abstract mathematical processes as if they could apply to natural processes  Hasty use of stochastic concepts  Misspecification / misinterpretation of scaling laws  Neglect of statistical bias and variation  Confusion between different scaling behaviours Careful use of stochastics can deal with all problems involving fractals of non-deterministic type in a more rigorous manner and more effectively.

D. Koutsoyiannis et al., From fractals to stochastics 4

Ambiguity The terms fractal and multifractal remain without an agreed mathematical definition. Let me argue that this situation ought not create concern and steal time from useful work. Entire fields of mathematics thrive for centuries with a clear but evolving self-image, and nothing resembling a definition. (Mandelbrot, 1999, p. 14; Note: Mandelbrot coined the term fractal in 1975). [Mandelbrot, 1982] observes that "Ordinary words used in scientific discourse combine (a) diverse intuitive meanings, dependent on the user, and (b) formal definitions, each of which singles out one special meaning and enshrines it mathematically. The terms stationary and ergodic are fortunate in that mathematicians agree on them. However, experience indicates that many engineers, physicists, and practical statisticians pay lip service to the mathematical definition, but hold narrower views." That is, many mathematically stationary processes are not intuitively stationary. By and large, those processes exemplify wild randomness, a circumstance that provides genuine justification for distinguishing a narrower and a wider view of stationarity. (Mandelbrot, 1999, p.7) D. Koutsoyiannis et al., From fractals to stochastics 5

Ambiguity (contd.) We are done now with explaining the peaceful coexistence of two values of D: the dimension D = 1/H = 2 applies to that three-dimensional curve, as well as to the trail obtained by projecting on the plane (X, Y). However, the projections of the three dimensional curve on the planes (t,X) and (t,Y) are of dimension D = 2 –H = 1.5. (Mandelbrot, 1999, p. 45) Definition. The term multifractal denotes the most general category of multibox cartoons. It allows the generator to combine axial boxes and diagonal boxes with non-identical values of Hi from Hmin > 0 to Hmax < ∞. (Mandelbrot, 1999, p. 45) Contrast Each definition is a piece of secret ripped from Nature by the human spirit. I insist on this: any complicated thing, being illumined by definitions, being laid out in them, being broken up into pieces, will be separated into pieces completely transparent even to a child, excluding foggy and dark parts that our intuition whispers to us while acting, separating into logical pieces, then only can we move further, towards new successes due to definitions . . . (Nikolai Luzin; from Graham and Kantor, 2009) D. Koutsoyiannis et al., From fractals to stochastics 6

Attempts to remove ambiguity There is no “official” consensus on the definition of a fractal. However, what is generally agreed on is that the Hausdorff measure and Hausdorff dimension play a key role. One possible definition of a fractal is then for example that it is a set A ⊆ Rk whose Hausdorff dimension dimHaus A is not an integer. (Beran et al., 2013, p. 178) There are many definitions of fractal dimension. The most general and mathematically satisfactory one is the Hausdorff dimension DHaus. (Veneziano and Langousis, 2010, p. 4) However: In the context of time series analysis, fractal behaviour is often mentioned as synonym for long-range dependence. Though there are strong connections between the two notions, they are also in some sense completely different. (Beran et al., 2013, p. 178) Note: The Hausdorff dimension expresses a local property, as a radius δ for covering the set A tends to zero. This is more evident in the so-called box-counting dimension, which is an upper bound for DHaus (Beran et al., 2013, p. 181-182) and is defined as dimBox 𝐴 = lim𝛿→0 log 𝑁𝛿 / log 𝛿 ⁡where Nδ is the minimal number of sets Ui needed for a δ-cover of A. D. Koutsoyiannis et al., From fractals to stochastics 7

Confusion between local and global properties of processes  Most fractal literature confuses fractal behaviour with long-range dependence.  However, Mandelbrot (1999, p.3) referred to the difference of locality and globality but in a rather obscure way: The importance of the contrast between mildness and wildness is in part due to its links with a contrast between locality and globality.  Gneiting and Schlather (2004), using a Cauchy type autocovariance function* made it clear that the fractal and Hurst properties (long-range dependence) are two different things: o The fractal parameter determines the local properties of the process (as t → 0) o The Hurst parameter determines the global properties of the process (as t → ∞)  These issues were also elaborated in Beran et al. (2013).  Koutsoyiannis (2013a,b, 2016) introduced a Cauchy type climacogram which has better properties than the Cauchy type autocovariance function, allowing for negative autocorrelations (antipersistence) at large time lags while ensuring positive autocorrelations at small lags, as demanded for physical consistency. It was first proposed by Yaglom (1987, p. 365) and also referred to by Wackernagel (1995, p. 219; 1998, p. 246), while a similar autocorrelation was used by Koutsoyiannis (2000) for discrete time processes. *


Use of the abstract mathematical objects as if they could apply to natural processes  In mathematical processes the local and global properties can be the same (e.g. the Hurst-Kolmogorov process, described by a single scaling exponent applicable at all scales).  Scale independence or absence of characteristic scales in a process or a phenomenon is mathematically attractive.  However, in Nature these cannot be the case; for example: o If the set of points along a river’s bed had fractal dimension > 1 (meaning that number of sets of its δ-cover would be a power law of δ with exponent > 1 for arbitrary low δ) then the geometrical length of the river would be infinite and any particle of water would take infinite time to reach the sea. o If a Hurst-Kolmogorov process were applicable for arbitrary low time scales, it would entail infinite variance of the instantaneous, continuous-time process which would imply infinite energy (infrared catastrophe). o If an antipersistent Hurst-Kolmogorov process (with Hurst exponent H < 0.5) were applicable for arbitrary low time scales, it would entail negative autocovariance (anti-correlation) for arbitrary small lags which is absurd (in a natural process the autocorrelation should tend to 1 as lag tends to 0). D. Koutsoyiannis et al., From fractals to stochastics 9

Hasty use of stochastic concepts  Use of formulations and tools of stationary stochastic processes while making claims of nonstationarity (e.g. in a nonstationary process, the autocovariance and the spectral density are functions of two variables, one being related to “absolute” time—see Dechant and Lutz—2015, but they have typically been presented as functions of one variable).  Use of statistical estimations from data while claiming nonstationarity; however, a nonstationary process is nonergodic and thus the estimates are meaningless.  A classical example: reporting logarithmic slopes in empirical power spectra 𝑠 # < −1 (e.g. 𝑠 # = −1.5, etc.) for small frequencies w (tending to zero). Note that the superscript # denotes logarithmic derivative, i.e. #(

𝑓 𝑥) ≔

d(ln 𝑓(𝑥)) d⁡(ln 𝑥)⁡

=

𝑥𝑓⁡′ (𝑥) 𝑓(𝑥)

(1)

 A slope 𝑠 # (𝑤) < −1 is mathematically and physically possible for large w but infeasible for w → 0 (see proof in Appendix). Reported values 𝑠 # < −1 for small w are invalid and are due to inconsistent estimation algorithms (stemming from the fact that the periodogram constructed from empirical autocovariances is too rough and the estimation of slopes from this is too uncertain; cf. Koutsoyiannis, 2013a,b; Koutsoyiannis et al., 2013; Dimitriadis and Koutsoyiannis, 2015). D. Koutsoyiannis et al., From fractals to stochastics 10

Hasty use of stochastic concepts (contd.)

Spectral density, s(w)

 It is not difficult to use inappropriate estimators and get inconsistent results, as exemplified in the graph of the power spectrum.  In the example, 1024 data points have been generated from a stochastic process which has the theoretical power spectrum with the indicated varying slope (specifically, an HHK process—see below—with parameters κ = 0.5, H = 0.8, α = λ = 1—see Koutsoyiannis, 2014).  The standard empirical power 100 spectrum is too rough to recover the underlying model and its 10 parameters, and even by smoothing (here by averaging 1 from 8 segments) relies in high bias and a misleading constant 0.1 slope of –1.5. Power spectrum, empirical smoothed  The theoretically consistent 0.01 Power spectrum, theoretical asymptotic slopes (–0.6 and –2) Fitted power law: Slope = -1.5 can be recovered by other 0.001 methods (CBS—see below). 0.001 0.01 0.1 1 Frequency, w


Misspecification / misinterpretation of scaling laws  Experiment: A Google search with terms universal multifractal rainfall was performed (similar to the experiment in Koutsoyiannis, 2010).  The first (highest PageRank) paper was chosen and its first figure is reproduced here.


Misinterpretation of scaling laws (contd.)

E[(x(Δ))q]

 In the example illustrated in the graph, a time series with length N =213 = 8192 was generated from the Hurst-Kolmogorov process (see below) with Hurst coefficient H = 0.8 and Gaussian distribution N(1,1).  Some scaling laws seam to appear at a range of time scales, which are spurious. One could be misled to assume a multifractal behaviour and specify a K(q) function, where K(q) is the log-log slope of the raw moment E[(x(Δ))q] vs. the inverse time scale λ = 1/Δ for specified q.  The truth is that there is no multifractal behaviour here. As shown theoretically by Lombardo et al. (2014), (a) there is 1000 moment q = no constant slope (e.g., as λ → 0, or 1 2 Δ → ∞, K(q) = 0); also the slope 3 4 5 6 empirically estimated for small Δ 7 100 (large λ) is too low compared to its theoretical value.  This is a symptom of a more 10 general tendency in the fractal literature to treat observations (time series) deterministically, confusing random variables with 1 0.0001 0.001 0.01 0.1 1 their realizations and ignoring λ =1/Δ statistical bias and variation. D. Koutsoyiannis et al., From fractals to stochastics 13

Neglect of statistical bias and variation pdf

Mode  The example illustrates 10 Median that estimates of high1 order moments, which have been popular in 0.1 Mean, theoretical ≈ Mean, MC multifractal studies, have 0.01 no information content.  The graph presents 0.001 results of Monte Carlo simulation for the fifth 0.0001 moment of a Pareto 0.00001 distribution with shape 0.001 0.01 0.1 1 10 100 ͠m /m parameter 0.15 for sample size n = 100 (Papalexiou et al. 2010; see also Lombardo et al., 2014).  Here the theory guarantees that there is no estimation bias; however the distribution function is enormously skewed.  The mode is nearly two orders of magnitude less than the mean and the probability that a calculation, based on data, will reach the mean is two orders of magnitude lower than the probability of obtaining the mode. 5

5


Confusion between different scaling behaviours  Temporal scaling o It indicates dependence in time and is expressed as a power law of either:  autocorrelation vs. time lag and/or climacogram vs. time scale (Hurst);  structure function vs. time lag and/or differential climacogram vs. time scale (fractal/local behaviour).  Spatial scaling o It is similar to temporal scaling but indicating dependence in space.  State scaling o Totally irrelevant to temporal/spatial scaling; it is related to the marginal distribution of the process and indicates heavy-tailed distributions (power laws of probability of exceedence vs. state).  Scaling of high-order moments with time scale o Perhaps an artefact related to other types of scaling; usually spurious because high-order moments are not reliably estimated from data.  As already mentioned, scaling laws never extend to the entire range of scales.  Usually they are asymptotic laws, with different exponents at each edge.  Scaling laws, as asymptotic ones, abound because they are a mathematical necessity (Koutsoyiannis, 2014). D. Koutsoyiannis et al., From fractals to stochastics 15

Part 2: Stochastics


The meaning of randomness and stochastics Deterministic world view Indeterministic world view Sharp exactness Uncertainty Random = unpredictable, uncertain Regular variable x: it Random variable, x: an abstract mathematical represents a number entity whose realizations x belong to a set of possible numerical values. x is associated with a probability density (or mass) function f(x). Notice the different notation of random variables (underlined, Hemelrijk, 1966) from regular ones. Trajectory x(t): the Stochastic process x(t): A collection of (usually sequence of a system’s infinitely many) random variables x indexed by t states x as time t changes (typically representing time). It represents the evolution of some uncertain system over time. A realization (sample) x(t) of x(t) is a trajectory; if it is known at certain points ti it is a time series. Stochastics: The mathematics of random variables and stochastic processes. Stochastics = probability theory + statistics + stochastic processes D. Koutsoyiannis et al., From fractals to stochastics 17

From continuous time to discrete time processes X(t) ≔ (cumulative, nonstationary)

t

Most natural x(t) (instantaneous, processes continuous-time process) evolve in continuous time but they are observed in discrete time, xi := x(iΔ) (instantaneous process instantaneously or by averaging sampled at spacing Δ) Xi:=

0

Δ

2Δ

…

(i – 1)Δ

iΔ

t

(cumulative sampled at Important spacing Δ, nonstationary) note: The

= = X(iΔ)/Δ – Χ((i– 1)Δ)/Δ 0

Δ

2Δ

…

(i – 1)Δ

iΔ

t

(averaged at time scale Δ)

graphs display a realization of the process while the notation is for the process per se.


Second order properties of a stationary stochastic process  Autocovariance function, c(τ) := Cov[x(t), x(t + τ)], where τ is time lag.  Power spectrum (spectral density), s(w), where w is frequency (inverse time).  Structure function (semivariogram or variogram), ℎ(𝜏) ≔ 12 Var[𝑥(𝑡) − 𝑥(𝑡 + 𝜏)]. (𝛥)

 Climacogram, γ(Δ), where Δ denotes time scale, so that γ(Δ) := Var[𝑥𝑖 ].  All these properties are transformations of one another, i.e.: ∞

𝑠(𝑤) = 4 ∫0 𝑐 (𝜏) cos(2π𝑤𝜏) d𝜏, ℎ(𝜏) = 𝑐(0) − 𝑐(𝜏), 𝛾(𝛥) =

1 2 ∫0 (1

∞

𝑐 (𝜏) = ∫0 𝑠(𝑤) cos(2π𝑤𝜏) d𝑤 𝑐 (𝜏) = 𝑐(0) − ℎ(𝜏)

− 𝜉 )𝑐 (𝜉𝛥)d𝜉 ,

2 2 1 d (𝜏 𝛾(𝜏))

𝑐 (𝜏) = ⁡ 2

d𝜏2

(2) (3) (4)

 In estimation from data, the climacogram behaves better than all other tools, which involve high bias and statistical variation (Dimitriadis and Koutsoyiannis, 2015 Koutsoyiannis, 2016). The climacogram involves bias too, but this can be determined analytically and included in the estimation. D. Koutsoyiannis et al., From fractals to stochastics 19

Second order properties at discrete time  Once the continuous-time properties are determined, the discrete-time ones can be calculated.  For example, the autocovariance of the averaged process is: ( )

( )

( )

𝛥 𝑐𝑗 𝛥 = Cov [𝑥𝑖 𝛥 , 𝑥𝑖+𝑗 ]=

1

𝛤(|𝑗+1|𝛥)+𝛤((|𝑗−1|𝛥)

𝛥

2

2 ⁡(

− 𝛤 (|𝑗|𝛥))

(5)

where Γ(Δ) ≔ Var[X(Δ)] = Δ2γ(Δ).  Also, the power spectrum of the averaged process can be calculated from: ( ) ( ) (𝛥) 𝑠d𝛥 (𝜔) = 2𝑐0𝛥 + 4 ∑∞ 𝑐 cos(2π𝜔𝑗) 𝑗=1 𝑗

(6)

( ) where 𝜔 ≔ 𝑤𝛥, ⁡𝑠d𝛥 (𝜔) = 𝑠 (𝛥) (𝑤)/𝛥 (nondimensionalized frequency and spectral density, respectively).

 More details and additional cases can be found in Koutsoyiannis (2013b, 2016).


Climacogram based metrics of stochastic processes Metric / Usefulness Climacogram For the global asymptotic behaviour (Δ → ∞) Differential climacogram (DC) (or climacogram-based structure function, CBSF)

Definition γ(Δ) ≔ Var[xi(Δ)]

Comments For an ergodic process for Δ → ∞ γ(Δ) → 0 necessarily g(Δ) ≔ γ0 – γ(Δ) The definition presupposes that where γ0 = γ(0) is the variance of the instantaneous the variance γ0 is finite process x(t)

For the local asymptotic behaviour (Δ → 0) 2⁡ Climacogram-based 𝜓(𝑤) ≔ 𝛾(1⁄𝑤)𝑔(1⁄𝑤 ) 𝑤𝛾 0 spectrum (CBS) 2⁡𝛾(1⁄𝑤) 𝛾(1⁄𝑤) = (1 − ) For both the global and 𝑤 𝛾0 local asymptotic behaviour where w ≡ 1/Δ is frequency (as in the power spectrum)

It combines the climacogram and the CBSF; it is valid for both finite and infinite variance

Note: The DC is related to the structure function h(τ) by the same way as the climacogram is related to the autocovariance function c(τ): 2 2 1 d (𝜏 𝛾(𝜏))

𝑐 (𝜏 ) = ⁡ 2

d𝜏2

2 2 1 d (𝜏 𝑔(𝜏))

, ℎ (𝜏 ) = ⁡ 2

d𝜏2


Cautionary notes for model fitting  Direct estimation of any statistic of a process (except perhaps for the mean) is not possible merely from the data; we always need to assume a model.  Any statistical estimator 𝑠̂ of a true parameter s is biased either strictly (meaning: E⁡[𝑠̂ ⁡] ≠ 𝑠) or loosely (meaning: mode⁡[𝑠̂ ⁡] ≠ 𝑠).  Model fitting is necessarily based on discrete-time data and needs to consider the effects of (a) discretization and (b) bias.  The climacogram provides easy means to analytically estimate from its true expression (that in continuous time) both effects.  As an example, we consider a process with climacogram 𝛾(𝛥), from which we have a time series for an observation period T (multiple of Δ), each one giving (𝛥) the averaged process 𝑥𝑖 at a time step Δ, so that the sample size is n = 𝑇⁄𝛥.  The standard estimator 𝛾̂(𝛥) of the variance 𝛾(𝛥) of the averaged process is 𝛾̂(𝛥) ≔

1

(𝛥) ⁡ ∑𝑛𝑖=1 (𝑥𝑖 𝑛−1

(𝛵) 2 − 𝑥1 )

=

1

(𝛥) ⁄𝛥 ∑𝑇𝑖=1 ⁡ (𝑥 𝑖 𝑇 ⁄𝛥−1

−

(𝛵) 2 𝑥1 )

(7)

 As shown in Koutsoyiannis (2011, 2016) the bias can be calculated from E⁡[𝛾̂(𝛥)⁡] = 𝜂(𝛥, 𝛵)𝛾(𝛥),⁡⁡⁡⁡𝜂(𝛥, 𝛵) =

1−𝛾(𝑇)⁄𝛾(𝛥) 1−𝛥⁄𝑇

=

1−(𝛥⁄𝛵)2 𝛤(𝛵)⁄𝛤(𝛥) 1−𝛥⁄𝑇

(8)


Entropy and entropy production  The Boltzmann-Gibbs-Shannon entropy of a cumulative process X(t) with probability density function f(X; t) is a dimensionless quantity defined as: ⁡Φ[𝑋(𝑡)] ≔ Ε [– ln

𝑓(𝑋;𝑡) ] ℎ(𝑋)

∞

= − ∫−∞ ln

𝑓(𝑋;⁡𝑡) 𝑓(𝑋; ⁡𝑡)⁡𝑑𝑋 ℎ(𝑋)

(9)

where h(Χ) is the density of a background measure (typically Lebesgue).  The entropy production in logarithmic time (EPLT) is a dimensionless quantity, the derivative of entropy in logarithmic time (Koutsoyiannis, 2011): φ(t) ≡ φ[X(t)] ≔ Φ΄[X(t)] t ≡ dΦ[X(t)] / d(lnt) (10)  For a Gaussian process, the entropy depends on its variance Γ(t) only and is: Φ[X(t)] = (1/2) ln(2πe Γ(t)/h2), φ(t) = Γ΄(t) t / 2Γ(t) (11)  When the past (t < 0) and the present (t = 0) are observed, instead of the unconditional variance Γ(t) we should use a variance ΓC(t) conditional on the past and present: ΓC(t) ≈ 2Γ(t) – Γ(2t)/2,⁡⁡⁡⁡𝜑C (𝑡) =

𝛤C′ (𝑡)𝑡 2𝛤C (𝑡)

≈

(2𝛤′(𝑡)−𝛤′(2𝑡))𝑡 4𝛤(𝑡)−𝛤(2𝑡)

(12)


Resulting processes from maximizing entropy production  A Markov process: 𝑐 (𝜏) = 𝜆e−𝜏/𝛼 , −

1−e−𝛥⁄𝛼 𝛥⁄𝛼

)

maximizes entropy production for small times but minimizes it for large times.  A Hurst-Kolmogorov (HK) process: (14) 𝛾(𝛥) = 𝜆(𝛼/𝛥)2−2𝛨 maximizes entropy production for large times but minimizes it for small times  A Hybrid Hurst Kolmogorov process 𝐻−1 (15) 2𝜅 𝛾(𝛥) = 𝜆(1 + (𝛥⁄𝛼) ) 𝜅 maximizes entropy production both at small and large time scales.

φ(τ), φC(t)

(13)

Markov, unconditional Markov, conditional HK, unconditional+conditional HHK, unconditional HHK, conditional

1.75 1.5 1.25

Hurst coefficient, H

1 0.75

0.5 0.25 0 0.00001

0.001

0.1

10

1000

100000 t

100 γ(Δ), γC(Δ)

𝛾(𝛥) =

2𝜆 (1 𝛥⁄𝛼

2

10 1

0.1 0.01

Markov, unconditional Markov, conditional HK, unconditional HK, conditional HHK, unconditional HHK, conditional

0.001 0.0001 0.00001 0.00001

0.001

0.1

10

1000

Δ

100000


Part 3: Simulation of stochastic processes (at discrete time)


The symmetric moving average scheme  The so-called symmetric moving average (SMA) method (Koutsoyiannis 2000) can directly generate time series with any arbitrary autocorrelation function provided that it is mathematically feasible: 𝑥𝑖 = ∑∞ 𝑙=−∞ 𝑎|𝑙 | 𝑣𝑖+𝑙

(16)

where aj are coefficients calculated from the autocovariance function and vi is white noise averaged in discrete-time.  Assuming that we work for the averaged discrete-time process with power ( ) spectrum 𝑠d𝛥 (𝜔), it has been shown (Koutsoyiannis 2000) that the Fourier transform 𝑠d𝑎 (𝜔) of the al series of coefficients is related to the power spectrum of the discrete time process as ( ) 𝑠d𝑎 (𝜔) = √2𝑠d𝛥 (𝜔)

(17)

 Thus, to calculate al we first determine 𝑠d𝑎 (𝜔) from the power spectrum of the process and then we inverse the Fourier transform to estimate all al.


Handling of truncation error  It is expected that the coefficients al will decrease with increasing l and will be negligible beyond some q (l > q), so that we can truncate (16) to read 𝑞 𝑥𝑖 = ∑𝑙=−𝑞 𝑎|𝑙| 𝑣𝑖+𝑙 (18)  This would introduce some truncation error in the resulting autocovariance function. To adjust for this on the variance, we the calculate the al from 𝑎𝑙 = ⁡ 𝑎𝑙′ + ⁡𝑎⁡′′ ⁡ (19) where the coefficients 𝑎𝑙′ ⁡ are calculated from inversing the Fourier transform of either 𝑠d𝑎 (𝜔)⁡or⁡𝑠d𝑎 (𝜔)(1 − sinc(2π𝜔𝑞))(two options; Koutsoyiannis, 2016).  The constant 𝑎⁡′′ ⁡ is determined so that the variance is exactly preserved: 𝑞

𝑞

′ 𝛾(𝛥) = ∑𝑙=−𝑞 𝑎|2𝑙| = ∑𝑙=−𝑞(𝑎|𝑙| ⁡+ ⁡𝑎⁡′′ ⁡)

2

(20)

 Solving for 𝑎⁡′′ ⁡ , this yields: 𝛾(𝛥)−Σ𝛼′2

𝑎′′ = √ 𝑞

2𝑞+1

+(

Σ𝛼′

2

) ⁡−

2𝑞+1

Σ𝑎′ 2𝑞+1

(21)

𝑞

where Σ𝛼 ′ ≔ ∑𝑙=−𝑞 𝑎|′𝑙| ⁡and⁡Σ𝛼′2 ≔ ∑𝑙=−𝑞 𝑎′2|𝑙| . D. Koutsoyiannis et al., From fractals to stochastics 27

Handling of moments higher than second  In addition to being general for any second order properties (autocovariance function), the SMA method can explicitly preserve higher marginal moments.  Specifically, to produce a discrete-time process xi with coefficient of skewness 𝐶s,𝑥 we need to use a white-noise process vi with coefficient of skewness: 3/2

𝑞

𝐶s,𝑣 = ⁡ 𝐶s,𝑥

2 (∑𝑙=−𝑞 𝑎|𝑙| )

(22)

𝑞 3 ∑𝑙=−𝑞 𝑎|𝑙|

 Likewise, to produce a process xi with coefficient of kurtosis 𝐶k,𝑥 the process vi should have coefficient of kurtosis: 𝑞

2

𝑞

𝑞

2 2 2 𝐶k,𝑥 (∑𝑙=−𝑞 𝑎|𝑙| ) −6 ∑𝑙=−𝑞 ∑𝑘=−𝑞 𝑎|𝑙| 𝑎|𝑘| 𝐶k,𝑣 = ⁡ 𝑞 4 ∑𝑙=−𝑞 𝑎|𝑙|

(23)

 See details in Dimitriadis and Koutsoyiannis (2016).  Note that the method can also be used in multivariate processes, represented by vectors (Koutsoyiannis, 2000).


Simple marginal distributions for generation of nonGaussian white noise  Four-parameter distributions are needed to preserve skewness and kurtosis.  For light-tailed distributions of v we can use an extended and standardized version of the Kumaraswamy distribution (ESK) with distribution function: 𝐹 (𝑣 ) = 1 − (1 − (

𝑏 𝑣−𝑐 𝑎 𝑑

) )

(24)

 For heavy-tailed distributions we can use the Normal-Inverse Gaussian (NIG) with probability density: 𝑓 (𝑣 ) =

√𝑎2 +𝑏2 e𝑏+𝑎(𝑣−𝑐)/𝑑 π𝑑√1+((𝑣−𝑐 )/𝑑)2

K1 (√𝑎2 + 𝑏 2 √1 + ((𝑣 − 𝑐 )/𝑑)2 )

(25)

with K1 denoting a modified Bessel function of the third kind. Even though its mathematical form is involved, its moments are calculated analytically and the generation from the distribution is easy.  In both cases v is the value of the random variable, a and b are dimensionless shape parameters, c is location parameter and d scale parameter; c and d have same dimensions as v (see details in Dimitriadis and Koutsoyiannis, 2016). D. Koutsoyiannis et al., From fractals to stochastics 29

Range of skewness and kurtosis covered by the two distributions 100

global min min ESK max ESK min NIG ESK: a = 0.5 ESK: a = 1 ESK: b = 0.1 ESK: b = 1 NIG: a = ± 0.05 NIG: a = ± 0.5 NIG: b = 0.1 NIG: b = 0.5

Ck

10

1 -10

-5

0

5

10

Cs

Isopleths of parameters 𝑎⁡or⁡𝑏 of the ESK and the NIG distribution for the indicated skewness and kurtosis. D. Koutsoyiannis et al., From fractals to stochastics 30

Performance in the generation of non-Gaussian white noise 2.0

3

Weibull (λ=1, k=0.8) maximized entropy simulation

1.5

2

f(x) = xk-1 exp(-x/θ) / θk / Γ(k), x > 0

f(x)

f(x)

f(x) =(k/λ) (x/λ)k-1 exp(-(x/λ)k) , x ≥ 0 1.0

λ0=1.217; λ1=0.645; λ2=-2.065; λ3=3.788; λ4=9.8

0.5

λ0=2.33; λ1=0.29; λ2=-0.947; λ3=1.864; λ4=4.79

1

0.0

0

0

1

2

3

4

0

5

1

2

3

1.0

5

lognormal (μ=0; σ=0.5) maximized entropy simulation

0.8

5

Pareto (a=16; b=1) maximized entropy simulation

4

f(x) = exp(-(ln(x)-μ)2/2σ2)/(xσ√2π) , x > 0

f(x) = a ba / xa+1 , x ≥ b

3

f(x)

f(x)

0.6

4

x

x

1

λ0=0.007; λ1=-0.072; λ2=0.275; λ3=-0.596; λ4=1.133

0.4

λ0=1.701E+7; λ1=0.073; λ2=1.814; λ3=3.402; λ4=5

2

1

0.2

0

0.0 0

2

gamma (k=0.5, θ=1) maximized entropy simulation

1

2

3

x

4

5

1.0

1.1

1.2

1.3

1.4

1.5

x

Four two-parameter probability density functions, their approximations by maximum entropy distributions using four moments, i.e., 𝑓(𝑥) = 𝜆0 exp⁡(− ∑4𝑖=1(𝑥/𝜆𝑖 )𝑖 ), and by the empirical density from a single synthetic time series with n = 105. D. Koutsoyiannis et al., From fractals to stochastics 31

Part 4: Applications


Application 1: Microscale (turbulence)  Estimation of high moments involves large uncertainty and cannot be reliable in the typically short time series of geophysical processes.  On the contrary, high moments can be reliably estimated from large samples recorded in laboratory experiments at sampling intervals of μs.  Here we use grid-turbulence data provided by the Johns Hopkins University (http://www.me.jhu.edu/meneveau/datasets/datamap.html).  This dataset consists of 40 time series with n = 36×106 data points of longitudinal wind velocity along the flow direction, all measured at a sampling time interval of 25 μs by X-wire probes placed downstream of the grid (Kang et al., 2003).  By standardizing all series we formed a sample of 40 × 36 ×106 = 1.44 ×109 values to estimate the marginal distribution, and an ensemble of 40 series, each with 36 ×106 values to estimate the dependence structure through the climacogram.  We also performed simulation using the SMA framework with n =106 values.


Marginal distribution  The time series are nearly-Gaussian but not exactly Gaussian (skewness = 0.23; kurtosis = 3.08). This divergence of fully developed turbulent processes from normality has been also derived theoretically (Wilczek et al., 2011).  Interestingly, these small differences from normality result in highly nonnormal distribution of the white noise of the SMA model (skewness = 3.26; kurtosis = 12.30!) x := (w - wm)/ws -6

-4

-2

0

2

f(x)

1.E+00

1.E-02

1.E-04

observed N(0,1) simulation

1.E-06

4

6

Probability density function of the mean standardized time series of turbulent velocity compared to that of a single simulation using the SMA scheme preserving the first four moments; the standard normal distribution N(0,1) is also shown. D. Koutsoyiannis et al., From fractals to stochastics 34

Stochastic dependence of the turbulent velocity process Sum of two equally weighted processes, an HHK and a Markov: −𝛥⁄𝛼 𝐻−1 𝜆 𝜆 1 − e 𝛾(𝛥) = (1 + (𝛥⁄𝛼)2𝜅 ) 𝜅 + (1 − )⁡ ⁄ ⁄ 2 𝛥 𝛼 𝛥 𝛼

(26)

m (ms) Δ 1.E+00

γ

1.E-02 1.E+00

1.E-01

true expected

simulated model observed 1.E-02

1.E+02

1.E+04

1.E+06

Climacogram of the turbulent velocity process (observed is the average from the 40 time series); the five parameters of the model are estimated as: 𝜆 = 1.016, 𝛼 = 14 ms, 𝜅 = 0.375, 𝐻 = 0.84. D. Koutsoyiannis et al., From fractals to stochastics 35

Other second-order properties of the model compared to data 1.0E+02 1.0E+00 1.0E-01

CBS (s)

Differential climacogram CBV

1.0E+00

1.0E-02

1.0E-03 1.0E-05

observed true (model) expected (model) simulated 1.0E-03

1.0E-01 Δm (s)

1.0E+01

1.0E+03

1.0E-02

- 5/3

observed

1.0E-04

true (model) 1.0E-06

expected (model) simulated

1.0E-08 1.0E-03

1.0E-01

1.0E+01

w (Hz)

1.0E+03

1.0E+05

⁡

1.0E+02

power spectrum (s)

1.0E+00

- 5/3

1.0E-02

1.0E-04

observed true (model)

1.0E-06

expected (model) simulated

1.0E-08 1.0E-03

1.0E-01

1.0E+01

1.0E+03

1.0E+05

The power spectrum is rough and contains several artefacts that result from discretization and bias. The climacogram based metrics provide better curves.

w (Hz) D. Koutsoyiannis et al., From fractals to stochastics 36

Kurtosis of velocity increments The change of kurtosis of the velocity increments (differences) with increased time distance, τ (lag), is related to the intermittent behaviour of turbulence (Batchelor and Townsend, 1949). Therefore it is important to preserve this variation. 6

Empirical and simulated kurtosis vs. lag.

simulated model observed

Ck

5

4

3 1.E-01

1.E+00

1.E+01

τ (ms)

1.E+02

Not a mystery to have large kurtosis (> 5 here) in velocity increments, even though the velocity is 1.E+03 almost normal (kurtosis = 3.08) D. Koutsoyiannis et al., From fractals to stochastics 37

Even higher moments 𝑝

𝑆𝑝 ≔ E[|𝑥𝑖 − 𝑥𝑖+𝑟 | ] ≈ 𝑟 𝜁𝑝 5

observed She-Leveque (1994) simulated HHK-N(0,1): ~ K41 Markov-PBF

4

ζp

3 2 1 0 0

5

10

p

15

Not a mystery that empirical values depart from the straight line (regarded as manifestation of intermittency). No provision is needed to reproduce it; a good parsimonious model (with Hurst behaviour and slightly nonGaussian distribution) suffices.


Application 2: Medium scale (wind)  We can estimate high moments in geophysical processes accurately only after analyzing thousands of short time series.  Here we use hourly wind speed data by NOAA (www.ncdc.noaa.gov).  This dataset consists of 15 000 time series around the globe with 10 min average measurements every one hour. After several quality and quantity tests we ended up with approximately 3500 stations.

 We also performed multiple simulations using the SMA framework with n =3 ×105 values.

latitude (deg)

 By standardizing all series we formed a sample of ~109 values to estimate the 90 marginal distribution, and an ensemble of 3500 series, each 60 5 with 3 × 10 values on the 30 average, to estimate the dependence structure through 0 the climacogram. -30

-60 -90 -180

-120

-60

0

60

120

180

longitude (deg) D. Koutsoyiannis et al., From fractals to stochastics 39

Marginal distribution v/vs 0

5

10

15

20

25

30

1.E+00

Wind speed distribution (from ~109 values):

1.E-02

observed

1.E-03

modified Pareto IV

1.E-04

𝐹 (𝑣) = 1 − (1 + (𝑣⁄𝛼𝑣s )2 )−𝛽/2

1.E-05

where α = 2 and β = 3.

(27)

1.E-06 1.E-07 1.E-08 15

Sample skewness and kurtosis coefficients of 1000 hourly wind stations as well as of the corresponding white noise process of the SMA model.

10

Ck

f(x)

1.E-01

min global min ESK

5

max ESK min NIG wind (data)

wind (Vsma for H=0.9)

0

0

1

2

Cs

3

4

5


Stochastic dependence of the wind process Δ k (h)

1.E+00 1.E+00

1.E+01

1.E+02

1.E+03

1.E+04

1.E+05

1.E+06

γ

1.E-01

true 1.E-02

expected observed simulated simulation

1.E-03 Climacogram of the wind speed process (observed is the average from the 3500 time series); the four parameters of the model are estimated as: α = 1 h, κ = 0.5, λ = 1.3 and H = 0.82. D. Koutsoyiannis et al., From fractals to stochastics 41

Application 3: Megascale (temperature) NSSTC Lohle Taylor EPICA Zachos

Orbital forcing, 10-100 kyears

CRU Moberg GRIP Huybers Veizer

0.1

28 months

Common Purely random change perception: 1 month

Variance ratio, γ(Δ) / γ(1 month)

1

The actual climatic variability at the scale of 100 million years equals that of 28 months of a purely random climate!

The HK behaviour extends over all time scales

0.01 1.E-02 1.E-01 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08

Climacograms constructed from the indicated instrumental and proxy data series (Markonis and Koutsoyiannis, 2013)

Scale, Δ (years)


Epilogue  Stochastic processes in continuous time offer a strong basis for modelling and interpretation of natural behaviours.  We owe the well-founded and rigorous mathematical theory of stochastics to Kolmogorov (1931, 1933, 1938), including the foundation of scaling processes (Kolmogorov, 1940). This theory has often be distorted but there exist textbooks consistent with it (e.g. Papoulis, 1991).  Calculating values of sample statistics without considering their statistical properties (bias and statistical variation) can yield misleading results.  Without proper attention to the underlying stochastics, we can even “identify” phenomena that do not exist and take statistical sampling effects as natural behaviours.  A general methodology for data analysis and construction of synthetic time series is possible provided that we have a good understanding of stochastics.

Data analysis

Inductive inference

Stochastics Prediction with Logical world view quantified uncertainty D. Koutsoyiannis et al., From fractals to stochastics 43

Appendix: Proof of infeasibility of too steep slopes in power spectrum for low frequencies  This proof is summarized here from Koutsoyiannis (2013b) and Koutsoyiannis et al. (2013).  Let us assume the contrary, i.e., that for frequency range 0 ≤ w ≤ ε (with ε however small) the loglog derivative is 𝑠 # (𝑤) = −𝛽, or else s(w) = α w–β where α and β are constants, with 𝛽 > 1.  As a result of (2) and (4) the climacogram is related to power spectrum by: ∞

(28)

𝛾 (𝛥) = ∫0 𝑠(𝑤) ⁡sinc 2 (π𝑤𝛥) d𝑤

 The sinc2 function within the integral takes significant values only for w < 1/Δ (cf. Papoulis, 1991, p. 433). Hence, assuming a scale Δ ≫ 1/ε, and with reference to (28) we may write: ∞

𝜀

𝛾 (𝛥) = ∫0 𝑠(𝑤) sinc 2 (π𝑤𝛥) d𝑤 ≈ ∫0 𝛼𝑤 −𝛽 ⁡sinc 2 (π𝑤𝛥) d𝑤

(29)

 On the other hand, it is easy to verify that, for 0 < w < 1/Δ, (30)

sinc(π𝑤𝛥) ≥ 1 − 𝑤𝛥 ≥ 0  Since ε ≫ 1/Δ, while the function in the integral (29) is nonnegative, 𝜀

1⁡/𝛥

𝛾 (𝛥) ≈ ∫0 𝛼𝑤 −𝛽 ⁡sinc 2 (π𝑤𝛥) d𝑤 ≥ ∫0

1⁡/𝛥

𝛼𝑤 −𝛽 ⁡sinc 2 (π𝑤𝛥) d𝑤 ≥ ∫0

𝛼𝑤 −𝛽 (1 − 𝑤𝛥)2 ⁡d𝑤

(31)

 Substituing ξ = wΔ in (31), we find: 1⁡

𝛾 (𝛥) ≥ 𝑎𝛥𝛽−1 ∫0 𝜉 −𝛽 (1 − 𝜉 )2 ⁡d𝜉

(32)


Appendix (contd.)  To evaluate the integral in (32) we take the limit for q → 0 of the integral: 𝐵(𝑞 ) ≔

1⁡ ∫𝑞 𝜉 −𝛽 (1

)2

− 𝜉 ⁡d𝜉 ⁡ =

𝑞1−𝛽 −1 𝛽−1

−2

𝑞2−𝛽 −1 𝛽−2

+

𝑞3−𝛽 −1 𝛽−3

(33)

 Clearly, for β > 1 the first term of the latter integral diverges for q → 0, i.e., B(0) = ∞ and thus, by virtue of the inequality (32), γ(Δ)= ∞. For a (mean) ergodic processes γ(Δ) should necessary tend to 0 for Δ → ∞ (Papoulis, 1991, p. 429). Therefore, the process is non-ergodic.† This analysis generalizes a result by Papoulis (1991, p. 434) who shows that an impulse at w = 0 corresponds to a non-ergodic process.  In a non-ergodic process there is no possibility to infer statistical properties from the samples, so the statistical analyses are in vain and hence the reported results not meaningful.  Sometimes reported slopes 𝑠 # < −1 are interpreted as indications of nonstationarity. Such interpretations are equally invalid because even the definition of the power spectrum as a function of frequency only (as well as those of autocorrelation and climacogram as functions of lag and scale, respectively) assumes stationarity.

†

It is interesting to note that, if |β| < 1, the integral in (29) can be evaluated to give: ∞

𝛾(𝛥) ≈ 𝛼 ∫ 𝑤 0

−𝛽

⁡sinc

2 (π𝑤𝛥)

sin(π𝛽⁄2) (2π)𝛽 𝛼Γ(1 − 𝛽) d𝑤 = ⁡ (π𝛽⁄2) 2(𝛽 + 1)𝛥1−𝛽

Clearly, for Δ → ∞, the last expression gives γ(Δ) → 0 and thus for |β| < 1 the process is mean ergodic. D. Koutsoyiannis et al., From fractals to stochastics 45

References Batchelor, G. K. and Townsend, A. A. (1949), The nature of turbulent motion at large wave-numbers, Proc. R. Soc. Lond. A ,199, 238-255. Beran, J., Feng, Y., Ghosh, S. and Kulik, R. (2013), Long-Memory Processes, Long-Memory Processes, Springer. Dechant, A. and Lutz, E. (2015), Wiener-Khinchin theorem for nonstationary scale-invariant processes, Physical Review Letters, 115(8), p.080603 Dimitriadis, P., and Koutsoyiannis, D. (2015), Climacogram versus autocovariance and power spectrum in stochastic modelling for Markovian and Hurst–Kolmogorov processes, Stochastic Environm. Research & Risk Assessment, 29 (6), 1649–1669, doi:10.1007/s00477-015-1023-7. Dimitriadis, P., and Koutsoyiannis, D. (2016), Stochastic synthesis approximating any process dependence and distribution (in preparation). Gneiting, T., and Schlather, M. (2004), Stochastic models that separate fractal dimension and the Hurst effect, Society for Industrial and Applied Mathematics Review, 46 (2), 269-282. Graham, L. and Kantor, J.-M. (2009), Naming Infinity: A True Story of Religious Mysticism and Mathematical Creativity, Harvard University Press. Kang, H. S., Chester, S., and Meneveau, C. (2003), Decaying turbulence in an active-grid-generated flow and comparisons with large-eddy simulation, Journal of Fluid Mechanics, 480, 129–160. Hemelrijk, J., Underlining random variables, Statistica Neerlandica, 20 (1), 1-7, 1966. Kolmogorov, A. N. (1931), Uber die analytischen Methoden in der Wahrscheinlichkcitsrechnung, Math. Ann. 104, 415-458. (English translation: On analytical methods in probability theory, In: Kolmogorov, A.N., 1992. Selected Works of A. N. Kolmogorov - Volume 2, Probability Theory and Mathematical Statistics A. N. Shiryayev, ed., Kluwer, Dordrecht, The Netherlands, pp. 62-108). Kolmogorov, A. N. (1933), Grundbegrijfe der Wahrscheinlichkeitsrechnung, Ergebnisseder Math. (2), Berlin. (2nd English Edition: Foundations of the Theory of Probability, 84 pp. Chelsea Publishing Company, New York, 1956). Kolmogorov, A. N. (1938), A simplified proof of the Birkhoff-Khinchin ergodic theorem, Uspekhi Matematicheskikh Nauk, 5, 52–56. (English edition: Kolmogorov, A.N., 1991, Selected Works of A. N. Kolmogorov - Volume 1, Mathematics and Mechanics, Tikhomirov, V. M. ed., Kluwer, Dordrecht, The Netherlands, pp. 271–276). Kolmogorov, A. N. (1940), Wiener spirals and some other interesting curves in a Hilbert space. Dokl. Akad. Nauk SSSR, 26, 115–118. English translation in: V.M. Tikhomirov, ed., 1991, Selected works of A.N. Kolmogorov, Volume I: Mathematics and mechanics, 324–326. Springer, Berlin. Koutsoyiannis, D. (2000), A generalized mathematical framework for stochastic simulation and forecast of hydrologic time series, Water Resources Research, 36 (6), 1519–1533. Koutsoyiannis, D. (2010), Some problems in inference from time series of geophysical processes (solicited), European Geosciences Union General Assembly 2010, Geophysical Research Abstracts, Vol. 12, Vienna, EGU2010-14229, European Geosciences Union (http://www.itia.ntua.gr/973/). Koutsoyiannis, D. (2011), Hurst-Kolmogorov dynamics as a result of extremal entropy production, Physica A, 390 (8), 1424–1432.


References (contd.) Koutsoyiannis, D. (2013a), Climacogram-based pseudospectrum: a simple tool to assess scaling properties, European Geosciences Union General Assembly 2013, Geophysical Research Abstracts, Vol. 15, Vienna, EGU2013-4209, European Geosciences Union, (http://itia.ntua.gr/1328). Koutsoyiannis, D. (2013b), Encolpion of Stochastics: Fundamentals of Stochastic Processes, Athens: Department of Water Resources and Environmental Engineering – National Technical University of Athens (http://www.itia.ntua.gr/1317/). Koutsoyiannis, D. (2014), Random musings on stochastics (Lorenz Lecture), AGU 2014 Fall Meeting, San Francisco, USA, American Geophysical Union, doi:10.13140/RG.2.1.2852.8804 (http://www.itia.ntua.gr/en/docinfo/1500/). Koutsoyiannis, D. (2016), Generic and parsimonious stochastic modelling for hydrology and beyond, Hydrological Sciences Journal, 61 (2), 225– 244, doi:10.1080/02626667.2015.1016950. Koutsoyiannis, D., Lombardo, F., Volpi, E., and Papalexiou S. M. (2013), Is consistency a limitation? — Reply to “Further (monofractal) limitations of climactograms” by Lovejoy et al., Comment in the review of “Just two moments! A cautionary note against use of high-order moments in multifractal models in hydrology” by Lombardo et al., Hydrol. Earth Syst. Sci. Discuss., 10, C5397, (http://www.hydrol-earth-syst-scidiscuss.net/10/C5397/2013/hessd-10-C5397-2013-supplement.pdf). Lombardo, F., Volpi, E., Koutsoyiannis, D., and Papalexiou, S. M. (2014), Just two moments! A cautionary note against use of high-order moments in multifractal models in hydrology, Hydrology and Earth System Sciences, 18, 243–255. Mandelbrot, B. B. (1982), The Fractal Geometry of Nature, W. H. Freeman, New York. Mandelbrot, B. B. (1999), Multifractals and 1/ƒ Noise: Wild Self-Affinity in Physics (1963-1976), Springer. Markonis, Y., and Koutsoyiannis, D. (2013), Climatic variability over time scales spanning nine orders of magnitude: Connecting Milankovitch cycles with Hurst–Kolmogorov dynamics, Surveys in Geophysics, 34(2), 181–207. Papalexiou, S. M., Koutsoyiannis, D., and Montanari, A. (2010), Mind the bias!, STAHY Official Workshop: Advances in statistical hydrology, Taormina, Italy, International Association of Hydrological Sciences. Papoulis, A. (1991), Probability, Random Variables, and Stochastic Processes, 3rd ed. New York: McGraw-Hill. Yaglom, A. M. (1987), Correlation Theory of Stationary and Related Random Functions, Springer. Veneziano, D. and Langousis, A. (2010), Scaling and fractals in hydrology, Advances in Data-based Approaches for Hydrologic Modeling and Forecasting, edited by B. Sivakumar and R. Berndtsson, 145 pages, World Scientific. von Kármán, T. (1940), The engineer grapples with nonlinear problems, Bull. Amer. Math. Soc., 46, 615-683, doi: 10.1090/S0002-9904-194007266-0. Wackernagel, H. (1995), Multivariate Geostatistics, Springer, Berlin. Wackernagel, H. (1998), Multivariate Geostatistics, 2nd completely revised edition, Springer, Berlin. Wilczek, M., Daitche, A., and Friedrich, R. (2011), On the velocity distribution in homogeneous isotropic turbulence: correlations and deviations from Gaussianity, Journal of Fluid Mechanics, 676, 191-217.