Sep 7, 2017 - where E is the expectation. In the followings, we ... Because of the Pythagorean theorem, the m-projection of p to MFS that minimizes the.
Geometry of Information Integration
arXiv:1709.02050v1 [cs.IT] 7 Sep 2017
Shun-ichi Amari1,2 , Naotsugu Tsuchiya3 , and Masafumi Oizumi1,2 1
RIKEN Brain Science Institute 2
3
Araya Inc.
School of Psychological Sciences, Monash University
Abstract Information geometry is used to quantify the amount of information integration within multiple terminals of a causal dynamical system. Integrated information quantifies how much information is lost when a system is split into parts and information transmission between the parts is removed. Multiple measures have been proposed as a measure of integrated information. Here, we analyze four of the previously proposed measures and elucidate their relations from a viewpoint of information geometry. Two of them use dually flat manifolds and the other two use curved manifolds to define a split model. We show that there are hierarchical structures among the measures. We provide explicit expressions of these measures.
1
Introduction
It is an interesting problem to quantify how much information is integrated in a multi-terminal causal system. The concept of information integration was introduced by Tononi and colleagues in Integrated Information Theory (IIT), which attempts to quantify the levels and contents of consciousness [1, 2, 3]. Inspired by Tononi’s idea, many variants of integrated information have been proposed [4, 5, 6, 7]. From a different perspective from IIT, Ay independently derived the same measure as integrated information proposed in [4] to quantify complexity in a system [8, 9]. 1
In this paper, we use information geometry [10] to clarify the nature of various measures of integrated information as well as the relations among them. Consider a joint probability distribution p(x, y) of sender X and receiver Y , where x and y are vectors consisting of n components, denoting actual values of X and Y . Here, y is stochastically generated depending on x. That is, information is sent from the sender X to the receiver Y . We consider a Markov model, where xt+1 (=y) is generated from xt (=x) stochastically by transition probability matrix p (xt+1 |xt ). In this way, we quantify how much information is integrated within a system through one step of state transition. To quantify the amount of integrated information, we need to consider a split version of the system in which information transmission between different elements are removed, so that we can compare the original joint probability with the split one. The joint probability distribution of a split model is denoted by q(x, y). We define the amount of information integration by the minimized Kullback-Leibler (KL) divergence between the original distribution p(x, y) and the split distribution q(x, y), Φ = min DKL [p(x, y) : q(x, y)] , q
(1)
which quantifies to what extent p(x, y) and q(x, y) are different. Minimizing KL-divergence means selecting the best approximation of the original distribution p(x, y) among the split distributions q(x, y). We need to search for a reasonable split model. For each distinct version of of split models, corresponding measure of integrated information can be derived [8, 9, 4, 6, 7]. The present paper studies four reasonable split models and the respective measures of integrated information. Among the four integrated information, ΦG , the geometric Φ defined in [7], is what we believe the most reasonable measure for information integration in a sense that it purely quantifies causal influences between parts, although the others have their own meanings and useful characteristics.
2
2
Markovian Dynamical Systems
We consider a Markovian dynamical system xt+1 = T xt
(2)
where xt is the state of the system at time t and xt+1 is the state at the next time step t + 1, which are vectors consisting of n elements. T is a state transition operator, which is represented by the conditional probability distribution of the next state xt+1 given the current state xt , p(xt+1 |xt ). p(xt+1 |xt ) is called a transition probability matrix. Throughout this paper, we will use x for xt and y for xt+1 for the ease of notation. Given the probability distribution of x at time t, p(x), the probability distribution of the next state, p(y), is given by p(y) =
X
p (y|x) p(x),
(3)
x
and the joint probability distribution is given by p(x, y) = p(y|x)p(x).
(4)
Throughout the paper, we will use p(x) and p(y) to mean pX (x) and pY (y), which explicitly and accurately denote X and Y . The state x is supported by n terminals and information at terminals x1 , x2 , · · · , xn are integrated to give information in the next state y = (y1 , · · · , yn ), so that each yi depends on all of x1 , · · · , xn . We quantify how much information is integrated among different terminals through state transition. All such information is contained in the form of the joint probability distribution p(x, y). We use a general model M to represent p(x, y), called a full model, which is a graphical model where all the terminals of sender X and receiver Y are fully connected. We consider the discrete case, in particular the binary case, in which xi and yi are binary taking values of 0 or 1, although generalization to other cases (e.g., continuous, more discretization steps than binary) is not difficult. We also study the case where random continuous variables are subject to Gaussian distributions. In order to quantify the amount of information integration, we consider a “split model” MS , where information transmission from one terminal xi to the other terminals yj (j 6= i) is 3
Figure 1: Full model p(x, y).
removed. Let q(x, y) be the joint probability distribution of x and y in a split model. The amount of information integration in p(x, y) is measured by the KL-divergence from p(x, y) to MS , that is, the KL-divergence from p(x, y) to q ∗ (x, y), which is a particular instantiation of the split model and is the one that is closest to p(x, y). Integrated information is defined as the minimized KL-divergence between the full model p and the split model q [7], Φ = min DKL [p(x, y) : q(x, y)] , q∈MS
= DKL [p(x, y) : q ∗ (x, y)] . Depending on various definitions of “split” model MS , different measures of integrated information can be defined. Below, we elucidate the nature of the other three candidate integrated information and their relations.
3 3.1
Stochastic Models of Causal Systems Full model
A full model M, p(x, y), is a graphical model in which all the nodes (terminals) are connected (Fig. 1). We consider the binary case. In that case, p(x, y) is an exponential family and can be expanded as p(x, y) = exp
X
θiX xi + +
X
X
θjY yj +
YY θij yi yj
4
+
X
X
XX θij xi xj
XY θij xi y j
+ h(x, y) − ψ ,
(5)
Figure 2: Fully split model q(x, y).
XX , θ Y Y , θ XY . where we show linear and quadratic terms explicitly by using parameters θiX , θjY , θij ij ij
h(x, y) is the higher order terms of x and y and the last term ψ is the free energy term (or cumulant generating function) corresponding to the normalizing factor. The set of distributions in the full model form a dually flat statistical manifold [10]. We hereafter neglect higher-order terms, since they disappear in split models we consider. Then, parameters XY XX Y Y , θij , θij θ = θiX , θjY , θij
(6)
form an e-coordinate system to specify a distribution p(x, y). The dual coordinate system, m-coordinate system, is denoted by η, XX Y Y XY η = ηiX , ηjY , ηij , ηij , ηij .
(7)
The components of η are expectations of corresponding random variables. For example, XX ηij
= E [xi xj ] ,
(8)
XY ηij
= E [xi yj ] ,
(9)
where E is the expectation. In the followings, we consider the case where the number of elements is 2 (n = 2) for the explanatory purpose, but generalization for larger n is straightforward.
5
3.2
Fully split model
Ay considered a split model from the viewpoint of complexity of a system[8, 9]. The split model q(y|x) is given by q(y|x) =
Y
q(yi |xi ),
(10)
i
where the conditional probability distribution of the whole system q(y|x) is fully split into that of each part. We call this model “fully split model” MF S . The corresponding measure was also introduced by Barrett and Seth [4] following the measure of integrated information proposed by Balduzzi and Tononi [2]. This split model deletes branches connecting Xi and Yj (i 6= j) and also deletes the branches connecting different Yi and Yj (i 6= j) (Here, we use capital letters X and Y to emphasize random variables, not their values.). This split model is reasonable because when terminals Yi are split, all the branches connecting Yi and the other nodes should be deleted except for branches connecting Xi and Yi . Branches connecting Xi and Xj remain as they are (Fig. 2). However, even though branches connecting Yi and Yj are deleted, this does not imply that Yi and Yj (i 6= j) are independent, because when input Xi and Xj are correlated, Yi and Yj are also correlated even though no branches exist connecting Xi and Yj and Yi and Yj . Even if branches connecting Yi and Yj are deleted, however, it does not imply that Yi and Yj (i 6= j) are independent; when input Xi and Xj are correlated, Yi and Yj are also correlated without any branches connecting Xi and Yj and Yi and Yj . When n = 2, the random variables Xi and Yj have a Markovian structure, Y1 − X1 − X2 − Y2 ,
(11)
so that Y1 and Y2 are conditionally independent when (X1 , X2 ) is fixed. Also X2 and Y1 (or X1 and Y2 ) are conditionally independent when X1 (or X2 ) are fixed. These constraints correspond to putting XY XY YY θ12 = θ21 = θ12 =0
(12)
in the θ-coordinates. They are linear constraints in the θ-coordinates. Thus, the fully split model MF S is an exponential family. It is an e-flat submanifold of M. Given p(x, y) ∈ M, 6
let q ∗ (x, y) be the m-projection of p to MF S . Then, q ∗ (x, y) is given by the minimizer of KL-divergence, q ∗ (x, y) = arg min DKL [p(x, y) : q(x, y)].
(13)
q(x,y)∈MF S
We use the mixed coordinate system of M, XX Y Y XY XY XY XY YY ξ = ηiX , ηjY , ηij , ηij , η11 , η22 ; θ12 , θ21 , θ12 .
(14)
Then MF S is specified by (12). Because of the Pythagorean theorem, the m-projection of p to MF S that minimizes the KL-divergence DKL [p : MF S ] is explicitly given by XX Y Y XY XY ξ ∗ = ηiX , ηjY , ηij , ηij , η11 , η22 ; 0
(15)
in the ξ-coordinate system, where η-part is the same as that of the mixed coordinates of p(x, y). By simple calculations, we obtain q ∗ (x, y) = p(x)p (y1 |x1 ) p (y2 |x2 ) ,
(16)
q ∗ (x) = p(x), Y q ∗ (y|x) = p (yi |xi ) .
(17)
which means
(18)
The corresponding measure of integrated information is given by ΦF S =
X
H [Yi |Xi ] − H[Y |X],
(19)
where H [Yi |Xi ] and H[Y |X] are the conditional entropies corresponding to the random variables. This measure was termed “stochastic interaction” by Ay [8]. While ΦF S is straightforward in derivation and its concept, it has an undesirable property as a measure of integrated information. Specifically, as we proposed in [6, 7], any measure of integrated information Φ, is expected to satisfy the following constraint, 0 ≤ Φ ≤ I(X; Y ), 7
(20)
Figure 3: Diagonally split graphical model q(x, y).
where I(X; Y ) is the mutual information between X and Y . This requirement is natural because Φ should quantify the “loss of information” caused by splitting a system into parts, i.e., removing information transmission between parts. The loss of information should not exceed the total amount of information in the whole system, I(X; Y ), and should be always positive or 0. Φ should be 0 only when X and Y are independent. However, ΦF S does not satisfy the requirement of the upper bound, as was pointed by [6, 7]. This is because MF S does not include the submanifold MI consisting of the independent distributions of X and Y , MI = {q(x)q(y)} .
(21)
XY θij = 0 (for ∀ i,j).
(22)
MI is characterized by
It is an e-flat submanifold of M. The minimized KL-divergence between p(x, y) and MI is mutual information, I(X; Y ) = min DKL (p(x, y) : q(x, y)) q∈MI
(23)
Thus, while stochastic interaction, derived from the submanifold MF S , has a simple expression (Eq. 19) and nice properties on its own, it may not be an ideal measure of integrated information due to its violation of the upper-bound requirement.
3.3
Diagonally split graphical model
In order to overcome the above difficulties, we consider an undirected graphical model in which all the branches connecting xi and yj (i 6= j) are deleted but all the other branches remain as 8
shown in Fig. 3. We call this model “diagonally split graphical model” MDS . The model is defined by XY XY θ12 = θ21 = 0,
(24)
It is also an e-flat submanifold of M. The branches connecting different yi exist so that Y Y 6= 0. The model does not remove direct interactions among y , which can be caused by θij i
correlated noises directly applied to the output nodes (not through causal influences from x). The fully split model MF S introduced in the previous section is an e-flat submanifold of MDS , Y Y = 0 (i 6= j) is further required for M since θij F S.
In the case of n = 2, the full model M is 10-dimensional (excluding higher-order interactions), MF S is 7-dimensional and MDS is 8-dimensional. MDS satisfies the conditions that x1 and y2 as well as x2 and y1 are conditionally independent when (x2 , y1 ) and (x1 , y2 ) are fixed, respectively. However, no Markovian type relations hold because the graph is cyclic. The model is characterized by q(x, y) = f (x)g(y)
Y
h (xi , yi ) .
(25)
i
We use the following mixed coordinates XX Y Y XY XY ξ = ηiX , ηjY , ηij , ηij ; θ12 , θ21 .
(26)
Then, the m-projection of p(x, y) to MSG is given by XX Y Y ξ ∗ = ηiX , ηjY , ηij , ηij ; 0, 0
(27)
in these coordinates. This implies that q ∗ (x) = p(x),
(28)
q ∗ (y) = p(y),
(29)
q ∗ (yi |xi ) = p (yi |xi ) ,
∀i.
(30)
The corresponding measure of integrated information is ΦDS = DKL [p(x, y) : q ∗ (x, y)] .
9
(31)
It satisfies the natural requirement for integrated information (Eq. 20). Thus, it resolves the shortcomings of ΦF S . However, there still remains a problem to take into consideration. To illustrate it, let us consider the two terminal Gaussian case (autoregressive (AR) model), in which x is linearly transformed to y by the connectivity matrix A and the Gaussian noise ǫ is added, y = Ax + ǫ.
(32)
Here, in the two terminals case, A is given by, A11 A12 A= , A21 A22
(33)
and ǫ is zero mean Gaussian noise whose covariance matrix is given by 2 σ1 σ12 Σ(E) = . σ21 σ22
(34)
Let Σ(X) be the covariance matrix of x. Then, the joint probability distribution is written as 1 p(x, y) = exp − xT Σ(X)−1 x + (y − Ax)T Σ(E)−1 (y − Ax) − ψ , (35) 2 where the means of all random variables are assumed to be equal to 0. The θ-coordinates consist of three matrices, θ = (θXX , θY Y , θXY ) , θXX
= Σ(X)−1 ,
θXY
= −AΣ(E)−1
(36)
θY Y = Σ(E)−1 ,
(37) (38)
and the corresponding η-coordinates are η = (ηXX , ηY Y , ηXY ) , ηXX
= Σ(X),
ηY Y = AΣ(X)A,
(39) ηXY = AΣ(X).
(40)
We project p(x, y) (Eq. 35) to MDS . The closest point q ∗ (x, y) is again given by an AR model, y = A∗ x + ε∗ . 10
(41)
where A∗ and the covariance matrix of ε∗ , Σ(E ∗ ), are determined from A, Σ(X) and Σ(E). However, the off-diagonal elements of A∗ is not zero. Therefore, the deletion of the diagonal branches in a graphical model is not equivalent to the deletion of the off-diagonal elements of A in the Gaussian case. The off-diagonal elements of A, Aij , determines causal influences from xi to yj . In the diagonal split model MDS , the causal influences are non-zero because the off-diagonal elements of A∗ , A∗ij , are non-zero. Thus, the corresponding measure of integrated information ΦDS (Eq. 31) does not purely quantify causal influences between the elements. In IIT, integrated information is designed to quantify causal influences [2, 3]. In this sense, it is desirable to have a split model, which results in a diagonal connectivity matrix A.
3.4
Causally split model (Geometric model)
To derive a split model where only causal influences between elements are removed, we consider that the essential part is to remove branches connecting xi and yj (i 6= j), without destroying other constituents. The minimal requirement to remove the effect of the branch (i, j) is to let xi and yj be conditionally independent, when all the other elements are fixed. In our case of n = 2, we should have two Markovian conditions X1 —X2 —Y2 ,
(42)
X2 —X1 —Y1 .
(43)
The split model that satisfies the above conditions was introduced by Oizumi, Tsuchiya and Amari [7] and was called “geometric model” MG , because information geometry was used as a guiding principle to obtain the model. We can also call it “causally split model” because causal influences between elements are removed. The model MG is a 8-dimensional submanifold of M in the case of n = 2, because there are two constraints (Eqs. 42 and 43). These constraints are expressed as q (x1 , y2 |x2 ) = q (x1 |x2 ) q (y2 |x2 ) ,
(44)
q (x2 , y1 |x1 ) = q (x2 |x1 ) q (y1 |x1 ) .
(45)
11
We can write down the constraints in terms of θ-coordinates, but they are nonlinear. They are also nonlinear in the η-coordinates. Thus, MG is a curved submanifold and it is not easy to give an explicit solution of the m-projection of p(x, y) to MG . We can solve the Gaussian case explicitly [7]. It is not difficult to prove that, when the Markovian conditions in Eqs. 42 and 43 are satisfied, the connectivity matrix A′ of an AR model in MG , y = A′ x + E ′ ,
(46)
is a diagonal matrix. From (38), we have A′ = −θY−1Y θXY .
(47)
Thus, the constraints in Eqs. 42 and 43 expressed in terms of θ-coordinates are equivalent to the off-diagonal elements of matrix θY−1Y θXY being 0. Thus, the constraints are nonlinear in the θ-coordinates. The corresponding measure of integrated information, ΦG (geometric integrated information), is given explicitly by ΦG =
1 |Σ(E)| log , 2 |Σ(E ′ )|
(48)
where Σ(E) is the noise covariance of p(x, y), Σ(E ′ ) is that of projected q ∗ (x, y), |Σ(E)| is the determinant of Σ(E). By construction, it is easy to see that ΦG satisfies the requirements for integrated information, 0 ≤ ΦG ≤ I(X; Y ),
(49)
because the causally split model MG includes the submanifold MI consisting of the independent distributions of X and Y (Eq. 21). We believe that ΦG is the best candidate measure in the sense that it is closest to the original philosophy of integrated information in IIT. In IIT, integrated information is designed to quantify causal influences between elements [2, 3]. Note that in IIT, “causal” influences are quantified by Pearl’s intervention framework [11, 2, 12] attempting to quantify the “actual” causation. On the other hand, causal influences quantified in this paper do not necessarily mean actual causation. ΦG is related to observational measures of causation such as Granger causality or Transfer entropy [7]. 12
3.5
Mismatched decoding model
As a different direction from the above measures of integrated information, we can consider another model, called a mismatched decoding model MM D . We use the concept of mismatched decoding in information theory proposed by Merhav et al [13]. We have utilized this concept in the context of neuroscience [14, 15, 16, 6, 17]. To introduce the decoding perspective, let us consider a situation where we try to estimate the input x when the output y is observed. When we know the correct joint probability distribution p(x, y), we can estimate x by using the true distribution p(x|y). This is the optimal matched decoding. However, when we use a split model q(x, y) for decoding, there is always loss of information. This type of decoding is called mismatched decoding because the decoding model q(x, y) is different from the actual probability distribution p(x, y). We previously considered the fully split model as a mismatched decoding model [6] q(y|x) =
Y
q(yi |xi ).
(50)
i
By using the Merhav’s framework, the information loss when q(y|x) is used for decoding can be quantified by [13, 6] ΦM D = minDKL [p(x, y)||q(x, y; β)] . β
(51)
where Q p(x)p(y) i p (yi |xi )β q(x, y; β) = P . Q ′ β ′ i p (yi |xi ) x′ p(x )
(52)
To quantify the information loss ΦM D , the KL-divergence needs to be minimized with respect to the one-dimensional parameter β. We call q(x, y; β) “mismatched decoding model” MM D . The mismatched decoding model MM D forms one-dimensional submanifold. As can be seen in Eq. 52, no interaction terms are included between xi and yj (i 6= j). Thus, MDM is included in the diagonally split graphical model MDS . The optimal β ∗ , which minimizes the KL-divergence, is given by projecting p(x, y) to MM D . Since MM D is not an e-flat submanifold, it is difficult to obtain the analytical expression of q ∗ (x, y). However, the minimization of KL-divergence is a convex problem and thus, the optimal β ∗ can be easily found by numerical calculations such as gradient descent [18, 6]. 13
4
Comparison of various measures of integrated information
We have derived four measures of integrated information from four different definitions of the split model. We elucidate their relations in this section. First, MF S and MDS are e-flat submanifolds, forming exponential families. Therefore, we can directly apply the Pythagorean projection theorem and the projected q ∗ (x, y) is explicitly obtained by using the mixed coordinates. However, MG and MM D are curved submanifolds and thus, it is difficult to analytically obtain the projected q ∗ (x, y) in general. The natural requirements for integrated information, 0 ≤ Φ ≤ I(X, Y ),
(53)
are satisfied for all the measures of integrated information except for MF S . This is because MF S does not include MI (Eq. 21) while the other split models include MI . In general, when M1 ⊃ M2 , min DKL [p : q] ≤ min DKL [p : q]
(54)
Φ2 ≥ Φ1 .
(55)
q∈M1
q∈M2
and therefore,
We have proved MDS ⊃ MF S , MDS ⊃ MM D ,
(56)
MG ⊃ MF S .
(57)
From these relations between the split models, we have the relations between the corresponding measures of integrated information, ΦF S ≥ ΦDS ,
ΦM D ≥ ΦDS ,
ΦF S ≥ ΦG .
(58)
MF S is included in the intersection of MDS and MG . MI is included in MDS , MG , and MM D . The relations among four different measures of integrated information are schematically summarized in Fig. 4. 14
Figure 4: Relations among four different split models. Fully split model MF S and diagonally split graphical model MDS are dually flat manifolds. MF S is represented by a magenta line on XX XY XX , θ XY . M the axis of θ12 DS is represented by a blue square spanned by the two axes θ12 , θii ii Y Y . Causally split model (geometric model) M and mismatched decoding model M and θ12 G MD
are curved manifolds. MG is represented by a curved green surface. MM D is represented by a curved red line inside the surface of MDS . Pind is an independent distribution of x and y, Pind = p(x)p(y), which is represented by a black point. Pind is included in MDS , MM D , and MG but is not included in MF S .
5
Conclusions
We studied four different measures of integrated information in a causal stochastic dynamical system from the unified viewpoint of information geometry. The four measures have their own meanings and characteristics. We elucidated their relations and a hierarchical structure of the measures (Fig. 4). We can define a measure of information transfer for each branch, but their effects are not additive but subadditive. Therefore, we need to study further collective behaviors of deleting branches [19]. This remains a future problem to be studied.
15
References [1] Tononi G. An information integration theory of consciousness. BMC Neurosci. 2004;5:42. doi:10.1186/1471-2202-5-42. [2] Balduzzi D, Tononi G.
Integrated information in discrete dynamical systems:
motivation and theoretical framework.
PLoS Comput Biol. 2008;4(6):e1000091.
doi:10.1371/journal.pcbi.1000091. [3] Oizumi M, Albantakis L, Tononi G. From the phenomenology to the mechanisms of consciousness: integrated information theory 3.0. PLoS Comput Biol. 2014;10(5):e1003588. doi:10.1371/journal.pcbi.1003588. [4] Barrett AB, Barnett L, Seth AK. Multivariate Granger causality and generalized variance. Phys Rev E. 2010;81(4):041907. doi:10.1103/PhysRevE.81.041907. [5] Tegmark M. Improved measures of integrated information. PLoS computational biology. 2016;12(11):e1005123. [6] Oizumi M, Amari S, Yanagawa T, Fujii N, Tsuchiya N. formation from the decoding perspective.
Measuring integrated in-
PLoS Comput Biol. 2016;12(1):e1004654.
doi:10.1371/journal.pcbi.1004654. [7] Oizumi M, Tsuchiya N, Amari S. based on information geometry.
Unified framework for information integration
Proceedings of the National Academy of Sciences.
2016;113(51):14817–14822. [8] Ay N.
Information geometry on complexity and stochastic interaction.
MPI MIS
PREPRINT 95. 2001;. [9] Ay N.
Information geometry on complexity and stochastic interaction.
2015;17(4):2432–2458. doi:10.3390/e17042432. [10] Amari S. Information geometry and its applications. Springer; 2016.
16
Entropy.
[11] Pearl J. Causality. Cambridge university press; 2009. [12] Ay N, Polani D. Information flows in causal networks. Advances in complex systems. 2008;11(01):17–41. [13] Merhav N, Kaplan G, Lapidoth A, Shitz SS. On information rates for mismatched decoders. IEEE Transactions on Information Theory. 1994;40(6):1953–1967. [14] Oizumi M, Ishii T, Ishibashi K, Hosoya T, Okada M. Mismatched decoding in the brain. Journal of Neuroscience. 2010;30(13):4815–4826. [15] Oizumi M, Okada M, Amari S. Information loss associated with imperfect observation and mismatched decoding. Frontiers in computational neuroscience. 2011;5. [16] Boly M, Sasai S, Gosseries O, Oizumi M, Casali A, Massimini M, et al. Stimulus set meaningfulness and neurophysiological differentiation: a functional magnetic resonance imaging study. PLoS One. 2015;10(5):e0125337. doi:10.1371/journal.pone.0125337. [17] Haun AM, Oizumi M, Kovach CK, Kawasaki H, Oya H, Howard MA, et al. Contents of consciousness investigated as integrated information in direct human brain recordings. eNeuro. 2017;in press. [18] Latham PE, Nirenberg S. Synergy, redundancy, and independence in population codes, revisited. Journal of Neuroscience. 2005;25(21):5195–5206. [19] Jost J, Bertschinger N, Olbrich E, Ay N, Frankel S. An information theoretic approach to system differentiation on the basis of statistical dependencies between subsystems. Physica A: Statistical Mechanics and its Applications. 2007;378(1):1–10.
17