Download Appendix A: Clustering and dimension reduction of correlation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Appendix A: Clustering and dimension reduction of correlation
matrices
A.1 Pearson correlations
To illustrate the methods used we start with 25 signals of synthetic data representing
artificial multi-channel recordings from a dynamical network (Fig. 1) - each signal
can be viewed to represent a channel recording from a specific component of the
network.
First step in the analysis is to calculate the similarity matrix. In general the similarity
matrix can represent different observables (correlation, mutual correlation, coherence
etc.). In the example shown in details in Appendix A and in most of the examples
shown in this chapter, we consider the correlation matrix using Pearson correlations
[12]:
Ci , j 
 ( X i (t )   i )( X j (t )   j ) 
 i j
(A-1)
Where Xi and Xj are the recorded signals of components i and j with the corresponding
means µi and µj and sample standard deviations (STD) σi and σj.
A.2 Dendrogram clustering
As is mentioned in the text, clustering algorithms are used to identify sub-groups of
components with higher inter-correlations. The dendrogramed clustering method
described here is based on the correlation distances Di,j between components (i)and
(j) – the Euclidian distances in the N-dimensional space of correlations that
corresponds to the correlation matrix. The vector location of a component (or node)
(i) in this space is the corresponding row Ci in the correlation matrix. Hence the
Euclidian distances are simply given by1:
 
Dij  Ci  C j 
N
 (C
ik
 C jk ) 2
k
(A-2)
1
Since the nodes actually are placed on an N-1 dimensional surface in the N-
dimensional space of correlations, another metric can be the Euclidian distances for
the N-1 surface by eliminating from the sum the terms (i,i) and (j,j) or for even
smaller N-2 dimensional sub-space by eliminating the terms (k=i and k=j) from the
summation.
Note that a location vector of a node in the space of correlation simply represents the
correlation of the corresponding component with all the other components of the
network. Therefore, the correlation distance between two components depends both
on the similarity in the activities between the two components as well as on the
relative similarities of the activity of each one of the components with the rest of the
components in the network. Typically when two components have higher correlations
they will also have similar correlations with the other components, yet the
correspondence is not direct especially when the dynamics has high complexity.
Therefore the correlation distances between two components can provide additional
important information beyond their direct correlation.
After calculating the distance matrix, a hierarchical tree is constructed [33]. This
dendrogram links channels by distance hierarchy. The pair of vectors with the shortest
distance is linked, followed by the second pair, etc [34]. When two vectors are linked,
the combined distance to other vectors is recalculated and the rank of the distance
matrix is decreased by one. Note that the vectors in the correlation matrix are not
changed; only the distance matrix is reordered. The combined distances can be
calculated in different ways (average, nearest neighbor, median etc.) In our
calculations, we have made use of the nearest neighbor scheme. The algorithm
terminates when all channels are linked to one group. The output of the algorithm is a
dendrogram plot (Fig. A-1). Using the linkage function, the correlation matrix is then
reordered to follow the linkage information.. The clustering procedure is very
illuminative in rearranging the data into sub-groups of similar characteristics. This
enables the focusing on subsets of the original data containing well defined subgroups of components.
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
17 22
8 13
7 11 15 10 24
1
2
4
9 23 19 16 12 18 20 21 25
3
6 14
5
Fig. A1: Dendrogram plot of Correlation matrix. Labels on X axis reflect nodes
numbers, while Y axis is correlation distance (or dissimilarity). Correlation matrix is taken
from figure 1. Note that to distinct sub-groups are deciphered: left most group (channels 1724) are nine channels of the same frequency, right most group (3-5) are another set of four
channels of the same frequency. Middle group entails pure noise signals
A.3 Dimension reduction
Much effort has been devoted to the development of clustering algorithms. The
motivating idea is to project the data contained in the N-dimensional correlation
matrix on a low dimensional space that captures most of the relevant information.
Usually it is done by first identifying the principal directions in the correlation space
that represents the directions of maximal variations. The next step is to project the
nodes on a few (typically 1-3) leading directions. We illustrate the idea via the
specific Principal Component Analysis (PCA) algorithm that is widely used for
visualizing information of high dimensionality.
The PCA algorithm approximates the "best" linear representation of the information
in lower dimensionality. To do this, the algorithm calculates the directions that best
capture the most information- or in other words, have the most variance. An
illustration of a reduction from 3D to 1D is illustrated in Fig. A2.
It can be shown [6] that the analytic solution to this problem of maximization involves
the digitalization of the correlation matrix. The eigen-vector with the highest
associated eigen-value represents the direction in which the most variance is captured.
The following eigen-vector with the second highest eigen-value represents the next
direction, etc.
The Principal 1D
Vector
The direction of maximal variation
3-D
1-D
Fig. A2: Pictorial illustration of dimension reduction. The example illustrates a
reduction from 3 dimensional space into 1 dimensional space. Original “information” (left) is
represented using only one dimension (right). The principal direction is chosen to represent
the maximal variations.
The PCA procedure involves the following steps: The correlation matrix is
diagnolized and ordered in descending eigen-values:
C  U  G U T
Gii  G jj for any i>j
(A-3)
The algorithm used for diagonalization is the Singular Value Decomposition (SVD)
algorithm [7]. Once the eigen-vectors are calculated, the original data is projected on
these vectors:
C PCA  U T  C
(A-4)
This new CPCA matrix is a description of the original C matrix using not the original
axis (each corresponds to the correlations of a specific component with all other
ones), but a superposition of these axes (the eigen-vectors). The first few eigenvectors that represent the majority of the variation are called the Primary Vectors
(PV's). It is then possible to plot the new CPCA matrix using only the first PV's.
Appendix B: Inclusion of temporal information
As was mentioned in the introduction, the similarity matrices do not include essential
information about the networks behavior – the temporal propagation of the activity
relative timing between the components. This additional information (when available)
is usually presented in temporal ordering matrices whose Ti,j element describes the
relative timing or phase difference between the activity of components i and j. There
are various methods to evaluate the temporal ordering matrices. In studies of ECoG
recorded brain activity, usually it is calculated in terms of ‘phase coherences’ – the
relative imaginary parts of the Fourier transform of the activity of channels i and j.
Another approach is to calculate the cross-correlation function (Equation B-3)
between every two components (i) and (j) and to set Ti,j to be the relative time shift
for which the cross-correlation function has a maximum.
Ci , j ( ) 
 ( X i (t   )   i )( X j (t )   j ) 
(B-3)
 i j
Here τ is defined as the time delay or phase lag between the two signals. This crosscorrelation function can be multiplied by a weight function (e.g. assigning high
weight to small time shifts).
In both cases the problem is how to extract the global time ordering from the relative
timing between pairs. A useful algorithm for completion of this task is described in
the next section.
B.1: Extraction of global time ordering from pairs time delay
The algorithm has to convert the timing matrix Ti,j into a single time propagation or
global temporal ordering vector TO(k). The index k of the TO vector represents the
global temporal ordering of the k-th component. Keeping this in mind, the elements of
the pair timing Ti,j simply represent the difference between two elements of the TO
vector.
That is:
Ti,j = TO(i) –TO(j)
(B-4)
Therefore all the rows in the matrix Ti,j represent the same global temporal ordering
information but with a cyclic permutation. For example, by taking the first row (T1j),
transposing it and adding it to the entire matrix (after the proper dimension
multiplication) we arrive at:
T'ij=Tij+T1i=(Ti-Tj)+(T1-Ti)=T1-Tj=T1j
(B-5)
This means that every row in the new T'ij is the same as the first row – all time
information has been shifted and is relative to the first channel. In the ideal case, we
could have taken just the first row of Tij (or any other row) and extract the global
temporal ordering (Figs. B1 a,b) . In the non ideal real cases there is usually a high
level of noise, different signals have different levels of intensity and the sampling is
over a finite time windows. In general when the cross-correlation function is low
and/or has high variations so that its maximum is poorly defined it is harder to assign
a time lag between them in a statistically significant way (in this cases usually also the
phase coherence is poorly defined). For this reason, the phase information in elements
of low correlation is usually embedded in the noise. To overcome this we developed
the following “remedy” algorithm: the matrix of relative timing is processed by
assigning a non numeric value (NNV) to relative timing with low corresponding
maximum cross-correlation or non significant detection (marked by X in Figs. B1
c,d). After we discard phases that stem from noise, we are not home free: even for
significantly correlated signals, the corresponding relative timings will vary due to
noise and finite sampling limitations.
The following algorithm has been designed to handle both limitations: A list is
created containing the columns that have not received any phase shift. The timing
matrix is then scanned line by line. If a NNV element is found, no value is added to
the corresponding row. If scanning down the rows reveals the missing time delay
(element (2,3) in the timing matrix Fig. B1 c), it is added to the proper row and only
to it (since the other rows have lready been updated). The algorithm for various
simple case is illustrated in Fig. A1. After exhausting all rows of the timing matrix the
median of each column is calculated and the result is the timing vector. The median is
used and not a simple average in order to filter out large phase errors.
In Fig. B1 c a single element with NNV is present so it can be properly handled and
the algorithm manages to extract all the timing information. In Fig. B1 d we show
hypothetical situation that the data is composed of two sub-groups with very low (at
noise level) correlations between them. This makes it impossible to determine a
global temporal ordering of all the components but only a global temporal ordering
within each of the sub-groups. The algorithm automatically set the global temporal
ordering of the first sub-group to be relative to channel 1 (which is assigned timing of
0) and of the second sub-group to be relative to channel 3 (that therefore is also
assigned a timing 0).
This artifact (of having two weakly correlated sub-groups to start at the temporal
ordering at the same time) can easily be detected when the global time ordering is
projected on the functional manifold to create the causal manifold. It will be simply
reflected in two foci nodes (nodes where the activity starts) that are relatively far apart
and have low correlations between them. It also illustrates the crucial importance of
projecting the temporal ordering information together with the functional correlations.
When dealing with periodic signals, there might be an inconsistency in a column due
to the fact that the detected phase can not include information if the time delay is over
the whole cycle. Before computing the median of a column, multiples of whole cycles
should be removed.
In the above described situation of two (or more) sub-groups with two low
correlations between them the relative timing of the sub-groups can be determine
using other measures that represent collective or global properties of each of the subgroups . For example in recorded brain activity one can look at the total intensity of
the activity (RMS of the recorded voltages) of each sub-group on a longer time
windows and determine which sub-group starts the first increase in the activity.
Another collective measure is the combined internal connectivity of each sub-group
(the sum over the correlations between the elements).
B.2 Direct evaluation of global temporal ordering
For networks in which the activity comes in bursts (such as for neural networks) it is
possible to determine directly the global temporal ordering by evaluation of the
relative timing of the activities of the individual components relative to the beginning
or the center of the time window of the collective bursts.
a
0
1
2
3
-1
0
1
2
0
0
0
0
1
1
1
1
-2
-1
0
1
2
2
2
-3
-2
-1
0
3
3
3
+
0
1
2
3
0
1
2
3
2
0
1
2
3
3
0
1
2
3
0
1
2
3
0
2
1
3
0
2
1
3
Time Vector
b
0
2
1
3
-2
0
-1
1
-1
1
0
-3
-2
-2
=
0
0
0
0
2
2
2
2
2
1
1
1
1
0
2
1
3
0
3
3
3
3
0
1
1
3
0
2
1
3
0
1
X
3
0
1
2
3
+
Time Vector
c
=
0
1
X
3
-1
0
1
2
X
-1
0
-3
-2
-1
=
=
0
0
0
0
1
1
1
1
1
X
X
X
X
X
-1
0
0
3
3
3
3
0
1
2
+
=
0
1
X
3
0
1
2
3
1
X
-1
0
3
0
1
2
->
0
0
0
0
0
0
0
0
1
2
2
2
3
0
0
0
+
0
1
X
3
0
1
2
3
2
X
1
2
3
0
0
1
2
3
0
1
2
3
0
1
X
X
0
1
X
X
Time Vector
d
0
1
X
X
-1
0
X
X
0
0
0
0
1
1
1
1
X
X
0
3
X
X
X
X
X
-3
0
X
X
X
+
0
1
X
X
0
1
X
X
X
X
X
0
X
X
X
-3
=
0
1
X
X
0
1
X
X
3
X
X
0
0
X
X
-3
->
0.0
1.1
1.9
3.2
-1.1
0.0
1.0
2.0
0
0
0
0
1.1
1.1
1.1
1.1
-1.9 -1.0
0.0
4.0
1.9
1.9
1.9
-3.2 -2.0 -4.0
0.0
3.2
3.2
3.2
+
0
1.1
1.9
3.2
0
1.1
2.1
3.1
1.9
0
0.9
1.9
5.9
3.2
0
1.2
-0.8
3.2
0
1.1
1.9
3.2
Time Vector
=
=
=
0
0
0
0
0
0
0
0
3
0
0
0
0
X
X
0
3
0
3
3
3
3
X
X
0
3
0
1
0
3
+
Time Vector
e
=
=
=
Fig. B1: Conversion of the relative (pair) timing matrix into a global temporal
ordering vector. a,b: ideal data. c,d: missing timing data due to low correlation. e:
noise inconsistency in the timing matrix.
Recently, a new notion – the "temporal center of mass", has been developed in the
context of neural networks [4]. The idea is to regard the activity density of each
neuron i as a temporal weight function so that its temporal center of mass, Tin , during
an nth Synchronized Bursting Event SBE (see section 3.1 for details about neural
networks activity) is given by:
 (t  T ) D (t  T )dt

 D (t  T )dt
n
T
n
n
i
i
i
(B-6)
n
n
n
Where the integral is over the time window of the SBE, and Tn marks the temporal
location of the n-th SBE which is the combined "center of mass" of all the neurons As
shown in Fig A1, the temporal center of mass of each neuron can vary between the
different SBEs. Therefore we define the relative timing of a neuron i to be Ti≡<Tin>n –
the average of the sequence of SBEs. Similarly, we define the temporal ordering
matrix as:
Ti , j  Ti n  T j n
(B-7)
Next we examine t the activity propagation in real space shown in Fig. B2. It can be
seen that the activity starts near the center of the network in the vicinity of electrodes
(11, 19, 27) and propagates in time in two directions – towards electrodes (3,4) and
towards electrodes (36,37) and then terminated at two opposite sides (near electrode
14 and near electrode 15). Interestingly, when the temporal information is
superimposed on the corresponding holographic network in the 3-D space of leading
PCA eigenvectors, the activity propagates along the manifold in an orderly fashion
from one end to the other (Fig B2). For this reason, it is proposed to view the resulted
manifold which includes the temporal information as a causal manifold.
-a-
-b-
Fig. B2: Inclusion of the temporal information. Left (a): illustration of the activity
propagation in the physical space of the neural network. At each neuron position, we
mark by color code (blue – early; red – late) the values of its corresponding Tin for all
the recorded SBEs. Right (b): illustration of the causal manifold obtained by
projection of the temporal information on the functional manifold in the 3-D space of
the leading PCA principal vectors.
Appendix C: Mathematical analyses of the Affinity transformation
C.1: Sensitivity to the collective normalization
In section 2.2 we presented various options to perform the collective normalization.
We emphasize that there is no “optimal” normalization. Each normalization scheme
has advantage in capturing different features embedded in the correlation matrix, e.g.
separation of the clusters, the internal structure of the clusters, the mutual relations
between the clusters etc.
-a-
-b-
-c-
5
1.5
13
15
14
1
12
16
36
11
0.5
21
40
39 41
38
37 42
4
22
2120
23
18
43 19
44
10
9
3
45
3
-1
2
1
0
PV2
11
16
12
44
43
15
13
17
14
45
39
40
41
42
38
36
37
45
1
-1
-2
-4
35
34
0
12
24
35
25 3331
3430
28
29
2627 32
-0.5
310
45
67
98
2
67 8
0
17
28
29 30
26 25 27
24
33
3231
-1
-3
-2
-1
0
-2
2
23
22
19
18
20
21
0
-2
PV1
-4
-6
-10
-8
-6
-4
-2
0
1.5
20
21
18
19
22 37
23
38
36 42
40
39
41
1
PV3
0.5
14
24
45
0
31
33
32
27
25
26
28
30
29
-0.5
98710
13 43 16 11 6
53
15 4412
17
4
21
34 35
-1
-4
-2
0
PV1
2
-1
1
0
2
3
PV2
-d-
-e-
-f-
Fig. C1: Illustration of the effect of the affinity transformation for correlation
matrix of recorded brain activity. Top left (a) is the sorted correlation matrix. Top
middle (b) is the affinity matrix computed by correlation distances normalization and top right
(c) shows the affinity matrix computed using the mutual-correlations normalization. Bottom
left (d) shows the projection of the correlation matrix on the 3-D space of leading PCA
vectors. Bottom right (e) shows the projection of the affinity matrix in the top middle on the
corresponding 3-D space. The affinity transformation helps to decipher the existence of two
main sub-groups one of which is further composed of two sub-sub-groups.
C.2: Analysis of the principal vectors
To further understand the effect of the affinity transformation we analyze the principal
vectors computed by the PCA algorithm when applied to the correlation matrix in
comparison to its application on the affinity matrix (Fig. C2). As can be seen, for the
correlation matrix the first principal vectors have close to equal weight from all the
channels. This is in contrast to the case of the affinity matrix in which each of the
principal vectors has higher weights for channels that belong to one of the distinct
groups in the sorted correlation matrix. This property leads to two effects:
1. Channels of distinct sub-groups will have a high value of projection on
their respective principal vector, while having very low projections on
other ones.
2. Channels that do not belong to one of the distinct Sub-groups will have
small projection on the principal vectors of the distinct groups and as a
result will be located close to the origin of the 3-D space.
These two effects will cause groups to be displayed along primary axes in the affinity
PCA (group orthogonally is a by-product) and non correlated nodes will be displayed
near the origin.
As can be seen in Fig.C2, the affinity transformation can enhance well correlated subgroups and at the same it can attenuate inter group correlation. Together with the PCA
procedure, groups are easier to detect and analyze. The price of this morphological
group amplification is loosing inner group information that is moved to higher order
principal vectors. Some of this lost information is retrieved by linking the nodes
according to the non-normalized similarities.
correlation matrix PCA decomp of PV1-3
0.35
1
2
3
0.3
0.25
0.2
0.15
0.1
0.05
0
0
5
G1
10
15
20
G2
25
G3 30
35
40
45
Affinity PCA decomp of PV1-3
0.4
1
2
3
0.3
0.2
0.1
0
0
5
G1
10
15
20
G2
25
G3
30
35
40
45
MCA matrix PCA decomp of PV1-3
-a-
0.4
1
2
3
Percent of Varience explained
35
0.2
30
0.1
25
0
0
5
Variation explained (%)
0.3
10
Distance Aff
C Matrix
MC Aff
20
15
20
25
Channel number
30
35
40
45
15
10
5
0
0
5
10
15
20
25
30
Principle Component
35
40
45
-bFig. C2: Analyzing the properties of the principal vectors. The top two graphs (a)
show the eigen-values of the nodes for the three principal vectors of the PCA algorithm –
green, blue and red respectively. The result for the correlation matrix is shown in the top
graph and for the affinity matrix in the one below. G1, G2 and G3 correspond to the three
distinct sub-groups that are observed in the sorted correlation matrix (Fig. C1 a). At the
bottom (b) we show the distribution of the eigen-vlaues for all the PCA vectors. Blue circles
are for the correlation matrix, green dots are for the affinity matrix computed by correlation
distances normalization and red + are for the affinity matrix computed by the mutualcorrelations normalization. In the correlation matrix the first three principal vectors represent
55% of the variation. In The affinity case the first three principal vectors represent only about
40% of the variation for the following reason: While inter group correlation is attenuated,
inner group correlation is amplified (relatively in mutual correlation and absolutely in
distance affinity). This amplification causes members of a group to actually be located further
apart in the affinity space than they were in the correlation space. This is not captured by the
leading principal vectors since they represent each group as a whole. Therefore although they
are more distant they appear closer in the PCA. The manifestation of this fact is seen in
changes of the distribution of the eigenvalues for the three cases.
C.3 Noise sensitivity
The effect of noise at the channel level is a concern since the transformation entails
dividing by distance, which might be sensitive to noise.
From the first definition of the affinity transformation (section 2.2, Eq. 2), we obtain
by direct differentiation:
Aij
Cij
1
  2   (Cik  C jk ) (Cik  C jk ) 
(C-1)
Aij
Cij
d ij k i , j
We define the noise added to the recorded channels as:
xˆ inoise (t )  xˆ i (t )  ni (t )
(C-2)
Using (C-2) in the correlation calculation we have:
Cijnoise 
 ( xˆ i  xˆ j ) 
[( xˆ    nˆ )  ( xˆ i2    nˆ i2 )]1 / 2
2
i
2
i
(C-3)
It has been assumed here that the noise mean is zero, the noise is uncorrelated to the
channels and that the noise in channel i is uncorrelated to noise in channel j. (<ni>=0,
<xi* ni >=0, < ni*nj>=0). It is clear that added noise will lower correlation between
nodes since it creates more variability for the signals recorded from each component
while it does not affect their covariance.
If we further assume that the functional form is similar to all signals and that the noise
level is smaller than the signals, we obtain:
Cij  C
noise
ij


[ xˆi2    xˆ 2j ]1 / 2
 Cij  Cij  

1
   ij  Cij
2
2
2
2
1/ 2
[( xˆi    nˆ i )  ( xˆ i    nˆi )]

(C-4)
  xˆ i2    xˆ 2j 
  n 2 
2
 ij  

n


2
2
2
 2  xˆ i    xˆ j 
  xˆ 
(C-5)
The last equality was derived assuming that channels have approximately the same
variability and so the linear coefficient kappa is equal for all Cij. Inserting the equality
(C-5) into (C-1) we arrive at the following equation for noise sensitivity (for the
Euclidian distances metric):
Aij
Aij
(C-6)


d
2
ij
  (Cik  C jk ) 2    0
k i , j
This result implies that the affinity calculated by this metric and under the
assumptions above (low uncorrelated noise and similar in properties in all signals) has
no first order effect of noise. It is only sensitive to second order and higher noise
contributions.
To verify the validity of the above analysis we show in Fig. C3 the effect of increased
noise on the correlation and affinity matrix for the synthetic data set. The correlation
relative error is negative as expected. Most members of the affinity matrix are more
resilient under noise: up to noise amplitude of 0.15 there is very little variation in
most members of the affinity matrix. For higher noise, there is a small bounded error
but it is smaller than the correlation matrix error. The exception to the above is S 13
(green line in Fig. C4). Since C13=0.99 with no noise, d13 is very small and A13 very
large (S13~=25). The relative error in this case is comparable to the correlation
relative error but is non monotonic and might be a cause for concern. However,
correlations that high are practically never encountered in neural recordings and with
the increased size of the correlation matrix distances will invariably grow larger. This
problem of error due to small distances was never encountered in practice.
Correlation Vs. noise
Affintiy Vs. Noise
1
1
32
0.5
0.5
24
0
0
16
-0.5
-0.5
-1
8
-1
0
0.1
0.2
0.3
noise amplitude
0.4
0
Correlation relative error
0.1
0.2
0.3
noise amplitude
0
0.4
Affinity relative error
0.1
0.1
0.05
0.05
0
0
-0.05
-0.05
-0.1
-0.1
0
0.1
0.2
0.3
noise amplitude
0.4
-0.15
0
0.1
0.2
0.3
noise amplitude
0.4
Fig. C3: Noise impact on correlations and functional correlations. Equally
distributed noise was added to synthetic data of 10 channels. Top left: 5 of the 45 correlations
are displayed with respect to noise amplitude. Green line is C(1,3) which is 0.99. Bottom left:
relative error due to noise in correlations. Top right: the same five members of the affinity
matrix. Note that A(1,3) in green is two orders of magnitude higher (scaled in green right side
axis). Bottom right: relative error in the Affinity matrix due to noise.
In order to maintain a general applicability in the face of the potential error with small
distances, we repeat the noise analysis for the affinity through meta correlation. Using
direct differentiation of Eq. 3:
AMCij
AMCij

 Cˆ
k i , j
ik
Cˆ jk  Cˆ ik  Cˆ jk

 Cˆ ik Cˆ jk
k i , j
 Cˆ
k i , j
ik
Cˆ ik

 Cˆ ik2 
 Cˆ
k i , j
jk
Cˆ jk
(C-
 Cˆ 2jk 
k i , j
k i , j
7)
Using (C-4,5) we arrive at:
AMCij
Cij
  
AMCij
Cij
8)
(C-
It means that the error due to noise in the MC affinity transformation is the same as
the error in the correlation matrix. This result is verified in Fig. C5.
Correlation Vs. noise
MC Affinity Vs. Noise
1
1
0.5
0.8
0
0.6
-0.5
0.4
-1
0
0.1
0.2
0.3
noise amplitude
0.4
0.2
0
Correlation relative error
0
-0.02
-0.02
-0.04
-0.04
-0.06
-0.06
-0.08
-0.08
0
0.1
0.2
0.3
noise amplitude
0.2
0.3
noise amplitude
0.4
MC Affinity relative error
0
-0.1
0.1
0.4
-0.1
0
0.1
0.2
0.3
noise amplitude
0.4
Fig. C4: Noise sensitivity MC functional correlation . We show a comparison of
induced error in the Meta Correlation Affinity matrix (right), compared to the
Correlation matrix (left).
We note that the affinity transformation can be view as a "group gain" function. In
this regard, the normalization by the correlation distances has much higher “gain”
than the normalization by meta correlation. In the latter case, there is no gain (since
correlation is always less then of equal to one), only attenuation. In that respect the
MCA is "well behaved" transformation that does not have inherent singular effects for
too high correlations. To compensate for the small gain in MCA, it is possible to
multiply the correlation matrix with higher orders of the meta correlation matrix
without compromising the noise induced error.
Appendix D: The dynamic synapses and soma model
Computer modeling can serve as a powerful research tool in the studies of biological
dynamical networks [27,28], provided it is utilized and analyzed in proper manners
adopted to the special autonomous (regulating) nature of these systems. Guided by the
above realization, we have developed a new model for neural networks in which both
the neurons and the synapses connecting them are described as dynamical elements
[13]. To model the neurons, we have adopted the Morris-Lecar [29] (M-L) dynamical
description. The reasons for this choice are several: (1) in the M-L element, memory
can be related to the dynamics of potassium channels; (2) it has been shown by
Abbott and Kepler [30] that the M-L equations can be viewed as a reduction (to two
variables) of the Hodgkin-Huxley model; (3) The M-L dynamical system has a special
phase-space, which can lead to a generation of scale-free temporal behavior in
neuronal firing, when fed with a simple noise current.
The simple version of Morris-Lecar model reads:
V   I ion (V ,W )  I ext (t )
W (V )  
W (V )  W (V )
 W (V )
(D-1)
with Iion(V,W) representing the contribution of the internal ionic Ca2+, K+ and leakage
currents, with their corresponding channel conductivities gCa, gK and gL being
constant
Iion (V ,W )  gCa m (V )(V  VCa )  g KW (V )(V  VK )  g L (V  VL )
2)
(D-
The additional current Iext represents all the external current sources stimulating the
neuron, such as signals received through its synapses, glia-derived currents, artificial
stimulations as well as any noise sources. In the absence of any such stimulation, the
fraction of open potassium channels, W(V), relaxes towards its limiting curve
W  (V), which is described by the sigmoid function. The limiting dynamics of
calcium channels are described by m  (V).
In our numerical simulations, we have used the following values: gCa=1.1mS/cm2,
gK=2.0mS/cm2, gL=0.5mS/cm2, VCa=100mV, VK=-70mV, VL=-35mV, V1=-1mV,
V2=15mV, V3=10mV, V4=14.5mV, =0.3. With such a choice of parameters, Ic=0.
According to the theory of neuronal group selection, the size of brains' basic
functional assembly varies between 50 and 104 cells. Motivated by this, and by the
notion of unitary networks (as explained in text), we study the dynamics of networks
composed of 20-60 cells. To follow physiological data [31], 20% of the cells are
usually set inhibitory.
The neurons in the model network exchange action potentials via the multi-state
dynamic synapses, as described by Tsodyks et.al. [32]. In this model, the effective
synaptic strength evolves according to the following equations:
x
z
y
`
z
 ux (t  tsp )
 rec
y
 in
y
 ux (t  tsp )

(D-3)
z
 in  rec
Here, x, y, and z are the fractions of synaptic resources in the recovered, active and
inactive states, respectively. The time-series tsp denote the arrival times of presynaptic spikes, τin is the characteristic time of post-synaptic currents (PSCs) decay,
and τrec is the recovery time from synaptic depression.
The variable u describes the effective use of synaptic resources by the incoming spike.
For facilitating synapses, it obeys the following dynamic equation:
u
u
 facil
 U 0 (1  u ) (t  tsp )
(D-4)
The parameter U0 determines the increase in the value of u with each spike. If no
spikes arrive, the facilitation parameter decays to its baseline value with the time
constant
τfacil. For the depressing synapses (as is the case when post-synaptic neuron is
excitatory) one has τfacil => 0, and u => U0 for each spike.
The effective synaptic current of a neuron i is obtained by summing all of its j
synaptic currents:
i
I syn
  Aj y j (t )
j i
(D-5)
The parameter Aj is the maximal value of synaptic strength.
The values of parameters control the ability of system to exhibit modes of correlated
activity. In our studies (unless indicated otherwise), we assigned to the network the
parameters specified below, using the following notations: I indicates inhibitory
neurons and E - excitatory ones. For example, τrec (E -> I) refers to the recovery time
of a synapse transmitting input to an inhibitory neuron from excitatory one. Hence,
we set: τrec(I=>I)=200ms, τrec(E=>I)=200ms, τrec(I=>E)=1200ms, τrec(E=>E)=1200ms,
U0(I=>I)=0.5, U0(E=>I)=0.5, U0(I=>E)=0.08, U0(E=>E)=0.08, A(I=>I)=9,
A(E=>I)=9, A(E=>E)=2.2, A(I=>E)=6.6. Actual values for each neuron were then
generated as reported in [32]. We set τin=6 ms for all neurons. In addition, due to the
small size of our simulated network, we chose τfacil =2000 ms for all inhibitory
neurons.
To complete the picture, we need to provide a mechanism responsible for the
generation of spontaneous activity in the isolated network. To simulate this, each
neuron is subject to the fluctuating additional current
 , p  0.5
I ad (t  1)  I ad (t )   ;   

 , p  0.5
(D-6)
The fluctuating current may drive the neuron beyond the firing threshold, thus
enabling it to generate spike and trigger the SBE. To keep a proper balance between
the above current and inputs received from other neurons via the synaptic
connections, the additional current Iad is limited to the range Ilow ≤ Iad ≤ Ihigh. The total
current seen by a neuron at any time is a sum of Isyn(t) and Iad(t).
References
[1] Ihmels J.; Levy R.; Barkai N.; Principles of transcriptional control in the metabolic
network of Saccharomyces cerevisiae. Nature Biotechnology, doi:10.1038/nbt918
(2003).
[2] Vernon L. T.; Syed I.; Berger C.; Grzesczcuk R.; Milton J.; Erickson R. K.; Cogen
P.; Berkson E.; Spire J.P.; Identification of the sensory/motor area and pathologic
regions using ECoG coherence. Electroencephalography and clinical
Neurophysiology 1998,106: 30–39.
[3] Brazier M.A.B.; Cluster Analysis. London: Edward Arnold. 256, 1993.
[4] Baruchi I.; Ben-Jacob E.; Functional Holography of Recorded Neuronal Networks
Activity. Neuroinformatics, August 2004, vol. 2, no. 3, pp. 333-352(20).
[5] Hoffman P.E.; Grinstein G.G.; A survey of visualizations for high-dimensional
data mining. Morgan Kaufmann Publishers Inc.(2002) 47--82.
[6] Baruchi I.; Towle V.L.; Ben-Jacob E.; Functional Holography of Complex
Networks Activity From Cultures to the Human Brain
Complexity, Volume 10, No 3, p. 38-51,(2005).
[7] ] Weinstein E.W. "Singular Value Decomposition." From MathWorld
[8] Segev R.; Benveniste M.; Hulata E.; Cohen N.; Palevski A.; Kapon
E.; Shapira Y.; and Ben-Jacob E.; Long term behavior of lithographically prepared
in vitro neuronal networks. Phys. Rev. Lett., 88, 118102, 2002.
[9] Segev R.; Baruchi I.; Hulata E.; Ben-Jacob E.; Hidden Neuronal Correlations in
Cultured Networks. Phys. Rev. Lett. 92, 118102 (2004).
[10] Kamioka H.; Maeda E.; Jimbo Y.; Robinson H.P.C.; Kawana A.; Spontaneous
periodic synchronized bursting during formation of mature patterns of connections in
cortical cultures. Neuroscience Letters ,206 109-112, 1996.
[11] Raichman N.; Volman V.; Ben-Jacob E.; Collective plasticity and individual
stability in cultured neuronal networks. NeuroComputing, in press.
[12] Otnes R.K; Enochson L.; Applied Time Series Analysis. Wiley – Interscience
Publication. 1978.
[13] Volman V.; Baruchi I.; Persi E.; Ben-Jacob E.; Generative Modeling of
Regulated Activity in Cultured Neuronal Networks. Physica A, vol.335, pp.249-278,
2004.
Persi E., Horn D., Volman V., Segev R., Ben-Jacob E.; Modeling of Synchronized
Bursting Events -The Importance of Inhomogeneity. Neural Computation, vol.16(12),
pp.2577-2595, 2004.
[14] Volman V.; Baruchi I.; Ben-Jacob E.; Manifestation of function-follow-form in
cultured neuronal networks. Physical Biology, vol.2, pp.98-110, 2005.
[15] Theoden I. Netoff and Steven J. Schiff. Decreased Neuronal Synchronization
during Experimental Seizures. The Journal of Neuroscience,August 15,2002,22,
7297–7307.
[16] Towle V.L.; Ahmad F.; Kohrman M.; Hecox K. and Chkhenkeli S.,
Electrocorticographic Coherence Patterns of Epileptic Seizures in Epilepsy as a
dynamic disease Ed. by P. Jung and J. Milton 2002 Springer, Berlin.
[17] Doyle, W.K., Spencer, D.D. Anterior temporal resections. In: Epilepsy: A
Comprehensive Textbook. J. Engel, Jr), Pedley, T.A. (Eds.), V. 2. Lippincott-Raven,
1998: 1807-1817.
[18] Chkhenkeli S.A., Towle V.L., Milton J.G., Spire J.-P. Multitarget stereotactic
surgery of intractable epilepsy. In: Abstr. of the XIII Congress of ESSFN. Freiburg.
Germany. 1998, 4: 21.
[19] Bandettini P.A.; Moonen C.T.W.; Aguirre G. K. ; Functional MRI. SpringerVerlag Telos; 1st edition (July 15, 1999).
[20] Volman V.; Baruchi I.; Ben-Jacob E.; Self-regulated homoclinic chaos in neural
networks activity. 8th Experimental Chaos Conference Edited by Boccaletti S. et al.,
American Institute of Physics, Melville NY,p.197-209, (2004)
[21] Hulata E.; Baruchi I.; Segev R.; Shapira Y.; Ben-Jacob E.; Self-Regulated
Complexity in Cultured Neuronal Networks. Phys. Rev. Lett. 92(19):198105(1)198105(4) (2004).
[22] Ben-Jacob E.; Bacterial Self-Organization: Co-Enhancement of Complexification
and Adaptability in a Dynamic Environment. Phil. Trans. R. Soc. Lond. A, 361:12831312, (2003).
[23] Ayali A., Fuchs E., Zilberstein Y., Shefi O., Hulata E., Baruchi I., Ben Jacob E.
Contextual Regularity and Complexity of Neuronal Activity: from Stand-Alone
Cultures to Task-Performing Animals. Complexity, Vol 9, Issue 6, 25-32,
(July/August 2004) .
[24] Stevens B, Porta S, Haak LL, Gallo V, Fields RD (2002) Adenosine: A NeuronGlial Transmitter Promoting Myelination in the CNS in Response to Action Potentials
Neuron, 36, 855–868
[25] P. R. Laming, H. Kimelberg, S. Robinson, A. Salm, N. Hawrylak, C. Müller, B.
Roots and K. Ng., (2000) Neuronal-glial Interactions and Behavior Neuroscience and
Biobehavioral Reviews 24 295-340.
[26] Stout C.E., Constantin J.L., Naus C.G., Charles A.C.(2002) Intercellular calcium
signaling in astrocytes via ATP release through connexin hemichannels J. Biological
Chemistry 277: 10482-10488
[27] Izhikevich E.M.; Which model to use for cortical spiking neurons? IEEE
Transactions on Neural Networks, vol.15, pp.1063-1070, 2004.
[28] Ben-Jacob E., Levine H.; Physical schemata underlying biological pattern
formation - examples, issues and strategies. Physical Biology, vol.1, pp.14-22 , 2004.
[29] Morris C., Lecar H.; Voltage oscillations in the barnacle giant muscle fiber.
Biophysical Journal, vol.35, pp.193-213, 1981.
[30] Abbott L.F., Kepler T.; Model neurons: from Hodgkin-Huxley to Hopfield.
Statistical mechanics of neural networks. Springer-Verlag, Berlin, 1990.
[31] Abeles M.; Corticonics. Cambridge University Press, 1991.
[32] Tsodyks M., Uziel A., Markram H.; Synchrony generation in recurrent networks
with frequency-dependent synapses. J. Neurosci., vol.20, RC50, 2000.
[33] Mathworks. http://www.mathworks.com/.
[34] Krzanowski W. J.; Principles of Multivariate Analysis - A user's Perspective;
Pages: 89-102 ;Oxford University Press; 1990.