Download NEW MULTIVARIATE DEPENDENCE MEASURES AND APPLICATIONS TO NEURAL ENSEMBLES

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Statistics wikipedia , lookup

Transcript
NEW MULTIVARIATE DEPENDENCE MEASURES AND APPLICATIONS TO NEURAL
ENSEMBLES
Ilan N. Goodman and Don H. Johnson
ECE Department, Rice University
Houston, TX 77251-1892
[email protected], [email protected]
ABSTRACT
We develop two new multivariate statistical dependence
measures. The first, based on the Kullback-Leibler distance,
results in a single value that indicates the general level of dependence among the random variables. The second, based
on an orthonormal series expansion of joint probability density functions, provides more detail about the nature of the
dependence. We apply these dependence measures to the
analysis of simultaneous recordings made from multiple
neurons, in which dependencies are time-varying and potentially information bearing.
1. INTRODUCTION
Quantifying the statistical dependencies among jointly distributed random variables has never been simple. The most
commonly used dependence measure, the correlation coefficient, only measures linear dependence between random
variables, and applies only to pairs of variables. Other dependence concepts apply only to specific families of distributions, and are not easily extended to ensembles of more
than two variables[1].
We develop here two new multivariate dependence measures that generalize easily to large ensembles, and have
stronger properties than correlation. The first measure uses
the Kullback-Leibler distance to quantify the difference between a measured joint probability function and its statistically independent variant. This measure results in a single
value that indicates the general level of dependence without
revealing any details about the nature of the dependencies
between the variables. In order to determine the interrelationships among simultaneously recorded neurons, such
detail is essential. Consequently, we developed a second,
more detailed measure that indicates how the random variables depend on each other. Simple examples show that, in
general, ensembles of random variables can express more
than just pairwise dependence. For example, consider N
Bernoulli random variables. Fully specifying their joint distribution requires 2N − 1 parameters; there are N (N − 1)/2
pairwise correlations, which, together with the N marginal
probabilities, can only specify the joint distribution when
N = 2. For larger ensembles, third and higher order dependencies also need to be determined. The challenge is to
quantify these dependencies in a coherent, meaningful way.
Our approach is based on an orthonormal series expansion
of joint probability functions.
2. DISTANCE FROM INDEPENDENCE
As a primary measure of the amount of statistical dependence in an ensemble, we use the Kullback-Leibler (K-L)
distance between the joint probability function and its corresponding “independent” variant, the product of the marginal
distribution functions of the N component variables:
Z
pX (x)
ν = pX (x) log QN
dx
(1)
n=1 pXn (xn )
Because of the properties of the K-L distance, this dependence measure is non-negative, equalling zero if and only
if the variables are independent. This dependence measure is bounded when the component random variables are
discrete-valued (in which case the integrals in (1) are sums)
and can be unbounded when continuous random variables
are involved. This measure represents a generalization of
mutual information to the multivariate case, but does not
have information theoretic significance.
In the case of discrete random variables, we find an upper bound in terms of the entropies of the component variables:
ν≤
N
X
n=1
H(xn ) − max H(xn ) ≡ νmax
n
(2)
Consequently, to analyze discrete data we use the normalized dependence measure ν̂ = ν/νmax . Figure 1(a) shows
an example of this dependence measure applied to the ensemble analysis of three time-varying discrete random variables. Starting from bin 25 the ensemble possesses a weak
positive dependence, indicated by statistically significant
non-zero values of ν̂.
0.2
0.1
0
(1,1,1)
0
10
20
30
40
50
60
70
80
90
0.4
(0,1,1)
0.2
Sample Mean
0
10
20
30
40
50
60
70
80
Symbol
(1,0,1)
0
90
0.4
0.2
0
(0,0,1)
(1,1,0)
0
10
20
30
40
50
60
70
80
90
(0,1,0)
0.4
(1,0,0)
0.2
0
0
10
20
30
40
50
60
70
80
90
0
Bin Number
10
20
30
40
50
60
70
80
90
Bin Number
(a)
(b)
Fig. 1. Analysis of three Bernoulli distributed random variables. The variables are time-varying to simulate a neural response; at each bin,
100 vector samples representing the joint response were simulated. The top plot in panel (a) shows the dependence measure ν̂ for the data,
computed from the type estimates of the joint and marginal distributions. The shaded region highlights the 90% confidence interval for the
measure, calculated using the bootstrap method[2]. The bottom plots show the time-varying average of each variable. Panel (b) shows the
excess probability η(x1 , x2 , x3 ) for each possible outcome of the ensemble, where the labels (x̂1 , x̂2 , x̂3 ) on the vertical axis indicate the
P
joint occurance of x̂1 in the first random variable, x̂2 in the second, and x̂3 in the third. Since
η(x) = 0, we omit the symbol (0, 0, 0)
which can be inferred from the remaining symbols. 90% confidence intervals were again computed using the bootstrap.
3. EXCESS PROBABILITY
While the dependence measure ν quantifies the overall dependence in an ensemble, it provides no insight into how
the components of the ensemble individually contribute to
this dependence. We developed the excess probability measure η to provide such details. This measure is motivated by
an expansion of a multivariate probability density function.
In [3, 4, 5] it was shown that bivariate probability density
functions can be expanded in terms of their marginal distributions. We generalized the expansion for multiple variables:
pX (x) =
N
Y
n=1

pXn (xn )·1 +
X
i1 ,··· ,iN
ai1 ···iN
N
Y
n=1

ψin (xn )
(3)
The functions {ψin (xn ), in = 0, . . .} form an orthonormal basis with respect to a weighting function equal to
the marginal probability function pXn (xn ). These basis
functions are chosen so that ψ0 (xn ) = 1. The coeffiR
Q
cients ai1 ···iN = pX (x) N
n=1 ψin (xn )dx are non-zero
only when the component random variables are dependent.
These coefficients are related to Pearson’s φ2 measure gen-
eralized for multivariate distributions:
Z
X
p2X (x)
a2i1 ···iN =
dx − 1 ≡ φ2
QN
p
(x
)
n
n=1 Xn
i1 ,··· iN
(4)
In addition, by substituting (3) into (1), simple bounding
arguments show that ν ≤ φ2 .
We define η to be the difference between the joint probability function and the product of the marginals:
η(x) = pX (x) −
N
Y
pXn (xn )
(5)
n=1
If for some value x0 , η(x0 ) > 0, then x0 occurs more frequently than if the component variables were independent.
Such x0 define values for which the random variables are
positively dependent. Similarly if η(x0 ) < 0, x0 define
negative dependence. Clearly, when more than two variables are involved, the nature of statistical dependence can
be quite intricate; some subsets of the constituent random
variables can be positively or negatively dependent and possibly independent of other subsets.
Figure 1(b) shows the excess probability measure η(x)
computed for our example. Beginning at bin 25 there is positive dependence between x1 and x2 , which is conditionally
more pronounced when x3 = 0.
1
0.5
(2,1)
0
0
50
100
150
200
250
300
350
400
450
(1,1)
Symbol
1
Sample Mean
0.5
0
0
50
100
150
200
250
300
350
400
450
(0,1)
(2,0)
1
(1,0)
0.5
0
0
50
100
150
200
250
300
350
400
450
0
50
100
150
200
250
Bin Number
Bin Number
(a)
(b)
300
350
400
450
Fig. 2. Analysis of two simultaneously recorded motor neurons in the crayfish optomotor system. The stimulus was a sinusoidal light
grating moving at a rate of 11◦ /s. At 2s, the grating was displaced at a constant spatial frequency through an angle of 30◦ . After a short
pause (7s to 9s), the grating was returned to its original position at the same frequency. The measured response in each 30ms bin was a
vector representing the number of times each neuron fired in the bin; 30 samples (per bin) of the joint response were obtained by repeated
presentation of the stimulus. The top plot in panel (a) shows the dependence measure ν̂ for the data, computed from the type estimates of
the joint and marginal distributions. The 90% confidence intervals, denoted by asterisks, were calculated using the bootstrap method[2],
and are marked only where they do not enclose zero. The bottom plots show the time-varying average for each neuron. Panel (b) shows the
excess probability η(x1 , x2 ), where the labels (x̂1 , x̂2 ) on the vertical axis indicate the simultaneous firing of x̂1 spikes by the first neuron,
and x̂2 by the second neuron. Only symbols containing significant values of η are shown. 90% confidence intervals were again computed
using the bootstrap.
4. NEURAL ENSEMBLE ANALYSIS
It is well documented that neural ensembles exhibit dependencies that may contribute to stimulus encoding [6, 7]. We
applied our dependence analysis to neural recordings from
the crayfish optomotor system. Figure 2 shows an example of this analysis. Bins containing statistically significant
values of ν̂ are characterized by both positive and negative dependence in the corresponding values of η, and occur
without a concurrent change in the marginal probabilities.
Additionally, the onset of the stimulus at 2s makes the neurons temporarily independent. Thus, the time-varying dependencies represent an ensemble code encoding details of
the stimulus.
5. CONCLUSION
We have presented two new dependence measures that we
have found useful for analyzing statistical dependencies between multiple variables. We use ν as the summary measure
because of its strong expression of dependence. However,
the key to analyzing multivariate dependence is not only the
ability to detect it, but also the ability to describe the de-
tails of the dependencies. We have found that the excess
probability measure η provides those details. We applied
our analysis to recordings of neural ensembles, and showed
how the dependencies revealed by the analysis contribute to
stimulus encoding.
What η does not quantify is the degree to which a particular dependence structure contributes to the measured value
of ν. In future work, we intend to elaborate on the geometric
approach to this problem taken by Amari[8].
Whatever approach to elucidating a random vector’s dependence is used, it is important to consider not only its
theoretical properties but also its statistical properties. We
have used the bootstrap method to remove bias and estimate
confidence intervals for our dependence measures; however,
this empirical approach needs the complementary theoretical work to determine how much data are needed to achieve
a specified confidence level. Neuroscientists are currently
able to make recordings of tens to hundreds of neurons simultaneously. Data-efficient dependence assessment techniques that can be applied to such high-dimensional random
vectors are vital to understanding neural ensemble codes.
6. REFERENCES
[1] H. Joe, Multivariate Models and Dependence Concepts, Chapman & Hall, 1997.
[2] B. Efron and R. Tibshirani, An Introduction to the Bootstrap, Chapman & Hall, 1993.
[3] O.V. Sarmanov, “Maximum correlation coefficient
(nonsymmetric case),” in Selected Translations in
Mathematical Statistics and Probability, vol. 2, pp.
207–210. Amer. Math. Soc., 1962.
[4] H.O. Lancaster, “The structure of bivariate distributions,” Ann. Math. Statistics, vol. 29, no. 3, pp. 719–
736, September 1958.
[5] J.F. Barrett and D.G. Lampard, “An expansion for some
second-order probability distributions and its application to noise problems,” IRE Transactions - Information
Theory, vol. 1, pp. 10–15, 1955.
[6] J.M. Alonso, W. Usrey, and R. Reid, “Precisely correlated firing in cells of the lateral geniculate nucleus,”
Nature, , no. 383, pp. 815–818, Oct. 1996.
[7] M. Bezzi, M.E. Diamond, and A. Treves, “Redundancy
and synergy arising from pairwise correlations in neuronal ensembles,” Journal of Computational Neuroscience, vol. 12, no. 3, pp. 165–174, May-June 2002.
[8] S. Amari, “Information geometry on hierarchy of probability distributions,” IEEE Trans. Info. Theory, vol. 47,
no. 5, pp. 1701–1711, July 2001.