Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
NEW MULTIVARIATE DEPENDENCE MEASURES AND APPLICATIONS TO NEURAL ENSEMBLES Ilan N. Goodman and Don H. Johnson ECE Department, Rice University Houston, TX 77251-1892 [email protected], [email protected] ABSTRACT We develop two new multivariate statistical dependence measures. The first, based on the Kullback-Leibler distance, results in a single value that indicates the general level of dependence among the random variables. The second, based on an orthonormal series expansion of joint probability density functions, provides more detail about the nature of the dependence. We apply these dependence measures to the analysis of simultaneous recordings made from multiple neurons, in which dependencies are time-varying and potentially information bearing. 1. INTRODUCTION Quantifying the statistical dependencies among jointly distributed random variables has never been simple. The most commonly used dependence measure, the correlation coefficient, only measures linear dependence between random variables, and applies only to pairs of variables. Other dependence concepts apply only to specific families of distributions, and are not easily extended to ensembles of more than two variables[1]. We develop here two new multivariate dependence measures that generalize easily to large ensembles, and have stronger properties than correlation. The first measure uses the Kullback-Leibler distance to quantify the difference between a measured joint probability function and its statistically independent variant. This measure results in a single value that indicates the general level of dependence without revealing any details about the nature of the dependencies between the variables. In order to determine the interrelationships among simultaneously recorded neurons, such detail is essential. Consequently, we developed a second, more detailed measure that indicates how the random variables depend on each other. Simple examples show that, in general, ensembles of random variables can express more than just pairwise dependence. For example, consider N Bernoulli random variables. Fully specifying their joint distribution requires 2N − 1 parameters; there are N (N − 1)/2 pairwise correlations, which, together with the N marginal probabilities, can only specify the joint distribution when N = 2. For larger ensembles, third and higher order dependencies also need to be determined. The challenge is to quantify these dependencies in a coherent, meaningful way. Our approach is based on an orthonormal series expansion of joint probability functions. 2. DISTANCE FROM INDEPENDENCE As a primary measure of the amount of statistical dependence in an ensemble, we use the Kullback-Leibler (K-L) distance between the joint probability function and its corresponding “independent” variant, the product of the marginal distribution functions of the N component variables: Z pX (x) ν = pX (x) log QN dx (1) n=1 pXn (xn ) Because of the properties of the K-L distance, this dependence measure is non-negative, equalling zero if and only if the variables are independent. This dependence measure is bounded when the component random variables are discrete-valued (in which case the integrals in (1) are sums) and can be unbounded when continuous random variables are involved. This measure represents a generalization of mutual information to the multivariate case, but does not have information theoretic significance. In the case of discrete random variables, we find an upper bound in terms of the entropies of the component variables: ν≤ N X n=1 H(xn ) − max H(xn ) ≡ νmax n (2) Consequently, to analyze discrete data we use the normalized dependence measure ν̂ = ν/νmax . Figure 1(a) shows an example of this dependence measure applied to the ensemble analysis of three time-varying discrete random variables. Starting from bin 25 the ensemble possesses a weak positive dependence, indicated by statistically significant non-zero values of ν̂. 0.2 0.1 0 (1,1,1) 0 10 20 30 40 50 60 70 80 90 0.4 (0,1,1) 0.2 Sample Mean 0 10 20 30 40 50 60 70 80 Symbol (1,0,1) 0 90 0.4 0.2 0 (0,0,1) (1,1,0) 0 10 20 30 40 50 60 70 80 90 (0,1,0) 0.4 (1,0,0) 0.2 0 0 10 20 30 40 50 60 70 80 90 0 Bin Number 10 20 30 40 50 60 70 80 90 Bin Number (a) (b) Fig. 1. Analysis of three Bernoulli distributed random variables. The variables are time-varying to simulate a neural response; at each bin, 100 vector samples representing the joint response were simulated. The top plot in panel (a) shows the dependence measure ν̂ for the data, computed from the type estimates of the joint and marginal distributions. The shaded region highlights the 90% confidence interval for the measure, calculated using the bootstrap method[2]. The bottom plots show the time-varying average of each variable. Panel (b) shows the excess probability η(x1 , x2 , x3 ) for each possible outcome of the ensemble, where the labels (x̂1 , x̂2 , x̂3 ) on the vertical axis indicate the P joint occurance of x̂1 in the first random variable, x̂2 in the second, and x̂3 in the third. Since η(x) = 0, we omit the symbol (0, 0, 0) which can be inferred from the remaining symbols. 90% confidence intervals were again computed using the bootstrap. 3. EXCESS PROBABILITY While the dependence measure ν quantifies the overall dependence in an ensemble, it provides no insight into how the components of the ensemble individually contribute to this dependence. We developed the excess probability measure η to provide such details. This measure is motivated by an expansion of a multivariate probability density function. In [3, 4, 5] it was shown that bivariate probability density functions can be expanded in terms of their marginal distributions. We generalized the expansion for multiple variables: pX (x) = N Y n=1 pXn (xn )·1 + X i1 ,··· ,iN ai1 ···iN N Y n=1 ψin (xn ) (3) The functions {ψin (xn ), in = 0, . . .} form an orthonormal basis with respect to a weighting function equal to the marginal probability function pXn (xn ). These basis functions are chosen so that ψ0 (xn ) = 1. The coeffiR Q cients ai1 ···iN = pX (x) N n=1 ψin (xn )dx are non-zero only when the component random variables are dependent. These coefficients are related to Pearson’s φ2 measure gen- eralized for multivariate distributions: Z X p2X (x) a2i1 ···iN = dx − 1 ≡ φ2 QN p (x ) n n=1 Xn i1 ,··· iN (4) In addition, by substituting (3) into (1), simple bounding arguments show that ν ≤ φ2 . We define η to be the difference between the joint probability function and the product of the marginals: η(x) = pX (x) − N Y pXn (xn ) (5) n=1 If for some value x0 , η(x0 ) > 0, then x0 occurs more frequently than if the component variables were independent. Such x0 define values for which the random variables are positively dependent. Similarly if η(x0 ) < 0, x0 define negative dependence. Clearly, when more than two variables are involved, the nature of statistical dependence can be quite intricate; some subsets of the constituent random variables can be positively or negatively dependent and possibly independent of other subsets. Figure 1(b) shows the excess probability measure η(x) computed for our example. Beginning at bin 25 there is positive dependence between x1 and x2 , which is conditionally more pronounced when x3 = 0. 1 0.5 (2,1) 0 0 50 100 150 200 250 300 350 400 450 (1,1) Symbol 1 Sample Mean 0.5 0 0 50 100 150 200 250 300 350 400 450 (0,1) (2,0) 1 (1,0) 0.5 0 0 50 100 150 200 250 300 350 400 450 0 50 100 150 200 250 Bin Number Bin Number (a) (b) 300 350 400 450 Fig. 2. Analysis of two simultaneously recorded motor neurons in the crayfish optomotor system. The stimulus was a sinusoidal light grating moving at a rate of 11◦ /s. At 2s, the grating was displaced at a constant spatial frequency through an angle of 30◦ . After a short pause (7s to 9s), the grating was returned to its original position at the same frequency. The measured response in each 30ms bin was a vector representing the number of times each neuron fired in the bin; 30 samples (per bin) of the joint response were obtained by repeated presentation of the stimulus. The top plot in panel (a) shows the dependence measure ν̂ for the data, computed from the type estimates of the joint and marginal distributions. The 90% confidence intervals, denoted by asterisks, were calculated using the bootstrap method[2], and are marked only where they do not enclose zero. The bottom plots show the time-varying average for each neuron. Panel (b) shows the excess probability η(x1 , x2 ), where the labels (x̂1 , x̂2 ) on the vertical axis indicate the simultaneous firing of x̂1 spikes by the first neuron, and x̂2 by the second neuron. Only symbols containing significant values of η are shown. 90% confidence intervals were again computed using the bootstrap. 4. NEURAL ENSEMBLE ANALYSIS It is well documented that neural ensembles exhibit dependencies that may contribute to stimulus encoding [6, 7]. We applied our dependence analysis to neural recordings from the crayfish optomotor system. Figure 2 shows an example of this analysis. Bins containing statistically significant values of ν̂ are characterized by both positive and negative dependence in the corresponding values of η, and occur without a concurrent change in the marginal probabilities. Additionally, the onset of the stimulus at 2s makes the neurons temporarily independent. Thus, the time-varying dependencies represent an ensemble code encoding details of the stimulus. 5. CONCLUSION We have presented two new dependence measures that we have found useful for analyzing statistical dependencies between multiple variables. We use ν as the summary measure because of its strong expression of dependence. However, the key to analyzing multivariate dependence is not only the ability to detect it, but also the ability to describe the de- tails of the dependencies. We have found that the excess probability measure η provides those details. We applied our analysis to recordings of neural ensembles, and showed how the dependencies revealed by the analysis contribute to stimulus encoding. What η does not quantify is the degree to which a particular dependence structure contributes to the measured value of ν. In future work, we intend to elaborate on the geometric approach to this problem taken by Amari[8]. Whatever approach to elucidating a random vector’s dependence is used, it is important to consider not only its theoretical properties but also its statistical properties. We have used the bootstrap method to remove bias and estimate confidence intervals for our dependence measures; however, this empirical approach needs the complementary theoretical work to determine how much data are needed to achieve a specified confidence level. Neuroscientists are currently able to make recordings of tens to hundreds of neurons simultaneously. Data-efficient dependence assessment techniques that can be applied to such high-dimensional random vectors are vital to understanding neural ensemble codes. 6. REFERENCES [1] H. Joe, Multivariate Models and Dependence Concepts, Chapman & Hall, 1997. [2] B. Efron and R. Tibshirani, An Introduction to the Bootstrap, Chapman & Hall, 1993. [3] O.V. Sarmanov, “Maximum correlation coefficient (nonsymmetric case),” in Selected Translations in Mathematical Statistics and Probability, vol. 2, pp. 207–210. Amer. Math. Soc., 1962. [4] H.O. Lancaster, “The structure of bivariate distributions,” Ann. Math. Statistics, vol. 29, no. 3, pp. 719– 736, September 1958. [5] J.F. Barrett and D.G. Lampard, “An expansion for some second-order probability distributions and its application to noise problems,” IRE Transactions - Information Theory, vol. 1, pp. 10–15, 1955. [6] J.M. Alonso, W. Usrey, and R. Reid, “Precisely correlated firing in cells of the lateral geniculate nucleus,” Nature, , no. 383, pp. 815–818, Oct. 1996. [7] M. Bezzi, M.E. Diamond, and A. Treves, “Redundancy and synergy arising from pairwise correlations in neuronal ensembles,” Journal of Computational Neuroscience, vol. 12, no. 3, pp. 165–174, May-June 2002. [8] S. Amari, “Information geometry on hierarchy of probability distributions,” IEEE Trans. Info. Theory, vol. 47, no. 5, pp. 1701–1711, July 2001.