* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download perA= ]TY[aMi)` « P^X = ^ = xW - American Mathematical Society
Survey
Document related concepts
Capelli's identity wikipedia , lookup
Symmetric cone wikipedia , lookup
Rotation matrix wikipedia , lookup
Determinant wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Four-vector wikipedia , lookup
Jordan normal form wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Gaussian elimination wikipedia , lookup
Matrix calculus wikipedia , lookup
Cayley–Hamilton theorem wikipedia , lookup
Transcript
PROCEEDINGS OF THE AMERICAN MATHEMATICAL Volume 102, Number 3, March SOCIETY 1988 MULTINOMIAL PROBABILITIES, PERMANENTS AND A CONJECTURE OF KARLIN AND RINOTT R. B. BAPAT (Communicated by Bert E. Fristedt) ABSTRACT. The probability density function of a multiparameter multinomial distribution can be expressed in terms of the permanent of a suitable matrix. This fact and certain results on conditionally negative definite matrices are used to prove a conjecture due to Karlin and Rinott. 1. Introduction. If A = ((a^-)) is an n x n matrix, the permanent of A, denoted by per A, is defined as n perA= ]T Y[aMi)' <r€Sn i=l where Sn is the group of permutations of 1,2,..., n. The probability density function of a multinomial distribution can be expressed in terms of the permanent of a suitable matrix. To begin with a simple example, suppose X denotes the number of heads resulting from n tosses of a coin, with probability of heads equal to p, 0 < p < 1, on a single toss. Then X has the Binomial distribution and its probability density function is given by (1) P(X = x)= n! .,p* g""*, x = 0,l,...,n; x\(n —x)\ where P(X = x) denotes the probability that X equals x, q = 1 —p, and, as usual, P(X = x) is understood to be zero for all values of x not specified in (1). The probability in (1) can be expressed in terms of a permanent as follows: P ••• p x times, x = 0,1,... ,n. P « P^X = ^ = xW^.^ a (n—x) lq times, ■■■ flJ Received by the editors July 11, 1986 and, in revised form, December 15, 1986. Presented at the Second Utah State Matrix Theory Conference held in Logan, Utah, January 29-31, 1987, sponsored by the Department of Mathematics, Utah State University. 1980 Mathematics Subject Classification (1985 Revision). Primary 15A15, 60E15; Secondary 15A48, 94A17. Key words and phrases. matrices. Permanents, multinomial distribution, conditionally ©1988 American 0002-9939/88 467 License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use negative Mathematical definite Society $1.00 + $.25 per page 468 R. B. BAPAT The expression in (2) admits generalizations. For example, suppose n different coins are tossed once and let X be the number of heads obtained. If p¿ is the probability of heads on a single toss of the ith coin and if ç, = 1 —p¿, i = 1,2,..., n, then it can be verified that Pi Pn x times, (3) P(X = x) = x\(n —x)\ per Pi Qi X = 0, 1, ... ,71. Pn <?n (n—x) qi times, Qn More generally, instead of tossing n coins, it may be an experiment of rolling n dice, differently loaded, and if A¿ denotes the number of times i spots are obtained, i = 1,2,..., 6, the density function of (Xi,..., Xq) can be written in terms of a permanent. To make the concepts precise, consider an experiment which can result in any of r possible outcomes and suppose n trials of the experiment are performed. Let ttí3 be the probability that the experiment results in the ith outcome at the jth trial, i — 1,2,..., r; j = 1,2,..., n. Let n denote the r x n matrix ((ttí3■)), which, of course, is column stochastic. Let Xi denote the number of times the z'th outcome is obtained in the n trials, i = l,2,...,r, and let X = (Xi,..., Xr). In this setup we will say that X has the multiparameter multinomial distribution with the r xn parameter matrix n. Let 7r*denote the ith row of n, i = 1,2,... ,r. The density function of X can again be expressed in terms of a permanent as -7T1 (4) P(X = t) = h\-tr per t G Kn, tr times, where Kn = \k — (ki,... ,kr): fc, nonnegative integers, V~^fc¿= n > . It seems that the representation (4) is important in understanding certain properties of the multinomial distribution. The purpose of this paper is to exploit (4) to settle a conjecture due to Karlin and Rinott [4]. 2. The problem. Before stating the main problem, we need some notation. Let Hr =\xGRr: ¿xt=o[. i=l License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use MULTINOMIAL PROBABILITIES 469 DEFINITION 1. A real, symmetric r x r matrix A is said to be conditionally negative definite (c.n.d.) if for any x G Hr, r r Y2^2ai]XiX3 <o. ¿=ij=i Now suppose X = (Xi,... ,Xr) follows the multiparameter multinomial distribution with the r x n parameter matrix n = ((ttí3)). We assume throughout that n is positive and that n > 2, r > 2. Fix k G Kn-2 and let (O) Kij = [fci, . . . , Ki—1, ft» T 1, Ät+1) ■• • »"j —li fcj T 1, Kj+1, ■• ■, "V)) 1 < » # 3 < r, Kii = \Kl, . . • , Ki—1, fci + ¿>, Ki+i, . . . , Kr), 1 S: t — T. Clearly, kij G Kn, i,j = 1,2,... ,r. Our main result is that ((logP(X = ki3))) is c.n.d. This appears as Conjecture 2.1 in Karlin and Rinott [4], where it has been confirmed for r = 2,3 and for any n. We refer to [4] for a discussion concerning the relevance of the problem to multivariate majorization and inequalities as well as for certain consequences of proving the conjecture. In particular, Conjecture 2.2 of [4] is also verified once Conjecture 2.1 is established. It must be remarked that we have used a notation different than that of [4] at various places in the paper. Also, we find it more convenient to work with conditionally negative definite matrices rather than conditionally positive definite matrices. 3. Permanents of positive matrices. In this section we give certain on permanents that will be used. Let A be a real matrix of order (n —2) x n, Let e3 denote the unit row vector (0,..., 0,1,0,..., 0), where the 1 occurs jth place, j = 1,2,..., n. Define an n x n matrix A as follows. The (i,j)th of A is the permanent of the augmented matrix results n > 2. at the entry i,3 = 1,2,..., n. Note that the diagonal entries of À are all zero and À is symmetric. if n = 2, then A is absent and A= Of course 0 1 1 0 The following theorem is a deep result in the theory of permanents. It was originally proved by Alexandroff [1] in a more general form. It serves as a crucial step in the proof of the well-known van der Waerden conjecture due to Egorychev [2, 10] and Falikman [3]. For a proof of the result, see Theorem 2.8 of [10]. Again, beware of differences in notation. THEOREM 2. If A is a positive (n —2)xn matrix, n > 2, then the matrix A is nonsingular and has exactly one, simple, positive eigenvalue. The following result will be deduced from Theorem 2. License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use 470 R. B. BAPAT THEOREM 3. Let A be a positive (n —2) x n matrix, n > 2, and let x1,..., be row vectors in Rn. Define the r x r matrix B = ((bi3)) as xr A bi3 = per »',¿= 1,2,. xi x> Then B has at most one, simple, positive eigenvalue. PROOF. The permanent admits a Laplace expansion in terms of a set of rows [7, p. 16]. Expand per in terms of the last two rows and observe that (6) 6,,-iA^), i,j = 1,2,...,r. Let X be the rxn matrix whose ¿th row is xx, i — 1,2,... ,r. Then it follows from (6) that B = XAX'. By Theorem 2, A has exactly one positive eigenvalue and hence (essentially) by Sylvester's law, B has at most one positive eigenvalue. This completes the proof. 4. Conditionally negative definite matrices. The definition of a c.n.d. matrix was given in §2. The following result will be useful (see, for example, Parthasarathy and Schmidt [8, p. 3]). LEMMA 4. A real, symmetric matrix A is c.n.d. if and only if for each a > 0, the matrix ((e~aaij)) is positive semidefinite. Recall that a function F: (0, oo) —» R is said to be completely it is in C°°(0,oo) and (-l)fcFW(x) definition, F^ = F. The following result appears included for completeness. > 0, x G (0,oo), k = 0,1,2,..., in a recent paper by Micchelli [6]. monotonie if where, by A proof is LEMMA 5. Let A be a c.n.d. nxn matrix with positive entries and let F: (0, oo) —► R be completely monotonie. Then the matrix ((F(üí3))) is positive semidefinite. PROOF. By a well-known theorem of Bernstein F admits the representation (see, for example, [11, p. 160]), oo F(t) = / „-ta Jo dp(o), t > 0, where dp(o) is a Borel measure on (0, oo). The result follows by Lemma 4. Now we have the following. LEMMA 6. Let A be a symmetric, positive rxr positive eigenvalue. Then, for any x G Hr, matrix with exactly one, simple, n c ^i»,j=i License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use MULTINOMIAL PROBABILITIES 471 PROOF. It is known that if A is a symmetric, positive r x r matrix, then there exists a positive vector z such that the matrix B = ((bi3)) = ((üí3zíz3)) is doubly stochastic (see, for example, [5, 9]). Clearly, if A satisfies the hypothesis of the lemma, then B also has only one positive eigenvalue by Sylvester's law, and, furthermore, il b*?> = [I (al3ZiZ3y*> r = n a*p »,i=i for any x G Hr. Thus, we may assume, without loss of generality, that A is doubly stochastic, so that the vector (1,..., 1) is an eigenvector of A corresponding to the eigenvalue 1. Since A has only one positive eigenvalue, any vector x G Hr lies in the span of eigenvectors of A corresponding to only nonpositive eigenvalues and, hence, r r ^2^a,i3xtx3 i=ij=i < 0. Thus, A is c.n.d. The function F(t) = t~a, t > 0, is completely any a > 0 and by Lemma 5, ((aZa)) is positive semidefinite. Hence, for any a > 0, ((e_Qlog"'■>)) is positive semidefinite ((logaij)) is c.n.d. This completes the proof. monotonie for and by Lemma 4, 5. The main result. We are now in a position to prove the following statement, conjectured by Karlin and Rinott [4]. THEOREM 7. Let X = (Xi,... tribution ,XT) have the multiparameter multinomial dis- with the r x n parameter matrix Tí. Let k G Kn-2, in (5), and let m,i3 = P(X = kij), i,j = 1,2,...,r. c.n.d. let kij be defined as Then the matrix ((logm,j)) is PROOF. We have to prove that for any x G Hr, (7) fl m*p< 1. »,i=i Let A be the (n —2) x n matrix formed by taking ki copies of 7r\ the ith row of n, i — 1,2,..., n. Define the r xr matrix B = ((hj)) as A i,j = 1,2, ...,r. bij = per TT3 By Theorem 3, B has at most one positive eigenvalue, and since B is a positive matrix, it must have exactly one positive eigenvalue. So by Lemma 6, r (8) Yl h*p <1 for any x G Hr. i,j=l License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use R. B. BAPAT 472 By the relationship (4), (9) \<i£j<r, (ki\,...,krl)(ki + l)(k3 + l) bu mi3 = P(X = kl3) = Ki<r. I (ki\---kr\)(kl + l)(ki+2Y Let D be the rxr t = 1,2,...,r, diagonal matrix with its ith diagonal entry equal to (fc¿+1)-1, and let M C=k^kl.DBDIt follows from (8) and (10) that for any x G Hr, r (il) n c*p < i. *ij=l From (9) and (10), (i2) 1 < « 7ej < r, m«={(;i:+i )/(ki + 2))di, l<i<r, Since (ki + l)/(/c¿ + 2) < 1, i = 1,2,... ,r, it follows from (12) and (11) that r r H m¡p < } [ c*p < 1 for any x G HT, i,j=l i,j=l and the proof is complete. References 1. A. D. Alexandroff, Zur théorie der gemischten Volumina von konvexen körpern. IV, Mat. Sb. 45 (1938), no. 3, 227-251. (Russian; German summary). 2. G. P. Egorychev, The solution of van der Waerden's problem ¡or permanents, Adv. in Math. 42 (1981), 299-305. 3. D. I. Falikman, A proof o¡ van der Waerden's conjecture on the permanent of a doubly stochastic matrix, Mat. Zametki 29 (1981), 931-938. (Russian) 4. S. Karlin and Y. Rinott, Entropy inequalities ¡or classes o¡ probability distributions. II, The multivariate case, Adv. in Appl. Probab. 13 (1981), 325-351. 5. M. Marcus and M. Newman, The permanent o¡ a symmetric matrix, Notices Amer. Math. Soc. 8 (1981), 595. 6. C. A. Micchelli, Interpolation o¡ scattered data: Distance matrices and conditionally positive definite matrices, Constr. Approx. 2 (1986), 11-22. 7. H. Mine, Permanents, Encyclopedia of Mathematics and its Applications, Vol. 6, Addison- Wesley, Reading, Mass., 1978. 8. K. R. Parthasarathy and K. Schmidt, Positive definite kernels, continuous tensor products and central limit theorems o¡ probability theory, Springer-Verlag, Berlin, 1972. 9. R. Sinkhorn, A relationship between arbitrary positive matrices and doubly stochastic matrices, Ann. Math. Statist. 35 (1964), 876-879. 10. J. H. van Lint, Notes on Egoritsjev's proof of the van der Waerden conjecture, Linear Algebra Appl. 39 (1981), 1-8. 11. D. V. Widder, The Laplace trans¡orm, indian statistical Delhi-110016, institute, Princeton Univ. Press, Princeton, delhi centre N.J., 1946. 7, s. j. s. sansanwal India License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use marc new