Download perA= ]TY[aMi)` « P^X = ^ = xW - American Mathematical Society

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Capelli's identity wikipedia , lookup

Symmetric cone wikipedia , lookup

Rotation matrix wikipedia , lookup

Determinant wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Four-vector wikipedia , lookup

Jordan normal form wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Gaussian elimination wikipedia , lookup

Matrix calculus wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Matrix multiplication wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Transcript
PROCEEDINGS
OF THE
AMERICAN MATHEMATICAL
Volume 102, Number 3, March
SOCIETY
1988
MULTINOMIAL PROBABILITIES,
PERMANENTS
AND A CONJECTURE OF KARLIN AND RINOTT
R. B. BAPAT
(Communicated
by Bert E. Fristedt)
ABSTRACT. The probability
density function of a multiparameter
multinomial distribution
can be expressed in terms of the permanent of a suitable
matrix. This fact and certain results on conditionally
negative definite matrices are used to prove a conjecture due to Karlin and Rinott.
1.
Introduction.
If A = ((a^-)) is an n x n matrix,
the permanent
of A,
denoted by per A, is defined as
n
perA= ]T Y[aMi)'
<r€Sn i=l
where Sn is the group of permutations of 1,2,..., n.
The probability density function of a multinomial distribution can be expressed
in terms of the permanent of a suitable matrix. To begin with a simple example,
suppose X denotes the number of heads resulting from n tosses of a coin, with
probability of heads equal to p, 0 < p < 1, on a single toss. Then X has the
Binomial distribution and its probability density function is given by
(1)
P(X = x)=
n!
.,p* g""*,
x = 0,l,...,n;
x\(n —x)\
where P(X = x) denotes the probability that X equals x, q = 1 —p, and, as usual,
P(X = x) is understood to be zero for all values of x not specified in (1).
The probability in (1) can be expressed in terms of a permanent as follows:
P
•••
p
x times,
x = 0,1,... ,n.
P
«
P^X = ^ = xW^.^
a
(n—x)
lq
times,
■■■ flJ
Received by the editors July 11, 1986 and, in revised form, December 15, 1986. Presented
at the Second Utah State Matrix Theory Conference held in Logan, Utah, January 29-31, 1987,
sponsored by the Department of Mathematics, Utah State University.
1980 Mathematics Subject Classification (1985 Revision). Primary
15A15, 60E15; Secondary
15A48, 94A17.
Key words and phrases.
matrices.
Permanents,
multinomial
distribution,
conditionally
©1988
American
0002-9939/88
467
License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use
negative
Mathematical
definite
Society
$1.00 + $.25 per page
468
R. B. BAPAT
The expression in (2) admits generalizations. For example, suppose n different
coins are tossed once and let X be the number of heads obtained. If p¿ is the
probability of heads on a single toss of the ith coin and if ç, = 1 —p¿, i = 1,2,..., n,
then it can be verified that
Pi
Pn
x times,
(3) P(X = x) =
x\(n —x)\
per
Pi
Qi
X = 0, 1, ... ,71.
Pn
<?n
(n—x)
qi
times,
Qn
More generally, instead of tossing n coins, it may be an experiment of rolling n
dice, differently loaded, and if A¿ denotes the number of times i spots are obtained,
i = 1,2,..., 6, the density function of (Xi,...,
Xq) can be written in terms of a
permanent.
To make the concepts precise, consider an experiment which can result in any of
r possible outcomes and suppose n trials of the experiment are performed. Let ttí3
be the probability that the experiment results in the ith outcome at the jth trial,
i — 1,2,..., r; j = 1,2,..., n. Let n denote the r x n matrix ((ttí3■)), which, of
course, is column stochastic. Let Xi denote the number of times the z'th outcome
is obtained in the n trials, i = l,2,...,r,
and let X = (Xi,...,
Xr). In this setup
we will say that X has the multiparameter multinomial distribution with the r xn
parameter matrix n. Let 7r*denote the ith row of n, i = 1,2,... ,r. The density
function of X can again be expressed in terms of a permanent as
-7T1
(4)
P(X = t) =
h\-tr
per
t G Kn,
tr times,
where
Kn = \k — (ki,...
,kr):
fc, nonnegative integers, V~^fc¿= n > .
It seems that the representation (4) is important in understanding certain properties of the multinomial distribution. The purpose of this paper is to exploit (4)
to settle a conjecture due to Karlin and Rinott [4].
2. The problem.
Before stating the main problem, we need some notation.
Let
Hr =\xGRr:
¿xt=o[.
i=l
License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use
MULTINOMIAL PROBABILITIES
469
DEFINITION 1. A real, symmetric r x r matrix A is said to be conditionally
negative definite (c.n.d.) if for any x G Hr,
r
r
Y2^2ai]XiX3
<o.
¿=ij=i
Now suppose X = (Xi,... ,Xr) follows the multiparameter
multinomial distribution with the r x n parameter matrix n = ((ttí3)). We assume throughout that
n is positive and that n > 2, r > 2.
Fix k G Kn-2 and let
(O)
Kij = [fci, . . . , Ki—1, ft» T 1, Ät+1) ■• • »"j —li
fcj T 1, Kj+1, ■• ■, "V))
1 < » # 3 < r,
Kii = \Kl, . . • , Ki—1, fci + ¿>, Ki+i,
. . . , Kr),
1 S: t — T.
Clearly, kij G Kn, i,j = 1,2,... ,r. Our main result is that ((logP(X
= ki3))) is
c.n.d. This appears as Conjecture 2.1 in Karlin and Rinott [4], where it has been
confirmed for r = 2,3 and for any n. We refer to [4] for a discussion concerning the
relevance of the problem to multivariate majorization and inequalities as well as
for certain consequences of proving the conjecture. In particular, Conjecture 2.2 of
[4] is also verified once Conjecture 2.1 is established. It must be remarked that we
have used a notation different than that of [4] at various places in the paper. Also,
we find it more convenient to work with conditionally negative definite matrices
rather than conditionally positive definite matrices.
3. Permanents
of positive matrices.
In this section we give certain
on permanents that will be used. Let A be a real matrix of order (n —2) x n,
Let e3 denote the unit row vector (0,..., 0,1,0,...,
0), where the 1 occurs
jth place, j = 1,2,..., n. Define an n x n matrix A as follows. The (i,j)th
of A is the permanent of the augmented matrix
results
n > 2.
at the
entry
i,3 = 1,2,..., n.
Note that the diagonal entries of À are all zero and À is symmetric.
if n = 2, then A is absent and
A=
Of course
0 1
1 0
The following theorem is a deep result in the theory of permanents.
It was
originally proved by Alexandroff [1] in a more general form. It serves as a crucial
step in the proof of the well-known van der Waerden conjecture due to Egorychev
[2, 10] and Falikman [3]. For a proof of the result, see Theorem 2.8 of [10]. Again,
beware of differences in notation.
THEOREM 2. If A is a positive (n —2)xn matrix, n > 2, then the matrix A is
nonsingular and has exactly one, simple, positive eigenvalue.
The following result will be deduced from Theorem 2.
License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use
470
R. B. BAPAT
THEOREM 3. Let A be a positive (n —2) x n matrix, n > 2, and let x1,...,
be row vectors in Rn. Define the r x r matrix B = ((bi3)) as
xr
A
bi3 = per
»',¿= 1,2,.
xi
x>
Then B has at most one, simple, positive eigenvalue.
PROOF. The permanent
admits a Laplace expansion in terms of a set of rows
[7, p. 16].
Expand
per
in terms of the last two rows and observe that
(6)
6,,-iA^),
i,j = 1,2,...,r.
Let X be the rxn matrix whose ¿th row is xx, i — 1,2,... ,r. Then it follows from
(6) that B = XAX'.
By Theorem 2, A has exactly one positive eigenvalue and
hence (essentially) by Sylvester's law, B has at most one positive eigenvalue. This
completes the proof.
4. Conditionally
negative
definite matrices.
The definition of a c.n.d.
matrix was given in §2. The following result will be useful (see, for example,
Parthasarathy and Schmidt [8, p. 3]).
LEMMA 4. A real, symmetric matrix A is c.n.d. if and only if for each a > 0,
the matrix ((e~aaij)) is positive semidefinite.
Recall that a function
F: (0, oo) —» R is said to be completely
it is in C°°(0,oo) and (-l)fcFW(x)
definition, F^ = F.
The following result appears
included for completeness.
> 0, x G (0,oo), k = 0,1,2,...,
in a recent paper by Micchelli [6].
monotonie
if
where, by
A proof is
LEMMA 5. Let A be a c.n.d. nxn matrix with positive entries and let F: (0, oo)
—►
R be completely monotonie. Then the matrix ((F(üí3))) is positive semidefinite.
PROOF. By a well-known theorem of Bernstein
F admits the representation
(see, for example, [11, p. 160]),
oo
F(t) = /
„-ta
Jo
dp(o),
t > 0,
where dp(o) is a Borel measure on (0, oo). The result follows by Lemma 4.
Now we have the following.
LEMMA 6. Let A be a symmetric, positive rxr
positive eigenvalue. Then, for any x G Hr,
matrix with exactly one, simple,
n c ^i»,j=i
License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use
MULTINOMIAL
PROBABILITIES
471
PROOF. It is known that if A is a symmetric, positive r x r matrix, then
there exists a positive vector z such that the matrix B = ((bi3)) = ((üí3zíz3)) is
doubly stochastic (see, for example, [5, 9]). Clearly, if A satisfies the hypothesis of
the lemma, then B also has only one positive eigenvalue by Sylvester's law, and,
furthermore,
il
b*?> = [I (al3ZiZ3y*>
r
= n a*p
»,i=i
for any x G Hr.
Thus, we may assume, without loss of generality, that A is doubly stochastic,
so that the vector (1,..., 1) is an eigenvector of A corresponding to the eigenvalue
1. Since A has only one positive eigenvalue, any vector x G Hr lies in the span of
eigenvectors of A corresponding to only nonpositive eigenvalues and, hence,
r
r
^2^a,i3xtx3
i=ij=i
< 0.
Thus, A is c.n.d. The function F(t) = t~a, t > 0, is completely
any a > 0 and by Lemma 5, ((aZa)) is positive semidefinite.
Hence, for any a > 0, ((e_Qlog"'■>)) is positive semidefinite
((logaij)) is c.n.d. This completes the proof.
monotonie
for
and by Lemma 4,
5. The main result.
We are now in a position to prove the following statement,
conjectured by Karlin and Rinott [4].
THEOREM 7. Let X = (Xi,...
tribution
,XT) have the multiparameter multinomial dis-
with the r x n parameter
matrix Tí. Let k G Kn-2,
in (5), and let m,i3 = P(X = kij), i,j = 1,2,...,r.
c.n.d.
let kij be defined as
Then the matrix ((logm,j))
is
PROOF. We have to prove that for any x G Hr,
(7)
fl m*p< 1.
»,i=i
Let A be the (n —2) x n matrix formed by taking ki copies of 7r\ the ith row of
n, i — 1,2,..., n. Define the r xr matrix B = ((hj)) as
A
i,j = 1,2, ...,r.
bij = per
TT3
By Theorem 3, B has at most one positive eigenvalue, and since B is a positive
matrix, it must have exactly one positive eigenvalue. So by Lemma 6,
r
(8)
Yl h*p <1
for any x G Hr.
i,j=l
License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use
R. B. BAPAT
472
By the relationship (4),
(9)
\<i£j<r,
(ki\,...,krl)(ki + l)(k3 + l)
bu
mi3 = P(X = kl3) =
Ki<r.
I (ki\---kr\)(kl + l)(ki+2Y
Let D be the rxr
t = 1,2,...,r,
diagonal matrix with its ith diagonal entry equal to (fc¿+1)-1,
and let
M
C=k^kl.DBDIt follows from (8) and (10) that for any x G Hr,
r
(il)
n c*p < i.
*ij=l
From (9) and (10),
(i2)
1 < « 7ej < r,
m«={(;i:+i
)/(ki + 2))di,
l<i<r,
Since (ki + l)/(/c¿ + 2) < 1, i = 1,2,... ,r, it follows from (12) and (11) that
r
r
H m¡p
< } [ c*p < 1 for any x G HT,
i,j=l
i,j=l
and the proof is complete.
References
1. A. D. Alexandroff,
Zur théorie der gemischten Volumina von konvexen körpern. IV, Mat. Sb. 45
(1938), no. 3, 227-251. (Russian; German summary).
2. G. P. Egorychev,
The solution of van der Waerden's problem ¡or permanents,
Adv. in Math.
42
(1981), 299-305.
3. D. I. Falikman,
A proof o¡ van der Waerden's conjecture on the permanent
of a doubly stochastic
matrix, Mat. Zametki 29 (1981), 931-938. (Russian)
4. S. Karlin
and Y. Rinott,
Entropy
inequalities
¡or classes o¡ probability
distributions.
II, The
multivariate case, Adv. in Appl. Probab. 13 (1981), 325-351.
5. M. Marcus and M. Newman,
The permanent
o¡ a symmetric matrix, Notices Amer. Math. Soc.
8 (1981), 595.
6. C. A. Micchelli,
Interpolation
o¡ scattered
data: Distance
matrices
and conditionally
positive
definite matrices, Constr. Approx. 2 (1986), 11-22.
7. H. Mine, Permanents,
Encyclopedia
of Mathematics
and its Applications,
Vol. 6, Addison-
Wesley, Reading, Mass., 1978.
8. K. R. Parthasarathy
and K. Schmidt, Positive definite kernels, continuous tensor products and
central limit theorems o¡ probability theory, Springer-Verlag,
Berlin, 1972.
9. R. Sinkhorn, A relationship between arbitrary positive matrices and doubly stochastic matrices,
Ann. Math. Statist. 35 (1964), 876-879.
10. J. H. van Lint, Notes on Egoritsjev's
proof of the van der Waerden conjecture, Linear
Algebra
Appl. 39 (1981), 1-8.
11. D. V. Widder,
The Laplace trans¡orm,
indian statistical
Delhi-110016,
institute,
Princeton
Univ. Press, Princeton,
delhi centre
N.J., 1946.
7, s. j. s. sansanwal
India
License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use
marc
new