Download POSITIVE DEFINITE RANDOM MATRICES

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Eigenvalues and eigenvectors wikipedia , lookup

System of linear equations wikipedia , lookup

Determinant wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Jordan normal form wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Quadratic form wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Gaussian elimination wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Fisher–Yates shuffle wikipedia , lookup

Matrix calculus wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Matrix multiplication wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Transcript
Comptes rendus de l’Academie bulgare des Sciences, Tome 59, № 4, 2006, p. 353 – 360.
POSITIVE DEFINITE RANDOM MATRICES
Evelina Veleva
Abstract: The paper begins with necessary and sufficient conditions for positive
definiteness of a matrix. These are used for obtaining an equivalent by distribution
presentation of elements of a Wishart matrix by algebraic functions of independent random
variables. The next section studies when sets of sample correlation coefficients of the
multivariate normal (Gaussian) distribution are dependent, independent or conditionally
independent. Some joint densities are obtained. The paper ends with an algorithm for
generating uniformly distributed positive definite matrices with preliminary fixed diagonal
elements.
Key words: positive definite matrix, Wishart distribution, multivariate normal
(Gaussian) distribution, sample correlation coefficients, generating random matrices
2000 Mathematics Subject Classification: 62H10
1. Necessary and sufficient conditions for positive definiteness of a matrix
Definition. Suppose A is a real symmetric matrix. If, for any non-zero vector x , we
have that
x′A x > 0
then A is a positive definite matrix (here and subsequently x′ is the transpose of x ).
Let A = {ai j }in, j =1 be a real matrix, and let N = {1, 2,… , n} . For any nonempty
α , β ⊂ N let A(α , β ) be the submatrix of A obtained by deleting from A the rows
indexed by α and the columns indexed by β . To simplify notation we use A (α ) instead of
A (α ,α ) . We will write the complement of α in N as α c . For any integers i and j,
1 ≤ i ≤ j ≤ n we will denote by A (α )i j and A(α )i0j the block matrices
⎛ A (α )
A (α ,{ j}c ) ⎞
⎛ A (α )
A (α ,{ j}c ) ⎞
0
=
(
α
)
,
A
A (α )i j = ⎜
⎟
⎜
⎟.
ij
c
⎜ A′(α ,{i}c )
⎟
ai j
0
⎝ A′(α ,{i} )
⎠
⎝
⎠
Theorem 1. Let A be a real symmetric matrix of size n, and let i, j be fixed integers
such that 1 ≤ i < j ≤ n . The matrix A is positive definite if and only if the matrices A ({i})
and A ({ j}) are both positive definite and the element ai j satisfies the inequalities
− A({i, j})i0j −
A({i}) A({ j})
A({i, j})
< ai j <
− A({i, j})i0j +
A({i}) A({ j})
A({i, j})
,
where . denotes determinant of a matrix.
Theorem 2. A real symmetric matrix A = {ai j }in, j =1 is positive definite if and only if
its elements satisfy the following conditions:
aii > 0 ,
i = 1,… , n ;
− a11 aii < a1i < a11 aii , i = 2,… , n ;
− A({i,… , n})i0j −
A({i,… , n})ii A ({i,… , n}) j j
A ({i,… , n})
<
< ai j
− A ({i,… , n})i0j +
A({i,… , n})i i A ({i,… , n}) j j
A({i,… , n})
i = 2,… , n − 1,
,
j = i + 1,… , n.
The proofs of Theorems 1 and 2 will appear in [1].
2. Wishart random matrices
Let X1 ,…, X n be independent, Xi ~ N p (0, Σ) , with n ≥ p , i.e., Xi has a pdimensional multivariate normal distribution with mean vector a zero vector 0 and
covariance matrix Σ , Σ > 0 and let W = ∑ Xi X′i . We say that W has a central Wishart
distribution with n degrees of freedom and covariance matrix Σ , and write W ~ W p (n, Σ) . It
is easy to see that W is a p × p matrix and that W > 0 . From this definition, it is apparent
that
EW = nΣ,
(1)
AWA′ ~ Wq (n, AΣA′) ,
where A is q × p of rank q.
It is known that, for an arbitrary positive definite p × p matrix Σ there exist a
p × p matrix B such that Σ = BB′ . Consequently, from (1) it follows that if W ~ W p (n, I ) ,
where I is the identity matrix of size p, then BWB′ ~ W p (n, Σ) . Therefore from now on we
consider the case Σ = I .
Ignatov and Nikolova in [2] consider the distribution Ψ (n, p ) with joint density
(2)
C D
( n − p −1) / 2
where
2
IE ,
- C is a constant;
- D is the matrix
⎛ 1
⎜
⎜ x12
D=⎜
⎜ .
⎜ x1 p
⎝
x12
1
.
x2 p
x1 p ⎞
⎟
… x2 p ⎟
;
. . ⎟⎟
… 1 ⎟⎠
…
- I E is the indicator of the set E, containing all points ( xij , 1 ≤ i < j ≤ p ) in the real
space R p ( p −1) / 2 , such that the matrix D is positive definite.
They have proved the following Proposition:
Proposition. Let τ1 ,τ 2 ,… ,τ p be independent and identically χ 2 - distributed
random variables with n degrees of freedom and let random variables ν ij , 1 ≤ i < j ≤ p have
distribution Ψ (n, p ) . Let us assume, that the set { τ1 ,τ 2 ,… ,τ p } is independent of the set
{ν i j , 1 ≤ i < j ≤ p }. Then the matrix below V has the Wishart distribution W p ( n, I ) ,
⎛ τ …
⎜ 1
V= ⎜
⎜
⎜ 0
⎝
⎛ 1 ν 12
0 ⎞⎜
⎟ ⎜ ν 12
1
⎟⎜
⎟
τ p ⎟⎠ ⎜⎜
⎝ν 1 p ν 2 p
ν1 p ⎞
⎟ ⎛ τ1 …
⎟⎜
⎟ ⎜⎜ 0
1 ⎟⎠ ⎝
ν 2 p ⎟⎜
0 ⎞
⎟
⎟.
⎟
τ p ⎟⎠
Using the same variable transformation as in the proof of this Proposition, it can be
shown that the opposite statement is also true. So, the density (2) gives actually the joint
density of the entries of the sample correlation matrix in the case Σ = I .
The next Theorem gives an equivalent by distribution presentation of the random
variables ν ij , 1 ≤ i < j ≤ p , having distribution Ψ (n, p ) by functions of independent
random variables. This can be used for generating Ψ (n, p ) distributed random matrices and
consequently Wishart ones (see [3]).
Theorem 3. Let the random variables ηij ,1 ≤ i < j ≤ p be mutually independent.
Suppose that ηi i +1 , … , ηip be identically distributed Ψ (n − i + 1, 2) for i = 1,… , p − 1 . The
random variables ν ij , 1 ≤ i < j ≤ p , defined by
ν1i = η1i , i = 2,…, p,
2
ν 2i = η12η1i + η2i (1 − η12
)(1 − η12i ), i = 3,… , p,
3
i −1
i −1 ⎛
q =1
t =1 ⎝
t −1
⎞
q =1
⎟
⎠
ν ij = ηij ∏ (1 − η qi2 )(1 − ηq2 j ) + ∑ ⎜ηti ηtj ∏ (1 − ηqi2 )(1 − ηqj2 ) ⎟, 3 ≤ i < j ≤ p
⎜
have distribution Ψ (n, p ) .
To generate random Ψ (n, p ) matrices, using Theorem 3, we have to know how to
generate random variables with distribution Ψ (k , 2) for k ≥ 2 . For p = 2 the density (2) of
the distribution Ψ (n, p) has the form
C (1 − y 2 )
n −3
2 ,
y ∈ (−1,1)
and it is known as the Pearson probability distribution of second type. Random variables
with this distribution can be easily generated (see [4], p.481) using the quotient of the
difference and the sum of two gamma - distributed random variables.
3. Marginal densities of the distribution Ψ (n, p)
In recent years the Wishart distribution and distributions derived from the Wishart
one have received a lot of attention because of their use in graphical Gaussian models. The
essence of graphical models in multivariate analysis is to identify independences and
conditional independences between various groups of variables.
Empirical correlation matrices are of great importance for risk management and
asset allocation. The study of correlation matrices is one of the cornerstone of Markowitz’s
theory of optimal portfolios.
Using Theorems 1 and 2 it is possible to get some marginal densities of the
distribution Ψ (n, p ) and to investigate when sets of sample correlation coefficients of the
multivariate normal (Gaussian) distribution are dependent, independent or conditionally
independent.
Let ν ij , 1 ≤ i < j ≤ p be random variables with distribution Ψ (n, p ) . Denote by V
the set of random variables V = {ν ij , 1 ≤ i < j ≤ p} . In [5] we have proved the following
Theorem:
Theorem 4. Every p random variables from the set V are dependent.
Let us introduce the notation f S for the joint density of the random variables from a
set S ; if S = Φ , i.e. the set S is empty we define f S = 1 .
Theorem 5.
Let r and q be arbitrary integers, such that 2 ≤ r ≤ p − 1 and
1 ≤ q ≤ p − 2 . Let us denote by
S1 and
S2 = {ν ij q + 1 ≤ i < j ≤ p} . Then
4
S2
the sets
S1 = {ν ij 1 ≤ i < j ≤ r} and
f S1 ∪ S2 =
(3)
f S1 f S2
f S1 ∩ S2
.
The proof will appear in [6].
We denote by f S1 ∪ S2 / S1 ∩ S2 , f S1 / S1 ∩ S2 and f S2 / S1 ∩S2 the densities of S1 ∪ S2 , S1
and S2 , conditioned on the random variables from the set S1 ∩ S2 . If we divide the two
sides of equality (3) by f S1 ∩ S2 we get the relation
(4)
f S1 ∪ S2 / S1 ∩ S2 = f S1 / S1 ∩ S2 f S2 / S1 ∩S2 .
According to the definition of conditional independence, given in [7], equality (4) shows that
the random variables from the sets S1 and S2 are independent conditionally on the random
variables from the set S1 ∩ S2 .
Theorem 6. Let q1 ,… , qk , r1 ,… rk be integers, such that 1 = q1 ≤ q2 ≤ … ≤ qk ≤ p and
1 ≤ r1 ≤ r2 ≤ … ≤ rk = p . Let us denote by Sl the set Sl = {vij ql ≤ i < j ≤ r l}, l = 1,… k . Then
f S1 ∪…∪ Sk =
f S1 f S2 … f Sk
f S1 ∩ S2 f S2 ∩S3 … f Sk −1 ∩ Sk
.
Let S = {ν is js , s = 1,… , k} be an arbitrary subset of the set of random variables V . To
this subset S we can attach a graph G ( S ) with nodes {1, 2,…, p} and k undirected edges
{ {is , js }, s = 1,…, k} .
Theorem 7. Let S1 and S2 be two subsets of the set V and G1 and G2 be their
corresponding graphs. Let us denote by K1 the set of the numbers of the nodes, which are
vertex of an edge from G1 . Analogically, for the graph G2 let us denote the corresponding
set by K 2 and let K = K1 ∩ K 2 . If the set K contains at most one element then
f S1 ∪ S2 = f S1 f S2 ,
i.e. the set S1 and S2 are independent. If the set K contains at least two elements and for
every two elements r and q from K , r < q the random variable ν rq belongs
simultaneously to S1 and S2 then
f S1 ∪ S2 =
f S1 f S2
f S1 ∩ S2
,
i.e. the sets S1 and S2 are independent conditioned on the set S1 ∩ S2 .
5
Theorem 8. Let S be a subset of the set V and G be the corresponding graph. Let
us denote by S+ the subset of S containing all variables which edges in the graph G are
part of a closed path (circuit). If S− is the set S− = S \ S+ then the sets S+ and S− are
independent.
The proofs of Theorems 6 - 8 will appear in [8].
In [9] we have proved the following two Theorems:
Theorem 9. Let r and q be arbitrary integers, such that 1 ≤ r < q ≤ p . The random
variables from the set S = V \ {ν rq } have joint density function of the form
(n− p )
⎛ (n − p + 1) ⎞ ⎛ 1 ⎞
Γ⎜
Γ⎜ ⎟ ( D
2
⎟
r ;r Dq ;q )
2
⎠ ⎝2⎠
C ⎝
IE ,
( n − p +1)
⎛n− p+2⎞
Γ⎜
2
⎟
Dr , q;r ,q
2
⎝
⎠
where
- I E is the indicator of the set E, containing all points in R p ( p −1) / 2−1 , such that the
matrices Dr ;r and Dq;q are both positive definite;
- Γ(.) is the well-known Gamma function.
Theorem 10. Let r be an arbitrary integer, such that 1 ≤ r ≤ p . The random
variables from the set S = {ν ij 1 ≤ i < j ≤ p, i ≠ r , j ≠ r} have joint density function of the
form
fS =
⎡ ⎛ n ⎞⎤
⎢Γ ⎜ 2 ⎟ ⎥
⎣ ⎝ ⎠⎦
p −2
⎛ n −1 ⎞ ⎛ n − 2 ⎞
⎛ n − p + 2 ⎞ ⎡ ⎛ 1 ⎞⎤
Γ⎜
⎟Γ⎜
⎟… Γ⎜
⎟ ⎢Γ ⎜ ⎟ ⎥
2
2
2
⎝
⎠ ⎝
⎠
⎝
⎠ ⎣ ⎝ 2 ⎠⎦
( p −1) ( p − 2)
2
Dr ;r
n− p
2
for all points in R( p −1)( p − 2) / 2 for which the matrix Dr ;r is positive definite.
4. Uniformly distributed positive definite matrices
The positive definiteness of a matrix is a necessary condition for applying many
numerical algorithms. The diagonal elements of the matrix have often specified significance.
The correctness of such numerical algorithm can be proven if we are able to choose a
positive definite matrix at random with uniform distribution. The space of all positive
definite matrices is, however, a cone and consequently a uniform distribution cannot be
defined over the whole cone because it has infinite volume. This enforces to introduce
additional restrictions on the matrices, which together with the positive definiteness reduce
our choice within a set with finite volume. An algorithm for generating uniformly distributed
positive definite matrices with fixed diagonal elements is given below. If the user does not
6
want to fix concrete diagonal elements of the matrix in advance, he has though to assign
bounds for each diagonal element. For instance, he can choose all diagonal elements to be in
the interval (0,100). Then a uniform random number in the chosen bounds has to be
generated for every diagonal element. The described below algorithm allows the user a
random choice among all positive definite matrices with concrete diagonal elements that are
either fixed in advance or randomly generated within the chosen bounds.
From now on we assume that a11 , … , ann are the diagonal elements of the matrix,
chosen in accordance with users’ preferences. They have to be positive so that to exist at
least one positive definite matrix with such diagonal elements (see Theorem 1).
Theorem 11. Let the random variables ηij ,1 ≤ i < j ≤ p be mutually independent.
Suppose that ηi i +1 , … , ηip be identically distributed Ψ (n − i + 2, 2) for i = 1,… , p − 1 .
Consider the random variables ν ij , 1 ≤ i < j ≤ p , defined by
ν1i = η1i a11aii , i = 2,…, p,
⎡ i −1 ⎛
r −1
⎞
i −1
⎤
⎢⎣ r =1 ⎜⎝
q =1
⎟
⎠
q =1
⎥⎦
2
2
ν ij = = aii a jj ⎢ ∑ ⎜η riη rj ∏ (1 − η qi
)(1 − η q2 j ) ⎟ + ηij ∏ (1 − η qi
)(1 − η q2 j ) ⎥
i = 2,… , p − 1, j = i + 1,… , p .
The joint density function of the random variables ν ij , 1 ≤ i < j ≤ p is of the form:
fν ij ,1≤i < j ≤ p ( xij ,1 ≤ i < j ≤ p ) = C I E ,
where
- C is the constant
⎡ ⎛ p + 1 ⎞⎤
⎢Γ ⎜ 2 ⎟ ⎥
⎠⎦
⎣ ⎝
C=
⎛ p ⎞ ⎛ p −1 ⎞
⎛ 3 ⎞ ⎡ ⎛ 1 ⎞⎤
Γ⎜ ⎟Γ⎜
⎟ … Γ ⎜ ⎟ ⎢Γ ⎜ ⎟ ⎥
⎝2⎠ ⎝ 2 ⎠
⎝ 2 ⎠ ⎣ ⎝ 2 ⎠⎦
p −1
p ( p −1)
2
(a11a22 … a pp
p −1
) 2
;
- I E is the indicator of the set E , consisting of all points ( xij ,1 ≤ i < j ≤ p ) in
R p ( p −1) / 2 , for which the matrix
(5)
⎛ a11
⎜
x
A = ⎜ 12
⎜
⎜
⎝ x1n
x1n ⎞
⎟
x2 n ⎟
⎟
⎟
ann ⎠
x12
a22
x2 n
is positive definite.
7
Theorem 11 gives the following algorithm for generating uniformly distributed
positive definite matrices:
1) Generate p ( p − 1) / 2 random numbers yij , 1 ≤ i < j ≤ p so that yij comes from
the distribution Ψ ( n − i + 2, 2) .
2) In order to reduce calculations, compute the auxiliary quantities zij , 1 ≤ i ≤ j ≤ p
such that
zij = yij (1 − y12j )… (1 − yi2−1 j ) with yii = 1 .
3) Calculate the desired matrix (5) with
xij = aii a jj ( z1i z1 j + z2i z2 j +
+ zii zij ) .
The obtained formulas show that it is possible to create a program, which generates
without dialog with the user a matrix U with units on the main diagonal. This is done by
steps 1)-3) substituting a11 = = ann = 1 . Then the user, according to his preferences, forms
a diagonal matrix D = diag ( a11 ,… , ann ) and the desired matrix A is A = DUD .
The details will appear in [10].
Reference:
[1]. VELEVA E. Linear Algebra and Appl. (submitted).
[2]. IGNATOV T. G., A. D. NIKOLOVA. Annuaire Univ. Sofia, Fac. of Econ. & Buss.
Admin., 2004, v.3, 79 – 94.
[3]. VELEVA E. Annuaire Univ. Sofia, Fac. of Econ. & Buss. Admin., 2007, v.6, 57 - 66.
[4]. DEVROYE L., Non-Uniform Random Variate Generation, New York, Springer-Verlag,
1986, 843 p.
[5]. VELEVA E., T. IGNATOV. Advanced Studies in Contemporary Mathematics, 12 (2),
2006, pp. 247 - 254.
[6]. VELEVA E. Math. and Education in Math. 35(2006), 315-320.
[7]. DAWID A. P., J. Roy. Stat.Soc., Ser. B, 1979, 41(1), 1–31.
[8]. VELEVA E. Pliska Stud. Math. Bulgar. 18 (2007), 379 – 386.
[9]. VELEVA E. Annuaire Univ. Sofia, Fac. of Econ. & Buss. Admin., 2006, v.5, 353 – 369.
[10]. VELEVA E. Math. and Education in Math. 35(2006), 321 – 326.
.
Evelina Ilieva Veleva
Department of Numerical Methods and Statistics,
Rouse University, Rouse, 7017, Studentska 8, Bulgaria
E-mail address: [email protected]
8