Download On the asymptotic spectral distribution of random matrices Jolanta Pielaszkiewicz

Document related concepts

Capelli's identity wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Determinant wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Jordan normal form wikipedia , lookup

Four-vector wikipedia , lookup

Gaussian elimination wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Matrix calculus wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Matrix multiplication wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Transcript
Linköping Studies in Science and Technology
Licentiate Thesis No. 1597
On the asymptotic spectral
distribution of random matrices
Closed form solutions using free independence
Jolanta Pielaszkiewicz
Department of Mathematics
Linköping University, SE–581 83 Linköping, Sweden
Linköping 2013
Linköping Studies in Science and Technology
Licentiate Thesis No. 1597
On the asymptotic spectral distribution of random matrices
Closed form solutions using free independence
Jolanta Pielaszkiewicz
[email protected]
www.mai.liu.se
Mathematical Statistics
Department of Mathematics
Linköping University
SE–581 83 Linköping
Sweden
LIU-TEK-LIC-2013:31
ISBN 978-91-7519-596-4
ISSN 0280-7971
c 2013 Jolanta Pielaszkiewicz
Copyright Printed by LiU-Tryck, Linköping, Sweden 2013
Abstract
The spectral distribution function of random matrices is an information-carrying object
widely studied within Random matrix theory. In this thesis we combine the results of the
theory together with the idea of free independence introduced by Voiculescu (1985).
Important theoretical part of the thesis consists of the introduction to Free probability
theory, which justifies use of asymptotic freeness with respect to particular matrices as
well as the use of Stieltjes and R-transform. Both transforms are presented together with
their properties.
The aim of thesis is to point out characterizations of those classes of the matrices,
which have closed form expressions for the asymptotic spectral distribution function. We
consider all matrices which can be decomposed to the sum of asymptotically free independent summands.
In particular, explicit calculations are performed in order to illustrate the use of asymptotic free independence to obtain the asymptotic spectral distribution for a matrix Q and
generalize Marčenko and Pastur (1967) theorem. The matrix Q is defined as
Q=
1
1
X1 X10 + · · · + Xk Xk0 ,
n
n
where Xi is p × n matrix following a matrix normal distribution, Xi ∼ Np,n (0, σ 2 I, I).
Finally, theorems pointing out classes of matrices Q which lead to closed formula
for the asymptotic spectral distribution will be presented. Particularly, results for matrices with inverse Stieltjes transform, with respect to the composition, given by a ratio of
polynomials of 1st and 2nd degree, are given.
v
Populärvetenskaplig sammanfattning
I många tillämpningar förekommer slumpmässiga matriser, det vill säga matriser vars element följer någon stokastisk fördelning. Det är ofta intressant att känna till hur egenvärdena för dessa slumpmässiga matriser uppför sig, det vill säga beräkna fördelningen för
egenvärdena, den så kallade spektralfördelningen. Egenvärdena är informationsbärande
objekt, då de ger information om till exempel stabilitet och inversen av den slumpmässiga
matrisen genom det minsta egenvärdet.
Telekommunikation och teoretisk fysik är två områden där det är intressant att studera
slumpmässiga matriser, matrisernas egenvärden och fördelningen för dessa egenvärden.
Speciellt gäller det för stora slumpmatriser där man då är intresserad av den asymptotiska
spektralfördelningen. Ett exempel kan vara en kanalmatris X för ett flerdimensionellt
kommunikationssystem, där fördelningen för egenvärdena för matrisen XX ∗ bestämmer
kanalkapaciteten och uppnåeliga överföringshastighet.
Spektralfördelningen har studerats ingående inom teorin för slumpmässiga matriser.
I denna avhandling kombinera vi resultaten från teorin för slumpmatriser tillsammans
med idén om fritt oberoende som diskuterades först av Voiculescu (1985). Den viktiga
teoretiska referensramen av den här avhandlingen består av en inledning i fri sannolikhetsteori, som motiverar användning av asymptotisk frihet med avseende på vissa matriser,
samt användningen av Stieltjes och R-transformen. Båda dessa transformer diskuteras
tillsammans med sina egenskaper.
Syftet med avhandlingen är att karakterisera klasser av matriser, som har ett slutet
uttryck för den asymptotiska spektralfördelningen. På så sätt slipper man numeriska approximationer av spektralfördelningen när man vill dra slutsatser om egenvärdena.
vii
Acknowledgments
I would like to express my gratitude to my supervisor Dietrich von Rosen for all advices,
support, encouragement and guidance during my work. Thank you for all discussions
and commenting on various drafts of the thesis. Similarly, I would like to appreciate all
meritorious and administrative help I get from Martin Singull, my co-supervisor. Special
thanks for believing in my Swedish language skills.
I am thankful to all the members of Department of Mathematics LiU for creating
friendly working atmosphere, especially to people I have opportunity to cooperate or
study with. Here, I cannot go without mentioning my colleagues PhD students who have
been a great source of friendship.
Finally, I want to acknowledge my family for their love and faith in me and my friends
from all around the world.
Linköping, May 13, 2013
Jolanta Pielaszkiewicz
ix
Contents
1
Introduction
1.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Background and some previous results . . . . . . . . . . . . . . . . . . .
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2
3
5
2 Free probability theory - background information
2.1 Non-commutative space and Freeness . . . . . . . . . . . . . . . . . . .
2.1.1 Space (RMp (C), τ ) . . . . . . . . . . . . . . . . . . . . . . . .
2.1.2 Asymptotic Freeness . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Combinatorial interpretation of freeness. Cumulants . . . . . . . . . . . .
2.2.1 Proof of asymptotic freeness between diagonal and Gaussian matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
7
11
12
13
3 Stieltjes and R-transform
3.1 Stieltjes transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 R-transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
19
23
4 Marčenko-Pastur, Girko-von Rosen and Silverstein-Bai theorems
4.1 Statements of theorems . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Comparison of results . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
27
29
5 Analytical form of the asymptotic spectral distribution function
5.1 Asymptotic spectral distribution through the R-transform . . . . . . . . .
5.1.1 Asymptotic spectral distribution of the matrix Qn when Σ = I
and Ψ = I . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Classes of matrices with closed formula for asymptotic spectral distribution function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
35
xi
15
36
40
xii
6
Contents
Future Research
45
Notation
47
Bibliography
49
Appendix: Package WigWish’ for Mathematica
53
1
Introduction
focus of the thesis has been put on the studies of asymptotic behaviour of large
random matrices, say Q ∈ Q, where Q is a set of all p×p positive definite, Hermitian
matrices which can be written as a sum of asymptotically free independent matrices with
known distributions. The aim is to obtain a closed form expression of the asymptotic
spectral distribution function of the random matrix Q.
The concept of free independence, mentioned in the previous paragraph, has been
introduced within an operator-valued version of Free probability theory by Voiculescu
(1985) and allow us to think about sums of asymptotically free independent matrices in a
similar way as about sums of independent random variables.
The motivation for considering problems regarding behaviour of the eigenvalues of
large dimensional random matrices are arising within, e.g. theoretical physics and wireless communication, where methods of Random matrix theory are commonly used, see
publications by Tulino and Verdú (2004) and Couillet and Debbah (2011). Particularly,
Random matrix theory started playing an important role for analyzing communications
systems as multiple antennas became commonly used, which implies an increase of the
amount of nodes. In application studies often, due to the lack of closed form solutions, numerical methods are applied to obtain asymptotic eigenvalue distributions, see e.g., Chen
et al. (2012). The computation of the asymptotic spectral distribution for Q ∈ Q, given
∗
by Q = Pp×n Pn×p
demands solving a non-linear system of p + n coupled functional
equations, which solution has the same asymptotic behaviour as the investigated spectral
distribution function, see Girko (1990) and Hachem et al. (2007).
In general, there is a strong interest in finding information about the spectral behaviour
of the eigenvalues of random matrices. Particularly the smallest eigenvalue provides
knowledge about both stability and invertability for a random positive definite matrix.
For example, let X be a channel matrix for multi-dimensional designed communication
systems, then the eigenvalue distribution function of XX ∗ determinates channel capacity
and achievable transmission rate.
T
HE
1
2
1.1
1 Introduction
Problem formulation
We are interested in an asymptotic spectral distribution of a positive definite and Hermitian matrix Q ∈ Q, which can be written as a sum of asymptotically free independent
matrices with known distributions. The assumption about the matrix to be Hermitian
is sufficient for eigenvalues to be real. The Hermitian property together with positive
definitness assures that all the eigenvalues are both real and positive.
In the thesis the concept of "infinite matrix" is always realized by referring to a sequence of random matrices with increasing size.
Definition 1.1 (The normalized spectral distribution function). Let λi be an eigenvalue of a p × p matrix W , i = 1, 2, . . . , p with complex entries. The normalized spectral
distribution function of a matrix W is defined by
p
FpW (x) =
1X
1{λk ≤x} ,
p
k=1
x ≥ 0,
where 1{λk ≤x} stands for the indicator function, i.e.
(
1 for x ≤ λk ,
1{λk ≤x} =
0 otherwise.
Previous results obtained in the literature will be compared using distinguished elements from the class of matrices Q denoted by Qn , which can be written as a sum of
p × p matrices of the form n1 Xi Xi0 , where Xi is p × n matrix, such that Xi and Xj are
independent for all i 6= j. We assume that the Kolmogorov condition p(n)
n → c ∈ (0, ∞)
for n → ∞ holds. More precisely, let
Qn = An A0n +
1
1
X1 X10 + · · · + Xk Xk0 ,
n
n
(1.1)
where Xi has a matrix normal distribution, Xi ∼ Np,n (0, Σi , Ψi ), for all i = 1, · · · , k,
Xi and Xj are independent for all i 6= j and An is a non-random p × n matrix. The mean
of the matrix normal distribution is a p × p zero matrix and the dispersion matrix of Xi
has the Kronecker product structure, i.e. D[Xi ] = D[vec Xi ] = Ψi ⊗ Σi , where both
Σi and Ψi are positive definite matrices. The vec X denotes vectorization of the matrix.
If some elements are standardized one can interpret Σi and Ψi as the covariance matrices
for the rows and the columns of Xi , respectively.
Qn can also be rewritten in a more compact form as
Qn = Pn Pn0 ,
where Pn = Bn + Yn , Bn = (An , 0, . . . , 0), Yn = √1n (0, X1 , . . . , Xn ). This form
shows immediately that Qn in our problem formulation is a special case of the matrix
ΩΩ∗ , which spectral behaviour has been discussed by Hachem et al. (2007).
The use of the Kolmogorov condition is motivated by the observation that the limiting
spectral distribution function of the matrix Qn is not affected by the increase of number
of rows and column as long as the speed of increase is the same, i.e. p = O(n).
1.2 Background and some previous results
3
1.2 Background and some previous results
Random matrix theory is the main field which placed its research interest in the properties of matrices, with strong accent put on eigenvalue distribution. The considered
matrices, called random matrices, have entries following some probability distribution.
An ensemble of random matrices is a family of random matrices with a density function
that express the probability density p of any member of the family to be observed. Let
Hn → U Hn U −1 be transformation which leaves p(Hn ) invariant. Within Random matrix theory most studied, classical cases are, when U is orthogonal or unitary matrix. Then
a matrices Hn are real symmetric matrices which give us Gaussian Orthogonal Ensemble
(GOE) or complex Hermitian matrices, what corresponds to the Gaussian Unitary Ensemble (GUE). All mentioned matrices have diagonal and off-diagonal entries independent,
normally distributed with mean zero and variance 12 .
Ensemble
Hij
U
GOE
real
orthogonal
GUE
complex
unitary
Table 1.1: Classification of Gaussian ensembles, the Hermitian matrix Hn = (Hij )
and its matrix of eigenvectors U .
One of the important elements of the GOE is the Wigner matrix. Let a matrix X be p × p
with independent and identically distributed (i.i.d.) Gaussian entries, then the symmetric
matrix Q defined as
1
Q = (X + X ∗ )
2
is called a Wigner matrix.
In this section we discuss shortly results obtained within Random matrix theory in research regarding the spectral distribution of the matrix Qn = Pn Pn0 defined in (1.1). Let
us categorize the literature investigating asymptotic behaviour of eigenvalues into three
categories with respect to used assumptions.
1) Independent and identically distributed entries of Pn with zero mean
The simplified version of the matrix Qn , when k = 1 and An = 0 with assumption about
independence of identically distributed entries in the p× n random matrix X1 was considered firstly by Marčenko and Pastur (1967). For some details see Chapter 4 in this thesis
or consult works by Yin (1986), Silverstein and Bai (1995) and Silverstein (1995).
2) Non-i.i.d. entries of Pn with zero mean
The studies of an asymptotic spectral distribution without assumption about i.i.d. of the
entries have been performed by Girko (1990), Girko and von Rosen (1994) and Khorunzhy et al. (1996).
In Section 4.2 a generalized version of the result presented by Girko and von Rosen
(1994) is discussed in relation to the result of Silverstein and Bai (1995).
4
1 Introduction
3) Independent and identically distributed entries of Pn with non-zero mean
Under restrictive assumptions on An in (1.1) the spectral asymptotic behaviour has been
explained by a solution of a certain nonlinear system of n + p coupled functional equations, see Girko (2001). The product Qn = Pn Pn0 = (Rn + Yn )(Rn + Yn )0 , where
both Rn and Yn are independent random matrices, Yn has i.i.d. entries and the empirical
distribution of Rn Rn0 converge to some non-random distribution, has been discussed by
Dozier and Silverstein (2007). In that work it has been shown that the distribution of the
eigenvalues of Qn converges almost surely to the same deterministic distribution. The
Stieltjes transform for that distribution, which by the Stieltjes inversion formula determinates the spectral distribution function, is uniquely given by certain functional equations.
For more details about the Stieltjes transform, see Section 3.1.
n
n
When Rn = (Rij
) denotes a deterministic and pseudo-diagonal matrix (i.e., Rij
=0
n
for all i 6= j, but Rn does not have to be necessary square), Yn = (Yij ) are independent
σ (n)
n
n
but not i.i.d. (Yijn = ij√n Xij
, Xij
are i.i.d. and {σij (n)} is bounded, real sequence of
numbers). The empirical spectral distribution of Qn converges almost surely to a nonrandom probability measure and has been studied and proven by Hachem et al. (2006). In
the presented there theorem, for some set of suitable assumptions, a system of equations
which describe the spectral distribution through its Stieltjes transform is given. A weaker
version of these assumptions (given below as A1, A2, A3) for An in (1.1) is considered
in the later paper by Hachem et al. (2007).
n
n 2
A1 Xij
are real, i.i.d. distributed, E(Xij
) = 1;
n 4+
∃>0 E|Xij
|
<∞
A2 there exists a σmax such that the family (σij )1≤i≤p;1≤j≤n , called variance profile,
satisfies
∃σmax sup max |σij (n)| < σmax
n≥1 i,j
A3 supremum over all n ≥ 1 of maximum of Euclidean norms of all columns and rows
of the matrix An is finite.
Under this set of assumptions it is proved by Hachem et al. (2007) that there exists a deterministic equivalent to the empirical Stieltjes transform of the distribution of the eigenvalues of Qn . More precisely, there exists a deterministic p × p matrix valued function
Tn (z), analytic in C \ R+ , such that
1
1
Tr(Qn − zIp )−1 − Tr(Tn (z)) = 0
a.s. and ∀c > 0.
limp
p
n→∞, n →c p
Note that by n → ∞, np → c we indicate that the so called Kolmogorov condition
holds and p1 Tr(Qn − zIp )−1 is a version of the Stieltjes transform, see Section 3.1, of
the spectral
P distribution of the matrix Qn . Tr A denotes trace of a square matrix, i.e.
Tr A = i Aii .
Moreover, Example 1.1 given by Hachem et al. (2007) shows that generally the convergence of the empirical spectral density of Qn can fail, despite of existence of variance profile in some limit and despite of convergence of the spectral distribution for nonrandom An A0n .
5
1.3 Outline
Example 1.1: Hachem et al. (2007)
This example is to motivate the additional assumptions A1, A2, A3 used by Hachem et al.
(2007) to avoid lack of convergence for the spectral measure of Qn .
Consider the 2n × 2n matrix Yn of the form
Wn 0
Yn =
,
0 0
X
where Wn = (Wijn )i,j is a quadratic matrix of size n such that Wijn = √ijn , where Xij
are i.i.d. with mean 0 and variance 1. It is easy to see that Yn is a matrix with a variance
profile.
In 0
0 0
are
Next, two 2n × 2n deterministic matrices Bn =
and B̄n =
0 0
0 In
0
B Bn
considered. They are chosen so that both the spectral distribution functions F2nn
B̄ B̄ 0
F2nn n converge to 21 δ0 + 21 δ1 , when n → ∞.
and
(Y +A )(Y +A )0
It is shown as F2n n n n n , where An is alternatively equal to Bn and to B̄n does
not admit a limiting distribution as
1
1
if n is even
(Yn +An )(Yn +An )0
2 Pcub + 2 δ0
F
=
1
1
P
+
δ
if
n is odd
MP
1
2
2
The PMP and Pcub denotes the distribution of Wn Wn0 (called Marčenko-Pastur’s distribution) and (Wn + In )(Wn + In )0 , respectively.
The mentioned example points out that to ensure convergence of the spectral measure
we must consider assumptions concerning boundness of norm for rows and columns of
n
matrix An and existence of at least four moments of Xij
for the Yi given with variance
profile, for all i = 1, . . . , k. The question which arises is if we are able to distinguish
classes of matrices Yi such that the obtained asymptotic spectral distribution function is
given by a closed form expression.
1.3 Outline
This thesis started with a short Introduction comprising the problem formulation and some
basic literature. It is followed by the four main chapters of the work. All used symbols
and operators are listed at the end of the work in Notation. Then the Bibliography and
Appendix are placed. A more specified outline of the main parts is presented below.
In Chapter 2, one can find an introduction to the most basic ideas and results of Free
probability theory, where the main focus is put on the concept of free independence. The
chapter includes a proof of asymptotic freeness between particular classes of matrices
and introduce free cumulants used later to define the R-transform. Then in the Chapter 3,
the R- and Stieltjes transform are discussed. Here, properties of the R-transform and its
relation to the Stieltjes transform play the the key role.
6
1 Introduction
After introduction to the theoretical tools used in the thesis, Chapter 4 presents MarčenkoPastur theorem and compares two theorems given by Girko and von Rosen (1994) and one
formulated according to Silverstein and Bai (1995).
In Chapter 5, one is introduced to the ideas and results related to the research question concerning the finding of closed formulas of the asymptotic spectral distribution of
Q ∈ Q. An illustrative example in Section 5.1.1 allows us to put up the theorem, which
gives generalization of an earlier result by Marčenko and Pastur (1967). In Section 5.2 the
results point out classes of matrices Q ∈ Q which lead to closed formula for the asymptotic spectral distribution of large random matrices. The results are given by stating the
asymptotic spectral distribution for all the matrices with a particular form of the inverse
Stieltjes transform, with respect to the composition. The thesis finish with Chapter 6,
where we put up some future research questions.
2
Free probability theory - background
information
probability theory was established by Voiculescu in the middle of the 80’s
(Voiculescu, 1985) and together with the result published in Voiculescu (1991) regarding asymptotic freeness of random matrices have established a new branch of theories
and tools in Random matrix theory such as the R-transform. The freeness can also be studied with the use of equivalent combinatorial definitions based on ideas of non-crossing
partitions (see, Section 2.2). The following chapter refers to both Random matrix theory
and combinatorial point of view, while introducing basic definitions and concepts of the
theory. These are introduced in a general set up of non-commutative probability space and
specified in Section 2.1.1 to the algebra of random matrices. The combinatorial approach
is introduced mainly in order to present a proof of asymptotic freeness between Gaussian
and diagonal matrices.
F
REE
2.1 Non-commutative space and Freeness
In this section the goal is to present a concept of freeness in a non-commutative space,
which is introduced according to, among others, books of Nica and Speicher (2006) and
Voiculescu et al. (1992). Some properties for elements of a non-commutative space are
also presented. For further reading, see Hiai and Petz (2000).
Definition 2.1 (Non-commutative probability space). A non-commutative probability
space is a pair (A, τ ), where A is a unital algebra over the field of complex numbers C
with identity element 1A and τ is a unital functional such that:
• τ : A → C is linear,
• τ (1A ) = 1.
7
8
2
Free probability theory - background information
Definition 2.2. The functional τ is called trace if τ (ab) = τ (ba) for all a, b ∈ A.
Note that the word trace is used in the thesis in two meanings: trace as name of
functional fulfillingPτ (ab) = τ (ba) for all a, b ∈ A and trace of a square matrix A =
(Aij ), i.e. Tr A = i Aii .
Definition 2.3. Let A in Definition 2.1 have a ∗-operation such that ∗ : A → A, (a∗ )∗ =
a and (ab)∗ = b∗ a∗ for all a, b ∈ A and let the functional τ satisfy
τ (a∗ a) ≥ 0
for all a ∈ A.
Then, we call τ positive and (A, τ ) a ∗-probability space.
Remark 2.1. If (A, τ ) is a ∗-probability space, then τ (a∗ ) = τ (a) for all a ∈ A, where
z denotes complex conjugate.
Proof of Remark 2.1: Let us take a ∈ A, then as A is a ∗-algebra over C we can
uniquely write a = x + iy, where x = x∗ and y = y ∗ .
Let us
τ (x) ∈ R for allx such thatx = x∗ . As x = x∗ , we can rewrite
show that x=
1
2 (x
+ 1A )
∗
1
2 (x
+ 1A ) −
1
2 (x
− 1A )
∗
1
2 (x
− 1A ) = w∗ w − q ∗ q. Of course
w, q ∈ A. Then, by linearity and positivity of the functional τ , we get that
τ (x) = τ (w∗ w − q ∗ q) ∈ R.
Using that τ (x) ∈ R for obtaining equality () we prove immediately result
()
τ (a∗ ) = τ (x − iy) = τ (x) − iτ (y) = τ (x) + iτ (y) = τ (x + iy) = τ (a).
Subsection 2.1.1 gives an example of a ∗-probability space with the positive tracial
function, i.e. space (RMp (C), τ ) of p × p random matrices with entries being complex random variables with all moments finite and equipped with a functional such that
τ (X) := E(Trp (X)), for all X ∈ RMp (C), where Trp = p1 Tr is a weighted trace. Keeping for now the general set up we define random variables, moments and distribution for
elements in non-commutative space.
Definition 2.4 (Freeness of algebras). The subalgebras A1 , . . . , Am ⊂ A, where (A, τ )
is a non-commutative probability space, are free if and only if for any (a1 , . . . , an ), aj ∈
Aij and for all i ∈ {1, 2, . . . , n − 1} and all j ∈ {1, 2, . . . , n − 1}
τ (ai ) = 0 and ij 6= ij+1 ⇒ τ (a1 · · · an ) = 0.
Note, that ij 6= ij+1 for all j ∈ {1, 2, . . . , n − 1} means that all elements aj and aj+1
with neighboring indices belong to different subalgebras.
Denote the free algebra with generators a1 , . . . , am by Cha1 , . . . , am i, i.e. all polynomials in m non-commutative indeterminants.
9
2.1 Non-commutative space and Freeness
Definition 2.5.
a) The element a ∈ A is called a non-commutative random variable and τ (aj ) is its
jth moment for all j ∈ N. aj is well defined due to the fact that algebra is closed
over the multiplication of elements.
b) Let a ∈ A, where A denotes ∗-probability space. The linear functional µ on unital
algebra which is freely generated by two non-commutative indeterminates X and
X ∗ an ChX, X ∗ i defined as
µ : ChX, X ∗ i → C,
µ(X ψ1 X ψ2 · · · X ψk ) → τ (aψ1 aψ2 · · · aψk ),
for all k ∈ N and all ψ1 , ψ2 , . . . , ψk ∈ {1, ∗} is called ∗-distribution of a.
c) Let a ∈ A be normal, i.e. aa∗ = a∗ a, where A denotes ∗-probability space. If
there exists a compactly supported probability measure µ on C such that
Z
z n z̄ k dµ(z) = τ (an (a∗ )k ),
for all n, k ∈ N, then µ is called ∗-distribution of a and is uniquely defined.
Assume that support of ∗-distribution of a is real and compact. Then the real probability measure µ given by Definition 2.5 is related to the moments by
Z
k
τ (a ) = xk dµ(x)
R
and is called a distribution of a. The distribution of a ∈ A on compact support is characterized by its moments τ (a), τ (a2 ), . . ..
Definition 2.6 (Freeness). The variables (a1 , a2 , . . . , am ) and (b1 , . . . , bn ) are said to
be free if and only if for any (Pi , Qi )1≤i≤p ∈ (Cha1 , . . . , am i × Chb1 , . . . , bn i)p such
that
τ (Pi (a1 , . . . , am )) = 0,
τ (Qi (b1 , . . . , bn )) = 0
∀i=1,...,p
following equation holds
Y
Pi (a1 , . . . , am )Qi (b1 , . . . , bn ) = 0.
τ
1≤i≤p
To be able to show that freeness does not go with classical independence in Lemma
2.2, given below, we state and prove first Lemma 2.1.
Lemma 2.1
Let a and b be free elements of a non-commutative probability space (A, τ ). Then, we
have:
τ (ab)
τ (aba)
τ (abab)
= τ (a)τ (b),
= τ (a2 )τ (b),
= τ (a2 )τ (b)2 + τ (a)2 τ (b2 ) − τ (a)2 τ (b)2 .
(2.1)
(2.2)
10
2
Free probability theory - background information
Proof: For free a and b we have
τ (a − τ (a)1A )(b − τ (b)1A )
=
τ ab − aτ (b) − τ (a)b + τ (a)τ (b)
=
τ (ab) =
0,
0,
τ (a)τ (b).
Then also
τ (a − τ (a)1A )(b − τ (b)1A )(a − τ (a)1A )
τ (ab − aτ (b) − τ (a)b + τ (a)τ (b))(a − τ (a)1A )
τ aba − aτ (b)a − τ (a)ba + τ (a)τ (b)a − abτ (a)
+aτ (b)τ (a) + τ (a)bτ (a) − τ (a)τ (b)τ (a)
τ (aba) − τ (a2 )τ (b) − τ (a)τ (ba) + τ (a)τ (b)τ (a) − τ (ab)τ (a)
+τ (a)τ (b)τ (a) + τ (a)τ (b)τ (a) − τ (a)τ (b)τ (a)
τ (aba)
= 0,
= 0,
= 0,
= 0,
= τ (a2 )τ (b).
Similar calculations show that
τ (abab) = τ (a2 )τ (b)2 + τ (a)2 τ (b2 ) − τ (a)2 τ (b)2 .
One can prove that the freeness and commutativity can not take place simultaneously
as it is stated in the next lemma.
Lemma 2.2
Let a and b be non-trivial elements of ∗-algebra A, equipped with the functional τ such
that a and b commute, i.e. ab = ba. Then, a and b are not free.
Proof by contradiction: Take two non-trivial elements a and b of the ∗-algebra A such
that they are both free and commute. Then
ab=ba
(2.1)
τ (abab) = τ (a2 b2 ) = τ (a2 )τ (b2 )
and
(2.2)
τ (abab) = τ (a2 )τ (b)2 + τ (a)2 τ (b2 ) − τ (a)2 τ (b)2 .
These two equalities give
τ (a2 )τ (b)2 + τ (a)2 τ (b2 ) − τ (a)2 τ (b)2 − τ (a2 )τ (b2 ) = 0.
(2.3)
11
2.1 Non-commutative space and Freeness
Then, as a and b are free
τ (a − τ (a)1A )2 τ (b − τ (b)1A )2 =
=
=
=
(2.3)
τ (a2 − 2aτ (a) + τ (a)2 1A )τ (b2 − 2bτ (b) + τ (b)2 1A )
(τ (a2 ) − 2τ (a)2 + τ (a)2 )(τ (b2 ) − 2τ (b)2 + τ (b)2 )
(τ (a2 ) − τ (a)2 )(τ (b2 ) − τ (b)2 )
τ (a2 )τ (b2 ) − τ (a2 )τ (b)2 − τ (a)2 τ (b2 ) + τ (a)2 τ (b)2
= 0
Then, either τ (a − τ (a)1A )2 = 0 or τ (b − τ (b)1A )2 = 0. As long as the functional
τ is faithful, i.e. τ (a∗ a) = 0 ⇒ a = 0, the obtained equality implies that a = τ (a)1A
or b = τ (b)1A . We get that at least one of elements a or b is trivial, what contradicts
the assumption that the equations holds for any non-trivial elements, which proves the
statement.
2.1.1 Space (RMp (C), τ )
In this subsection we consider a particular example of a non-commutative space (RMp (C), τ ).
Let (Ω, F , P ) be a probability space, then
T the RMp (C) denotes set of all p × p random
matrices, with entries which belong to p=1,2,... Lp (Ω, P ), i.e. entries are complex random variables with finite moments of any order. Defined in this way RMp (C) is a ∗algebra, with the classical matrix product as multiplication and the conjugate transpose as
∗-operation. The ∗-algebra is equipped with tracial functional τ defined as expectation of
the normalized trace Trp in the following way
p
p
1X
1 X
1
EXii ,
(2.4)
Xii ) =
Tr(X) = E(
τ (X) := E(Trp (X)) = E
p
p i=1
p i=1
where X = (Xij )pi,j=1 ∈ RMp (C).
The form of the chosen functional τ is determined by the fact that especially interesting to us is the distribution of the eigenvalues. Notice, that for any normal matrix
X ∈ (RMp (C), τ ) the eigenvalue distribution µX is the ∗-distribution with respect to a
given functional τ defined in Definition 2.5.
First, consider a matrix X with eigenvalues denoted by λ1 , . . . , λp . Then
Z
p
1X k n
λi λi = z k z̄ n dµX (z),
Trp (X k (X ∗ )n ) =
p i=1
C
for all k, n ∈ N, where µX is a spectral probability measure, corresponding to the normalized spectral distribution function defined in Definition 1.1. For
Z p
1 X
µX (x) =
δλk (ω) dP (ω),
p
Ω k=1
where δλk (ω) stands for Dirac delta, we obtain a generalization of the above statement for
X being a normal random matrix, so
Z
Z p
n
1 X k
k
∗ n
λi (ω)λi (ω) dP (ω) = z k z̄ n dµX (z).
τ (X (X ) ) =
p
i=1
Ω
C
12
2
Free probability theory - background information
Hence, defined in (2.4) trace τ is a ∗-distribution in the sense given by Definition 2.5c.
Remark 2.2. In general, for any random matrix X the measure µX does not have compact
support.
2.1.2 Asymptotic Freeness
The concept of asymptotic freeness was established by Voiculescu (1991), where Gaussian random matrices with constant unitary matrices have been discussed.
Theorem 2.1 (Voiculescu’s Asymptotic Freeness)
Let Xp,1 , Xp,2 , . . . be independent (in the classical sense) p × p GUE. Then there exists
functional φ in some non-commutative polynomial algebra ChX1 , X2 , . . .i such that
• (Xp,1 , Xp,2 , . . .) has a limit distribution φ as p → ∞, i.e.,
φ(Xi1 Xi2 · · · Xik ) = lim τp (Xp,i1 Xp,i2 · · · Xp,ik ),
p→∞
for all ij ∈ N, j ∈ N, where τp (X) = E Trp X .
• X1 , X2 , . . . are freely independent with respect to φ, see Definition 2.6.
The mentioned work was followed by Dykema (1993) who replaced the Gaussian
entries of the matrices with more general non-Gaussian random variables. Furthermore,
the constant diagonal matrices was generalized to some constant block diagonal matrices,
such that the block size remains constant. In general the random matrices with independent entries of size p × p tend to be asymptotically free while p → ∞ , under certain
conditions.
To give some additional examples one can consider that two unitary p × p matrices
are asymptotically free and two i.i.d. p × p Gaussian distributed random matrices are
asymptotically free as p → ∞. For the future use in the Chapter 5 we mention here
asymptotic freeness between i.i.d. Wigner matrices. This fact has been proven by Dykema
(1993). The asymptotic free independence holds also for Gaussian and Wishart random
matrices and for Wigner and Wishart matrices, Capitaine and Donati-Martin (2007).
Following Müller (2002) we want to point out that there exists matrices which are dependent, in classical sense, and asymptotically free as well as matrices with independent,
in classical sense, entries which are not asymptotically free.
Remark 2.3. Let D1 and D2 be independent diagonal random matrices and let matrices
H1 and H2 be Haar distributed, independent of each other and of the diagonal matrices
D1 , D2 . Then,
• D1 and D2 are not asymptotically free;
• but H1 D1 H1∗ and H2 D1 H2∗ are asymptotically free.
13
2.2 Combinatorial interpretation of freeness. Cumulants
2.2 Combinatorial interpretation of freeness.
Cumulants
Combinatorial interpretation of freeness, described using free cumulants (see, Definition
2.9 given below) have been established by Speicher (1994) and developed by Nica and
Speicher (2006). Two main purposes of this section is to introduce idea of the free cumulant as well as to present steps of the proof of asymptotic free independence between
some particular classes of matrices. Free cumulants play an important role in Chapter 3,
where the R-transform is defined.
Definition 2.7 (Non-crossing partition). Let V = V1 , . . . , Vp be a partitionSof the set
{1, . . . , r}, i.e. for all i = 1, . . . , p the Vi are ordered and disjoint sets and pi=1 Vi =
{1, . . . , r}. The V is called non-crossing, if for all i, j = 1, . . . , p with Vi = (v1 , . . . , vn )
(such that v1 < . . . < vn ) and Vj = (w1 , . . . , wm ) (such that w1 < . . . < wm ) we have
wk < v1 < wk+1 ⇔ wk < vn < wk+1
(k + 1, . . . , m − 1).
Presented here, following Speicher, definition of non-crossing partition can be given
in an equivalent recursive form.
Definition 2.8 (Non-crossing partition. Recursive definition). The partition V =
{V1 , . . . , Vp } is non-crossing if at least one of the Vi is a segment of (1, . . . , r) i.e. it has
the form Vi = (k, k + 1, . . . , k + m) and {V1 , . . . , Vi−1 , Vi+1 , . . . , Vp } is a non-crossing
partition of {1, . . . , r} \ Vi .
Let the set of all non-crossing partitions over {1, . . . , r} be denoted by N C(r).
Definition 2.9 (Cumulant). Let (A, τ ) be a non-commutative probability space. Then
we define the cumulant functionals kk : Ak → C, for all i ∈ N by the moment-cumulant
relation
X
kπ [a1 , . . . , ak ],
k1 (a) = τ (a),
τ (a1 · · · ak ) =
π∈N C(k)
where the sum is taken over all non-crossing partitions of the set {a1 , a2 , . . . , ak } and
kπ [a1 , . . . , ak ] =
r
Y
kV (i) [a1 , . . . , ak ]
i=1
kV [a1 , . . . , ak ] =
ks (av(1) , . . . , av(s) )
π = {V (1), . . . , V (r)},
V = (v(1), . . . , v(s)).
For the X element of a non-commutative algebra (A, τ ) we define the cumulant of X as
knX = kn (X, . . . , X).
Note that the square bracket are used to denote the cumulants with respect to the partitions, while the parentheses for the cumulants of some set of variables. To illustrate
the difference let consider two elements set {a1 , a2 }, such that a1 , a2 belong to noncommutative probability space equip with tracial functional τ . Then k1 (ai ) = τ (ai ) for
14
2
Free probability theory - background information
all i = 1, 2. The only non-crossing
P partitions of the two element set are segment {a1 , a2 }
or {a1 }, {a2 } so τ (a1 a2 ) =
π∈N C(2) kπ [a1 , a2 ] = k2 (a1 , a2 ) + k1 (a1 )k1 (a2 ) =
k2 (a1 , a2 ) + τ (a1 )τ (a2 ). Hence k2 (a1 , a2 ) = τ (a1 a2 ) − τ (a1 )τ (a2 ) is a cumulant of the
2-element set {a1 , a2 }, while kπ [a1 , a2 ] denotes cumulant of partition π.
Lemma 2.3
Given by Definition 2.9 cumulants are well defined.
Proof: Following the definition of a cumulant
X
kπ [a1 , . . . , an ] = kn (a1 , . . . , an )+
τ (a1 , . . . , an ) =
π∈N C(n)
X
kπ [a1 , . . . , an ],
π∈N C(n),π6=1n
where π 6= 1n means that we consider partitions different from the n-elements segment,
i.e. π 6= {1, 2, . . . , n}. Now the lemma follows by induction.
To show linearity of cumulants for the sum of free random variables, we need to state
a theorem about vanishing mixed cumulants.
Theorem 2.2
Let a1 , a2 , . . . , an ∈ A then elements a1 , a2 , . . . , an are freely independent if and only if
all mixed cumulants vanishes, i.e. for n ≥ 2 and any choice of i1 , . . . , ik ∈ {1, . . . , n} if
there exist j, k such that j 6= k, but ij = ik then
kn (ai1 , . . . , ain ) = 0.
Proof: The proof can be found in Nica and Speicher (2006).
Theorem 2.3
Let a, b ∈ A be free, then
kna+b = kna + knb ,
for n ≥ 1.
Proof: The proof of the theorem follows from the fact that for free random variables
mixed cumulants are equal zero, see Theorem 2.2.
kna+b := kn (a + b, a + b, . . . , a + b) = kn (a, a, . . . , a) + kn (b, b, . . . , b).
Definition 2.10 (Free additive convolution). Let a and b be elements of the non-commutative
probability space (A, τ ) with law µa , µb , respectively. Then if a and b are free and µa
and µb have compact support, the distribution of a + b is denoted µab , where is called
free additive convolution.
The measure µab is determinated by the tracial functional φ ? ψ, called free product,
in the free algebra generated by a and b, i.e. Cha, bi = Chai ? Chbi by
Z
φ ? ψ((a + b)k ) = xk d(µab )(x).
Given in definition distribution µab does not depend of the elements a and b, but only of
the their distribution functions µa and µb . Moreover, the Definition 2.10 can be extended
to arbitrary probability measure on R, see Nica and Speicher (2006).
15
2.2 Combinatorial interpretation of freeness. Cumulants
2.2.1 Proof of asymptotic freeness between diagonal and
Gaussian matrices
The Section 2.1.2 indicates the asymptotic freeness between various classes of random
matrices. Here, the goal is to actually prove the asymptotic freeness in one of such cases.
Moreover, the proofs illustrate the use of the combinatorial approach introduced in Section 2.2.
Theorem 2.4 (Speicher, 1993)
Let A = (An )n∈N and B = (Bn )n∈N be two sequences of self-adjoint (i.e. An = A∗n )
n × n matrices An and Bn such that
µAn → µA and µBn → µB weakly
for some spectral probability measure µA and µB on R. If µA and µB have compact
support, then
µAn +Un Bn Un∗ → µA µB weakly
for almost all random sequences of unitary matrices U = (Un )n∈N .
Steps of the proof: The theorem have been proved by Speicher in the following steps.
1) Firstly, the asymptotic freeness between the n × n Gaussian random matrices X
(entries are independent and complex valued random variables with mean zero and
variance n1 ) and non-random diagonal matrices is shown. The free independence
between Gaussian and non-random diagonal matrices is stated as Lemma 2.5 and
proved below.
2) Then, the first step of the proof implies asymptotic freeness between polynomials
in X and diagonal matrices.
1
3) Let U := X(X ∗ X)− 2 . Notice that
UU∗
1
1
1
1
= X(X ∗ X)− 2 (X(X ∗ X)− 2 )∗ = X(X ∗ X)− 2 (XX ∗ )− 2 X ∗
1
= X(X ∗ XXX ∗ )− 2 X ∗ = X(X ∗ X)−1 X ∗ = I
1
and similarly U ∗ U = I. Hence, U is an unitary matrix and X → X(X ∗ X)− 2
determinates a measurable mapping from space of Gaussian matrices into space of
unitary matrices, defined almost everywhere. Then, an image measure under this
mapping of the measure on the space of Gaussian matrices is a measure on the
space of unitary matrices. What implies that it is enough to proof the statement
1
for the matrices given by X(X ∗ X)− 2 , which then can be approximated by the
∗
polynomials Xg(X X), where g is a polynomial with real coefficients such that
1
that it is a good approximation of x− 2 . That and 2) imply asymptotic freeness
between unitary and diagonal matrices.
4) Finally, notice that both matrices A and B can be decomposed as A = W AD W ∗ ,
B = V B D V ∗ , respectively, where AD , B D are diagonal matrices. Hence
µAn +Un Bn Un∗
=
µWn AD
D ∗ ∗ = µAD +W ∗ U V B D V ∗ U ∗ W
n
n Wn +Un Vn Bn Vn Un
n
n n n n n n
=
µA D
∗
D
∗
∗
n +(Wn Un Vn )Bn (Wn Un Vn )
16
2
Free probability theory - background information
we obtained equivalence of measures and we can simply prove the statement of the
theorem for diagonal matrices AD and B D as mapping Un → Wn∗ Un Vn preserve
the measure. Then, by the third step, one obtains the asymptotic freeness between
{AD , B D } and {U, U ∗}. Hence, also AD and U B D U ∗ are asymptotically free,
what finishes the proof.
Lemma 2.4 (Speicher, 1993)
For almost all sequences of Gaussian square matrices X = (Xn )n∈N the following holds.
Let P be a polynomial in two non-commuting indeterminants. Then
lim Trn (P (Xn , Xn∗ )) = φ(P (X, X ∗ )),
n→∞
where φ is a positive linear functional, such that φ(1) = 1, φ(XX ∗ ) = φ(X ∗ X) = 1
and φ(XX) = φ(X ∗ X ∗ ) = 0.
Proof: The statement of a lemma follows from Lemma 2.5, which is proven below.
To show the use of the combinatorial interpretations of asymptotic freeness within
Free probability theory, the proof of the Lemma 2.5 is given. It is, simultaneously, the
proof of the first step in Theorem 2.4.
Lemma 2.5 (Speicher, 1993)
For almost all sequences of Gaussian square matrices X = (Xn )n∈N the following holds.
Let D1 = (Dn1 )n∈N ,. . .,Ds = (Dns )n∈N be s sequence of diagonal n × n matrices Dnj
such that
lim Trn (P (Dn1 , . . . , Dns )) = ρ(P (D1 , . . . , Ds ))
n→∞
for all polynomials P in s indeterminants, where ρ is a state on ChD1 , . . . , Ds i, i.e., ρ is
positive linear functional and ρ(1) = 1. Then
lim Trn (P (Xn , Dn1 , . . . , Dns , Xn∗ )) = ρ ? φ(P (X, D1 , . . . , Ds , X ∗ ))
n→∞
for all polynomials P in (s + 2) non-commutative indeterminants, where ρ ? φ is called
a free product of ρ and φ and is a functional on space ChX, D1 , . . . , Ds , X ∗ i. The
functional φ is defined in the previous lemma.
Proof: As the proof is rather complex, we present it stepwise.
i(1)
i(r)
1. Notation. We consider monomials Xn , . . . , Xn
Xn0
Xnj
Xns+1
:=
:=
Xn ,
Dnj ,
:=
Xn∗
defined as
1≤j≤s
for all choices of r ∈ N and i(k) ∈ {0, 1, . . . , s + 1}. Define
Sn
:= Trn (Xni(1) · · · Xni(r) )
n
n
X
1 X
=
Xni(1) (k1 , k2 )Xni(2) (k2 , k3 ) · · · Xni(r) (kr , k1 ).
···
n
k1 =1
kr =1
2.2 Combinatorial interpretation of freeness. Cumulants
17
2. Show the convergence of E[Sn ] to ρ ? φ(V).
2a) Define the valid partition and denote the set of all valid partitions on {1, . . . , r}
by Pv (1, . . . , r).
The non-crossing partition V = (V1 , . . . , Vt ) of {1, . . . , r} is called valid if
one of the following holds:
- t = 1 and i(v) = {1, . . . , s} for all v = 1, 2, . . . , r. Then
ρ ? φ(V) = ρ(Di(1) · · · Di(r) ).
- V contains segment Vk = (m, m + 1, . . . , m + l) such that i(m), i(m + l) ∈
{0, s + 1}, i(m) 6= i(m + l) and i(m + 1), . . . , i(m + l − 1) ∈ {1, . . . , s} and
V \ {Vk } is valid partition of {1, 2, . . . , r} \ Vk . In the other words, segment
Vk have to be started and finished with X and X ∗ , or X ∗ and X. The inner
elements of a segment should only consist of the diagonal matrices and the
partition without this segment should become valid.
We define
ρ ? φ(Vk ) = φ(X i(m) X i(m+l) )ρ(Di(m+1) · · · Di(m+l−1) )
and
ρ ? φ(V) = ρ ? φ(Vk )ρ ? φ(V \ {Vk }).
2b) Consider the expectation
E[Sn ] =
n
n
X
1 X
···
E[Xni(1) (k1 , k2 )Xni(2) (k2 , k3 ) · · · Xni(r) (kr , k1 )].
n
k1 =1
kr =1
The entries in matrices Xn are independent with mean zero and Xn∗ (l, k) =
i(1)
i(2)
i(r)
X̄n (k, l). Hence, E[Xn (k1 , k2 )Xn (k2 , k3 ) · · · Xn (kr , k1 )] is different
than zero only if each of matrix elements of Xn occurs at least twice in the
product.
2c) Now, denote set of all positions of the Xn and Xn∗ by
I := {k : i(k) ∈ {0, s + 1}}.
Then a pair (kj , kj+1 ) is called free step if i(j) ∈ {0, s + 1} and kj+1 has not
appeared before and a repetive step if i(j) ∈ {0, s+ 1} and kj+1 has appeared
before.
We are interested only in tuples (k1 , . . . , kr ), kr+1 = k1 with both number
of free steps and repetive steps equal to #I
2 . We call that tuple valid. It have
been shown by Wigner (1955) that contribution of non-valid tuples is at most
o( n1 ), so it vanishes in limit.
2d) We have one-to-one correspondence between valid partition and valid tuple
(in recursive way). Hence,
X
ρ ? φ(V).
lim E[Sn ] =
n→∞
V∈Pv (1,...,r)
18
2
Free probability theory - background information
3. Almost sure convergence of Sn .
3a) We want to show that Var[Sn ] ≤ nc2 .
Similarly like in the case of expected value we rewrite V ar[Sn ] as a double
sum over all tuples (k1 , . . . , kr ) and (l1 , . . . , lr ). Contribution into that sum
have only that tuples for which at least one step of k-tuple agree with one step
of l-tuple (in the opposite case we have independence and vanishing in the
sum). Then, one consider such tuples that i(m) = i(j), hence (km , km+1 ) =
(lj , lj+1 ) and i(m) 6= i(j), hence (km , km+1 ) = (lj+1 , lj ). Both cases we
obtain invalid tuples. Hence contribution of those cases is of the order n12 as
for E[Sn ] we have had factor n1 .
3b) We show sufficiency of point 3a) for showing almost sure convergence
E[
∞
X
n=1
(Sn − E[Sn ])2 ] =
∞
X
n=1
VarSn < ∞ ⇒
∞
X
(Sn − E[Sn ])2 < ∞ a.s.
n=1
Hence
lim (Sn − E[Sn ]) = 0 a.s.
n→∞
The combinatorial approach is often used in Free probability theory in order to prove
asymptotic freeness between classes of matrices. Here, the freeness for p → ∞ between
a p × p diagonal and Gaussian matrices have been proven. The concept of free cumulants
introduced in connection to the idea of non-crossing partitions will allow us to define one
of the main tools for applying free additive convolution, namely the R-transform.
3
Stieltjes and R-transform
Stieltjes transform is commonly used in research regarding spectral measure of
random matrices. It appears among others in formulations and proofs of a number of
results published within Random matrix theory, i.e. Marčenko and Pastur (1967), Girko
and von Rosen (1994), Silverstein and Bai (1995), Hachem et al. (2007). Thanks to
good algebraic properties it simplifies calculations provided in order to obtain the limit of
spectral distributions for the large dimensional random matrices.
The second section of this chapter is presenting the R-transform introduced within
Free probability theory and strongly related to the Stieltjes transform. The transformation
provides a way to obtain an analytical form of the asymptotic distribution of eigenvalues
for the sums of certain random matrices.
Both the Stieltjes and R-transform and their properties are discussed to different extents by Nica and Speicher (2006), Couillet and Debbah (2011), Speicher (2009) within
the lectures by Hiai and Petz (2000), Krishnapur (2011). A version of the Stieltjes transform, the Cauchy transform, is also widely described in Cima et al. (2006).
T
HE
3.1 Stieltjes transform
The literature studies shows that defined in this section Stieltjes transform, or its version,
is often described using terms Cauchy transform (i.e. Cima et al., 2006, Hiai and Petz,
2000, Nica and Speicher, 2006) or Stieltjes-Cauchy transform (i.e. Hasebe, 2012, Bożejko
and Demni, 2009). In this thesis using the Stieltjes transform terminology we follow the
work by Couillet and Debbah (2011).
Definition 3.1 (Stieltjes transform). Let µ be a non-negative, finite borel measure on
R. Then we define the Stieltjes transform of µ by
Z
1
dµ(x),
G(z) =
z−x
R
19
20
3 Stieltjes and R-transform
for all z ∈ {z : z ∈ C, =(z) > 0}, where =(z) denotes the imaginary part of the complex
z.
Remark 3.1. Note that for z ∈ {z : z ∈ C, =(z) > 0} the Stieltjes transform is well
defined and G(z) is analytical for all z ∈ {z : z ∈ C, =(z) > 0}.
Proof of Remark 3.1: The fact that the Stieltjes transform is well defined follows from
1
the fact that for the domain under function z−x
is bounded.
One can show that G(z) is analytical for all z ∈ {z : z ∈ C, =(z) > 0} using
Morera’s theorem (see work
H by Greene and Krantz, 2006). Then, it is enough to show
that the contour integral Γ G(z)dz = 0 for all closed contours Γ in z ∈ {z : z ∈
C, =(z) > 0}. We are allowed to interchange integrals and obtain
Z
Z I
1
dzdµ(x) = 0dµ(x) = 0,
z−x
R Γ
R
where the first integral vanishes by the Cauchy’s integral theorem for any closed contour
1
Γ as z−x
is analytic.
Definition 3.1 above can be extended to all z ∈ C \ support(µ). Nevertheless, as
we require from G(z) to be analytical, our consideration are going to be restricted to the
upper half plane of C as domain.
Now, we introduce the Stieltjes inversion formula, which allows us to use the knowledge about the form of the transform G to derive the measure µ.
Theorem 3.1 (Stieltjes inversion formula)
For any open interval I = (a, b), such that neither a nor b are atoms for the probability
measure µ the inversion formula
Z
1
µ(I) = − lim =G(x + iy)dx
π y→0
I
holds. Here convergence is with respect to the weak topology on the space of all real
probability measures.
Proof: We have
Z
1
− lim =G(x + iy)dx
π y→0
=
−
1
lim
π y→0
I
Z Z
I
=
1
lim
π y→0
=
1
π
(∗)
()
=
1
π
R
Z Zb
=
1
dµ(t)dx
x + iy − t
y
dxdµ(t)
(t − x)2 + y 2
R a
Z
a−t
b−t
− arctan
dµ(t)
lim arctan
y→0
y
y
R
Z
a−t
b−t
− arctan
dµ(t).
lim arctan
y→0
y
y
R
21
3.1 Stieltjes transform
The order of integration can be interchanged in (∗) due to continuity of the function
y
(t−x)2 +y 2 . Interchanging of order between integration and taking the limit in () follows
by the Bounded convergence theorem as µ(R) ≤ 1 < ∞ and
∃M arctan
b−t
y
− arctan
a−t
y
∀y ∀t
<M
so it is an uniformly bounded real-valued
measurable function for all y.
Then, using that limy→0 arctan
arctan
b−t
y
− arctan
T
y
=
a−t
y
π
2 sgn(T )
y→0
−−−→
for T ∈ R we get
0
π
22
=π
if t < a or t > b
if t ∈ (a, b)
which by the Dominated convergence theorem completes the proof.
Remark 3.2. More general, for any µ being a probability measure on R and any a < b
Z
1
1
1
µ((a, b)) + µ({a}) + µ({b}) = − lim =G(x + iy)dx.
2
2
π y→0
I
For further reading, see Krishnapur (2011).
Theorem 3.2
Let µn be a sequence of probability measures on R and let Gµn denote the Stieltjes transform of µn . Then:
a) if µn → µ weakly, where µ is a measure on R, then Gµn (z) → Gµ (z) pointwise
for any z ∈ {z : z ∈ C, =(z) > 0}.
b) if Gµn (z) → G(z) pointwise, for all z ∈ {z : z ∈ C, =(z) > 0} then there exists a
unique non-negative and finite measure such that G = Gµ and µn → µ weakly.
Proof:
a) We know that µn → µ, then for all bounded and continuous functions f the following
Z
Z
f dµn → f dµ
1
is both bounded and continuous on R for all fixed z ∈ {z :
holds. As f (x) = z−x
z ∈ C, =(z) > 0} we conclude that
Z
Z
1
1
Gµn (z) =
dµn (x) →
dµ(x) = Gµ (z)
z−x
z−x
R
pointwise.
R
22
3 Stieltjes and R-transform
b) Now, assume that Gµn (z) → G(z) pointwise. As µn is a probability measure
(so it is bounded and a positive measure for which supn µn (R) < ∞), then by
Helly’s selection principle µn has a weakly convergent subsequence. Denote that
subsequence by µnk and its limit by µ.
x→±∞
1
is bounded, continuous on R and f (x) −−−−−→ 0, by part a)
As f (x) = z−x
Gµnk (z) → Gµ (z) pointwise for all z ∈ {z : z ∈ C, =(z) > 0}. Then, Gµ = G
what by the fact that inverse Stieltjes transform is unique mean that for all the
converging subsequences µnk we obtain the same limit µ. Hence, µn → µ.
Last in this section we state a lemma, not restricted to the space of matrices, which
relates the Stieltjes transform with the moment generating function. It will be used later
for proving a relation between the Stieltjes and R-transform.
Lemma 3.1
Let
be a probability measure on R and {mk }k=1,... a sequenceP
of moments (i.e. mk (µ) =
R µ
∞
tk dµ(t)). Then, the moment generating function M µ (z) = k=0 mk z k converges to
R
an analytic function in some neighborhood of 0. For sufficiently large |z|
1 µ 1
.
(3.1)
G(z) = M
z
z
holds.
Consider now an element of the non-commutative space of Hermitian random matrices over the complex plane X ∈ RMp (C) and note that analyzing the Stieltjes transform
is actually simplified to consideration of diagonal elements of the matrix (zIp − X)−1 as
Z
1
1
1
Gµ (z) =
dµ(x) = Tr(zIp − Λ)−1 = Tr(zIp − X)−1 ,
z−x
p
p
R
where µ and Λ denote the empirical spectral distribution and the diagonal matrix of eigenvalues of the matrix X, respectively.
Lemma 3.2
For X ∈ Mp×n (C) and z ∈ {z : z ∈ C =(z) > 0}
p−n
n
Gµ ∗ (z) = GµXX ∗ (z) −
.
p X X
pz
Proof: As X ∈ Mp×n (C) the matrix X ∗ X is of size n × n and the matrix XX ∗ is of
size p×p. Assume that p > n. Then, X ∗ X has n eigenvalues, while the set of eigenvalues
of the matrix XX ∗ consists of the same n eigenvalues and additional p − n zeros. Then,
1
= z1 and
we have p − n times the term z−0
GµXX ∗ (z) =
n
p−n1
Gµ ∗ (z) +
.
p X X
p z
23
3.2 R-transform
If p < n, we have
GµX ∗ X (z) =
n
Gµ ∗ (z) =
p X X
p
n−p1
GµXX ∗ (z) +
,
n
n z
n−p1
GµXX ∗ (z) +
p z
and the proof is complete.
3.2 R-transform
The R-transform plays the same role in the Free probability theory as the Fourier transform in classical probability theory and it is defined in the following way.
Definition 3.2 (R-transform). Let µ be a probability measure with compact support,
with {ki }i=1,... as the sequence of cumulants, see Chapter 2, Definition 2.9. Then the
R-transform is given by
∞
X
ki+1 z i .
Rµ (z) =
i=0
Note that defined in this way the R-transform and cumulants {ki } give us essentially
the same information. Moreover, if it is not introducing confusion the upper index is
skipped and the R-transform is simply denoted R(z).
There is a relation between the R- and Stieltjes transform G, or more precisely G−1 ,
which is the inverse with respect to composition, and often is considered as an equivalent
definition to Definition 3.2.
Theorem 3.3
Let µ be a probability measure with compact support, G(z) the Stieltjes transform and
R(z) the R-transform. Then,
1
R(z) = G−1 (z) − .
z
The relation between the moment and cumulant generating functions is given by
Lemma 3.3. This tool is stated here due to its use in the proof of Theorem 3.3.
Lemma 3.3
Let {mi }i≥1 and {ki }i≥1 be sequences of complex numbers, with corresponding formal
power series
∞
∞
X
X
ki z i
mi z i
C(z) = 1 +
M (z) = 1 +
i=1
i=1
as generating functions, such that
mi =
X
kπ .
π∈N C(i)
Then
C(zM (z)) = M (z).
24
3 Stieltjes and R-transform
The proof of the lemma has been presented by Nica and Speicher (2006), who used
combinatorial tools.
Proof of Theorem 3.3:
R(G(z)) +
1
=z
G(z)
is equivalent with the statement given in the theorem. Let {mk }k=1,... and {ki }i=1,... denote
of moments
sequence of cumulants, respectively. Let M (z) =
P∞ the sequence
P∞ and the
k
i
m
z
and
C(z)
=
k
z
.
Then
k
i
k=0
i=0
∞
∞
∞
∞
X
1X
1X
1 X
ki+1 z i =
R(z) =
ki z i − 1
ki+1 z i+1 =
ki z i =
z i=0
z i=1
z i=0
i=0
=
1
(C(z) − 1)
z
(3.2)
holds. By Lemma 3.3, we get the relation between moment and cumulant generating
functions
M (z) = C(zM (z)).
(3.3)
Then
R(G(z)) +
1
G(z)
1
1
1
(C(G(z)) − 1) +
=
C(G(z))
G(z)
G(z)
G(z)
1
1
1
1
1 (3.1)
(3.3)
=
= z
C
M
M
G(z)
z
z
G(z)
z
(3.2)
=
(3.1)
=
and the theorem is proved.
The R-transform will play an important role in the following chapter so we prove here
some of its properties. Especially, the first two of Theorem 3.4 will be used frequently.
Theorem 3.4
Let (A, τ ) be a non-commutative probability space, such that distribution of X, Y, Xn ∈
A, for all n ∈ N, has compact support. The R-transform has the following properties
a) Non-linearity: RαX (z) = αRX (αz) for every X ∈ A and α ∈ C;
b) For any two freely independent non-commutative random variables X, Y ∈ A
RX+Y (z) = RX (z) + RY (z)
as formal power series;
c) Let X, Xn ∈ A, for n ∈ N. If
lim τ (Xnk ) = τ (X k ),
n→∞
k = 1, 2, . . . ,
then
lim RXn (y) = RX (y)
n→∞
as formal power series (convergence of coefficients).
25
3.2 R-transform
Proof:
a) Let us prove the lack of linearity for the R-transform.
We notice first that
Z
Z
Z
1
1
1
GαX (z) =
dµ(αx) =
dµ(x) =
z − αx
z − αx
α
R
R
R
z
1
GX
=
α
α
z
α
1
dµ(x)
−x
and then as G−1
αX (GαX (z)) = z we have
z
=
αz
=
G−1
X (αz) =
1 −1
1
G
G
(z)
,
GαX (G−1
(z))
=
X
αX
α
α αX
1 −1
GX
GαX (z) ,
α
1 −1
G (z),
α αX
−1
Hence, G−1
αX (z) = αGX (αz). Then,
RαX (z) =
=
1
1
1
−1
−1
− = αGX (αz) − = α GX (αz) −
z
z
αz
αRX (αz).
G−1
αX (z)
b) By the freeness of X and Y we have that kiX+Y = kiX + kiY for i = 1, 2, . . ., see
Theorem 2.3. Then,
RX+Y (z) =
∞
X
i=0
X+Y i
ki+1
z =
∞
X
i=0
= RX (z) + RY (z)
X
Y
(ki+1
+ ki+1
)z i =
∞
X
X
ki+1
zi +
i=0
∞
X
Y
ki+1
zi
i=0
c) The last property follows directly from the definition of the R-transform. As the
free cumulants converge the R-transform also converge in each of its coefficients.
Presented in Theorem 3.4b) the linearization of free convolution
RµX+Y (z) = RµX µY (z) = RµX (z) + RµY (z),
where µX (µY ) stands for distribution of X (Y , respectively), is for simplicity denoted
by
RX+Y (z) = RX (z) + RY (z).
Besides the asymptotic freeness of matrices, results regarding the R-transform, Part b) of
Theorem 3.4 and Theorem 3.3, are considered to be the two main achievements presented
by Voiculescu in his early papers, see Voiculescu (1991).
4
Marčenko-Pastur, Girko-von Rosen
and Silverstein-Bai theorems
results discussed in this chapter are an illustration of the use of the Stieltjes transform in Random matrix theory. The theorems show various methods of calculation
of the asymptotic spectral distribution. Two of such theorems are compared in the case of
the Wishart matrix n1 XX 0 , where X ∼ Np,n (0, σ 2 I, I).
T
HE
4.1 Statements of theorems
In this section we recall some of the early results obtained by Marčenko and Pastur (1967),
Girko and von Rosen (1994) and Silverstein and Bai (1995).
Theorem 4.1 (Marčenko-Pastur Law)
Consider the matrix Qn defined by (1.1), when k = 1, An = 0 and X ∼ Np,n (0, σ 2 I, I).
Then the asymptotic spectral distribution is given by:
If np → c ∈ (0, 1]
µ0 (x) =
If
p
n
→c≥1
p
√
√
[σ 2 (1 + c)2 − x][x − σ 2 (1 − c)2 ]
1((1−√c)2 σ2 ,(1+√c)2 σ2 ) (x)
2πcσ 2 x
1
1−
δ0 + µ,
c
where the asymptotic spectral density function
p
√
√
[σ 2 (1 + c)2 − x][x − σ 2 (1 − c)2 ]
0
1((1−√c)2 σ2 ,(1+√c)2 σ2 ) (x).
µ (x) =
2πcσ 2 x
27
28
4 Marčenko-Pastur, Girko-von Rosen and Silverstein-Bai theorems
In the special case, when c = 1 we obtain spectral density
1 p
µ0 (x) =
4x − x2 ,
2πx
which is scaled β-distribution with α = 12 and β = 23 .
Recently it has been proven that for a class of random matrices with dependent entries
limiting empirical distribution of the eigenvalues is given by the Marčenko-Pastur law.
Theorem 4.2 (Girko and von Rosen, 1994)
Let X ∼ Nn,p (0, Σ, Ψ), where the eigenvalues of Σ and Ψ are bounded by some constant.
Suppose that the Kolmogorov condition 0 < c = limn→∞ np < ∞ holds and let µn,p (x)
be defined by Definition 1.1. Then for every x ≥ 0
p
µn,p (x) − Fn (x) → 0,
n → ∞,
p
where → denotes convergence in probability and where for large n, {Fn (x)} are distribution functions satisfying
Z∞
0
−1
dFn (x)
1
0
= Tr I + tAn An + tΣa(t)
,
1 + tx
p
where for all t > 0, a(t) is a unique nonnegative analytical function which exists and
which satisfies the nonlinear equation
a(t) =
t
1
Tr(Ψ(I + Ψ Tr(Σ(I + tAn A0n + tΣa(t))−1 ))−1 ).
n
n
Note, that for Stieltjes transform G(z), defined according to Definition 3.1, is given
R ∞ n (x)
by G(z) = 1z g − z1 , where g(z) = 0 dF
1+tx as in Theorem 4.2.
Theorem 4.3 (Girko and von Rosen, 1994)
Consider the modification of the matrix Qn defined by (1.1), when k=2 and An = 0 to the
form Qn = n11 X1 X10 + n12 X2 X20 , where the matrices X1 and X2 are independent. Let
a(t)
=
b(t)
=
c(t)
=
d(t)
=
1
n2
1
n2
1
n1
1
n1
Tr Ψ2 (I +
t
Ψ2 b(t))−1 ,
n2
Tr Σ2 (I + tΣ2 a(t) + tΣ1 c(t))−1 ,
t
Ψ1 Tr(Σ1 (I + tΣ2 a(t) + tΣ1 b(t))−1 ))−1 ,
n1
t
Tr Ψ1 (I +
Ψ1 Tr(Σ1 (I + tΣ2 a(t) + tΣ1 d(t))−1 ))−1 .
n1
Tr Ψ1 (I +
Put g(t) = p1 Tr((I + t/n1 X1 X10 + t/n2 X2 X20 )−1 ). If 0 < limn1 →∞
0 < limn2 →∞ np2 < ∞ it follows that
g(t) →
1
(I + tΣ1 d(t) + tΣ2 a(t))−1 ,
p
n → ∞.
p
n1
< ∞ and
29
4.2 Comparison of results
Theorem 4.4 (Silverstein and Bai, 1995)
Assume that
p
p
)p×n , Zij
∈ C are identically distributed for all
• for p = 1, 2, . . . Zp = ( √1p Zij
1
1 2
p, i, j, independent across i, j for each p, E|Z11
− EZ11
| = 1;
•
n(p)
p
→ d > 0 as p → ∞;
• Tp = diag(τ1p , τ2p , . . . , τnp ) where τip ∈ R, and the empirical distribution function
of the eigenvalues of the matrix Tp , i.e. {τ1p , τ2p , . . . , τnp } converges almost surely
in distribution to a probability distribution function H as p → ∞;
• Bp = Ap + Zp Tp Zp∗ , where ∗ stands for conjugate transpose of a matrix and
Ap = (apij )i,j=1,2,...,p is a Hermitian p×p matrix for which F Ap converges vaguely
to A almost surely, A being a possibly defective (i.e. with discontinuities) nonrandom distribution function;
• Zp , Tp and Ap are independent.
Then, almost surely, F Bp , the empirical distribution function of the eigenvalues of Bp ,
converges vaguely, as p → ∞ to a (non-random) distribution function F whose Stieltjes
transform m(z), z ∈ C+ , satisfies the canonical equation
Z
m(z) = mA z − d
τ dH(τ ) .
1 + τ m(z)
(4.1)
Note, that Silverstein and Bai defined the Stieltjes transform as −G(z), where G(z)
is given by Definition 3.1. Hence, also the inverse of Stieltjes transform is given with
opposite sign i.e.
Z
1
µ(I) = lim =G(x + iy)dx,
π y→0
I
where I = (a, b) such that a, b are not atoms for the measure µ.
4.2 Comparison of results
Theorem 4.2 can be extended to hold for matrices over C. Then, together with Theorem
4.4 provides us two computationally different ways to obtain the asymptotic spectral distribution. In this section the aim is to illustrate those differences with a simple example
of the matrix Qn given by equation (1.1) with k = 1, An = 0 and X1 ∼ Np,n (0, σ 2 I, I).
The following calculations show that the mentioned theorems give us the Stieltjes transforms, which however differ by a vanishing term and thanks to that lead to the same
asymptotic distribution function. Note that here the notations are standardized to the ones
presented in Chapter 2 and 3.
30
4 Marčenko-Pastur, Girko-von Rosen and Silverstein-Bai theorems
Consider Theorem 4.2, when Σ = σ 2 I, Ψ = I. Then,
Z∞
−1
dFn (x)
1
1
= Tr I + tAn A0n + tΣa(t)
,
=
1 + tx
p
1 + tσ 2 a(t)
0
where a(t) is a unique nonnegative analytical function which exists and which satisfies
the nonlinear equation
−1
σ2
t
tp
1
,
a(t) = Tr(Ψ(I + Ψ Tr(Σ(I + tAn A0n + tΣa(t))−1 ))−1 ) = 1+
n
n
n 1 + tσ 2 a(t)
a−1 (t)
1 + tσ 2 a(t)
0
a(t)
b(t) =
lim
n→∞
= 1 + tc
σ2
,
1 + tσ 2 a(t)
= a(t)(1 + tσ 2 a(t)) + tcσ 2 a(t),
= a2 (t)tσ 2 + a(t)(tσ 2 (c − 1) + 1) − 1,
p
−1 + tσ 2 (1 − c) ± 4σ 2 t + (1 + tσ 2 (c − 1))2
=
.
2σ 2 t
Z∞
dFn (x)
1
=
1 + tx
1 + tσ 2 a(t)
0
=
=
=
=
=
2
p
2−1+
− c) ± 4σ 2 t + (1 + tσ 2 (c − 1))2
2
p
2
1 + tσ (1 − c) ± 4σ 2 t + (1 + tσ 2 (c − 1))2
p
2(1 + tσ 2 (1 − c) ± 4σ 2 t + (1 + tσ 2 (c − 1))2 )
1 + 2tσ 2 (1 − c) + t2 σ 4 (1 − c)2 − (4σ 2 t + (1 + tσ 2 (c − 1))2 )
p
2(1 + tσ 2 (1 − c) ± 4σ 2 t + (1 + tσ 2 (c − 1))2 )
−4ctσ 2
p
2
1 + tσ (1 − c) ± 4σ 2 t + (1 + tσ 2 (c − 1))2
.
−
2ctσ 2
tσ 2 (1
Since b(t) > 0,
b(t) =
Denote f (u) =
ĥ(z) =
=
1
2uσ2
−
1 + tσ 2 (1 − c) −
p
4σ 2 t + (1 + tσ 2 (c − 1))2
.
2ctσ 2
q
−4σ 2 z1 + (1 − 1z σ 2 (u − 1))2 . Then, the Stieltjes transform equals
1 − 1 σ 2 (1 − c) −
1
1
z
=
b −
z
z
1−
1 2
z σ (1
2cσ 2
− c)
− f (c).
q
−4σ 2 1z + (1 − z1 σ 2 (c − 1))2
2cσ 2
31
4.2 Comparison of results
Now, following Theorem 4.4 we consider Tn = σ 2 I. This gives us the Dirac deltafunction H(τ ) = δ(τ − σ 2 ) as the asymptotic spectral distribution function of Tn . As
Ap = 0 we obtain
1
1
mA (z) =
=−
0−z
z
and then, by (4.1), the Stieltjes transform G(z) = −m(z) in Theorem 4.4 satisfies the
canonical equation
m(z) = −
1
2
z − d 1+σσ2 m(z)
,
which is identical to
m2 (z)σ 2 z + m(z)(z + σ 2 (1 − d)) + 1 = 0.
Thus
m(z) =
=
−1 + z1 σ 2 (d − 1) +
−1 +
1 2
z σ (d
2σ 2
− 1)
q
−4σ 2 z1 + (1 − 1z σ 2 (d − 1))2
2σ 2
+ df (d).
Hence, the Stieltjes transform is given by
G(z) =
1 − 1z σ 2 (d − 1)
− df (d).
2σ 2
Using Theorem 4.2 and Theorem 4.4 we have obtained the Stieltjes transforms ĥ(z)
and G(z), respectively. Now, we are going to obtain the distribution function using the
Stieltjes inversion formula, see Section 3.1. One can easily notice that
1 − 1z σ 2 (1 − c)
σ 2 (1 − c)
1
− f (c) = =
−
− f (c)
=ĥ(z) = =
2cσ 2
2cσ 2
2czσ 2
σ 2 (1 − c)(x − iy)
σ 2 (1 − c)
− f (c) = = −
− f (c)
= = −
2c(x + iy)σ 2
2c(x2 + y 2 )σ 2
σ 2 (1 − c)y
(1 − c)y
=
− =f (c) =
− =f (c).
2
2
2
2c(x + y )σ
2c(x2 + y 2 )
Similarly,
=G(z) = =
1−
1
2
x+iy σ (d
2σ 2
− 1)
(d − 1)y
− df (d) =
− =[df (d)].
2(x2 + y 2 )
Then, the spectral density function obtained, thanks to Theorem 4.2 and Theorem 4.4,
is equal to
−(1 − c)y
1
1
1
+
=f
(c)
= lim =f (c),
µ0(c) (x) = − lim =ĥ(z) = lim
y→0 π2c(x2 + y 2 )
π y→0
π
π y→0
32
0
(x)
ν(d)
4 Marčenko-Pastur, Girko-von Rosen and Silverstein-Bai theorems
= −
1
lim =G(z) = lim
y→0
π y→0
−(d − 1)y
1
1
+
=[df
(d)]
= lim =df (d),
2π(x2 + y 2 ) π
π y→0
respectively. After some calculations we obtain
p
√
√
[σ 2 (1 + c)2 − x][x − σ 2 (1 − c)2 ]
0
1((1−√c)2 σ2 ,(1+√c)2 σ2 ) (x).
µ(c) (x) =
2πcσ 2 x
Similarly, we apply the Stieltjes inversion formula to G(z).
1
= − lim =df (d) = dµ0(d) (x)
π y→0
s q 2
q 2 σ 2 1 + 1c
− x x − σ 2 1 − 1c
1((1−√ 1 )2 σ2 ,(1+√ 1 )2 σ2 ) (x)
=
c
c
c2π 1c σ 2 x
s 2
2 √
√
c + 1 − cx cx − σ 2
c−1
σ2
1((1−√c)2 σ2 ,(1+√c)2 σ2 ) (cx)
=
c2πσ 2 x
1 0
µ (cx) = µ0(c) (t).
=
c (c)
Hence, as for the continuous part of the spectral distribution functions
0
ν(d)
(x)
0
ν(d)
(x) = µ0(c) (x)
0
holds. Let µ0cont (x) := ν(d)
(x) = µ0(c) (x). To finally obtain the spectral distribution
function we calculate the deterministic part
Z
µmass (x) = 1 − µ0cont (x)dx
R
=
√
√
Z p 2
[σ (1 + c)2 − x][x − σ 2 (1 − c)2 ]
1D (x)dx
1−
2πcσ 2 x
R
=
1−
√
(1+Z c)2 σ2 p
[σ 2 (1 +
√
(1− c)2 σ2
√ 2
√
c) − x][x − σ 2 (1 − c)2 ]
dx
2πcσ 2 x

 1 − 1 δ if c ≥ 1,
0
c
=

0
if c < 1,
√ 2 2
√ 2 2
where D = ((1 − c) σ , (1 + c) σ ). Hence, the distribution function consists of a
continuous part and a mass point and is given by

 1− 1 δ +µ
0
cont (x) if c ≥ 1,
c
µmass (x) + µcont (x) =

µcont (x)
if c < 1,
4.2 Comparison of results
33
where
p
√
√
[σ 2 (1 + c)2 − x][x − σ 2 (1 − c)2 ]
1D (x),
=
2πcσ 2 x
√
√
where D = ((1 − c)2 σ 2 , (1 + c)2 σ 2 ).
Hence, both theorems give the same asymptotic spectral distribution function of the
considered matrix Qn even though the obtained Stieltjes transforms differ.
µ0cont (x)
34
4 Marčenko-Pastur, Girko-von Rosen and Silverstein-Bai theorems
5
Analytical form of the asymptotic
spectral distribution function
5.1 Asymptotic spectral distribution through the
R-transform
Due to the obtained asymptotic spectral distribution of the quadratic form Qn in (1.1)
asymptotic free independence of the sum of elements Xi Xi0 is used. Then the sum of the
R-transforms for asymptotically free independent elements leads us to the R-transform
of Qn . The difficulty here is to be able to calculate analytically the inverse Stieltjes
transform. The general idea of conducting calculations is given in Figure 5.1.
µX
trans.
GX
RX
µY
trans.
GY
RY
...
trans.
...
...
µZ
trans.
GZ
RZ
+
RX+Y +...+Z
GX+Y +...+Z
Stieltjes inversion f ormula
µX+Y +...+Z
Figure 5.1: Graphical illustration of the procedure of calculating the asymptotic
spectral distribution function using the knowledge about asymptotic spectral distribution of its asymptotically free independent summands.
The distribution of each of the asymptotically free independent summands leads to
the corresponding Stieltjes transforms. Using Theorem 3.3 we obtain the R-transforms,
35
36
5 Analytical form of the asymptotic spectral distribution function
which then are added. The form of the calculated Stieltjes transform GX+Y +...+Z allows
us, in some cases, to obtain a closed form expression for the asymptotic spectral density
function. The particular class of matrices with the closed form solution is given by the
Theorem 5.1.
In the next subsection we consider a particular form of the matrix quadratic form Qn ,
when Xi ∼ Np,n (0, I, I), i = 1, . . . , k, for which the above proceeding allows us to
obtain the asymptotic spectral distribution.
5.1.1 Asymptotic spectral distribution of the matrix Qn when
Σ = I and Ψ = I
If we consider Qn , the sum of matrices as in (1.1) with covariance matrices Σ = I,
Ψ = I and constant c = 1, then by the Marčenko-Pastur Law the spectral density for
Bi = n1 Xi Xi0 is given by
1 p
4x − x2 .
µ0 (x) =
2πx
Hence,
r
Z4 √
4x − x2
1 1
1
1
GBi (z) =
dx = −
− ,
2π
x(z − x)
2
4 z
0
GBi −1 (z) =
RBi (z) =
1
,
(1 − z)z
1
1
1
− =
.
(1 − z)z z
1−z
Then as Bi and Bj are asymptotically free for i 6= j (∗), see Dykema (1993), we obtain
the R-transform of Qn as
(∗)
RQn (z) = RB1 +···+Bk (z) = kRBi (z) =
k
.
1−z
Hence,
GQn −1 (z) =
1
z(k − 1) + 1
k
+ =
1−z
z
(1 − z)z
and
r
1
2(k + 1) (k − 1)2
1−k
GQn (z) =
1+
.
− 1−
+
2
z
z
z2
(5.1)
To make notation more compact denote
M
:=
=
2(1 + k)
(k − 1)2
−
1+
2
z
z
(k − 1)2 (x2 − y 2 ) 2(k + 1)x
2(k − 1)2 xy
2(1 + k)y
+ 1+
.
−
− 2
i
x2 + y 2
(x2 + y 2 )2
(x2 + y 2 )2
x + y2
37
5.1 Asymptotic spectral distribution through the R-transform
Then,
|M | =
s
2(1 + k)y 2(k − 1)2 xy
−
x2 + y 2
(x2 + y 2 )2
2
2
(k − 1)2 (x2 − y 2 ) 2(k + 1)x
+ 1+
− 2
.
(x2 + y 2 )2
x + y2
Using the formula for the inverse of the Stieltjes transform
µ0 (x) = −
1
lim =GQn (x + iy)
π y→0
we obtain
GQn (x + iy)
x
kx
y
ky
1√
1
+
−
−i
−
−
M
2
2(x2 + y 2 )
2(x2 + y 2 )
2(x2 + y 2 )
2(x2 + y 2 )
2
x
kx
1
1 p
1
|M |
+
−
− cos
φ
2
2(x2 + y 2 )
2(x2 + y 2 )
2
2
p
ky
1
1
y
,
−
+
sin
φ
|M
|
i
2(x2 + y 2 )
2(x2 + y 2 )
2
2
=
=
−
where φ = Arg(M ). Then,
r
φ
1 + cos φ
cos
= ±
,
2
2
cos φ
= cos Arg(M ) = cos arccos
<M
|M |
=
<M
.
|M |
Hence,
φ
cos
=
2
±
s
1+
<M
|M|
2
=
s
|M | + <M
.
2|M |
Similarly,
φ
sin
=
2
s
|M | − <M
.
2|M |
Thus, the Stieltjes transform is given by
GQn (x + iy)
r
r
1
x(1 − k)
1 |M | + <M
y(1 − k)
1 |M | − <M
+
−
−
i
+
2
2(x2 + y 2 )
2
2
2(x2 + y 2 )
2
2
p
p
|M | + <M
y(1 − k)
x(1 − k)
1
1
√
√
−i
|M | − <M .
+
−
+
2
2(x2 + y 2 )
2(x2 + y 2 )
2 2
2 2
=
=
Calculate the asymptotic spectral density function
µ0 (x)
=
=
1
lim =GQn (x + iy)
π y→0
ky
1 p
y
1
√
−
+
|M | − <M ,
− lim (−1)
π y→0
2(x2 + y 2 ) 2(x2 + y 2 ) 2 2
−
38
5 Analytical form of the asymptotic spectral distribution function
|M |
=
y→0
→
s
2
2(k + 1)x
(k − 1)2 (x2 − y 2 )
−
+ 1+
(x2 + y 2 )2
x2 + y 2
s
2
(k − 1)2 x2
2(k + 1)x
1+
,
−
x4
x2
2(1 + k)y
2(k − 1)2 xy
−
2
2
x +y
(x2 + y 2 )2
lim <M
y→0
p
2
(k − 1)2 (x2 − y 2 ) 2(k + 1)x
1+
−
y→0
(x2 + y 2 )2
x2 + y 2
2(k + 1)x
(k − 1)2 x2
−
,
= 1+
x4
x2
=
lim
vs
u 2 u
(k − 1)2 x2
2(k + 1)x
2(k + 1)x
(k − 1)2 x2
t
.
|M | − <M →
1+
−
−
−
1
+
x4
x2
x4
x2
y→0
Hence, we obtain a closed form expression for the asymptotic spectral density function:
µ0 (x)
=
=
=
=
=
=
vs
u 2 (k − 1)2 x2
2(k + 1)x
2(k + 1)x
(k − 1)2 x2
1 1 u
t
√
1+
−
−
−
1
+
π2 2
x4
x2
x4
x2
s
1
(k − 1)2 x2
2(k + 1)x 2(k + 1)x
(k − 1)2 x2
√
−
−
1+
− 1+
x4
x2
x4
x2
2 2π r
2(k + 1)x
1 4
(k − 1)2 x2
−1 −
+
1(k+1−2√k,k+1+2√k) (x)
2π
x4
x2
r
(k − 1)2
2(k + 1)
1
−1 −
1(k+1−2√k,k+1+2√k) (x)
+
2π
x2
x
1 p 2
−x − (k − 1)2 + 2(k + 1)x1(k+1−2√k,k+1+2√k) (x)
2πx
q
√
√
[k + 1 + 2 k − x][x − k − 1 + 2 k]
1(k+1−2√k,k+1+2√k) (x).
2πx
For c 6= 1 the matrix Qn has a spectral density function given by
µ0 (x)
=
q √
√
√
√
[( k + c)2 − x][x − ( k − c)2 ]
2πcx
1((√k−√c)2 ,(√k+√c)2 ) (x).
Now the result is stated for Σ = σ 2 I and Ψ = I.
Theorem 5.1
Let Qn be a p × p dimensional matrix defined as
Qn =
1
1
X1 X10 + · · · + Xk Xk0 ,
n
n
39
5.1 Asymptotic spectral distribution through the R-transform
where Xi is p × n matrix following a matrix normal distribution, Xi ∼ Np,n (0, σ 2 I, I).
Then, an asymptotic spectral distribution of Qn , denoted by µ, is determined by the spectral density function
µ0 (x)
q
√
√
√
√
[σ 2 ( k + c)2 − x][x − σ 2 ( k − c)2 ]
=
2πcxσ 2
1M (x),
(5.2)
√
√
√
√
where M = (σ 2 ( k − c)2 , σ 2 ( k + c)2 ), n → ∞ and the Kolmogorov condition,
p(n)
n = c, holds.
The spectral density function obtained using formula (5.2) in Theorem 5.1 and the
empirical spectral density function in the form of a histogram for the generated random
matrices are presented in Figure 5.2.
0.10
0.10
0.08
0.08
0.06
0.06
0.04
0.04
0.02
0.02
[a]
[c]
5
10
15
[b]
20
5
0.015
0.015
0.010
0.010
0.005
0.005
160
180
200
220
240
[d]
160
10
180
15
200
20
220
240
Figure 5.2: Comparison of the empirical spectral density function for 20 in [a,c]
and 200 in [b,d] realizations (histogram) and theoretical asymptotic spectral density
function (dashed line), given in (5.2) with k = 5 in [a,b], k = 100 in [c,d]. In all
cases σ = 1.4 and c = np , where p = 200 and n = 200.
Figures 5.2[a,b] are for the sum of k = 5 p × p matrices, while for Figures 5.2[c,d]
the number of summands k have been increased 20 times. Together with the increase of
k, the increase of the values of the eigenvalues and of the length of the support for the
asymptotic spectral distribution is observed. Figure 5.3 illustrates the described behaviour
with respect to k.
40
5 Analytical form of the asymptotic spectral distribution function
0.035
0.030
0.025
0.020
0.015
0.010
0.005
500
1000
1500
2000
2500
Figure 5.3: Theoretical asymptotic spectral density functions for increasing amount
of summands, given in (5.2). Figures, from the left to right, are for k =
20, 50, 100, 200, . . . , 1200. In all cases σ = 1.4 and c = np , where p = n = 200.
5.2
Classes of matrices with closed formula for
asymptotic spectral distribution function
In this section we point out classes of matrices for which it is possible to obtain a closed
form of the asymptotic spectral distribution. Classes are characterized by a general form
of inverse with respect to the composition of the Stieltjes transform (Theorem 5.2) or of
the R-transform as it is presented Example 5.1.
Theorem 5.2
For any p × p dimensional matrix Q ∈ Q with inverse with respect to the composition of
Stieltjes transform of the form
G−1 (z) =
az + b
,
cz 2 + dz + e
(5.3)
where a, b, c, d, e ∈ R, c 6= 0, d2 − 4ce 6= 0 has the asymptotic spectral distribution
µ0 (x) =
1 p
(−d2 + 4ce)x2 − a2 − 2(2bc − ad)x1D (x),
2πx
when the Kolmogorov condition holds, i.e.
D :=
x∈R:x∈
p(n)
n
→ c ∈ (0, ∞) for n → ∞ and
√
√
ad − 2bc − 2 b2 c2 − abcd + a2 ce ad − 2bc + 2 b2 c2 − abcd + a2 ce
,
.
d2 − 4ce
d2 − 4ce
Proof: Let the inverse, with respect to the composition, of the Stieltjes transform be as in
5.3. Then,
z(cG2 (z) + dG(z) + e) = aG(z) + b,
5.2 Classes of matrices with closed formula for asymptotic spectral distribution function
G(z)
41
p
1
zd − a − (zd − a)2 − 4cz(ez − b)
= −
2cz
r
2(2bc − ad) a2
d
a
1
2
− +
− d − 4ce +
+ 2 ,
=
2
c cz
z
z
which in particular for a = k − 1, b = 1, c = −1, d = 1 and e = 0 leads to (5.1). To
compute the asymptotic spectral density function we put up the Stieltjes transform with
respect to complex variable
d
xa
ya
G(x + iy) = − +
−i
2c 2c(x2 + y 2 )
2c(x2 + y 2 )
s a2 (x2 − y 2 ) 2(2bc − ad)x
2a2 xy
−2(2bc − ad)y
2 − 4ce +
+
d
−
+
+
i
x2 + y 2
(x2 + y 2 )2
(x2 + y 2 )2
x2 + y 2
xa
1
1
1
ya
1
d
− cos
φ r−i
+ sin
φ r ,
= − +
2
2
2
2
2c 2c(x + y ) 2
2
2c(x + y ) 2
2
where
φ
=
r
=
2(2bc − ad)
a2
Arg d2 − 4ce +
+ 2 ,
z
z
s
2 2
−2(2bc − ad)y
a2 (x2 − y 2 )
2a2 xy
2(2bc − ad)x
4
2 − 4ce +
+
d
.
−
+
x2 + y 2
(x2 + y 2 )2
(x2 + y 2 )2
x2 + y 2
After calculations which in a special case was carried out in Subsection 5.1.1 we obtain
G(x + iy)
=
−
1
lim =G(x + iy)
π y→0
s
1
a2 (x2 − y 2 )
ya
1
2(2bc − ad)x
2 − d2 + 4ce −
√
− lim (−1)
r
+
−
π y→0
2c(x2 + y 2 )
(x2 + y 2 )2
x2 + y 2
2 2
vs
u 2
2(2bc − ad)x
2(2bc − ad)x
a2 x 2
1 u
a2 x 2
t
√
− d2 + 4ce − 4 −
d2 − 4ce + 4 +
2
x
x
x
x2
2π 2
s
2 2
2 2
1
d2 − 4ce + a x + 2(2bc − ad)x − d2 − 4ce + a x + 2(2bc − ad)x
√
x4
x2
x4
x2
2π 2 r
1
a2 x 2
2(2bc − ad)x
−d2 + 4ce − 4 −
1D (x)
2π
x
x2
r
2(2bc − ad)
a2
1
−d2 + 4ce − 2 −
1D (x)
2π
x
x
p
1
(−d2 + 4ce)x2 − a2 − 2(2bc − ad)x1D (x),
2πx
µ0 (x) = −
=
=
=
=
=
=
s
a2 (x2 − y 2 )
d
xa
1
2(2bc − ad)x
r 2 + d2 − 4ce +
− +
− √
+
2
2
2c
2c(x + y )
(x2 + y 2 )2
x2 + y 2
2 2
s
ya
1
a2 (x2 − y 2 )
2(2bc − ad)x
2 − d2 − 4ce +
√
i
,
+
r
+
2c(x2 + y 2 )
(x2 + y 2 )2
x2 + y 2
2 2
42
5 Analytical form of the asymptotic spectral distribution function
√
√
ad−2bc−2 b2 c2 −abcd+a2 ce ad−2bc+2 b2 c2 −abcd+a2 ce
where D := x ∈ R : x ∈
.
,
d2 −4ce
d2 −4ce
Hence we have obtained an analytical solution for the whole class of matrices with
G−1 (z) =
az + b
.
+ dz + e
cz 2
Remark 5.1. The class of the matrices with inverse, with respect to the composition,
of Stieltjes transform given by Theorem 5.2 is equivalent to the class of matrices with
R-transform given by formula:
R(z) =
az + b
1
(a − c)z 2 + (b − d)z − e
−
=
.
cz 2 + dz + e z
z(cz 2 + dz + e)
Theorem 5.2 applied to the formulation discussed in Subsection 5.1.1 gives directly
the asymptotic spectral distribution given by (5.2).
Let us consider the p × p dimensional matrix Q ∈ Q with R-transform of the form
az 2 + bz + c
.
dz + e
R(z) =
Then an example, where by analytic tools one obtains the asymptotic spectral distribution
is the sum of a Wigner and a Wishart matrix.
Example 5.1: Asymptotic spectral distribution for Q=Wigner+Wishart
The density for the Wigner matrix is given by
µ0 (x) =
1
GW igner (z) =
2π
1 p
4 − x2 ,
2π
√
Z2 √
4 − x2
z2 − 4
z
dx = −
,
z−x
2
2
−2
RW igner (z) = z,
where for the Wishart matrix RW ishart (z) =
RW igner+W ishart = z +
G−1
W igner+W ishart (z) =
1
1−z .
Then,
1
z − z2 + 1
=
.
1−z
1−z
z2 − z3 + 1
z − z2 + 1 1
+ =
1−z
z
z(1 − z)
Hence the Stieltjes transform can be computed and further the spectral density of a sum
of Wigner and Wishart matrices.
43
5.2 Classes of matrices with closed formula for asymptotic spectral distribution function
If one considers c as an argument of the distribution function, the 3-dimensional illustration is given by Figure 5.4[a]. Then the 3D figure has been projected to 2D for each of
c = 2, 4, 8 in the Figures 5.4[b,c,d], respectively.
100
50
0
0.3
0.2
0.1
0.0
0
5
[a]
0.30
0.25
0.25
0.25
0.20
0.20
0.20
0.15
0.15
0.10
0.05
[b]
-2
0.15
0.10
0.10
0.05
2
4
6
[c]
-2
0.05
2
4
6
8
10
[d]
5
10
15
Figure 5.4: Asymptotic spectral density function for sum of Wigner and Wishart
matrices
[a] 3D plot of distribution c = np ∈ [2, 100], z ∈ [−3, 7] (Created using Mathematica, function DoDistrAbs3D(2,1) from package WigWish‘, presented in Appendix)
[b, c, d] 2D plot of distribution for c = np = 2, 4, 8, respectively, and z ∈ [−3, 7]
(Created using Mathematica, function DoDistrAbs(p,n,1) from package WigWish‘)
The comparison of the empirical spectral density function and the theoretical asymptotic spectral density function for sum of Wigner and Wishart matrices with c = 2 is given
by Figure 5.5.
44
5 Analytical form of the asymptotic spectral distribution function
0.25
0.20
0.15
0.10
0.05
0
2
4
6
Figure 5.5: Comparison of the empirical spectral density function for a Wigner plus
Wishart matrix for given 100 realizations (histogram) and its theoretical asymptotic
220
spectral density function (dashed line) with c = np = 110
= 2.
(Created using Mathematica, function DistrAndHist(220,110,1,100) from package
WigWish‘)
6
Future Research
The presented thesis analyze the existence of a closed form asymptotic spectral distribution for the matrices Q ∈ Q, i.e. the positive definite and Hermitian matrices, which can
be written as a sum of asymptotically free independent summands, depending of the form
of R-transform or inverse, with respect to the composition, of Stieltjes transform.
The studies can be extended to the matrices decomposable to the free matrix products.
Then, the other tool of Free probability theory, the S-transform will play key role. The
transform of the free element a is defined as Sa (z) = z1 Ra−1 (z) and its main property is
that for all free a and b belonging to the non-commutative probability space Sab = Sa Sb
holds.
There other idea for obtaining closed form expression for the asymptotic spectral distribution function arise when the actual measurements are provided. Then, in some particular cases, it is possible to manipulate the sample size in order to obtain the final result
in the analytic form.
45
Notation
Note that all vectors are column vectors. In general, lower case letters are used to denote
vector valued and scalar variables and upper case letters are used for matrix valued variables. However, there might be exceptions from these general rules due to conventions.
The same symbol can be used for different purposes. The principal notation is listed
below and any deviations from this is explained in the text.
Symbols and Operators
R
C
i
z̄
=z
<z
EY
VarY
X
X0
X∗
X −1
Tr X
Trp X
kXk
(Ω, F , P )
RMp (C)
Real numbers
Complex numbers√
Complex number −1
Complex conjugate of z ∈ C
Imaginary part of z ∈ C
Real part of z ∈ C
Expectation of a random variable Y
Variance of a random variable Y
Random matrix
Transpose of a matrix X
Conjugate transpose of a matrix X
Inverse of a matrix X
Trace of a matrix X
Weighted trace of a matrix X
Euclidean norm of a matrix X
Probability space
Set
p random matrices, with entries which belong to
T of all p ×
p
L
(Ω,
P ), i.e. entries are complex random variables
p=1,2,...
with all finite moments
47
48
(RMp (C), τ )
Non-commutative probability space, i.e. *-algebra RMp (C)
and tracial functional τ
Cha1 , . . . , am i Free algebra with generators a1 , . . . , am , i.e. all polynomials
in m non-commutative indeterminants
Σ
Covariance matrix between rows
Ψ
Covariance matrix between columns
Ψ⊗Σ
Kronecker product of the matrices Ψ and Σ
Np,n (M, Σ, Ψ) Matrix normal (Gaussian) distribution of p × n matrix with
mean matrix M and dispersion matrix Ψ ⊗ Σ, what is equivalent to Npn (vec M, Ψ ⊗ Σ)
Wp (n, Ω)
central Wishart distribution with n degrees of freedom and p ×
p positive definite matrix Ω
Indicator function of set D
1D
δ(x)
Dirac delta-function of x, i.e., δ(x) = 1 if x = 0, δ(x) = 0 if
x 6= 0
1A
One of the algebra A, i.e. 1A a = a1A = a for all a ∈ A
FpX (x)
Normalized spectral distribution function for p×p random matrix X
N C(r)
Non-crossing partitions over {1, . . . , r}
1n
n-elements segment, i.e. 1n = {1, . . . , n}
G(z)
Stieltjes transform
G−1 (z)
inverse, with respect to the compostition, of Stieltjes transform, i.e. G(G−1 (z)) = G−1 (G(z)) = z
R(z)
R-transform
Free additive convolution
?
Free procuct
Notation
Bibliography
Bożejko, M. and Demni, N. (2009). Generating Functions of Cauchy-Stieltjes type for orthogonal polynomials. Infinite Dimensional Analysis, Quantum Probability & Related
Topics, 12(1):91 – 98.
Capitaine, M. and Donati-Martin, C. (2007). Strong asymptotic freeness for Wigner and
Wishart matrices. Indiana University Mathematics Journal, 56:767–804.
Chen, J., Hontz, E., Moix, J., Welborn, M., Van Voorhis, T., Suárez, A., Movassagh,
R., and Edelman, A. (2012). Error analysis of free probability approximations to the
density of states of disordered systems. Phys. Rev. Lett., 109:036403.
Cima, J., Matheson, A., and Ross, W. (2006). The Cauchy transform. Mathematical
Surveys and Monographs. American Mathematical Society.
Couillet, R. and Debbah, M. (2011). Random Matrix Methods for Wireless Communications. Cambridge University Press, Cambridge, United Kingdom.
Dozier, R. B. and Silverstein, J. W. (2007). On the empirical distribution of eigenvalues of large dimensional information-plus-noise-type matrices. Journal of Multivariate
Analysis, 98(4):678 – 694.
Dykema, K. (1993). On certain free product factors via an extended matrix model. Journal
of Functional Analysis, 112(1):31 – 60.
Girko, V. (1990). Theory of Random Determinants. Mathematics and its applications
(Soviet series). Kluwer Academic Publishers, Dordrecht, The Netherlands.
Girko, V. (2001). Theory of Stochastic Canonical Equations. Mathematics and Its Applications Series. Kluwer Academic Publishers, Dordrecht, The Netherlands.
49
50
Bibliography
Girko, V. and von Rosen, D. (1994). Asymptotics for the normalized spectral function of
matrix quadratic form. Random Operators and Stochastic Equations, 2(2):153–161.
Greene, R. and Krantz, S. (2006). Function Theory of One Complex Variable. Graduate
Studies in Mathematics Series. American Mathematical Society, Rhode Island, USA.
Hachem, W., Loubaton, P., and Najim, J. (2006). The empirical distribution of the eigenvalues of a Gram matrix with a given variance profile. Annales de l’Institut Henri
Poincaré. Probabilités et Statistiques, 42(6):649 – 670.
Hachem, W., Loubaton, P., and Najim, J. (2007). Deterministic equivalents for certain
functionals of large random matrices. Annals of Applied Probability, 17(3):875–930.
Hasebe, T. (2012). Fourier and Cauchy-Stieltjes transforms of power laws including stable
distributions. International Journal of Mathematics, 23(03):1250041.
Hiai, F. and Petz, D. (2000). The Semicircle Law, Free Random Variables and Entropy.
Mathematical surveys and monographs. American Mathematical Society, Rhode Island, USA.
Khorunzhy, A. M., Khoruzhenko, B. A., and Pastur, L. A. (1996). Asymptotic properties
of large random matrices with independent entries. Journal of Mathematical Physics,
37(10):5033–5060.
Krishnapur, M. (2011).
Random matrix theory : Stieltjes’ transform proof
(cont’d). Remarks on fourth moment assumption. Available May 2013 on
http://math.iisc.ernet.in/ manju/rmt/lec 4.pdf.
Marčenko, V. A. and Pastur, L. A. (1967). Distribution of eigenvalues in certain sets of
random matrices. Mat. Sb. (N.S.), 72(114):507–536.
Müller, R. R. (2002). Lecture notes (2002-2007): Random matrix theory for wireless
communications. Available June 2011 on http://www.iet.ntnu.no/∼ralf/rmt.pdf.
Nica, A. and Speicher, R. (2006). Lectures on the Combinatorics of Free Probability.
Cambridge University Press, Cambridge, United Kingdom.
Silverstein, J. (1995). Strong convergence of the empirical distribution of eigenvalues of
large dimensional random matrices. Journal of Multivariate Analysis, 55(2):331 – 339.
Silverstein, J. and Bai, Z. (1995). On the empirical distribution of eigenvalues of a class
of large dimensional random matrices. Journal of Multivariate Analysis, 54(2):175 –
192.
Speicher, R. (1993). Free convolution and the random sum of matrices. RIMS, 29:731–
744.
Speicher, R. (1994). Multiplicative functions on the lattice of noncrossing partitions and
free convolution. Math. Ann., 298:611–628.
Speicher, R. (2009). Free Probability Theory. arXiv:0911.0087.
Bibliography
51
Tulino, A. M. and Verdú, S. (2004). Random matrix theory and wireless communications.
Commun. Inf. Theory, 1(1):1–182.
Voiculescu, D. (1985). Symmetries of some reduced free product C ∗ -algebras. Operator algebras and their connections with topology and ergodic theory, Proc. Conf.,
Buşteni/Rom. 1983, Lect. Notes Math. 1132, 556-588 (1985).
Voiculescu, D. (1991). Limit laws for random matrices and free products. Inventiones
mathematicae, 104(1):201–220.
Voiculescu, D., Dykema, K., and Nica, A. (1992). Free Random Variables. CRM Monographs. American Mathematical Society, Rhode Island, USA.
Wigner, E. P. (1955). Characteristic vectors of bordered matrices with infinite dimensions.
Annals of Mathematics, 62(3):548–564.
Yin, Y. (1986). Limiting spectral distribution for a class of random matrices. Journal of
Multivariate Analysis, 20(1):50 – 68.
Package WigWish’ for Mathematica
BeginPackage["WigWish‘"]
DoHistogram::usage = "to do histogram for Wigner+Wishart matrix"
DoDistr::usage = "to do plot of density measure obrained using
all G-transforms for Wigner+Wishart matrix"
DoDistrAbs::usage = "to do plot of density measure for
Wigner+Wishart matrix"
DoDistrAbs3D::usage = "to do plot 3D of density measure
depending on c=p/n for Wigner+Wishart matrix"
DistrAndHist::usage = "to do plot of both density and histogram
for Wigner+Wishart matrix"
DistrAndHistMovie::usage = "to do plot with possible change of
parameters n,p,sigma of both density and histogram for
Wigner+Wishart matrix"
DistrAndHistMovie3D::usage = "to do plot 3D with possible change
of ratio c=p/n of both density and histogram for
Wigner+Wishart matrix"
Begin["‘Private‘"]
<<Statistics‘NormalDistribution‘
G2[z_,a_,b_,c_,d_,e_]:=1/(12 a) (-4 b+4 d z-(2 I 2^(1/3)
(-I+Sqrt[3]) (b^2-2 b d z+d^2 z^2-3 a (c+d-e z)))/(-2 b^3+9 a
b c+9 a b d-27 a^2 e+6 b^2 d z-9 a c d z-9 a d^2 z-9 a b e
z-6 b d^2 z^2+9 a d e z^2+2 d^3 z^3+\[Sqrt](4 (-(b-d z)^2+3 a
(c+d-e z))^3+(2 b^3+27 a^2 e-6 b^2 d z-2 d^3 z^3+9 a d z
(c+d-e z)+b (6 d^2 z^2-9 a (c+d-e z)))^2))^(1/3)+I 2^(2/3)
(I+Sqrt[3]) (-2 b^3+9 a b c+9 a b d-27 a^2 e+6 b^2 d z-9 a c
53
54
Appendix: Package WigWish’ for Mathematica
d z-9 a d^2 z-9 a b e z-6 b d^2 z^2+9 a d e z^2+2 d^3
z^3+\[Sqrt](4 (-(b-d z)^2+3 a (c+d-e z))^3+(2 b^3+27 a^2 e-6
b^2 d z-2 d^3 z^3+9 a d z (c+d-e z)+b (6 d^2 z^2-9 a (c+d-e
z)))^2))^(1/3))
G3[z_,a_,b_,c_,d_,e_]:=1/(12 a) (-4 b+4 d z+(2 I 2^(1/3)
(I+Sqrt[3]) (b^2-2 b d z+d^2 z^2-3 a (c+d-e z)))/(-2 b^3+9 a
b c+9 a b d-27 a^2 e+6 b^2 d z-9 a c d z-9 a d^2 z-9 a b e
z-6 b d^2 z^2+9 a d e z^2+2 d^3 z^3+\[Sqrt](4 (-(b-d z)^2+3 a
(c+d-e z))^3+(2 b^3+27 a^2 e-6 b^2 d z-2 d^3 z^3+9 a d z
(c+d-e z)+b (6 d^2 z^2-9 a (c+d-e z)))^2))^(1/3)-2^(2/3) (1+I
Sqrt[3]) (-2 b^3+9 a b c+9 a b d-27 a^2 e+6 b^2 d z-9 a c d
z-9 a d^2 z-9 a b e z-6 b d^2 z^2+9 a d e z^2+2 d^3
z^3+\[Sqrt](4 (-(b-d z)^2+3 a (c+d-e z))^3+(2 b^3+27 a^2 e-6
b^2 d z-2 d^3 z^3+9 a d z (c+d-e z)+b (6 d^2 z^2-9 a (c+d-e
z)))^2))^(1/3))
G[z_,a_,b_,c_,d_,e_]:=Simplify[1/(6 a) (-2 b+2 d z+(2 2^(1/3)
(b^2-2 b d z+d^2 z^2-3 a (c+d-e z)))/(-2 b^3+9 a b c+9 a b
d-27 a^2 e+6 b^2 d z-9 a c d z-9 a d^2 z-9 a b e z-6 b d^2
z^2+9 a d e z^2+2 d^3 z^3+\[Sqrt](4 (-(b-d z)^2+3 a (c+d-e
z))^3+(2 b^3+27 a^2 e-6 b^2 d z-2 d^3 z^3+9 a d z (c+d-e z)+b
(6 d^2 z^2-9 a (c+d-e z)))^2))^(1/3)+2^(2/3) (-2 b^3+9 a b
c+9 a b d-27 a^2 e+6 b^2 d z-9 a c d z-9 a d^2 z-9 a b e z-6
b d^2 z^2+9 a d e z^2+2 d^3 z^3+\[Sqrt](4 (-(b-d z)^2+3 a
(c+d-e z))^3+(2 b^3+27 a^2 e-6 b^2 d z-2 d^3 z^3+9 a d z
(c+d-e z)+b (6 d^2 z^2-9 a (c+d-e z)))^2))^(1/3))]
DoDistr[p_,n_,sigma_]:=Function[param=p/n;a=-param*sigma^2;b=1;
c=sigma^2; d=-param*sigma^2; e=1; Plot[{-1/Pi
Im[G[z,a,b,c,d,e]], -1/Pi Im[G2[z,a,b,c,d,e]], -1/Pi
Im[G3[z,a,b,c,d,e]]}, {z,-4,30}, PlotRange->{Full,Full},
PlotStyle->{Red,Directive[Thick,Blue],
Directive[Black,Thick,Dashed]}]][p,n,sigma]
DoDistrAbs[p_,n_,sigma_]:=Function[param=p/n;a=-(param*sigma^2);
b=1; c=sigma^2;
d=-(param)*sigma^2; e=1; Plot[{Abs[-1/Pi Im[G3[z,a,b,c,d,e]]]},
{z,-4,30}, PlotRange->{Full,Full},
PlotStyle->{Directive[Black,Thick,Dashed]}]][p,n,sigma]
DoDistrAbs3D[ratio_,sigma_]:=Function[Plot3D[{Abs[-1/Pi
Im[G3[z,-param1*sigma^2,1,sigma^2,-param1*sigma^2,1]]]},
{z,-4,30}, {param1,ratio,100},
PlotRange->{Full,Full,Full}]][ratio,sigma]
Wigner[p_]:=Function[g1=RandomArray[NormalDistribution[0,1],{p,p}]
*(1+ I)/Sqrt[2]; (g1+ConjugateTranspose[g1])/(Sqrt[2p])][p]
Appendix: Package WigWish’ for Mathematica
Wishart[p_,n_,sigma_]:=Function[g2=sigma*
RandomArray[NormalDistribution[0,1],{p,n}] *(1+I)/Sqrt[2];
((g2.ConjugateTranspose[g2])/n)][p,n,sigma]
suma[p_,n_,sigma_]:=Wigner[p]+Wishart[p,n,sigma]
DoHistogram[p_,n_,sigma_,antal_]:=Function[ee[0]={};For[i=0,
i<antal, i++, ee[i+1]= Join[ee[i],
Re[Eigenvalues[suma[p,n,sigma]]]]]; Histogram[ee[antal],
Automatic, "PDF"]][p,n,sigma,antal]
DistrAndHist[p_,n_,sigma_,antal_]:=Function[Show[
DoHistogram[p,n,sigma,antal], DoDistrAbs[p,n,sigma]]]
[p,n,sigma,antal]
DistrAndHistMovie[p_,n_,sigma_,antal_]:=Function[ Manipulate[
{DoDistrAbs[p1,n1,sigma1]}, {{p1,p},1,500}, {{n1,n},1,500},
{{sigma1,sigma},0,3}]] [p,n,sigma,antal]
DistrAndHistMovie3D[ratio_,sigma_,antal_]:=Function[ Manipulate[
{DoDistrAbs3D[ratio1,sigma1]}, {{ratio1,ratio},0,50},
{{sigma1,sigma},0,3}]] [ratio,sigma,antal]
End[ ]
EndPackage[ ]
55
56
Appendix: Package WigWish’ for Mathematica