Download Entropy of operator-valued random variables: A variational principle for large N matrix models

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gaussian elimination wikipedia , lookup

Matrix calculus wikipedia , lookup

Matrix multiplication wikipedia , lookup

System of linear equations wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Transcript
July 1, 2002 17:9 WSPC/139-IJMPA
01079
International Journal of Modern Physics A, Vol. 17, No. 18 (2002) 2413–2444
c World Scientific Publishing Company
ENTROPY OF OPERATOR-VALUED RANDOM VARIABLES:
A VARIATIONAL PRINCIPLE FOR LARGE N MATRIX MODELS
L. AKANT,∗ G. S. KRISHNASWAMI† and S. G. RAJEEV‡
Department of Physics and Astronomy, University of Rochester,
Rochester, New York 14627, USA
∗ [email protected][email protected][email protected]
Received 5 December 2001
We show that, in ’t Hooft’s large N limit, matrix models can be formulated as a classical
theory whose equations of motion are the factorized Schwinger–Dyson equations. We
discover an action principle for this classical theory. This action contains a universal
term describing the entropy of the noncommutative probability distributions. We show
that this entropy is a nontrivial one-cocycle of the noncommutative analog of the diffeomorphism group and derive an explicit formula for it. The action principle allows us to
solve matrix models using novel variational approximation methods; in the simple cases
where comparisons with other methods are possible, we get reasonable agreement.
Keywords: Large N matrix models; QCD; noncommutative probability theory; free
entropy; cohomology; variational principle.
PACS numbers: 11.15.Pg, 12.38.-t, 11.15.-q
1. Introduction
There are many physical theories in which random variables which are operators —
matrices of finite or infinite order — appear: For example, Yang–Mills theories,
models for random surfaces and M-theory (an approach to a string theory of quantum gravity). In all these theories, the observables are functions of the matrices
which are invariant under changes of basis; in many cases — as for Yang–Mills
theories — the invariance group is quite large since it contains changes of basis
that depend on position. We address the question of how to construct an effective
action (probability distribution) for these gauge-invariant observables induced by
the original probability distribution of the matrices.
Quantum chromodynamics (QCD) is the matrix model of greatest physical interest. QCD is the widely accepted theory of strong interactions. It is a Yang–Mills
theory with a non-Abelian gauge group SU(N ). Thus, the microscopic degrees of
freedom include a set of N × N Hermitian matrices at each point of space–time:
The components of the one-form that represents the gauge field. In addition there
2413
July 1, 2002 17:9 WSPC/139-IJMPA
2414
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
are the quark fields that form an N -component complex vector at each point of
space–time. The number of “colors,” N , is equal to three in nature. Nevertheless,
it will be useful to study the theory for an arbitrary value of N . Also, it will be
convenient to regard U(N ) rather than SU(N ) as the gauge group.
The microscopic degrees of freedom — quarks and gluons — do not describe
the particles that we can directly observe.1 – 3 Only certain bound states called
hadrons — those that are invariant under the gauge group — are observable. This
phenomenon — called confinement — is one of the deepest mysteries of theoretical
physics.
In earlier papers we have postulated that there is a self-contained theory of color
invariant observables fully equivalent to QCD at all energies and all values of N .
We have called this theory we seek “quantum hadron dynamics”4 and fully constructed it in the case of two-dimensional space–time. Also we have shown that this
theory is a good approximation to four-dimensional QCD applied to deep inelastic scattering: It predicts with good accuracy the parton distributions observed in
experiments.5
Certain simplifications of the two-dimensional theory allowed us to eliminate
all gluon (matrix-valued) degrees of freedom. This helped us to construct twodimensional quantum hadron dynamics quite explicitly. To make further progress,
it is necessary to understand theories in which the degrees of freedom are N × N
matrices. Before studying a full-fledged matrix field theory we need to understand how to reformulate a theory of a finite number of matrices in terms of their
invariants.a This is the problem we will solve in this paper.
It is well known that matrix models simplify enormously in the limit N → ∞.1,2
The quantum fluctuations in the gauge-invariant observables in gauge-invariant
~
states can be shown to be of order N
. Thus, as long as we restrict to gauge-invariant
observables, in the limit N → ∞ QCD must tend to some classical theory. This
classical theory cannot be Yang–Mills theory, however, since the fluctuations in all
states (not just the gauge-invariant ones) would vanish in that limit. An important
clue to discovering quantum hadron dynamics would be to study its classical limit
first. This is the strategy that worked in the case of two dimensions.
The analog of the field equations of this “classical hadron dynamics” has been
known for a long time — they are the factorized Schwinger–Dyson equations.3 It
is natural to ask if there is a variational principle from which this equation can be
derived. Finding this action principle would be a major step forward in understanding hadronic physics: It would give a formulation of hadron dynamics in terms of
hadronic variables, entirely independent of Yang–Mills theory. A quantization of
the theory based on this action principle would recover the corrections of order N1 .
Moreover, we would be able to derive approximate solutions of the large-N field
equations by the variational method.
a Since
we understand by now how to deal with the quark degrees of freedom in terms of invariants,
it is sufficient to consider toy models for pure gauge theory, without vectorial degrees of freedom.
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2415
Even after the simplifications of the large N limit, generic matrix models have
proved to be not exactly solvable: The factorized Schwinger–Dyson equations have
proved to be generally intractable. Diagrammatic methods have been pushed to
their limit.1 To make further progress, new approximation methods are needed —
based on algebraic, geometric and probabilistic ideas. Moreover, the entire theory
has to be reformulated in terms of manifestly U(N )-invariant observables. Thus, the
basic symmetry principle that determines the theory has to be something new —
the gauge group acts trivially on these observables. In previous papers6 we had
suggested that the group G of automorphisms of a free algebra — the noncommutative analog of the diffeomorphism group — plays this crucial role in such a
gauge-invariant reformulation of matrix models. In this paper we finally discover
this manifestly gauge-invariant formulation of finite dimensional matrix models. We
find that the configuration space of the theory is a coset space of G — justifying
our earlier anticipation.
If we restrict to observables which are invariant under the action of U(N ), we
should expect that the effective action should contain some kind of entropy. The
situation is analogous to that in statistical mechanics, with the gauge-invariant
observables playing the role of macroscopic variables. However, there are an infinite
number of such observables in our case. Moreover, there is no reason to expect that
the systems we are studying are in thermal equilibrium in any sense. The entropy
should be the logarithm of the volume of the set of all Hermitian matrices that
yield a given set of values for the U(N )-invariant observables. This physical idea,
motivated by Boltzmann’s notions in statistical mechanics, allows us to derive an
explicit formula for entropy.
Our approach continues the point of view in the physics literature on large
N limit of matrix models.8,10,1 – 3,11,6,16 – 18 It should not be surprising that our
work has close relations to the theory of von Neumann algebras-nowadays called
noncommutative probability theory: After all operators are just matrices of large
dimension. Voiculescu12 – 15 has another, quite remarkable, approach to noncommutative probability distributions.b Our definition in terms of moments and the group
of automorphisms is closer in spirit to the physics literature. Also, the connection of
entropy to the cocycle of the automorphism group is not evident in that approach.
A closer connection between the mathematical and physical literature should enrich
both fields.
Although our primary motivation has been to study toy models of Yang–Mills
theory, the matrix models we study also arise in some other physical problems.
There are several recent reviews that establish these connections, so we make only
some brief comments. See e.g. Ref. 19.
In the language of string theory, what we seek is the action of closed string field
theory. We solve this problem in a “toy model” — for strings on a model of space–
time with a finite number of points. Closed string theory turns out to be a kind
b We
thank Prof. Dan Voiculescu for bringing Refs. 13–15 to our attention.
July 1, 2002 17:9 WSPC/139-IJMPA
2416
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
of Wess–Zumino–Witten model on the coset spacec G/SG; we discover an explicit
formula for the classical action, including a term which represents an anomaly — a
nontrivial one-cocycle. Our work complements the other approaches to closed string
field theory.20
Random matrices also appear in another approach to quantum geometry.21 Our
variational method could be useful to approximately solve these matrix models for
Lorentzian geometry.
Quantum chaos are often modeled by matrix models. 19 In that context the
focus is often on universal properties that are independent of the particular choice
of the matrix model action (which we call S below).27 These universal properties
are thus completely determined by the entropy. Our discovery of an explicit formula for entropy should help in deriving such universal properties for multi-matrix
models: So far results have been mainly about the one-matrix model. In the current paper our focus is on the joint probability distribution, which is definitely not
universal.
A preliminary version of this paper was presented at the MRST 2001 conference.7
2. Operator-Valued Random Variables
Let ξi , i = 1 · · · M be a collection of operator valued random variables. We can
assume without any loss of generality that they are Hermitian operators: We
can always split any operator into its “real” and “imaginary” parts. If the ξi
were real-valued, we could have described their joint probability distribution as
a density (more generally a measure) on RM . When ξi are operators that cannot be diagonalized simultaneously, this is not a meaningful notion; we must seek
another definition for the notion of joint probability distribution (jpd).
The quantities of physical interest are the expectation values of functions of
the basic variables (generators) ξi ; the jpd simply provides a rule to calculate these
expectation values. We will think of these functions as polynomials in the generators
(more precisely formal power series). Thus, each random variable will be determined
by a collection of tensors u = {u∅ , ui , ui1 i2 , . . .} which are the coefficients in its
expansion in terms of the generators:
∞
X
u(ξ) =
ui1 ···im ξi1 · · · ξim .
(1)
m=0
The constant term is just a complex number: The set of indices on it is empty.
If u is a polynomial, all except a finite number of the tensors will be zero. It
is inconvenient to restrict to polynomials: We would not be able to find inverses
of functions, for example, within the world of polynomials. The opposite extreme
would be to impose no restriction at all on the tensors u: Then the random variable
cG
is a noncommutative analog of the diffeomorphism group; SG is the subgroup that preserves a
noncommutative analog of volume. See below for precise definitions.
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2417
is thought of as a formal power series. We pay a price for this: It is no longer possible
to “evaluate” the above infinite series for any particular collection of operators ξi :
The series may not converge. Nevertheless, it makes sense to take linear combinations and to multiply such formal power series:
m
X
i1 ···im
i1 ···im
i1 ···im
i1 ···im
= αu
+ βv
, [uv]
=
ui1 ···in v in+1 ···im . (2)
[αu + βv]
n=0
Note that even if there are an infinite number of nonzero elements in the tensors
u or v, the sum and product is always given by finite series: There are no issues
of convergence in their definition. Thus the set of formal power series form an
associatived algebra; this is the free algebra TM on the generators ξi . Note that the
multiplication is just the direct product of tensors.
As we noted above, the joint probability distribution of the ξi is just a rule
to calculate the expectation value of an arbitrary function of these generators. If
we restrict to functions of ξi that are polynomials,e such expectation values are
determined by the moments
Gi1 i2 ···in = hξi1 ξi2 · · · ξin i .
(3)
If the variables commute among each other, these moments are symmetric
tensors. The most general situation that can arise in physics is that the ξi satisfy no
relations at all among each other, (in particular they do not commute) except the
associativity of multiplication. In this case the moments form tensors with no particular symmetry property. All other associative algebras can be obtained from this
“free algebra” by imposing relations (i.e. quotienting by some ideal of the free algebra.) Such relations can be expressed as conditions on the moments. For example,
kl
if ξi ξj = Rij
ξk ξl , the moment tensors will satisfy conditions
Gi1 ···ia ia+1 ···im = Rikla ia+1 Gi1 ···ia−1 klia+2 ···im
(4)
involving neighboring indices.
2.1. The space of paths
Thus, in our theory, a random variable is a tensor u = (u∅ , ui , ui1 i2 , . . .). We can
regard each sequence of indices I = i1 i2 · · · im as a path in the finite set 1, 2, . . . , M
in which the indices take their values; a tensor is then a function on this space of
paths. Now, given two paths I = i1 · · · im and J = j1 · · · jn , we can concatenate
them: Follow I after traversing J : IJ = i1 · · · im j1 · · · jn . This concatenation operation is associative but not in general commutative; it has the empty path ∅ as an
identity element. There is however no inverse operation, so the concatenation defines
the structure of a semi-group on the space of paths.
d This algebra is commutative only if the number of generators M is one.
e Not all formal power series may have finite expectation values: The series
might diverge. This
does not need to worry us: There is a sufficiently large family of “well-behaved” random variables,
the polynomials.
July 1, 2002 17:9 WSPC/139-IJMPA
2418
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
We will use upper case latin indices to denote sequences of indices or paths.
Repeated indices are to be summed as usual; for example
uI GI =
∞
X
ui1 ···im Gi1 ···im .
(5)
m=0
Also, define δII1 I2 to be one if the paths I1 and I2 concatenate to give the path I;
and zero otherwise. I¯ = im im−1 · · · i1 denotes the reverse path. Now we can see that
the direct product on tensors is just the multiplication induced by concatenation
on the space of paths:
[uv]I = δII1 I2 uI1 v I2 .
(6)
A more refined notion of a path emerges if we regard the indices i as labelling
the edges of a directed graph; a sequence I = i1 i2 i3 · · · in is a path only when i1
is incident to i2 , and i2 incident with i3 etc. The space of paths associated with a
directed graph is still a semigroup; the associative algebra induced by concatentations is just the algebra of functions on this semi-group. Lattice gauge theory can
be interpreted as a matrix model (of unitary matrices) on a directed graphf that
approximates space–time; for example the cubic lattice.
The free algebra arises from the graph with one vertex, with every edge connecting that vertex to itself; that is why every edge is incident to every other edge.
This is the case we will mostly consider in this paper. Other cases can be developed
analogously.
2.2. Noncommutative probability distributions
We define the “noncommutative joint probability distribution” of the variables ξi
to be a collection of tensors G∅ , Gi , Gi1 i2 , . . . satisfying the normalization condition
1 = h1i ,
(7)
the hermiticity condition
G∗i1 i2 ···im = Gim im−1 ···i1 ,
i.e.
G∗I = GI¯
(8)
as well as the positivity condition
∞
X
Gi1 i2 ···im j1 j2 ···jn uim ···i1 ∗ uj1 j2 ···jn ≥ 0 ,
¯
i.e. GIJ uI ∗ uJ ≥ 0 .
(9)
m,n=0
for any polynomial u(ξ) = u∅ + uiξi + ui1 i2 ξi1 ξi ξ2 · · · . Denote by PM the space of
such noncommutative probability distributions. We define the expectation value of
P
a polynomial v(ξ) = m v i1 i2 ···im ξi1 · · · ξim to be
X
hv(ξ)i =
v i1 i2 ···im Gi1 i2 ···im ; i.e. hv(ξ)i = v I GI .
(10)
m
f Directed
graphs approximating space–time are called “lattices” in physics terminology.
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
If the variables ξi are commutative with joint pdf p(x1 · · · xn )dM x,
Z
Gi1 i2 ···in = xi1 · · · xin p(x1 · · · xM )dM x ;
2419
(11)
it is clear then that the above conditions on G follow from the usual normalization,
hermiticity and positivity conditions on p(x)dn x. For example, the contravariant
tensors u∅ , ui , ui1 i2 define a polynomial
X
u(ξ) =
ui1 i2 ···im ξi1 · · · ξim
(12)
m
and the quantity on the lhs of the positivity condition above is just the expectation
value of u† (ξ)u(ξ). In this case, the moment tensors will be real and symmetric.
Up to some technical assumptions, the pdf p(x)dM x is determined uniquely by the
collection of moments Gi1 ···in . (In the case of a single variable, the reconstruction
of the pdf from the moments is the “moment problem” of classical analysis; it was
solved at varying levels of generality by Markov, Chebycheff etc. See the excellent
book by Akhiezer.22 )
In the noncommutative case, the pdf no longer makes sense — even then the
moments allow us to calculate expectation values of arbitrary polynomials. This
motivates our definition.
In the cases of interest to us in this paper, GI is cyclically symmetric. This
corresponds to closed string theory or to glueball states of QCD. Open strings, or
mesons would require us to study moments which are not cyclically symmetric. The
theory adapts with small changes; but we will not discuss these cases here.
3. Large N Matrix Models
The basic examples of such operator-valued random variables are random matrices
of large size.
A matrix model is a theory of random variables which are N × N Hermitian
matrices Ai , i = 1, . . . , M . The matrix elements of each of the Ai are complex-valued
2
random variables, with the joint probability density function on RN M
1 N tr S(A) N 2 M
e
d
A,
(13)
Z(S)
P
i1 i2 ···in
where S(A) =
Ai1 · · · Ain is a polynomial called the action. Also,
n=1 S
Z(S) is determined by the normalization condition:
Z
2
Z(S) = eN tr S(A) dN M A .
(14)
The expectation value of any function of the random variables is defined to be
Z
1 N tr S(A) N 2 M
hf (A)i = f (A)
e
d
A.
(15)
Z(S)
The tensors S i1 ···in may be chosen to be cyclically symmetric. We assume they are
such that the integrals above converge: S(A) → −∞ as |A| → ∞. The interesting
July 1, 2002 17:9 WSPC/139-IJMPA
2420
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
observables are invariant under changes of bases. We can regard the indices i =
1, . . . , M as labelling the links in some graph; then a sequence i1 · · · in is a path
in this graph. Then Φi1 ···in (A) = N1 tr[Ai1 · · · Ain ] is a random variable depending
on a loop in this graph. the moment we will consider every link in the graph to be
incident with every other link, so that all sequences i1 · · · in are allowed loops. In
this case the loop variables are invariant under simultaneous changes of basis in the
basic variables:
Ai → gAi g † ,
g ∈ U(N ) .
(16)
If we choose some other graph, a sequence of indices is a closed loop only if the
edge i2 is adjacent to i1 , i3 is adjacent to i2 and so on. The invariance group will
be larger; as a result, S I is nonzero only for closed loops I.
Given S, the moments of the loop variables satisfy the Schwinger–Dyson
equations
S J1 iJ2 hΦJ1 IJ2 i + δII1 iI2 hΦI1 ΦI2 i = 0 .
(17)
This equation can be derived by considering the infinitesimal change of variables
[δv A]i = viI AI
on the integral
(18)
Z
Z(S) =
eN tr S(A) dA .
(19)
The variation of a product of A’s is easy to compute:
[δv A]J = viI δJJ1 iJ2 AJ1 AI AJ2 .
(20)
The first term in the Schwinger–Dyson equation follows from the variation of the
action under this change. The second term is more subtle as it is the change of the
measure of integration — the divergence of the vector field vi (A):
δv (dA) = viI
∂AaIb
dA ,
∂Aaib
∂AaIb
= δII1 iI2 tr AI1 tr AI2 .
∂Aaib
(21)
Returning to the Schwinger–Dyson equations, we see that they are not a closed
system of equations: The expectation value of the loop variables is related to that
of the product of two loop variables. However, there is a remarkable simplification
as N → ∞.
In the planar limit N → ∞ keeping S i1 ···in fixed, the loop variables have no
fluctuations:g
1
hf1 (Φ)f2 (Φ)i = hf1 (Φ)ihf2 (Φ)i + O
,
(22)
N2
g This is called the planar limit, since in perturbation theory, only Feynman diagrams of planar
topology contribute.1 In the matrix model of random surface theory, one is interested in another
large N limit, the double scaling limit. Here the coupling constants S I have to vary as N → ∞
and tend to certain critical values at a specified rate. The fluctuations are not small in the double
scaling limit.
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2421
where f1 (Φ), f2 (Φ) are polynomials of the loop variables. This means that the probability distribution of the loop variables is entirely determined by the expectation
values (moments)
1
tr Ai1 · · · Ain .
Gi1 ···in = lim
(23)
N →∞ N
Thus we get the factorized Schwinger–Dyson equations:
S J1 iJ2 GJ1 IJ2 + δII1 iI2 GI1 GI2 = 0 .
(24)
Since the fluctuations in the loop variables vanishes in the planar limit, there
must be some effective “classical theory” for these variables of which these are the
equations of motion. We now seek a variational principle from which these equations
follow.
Matrix models arise as toy models of Yang–Mills theory as well as string theory.
The cyclically symmetric indices I = i1 · · · in should be interpreted as a closed
curve in space–time. The observable ΦI correspond to the Wilson loop in Yang–
Mills theory and to the closed string field.
To summarize, the most important examples of noncommutative probability
distributions are large N matrix models:
Z
J
1
dA
Gi1 ···in = lim
tr[Ai1 · · · Ain ]eN tr S AJ
.
(25)
N →∞
N
Z(S)
3.1. Example: The Wigner distribution
The most ubiquituous of all classical probability distributions is the Gaussian; the
noncommutative analog of this is the Wigner distribution8 (also called the semicircular distribution).
We begin with the simplest where we have just one generator ξ for our algebra of
random variables. The algebra of random variables is then necessarily commutative
and can be identified with the algebra of formal power series in one variable. The
simplest example of a matrix-valued random variable is this: ξ is an N ×N Hermitian
matrix whose entries are mutually independent random variables of zero mean and
unit variance. More precisely, the average of ξ n is
Z
N2
1
ξ
n
n −N
tr ξ† ξ d
2
hξ i =
tr ξ e
.
(26)
N
ZN
The normalization constant ZN is chosen such that h1i = 1.
The Wigner distribution with unit covariance is the limit as N → ∞:
Z
N2
† d
N
1
ξ
Γn = lim
tr ξ n e− 2 tr ξ ξ
.
(27)
N →∞
N
ZN
The factorized Schwinger–Dyson equations reduce to the following recursion relations for the moments:
X
Γk+1 =
Γm Γn .
(28)
m+n=k−1
July 1, 2002 17:9 WSPC/139-IJMPA
2422
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
Clearly the odd moments vanish and Γ0 = 1. Set Γ2k = Ck . Then
X
Cm Cn .
Ck+1 =
(29)
m+n=k
The solution of the recursion relations give the moments in terms of the Catalan
numbers
1
2k
Γ2k = Ck =
.
(30)
k+1 k
3.2. The multivariable Wigner distribution
Let K ij be a positive matrix; i.e. such that
K ij u∗i uj ≥ 0
(31)
for all vectors ui with zero occuring only for u = 0. Then the moments of the
Wigner distribution on the generators ξi , i = 1, 2, . . . , M are given by
Z
2
ij
N
1
dN M ξ
Γi1 ···in = lim
tr [ξi1 · · · ξin ] e− 2 K tr ξi ξj
(32)
N →∞
N
ZN
(Again, ZN is chosen such that h1i = 1.) It is obvious that the moments of odd
order vanish; also, that the second moment is
Γij = K −1
ij
.
(33)
The higher order moments are given by the recursion relations:
ΓiI = Γij δII1 jI2 ΓI1 ΓI2 .
(34)
Note that each term on the rhs corresponds to a partition of the original path I
into subsequences that preserve the order. By repeated application of the recursion
the rhs can be written in terms of a sum over all such “non-crossing partitions” into
pairs. The Catalan number Ck is simply the number of such non-crossing partitions
into pairs of a sequence of length 2k.
Our strategy for studying more general probability distributions will be to transform them to the Wigner distribution by a nonlinear change of variables. Hence the
group of such transformations is of importance to us. In the next section we will
study this group.
4. Automorphisms of the Free Algebra
The free algebra generated by ξi remains unchanged (is isomorphic) if we change
to a new set of generators,
φ(ξ)i =
∞
X
m=0
φii1 ···im ξi1 · · · ξim
(35)
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2423
provided that this transformation is invertible. We will often abbreviate ξI =
ξi1 · · · ξim so that the above equation would be φ(ξ)i = φIi ξI . The composition
of two transformations ψ and φ can be seen to be
X
Pn
1
δPK1 ···Pn ψij1 ···jn φP
(36)
[(ψ ◦ φ)]K
i =
j1 · · · φjn .
n≤|K|
Note that the composition involves only finite series even when each of the series φ
and ψ is infinite.
The inverse, (φ−1 )i (ξ) ≡ χi (ξ), is determined by the conditions
Pn
K
1
[(χ ◦ φ)i ]K = δPK1 ···Pn χji 1 ···jn φP
j1 · · · φjn = δi .
(37)
They can be solved recursively for χiJ :
χij = (φ−1 )ij ,
χij1 j2 = −χkj11 χkj22 χil1 φlk11 k2 ,
··· X
···Pm k1
χij1 ···jn = −
δkP11···k
χj1 · · · χkjnn χil1 ···lm φlP11 · · · φlPmm .
n
(38)
m<n
Thus an automorphism φ has an inverse as a formal power series if and only if the
linear term φij ξj has an inverse; i.e. if the determinant of the matrix φij is nonzero.
The set of such automorphisms form a group GM = Aut TM . This group plays a
crucial role in our theory.
4.1. Transformation of moments under GM
Given an automorphism and a probability distribution with moments GI
[φ(ξ)]i = φIi ξI
(39)
the expectation value of [φ(ξ)]i1 · · · [φ(ξ)]in :
[φ∗ G]I =
∞
X
φJi11 · · · φJinn GJ1 ···Jn .
(40)
n=1
We might regard these as the moments of some new probability distribution. There
is a technical problem however: The sums may not converge, the group GM of formal
power series includes many transformations that may not map positive tensors to
positive tensors: They may not preserve the “measure class” of the joint probability
distribution GI .
Given some fixed jpd (say the Wigner distribution with unit covariance), there
is a subgroup G̃M that maps it to probability distributions; this is the open subset
of GM defined by the inequalities:h
¯
G̃M = {φ ∈ GM |[φ∗ G]I1 I2 u∗I1 uI2 ≥ 0}
h I¯
denotes the reverse of the sequence: I = i1 i2 · · · in , I¯ = in in−1 · · · i1 .
(41)
July 1, 2002 17:9 WSPC/139-IJMPA
2424
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
for all polynomials u. Thus in the neighborhood of the identity the two groups GM
and G̃M are the same; in particular, they have the same Lie algebra. The point is
that GM and G̃M are Lie groups under different topologies: The series φ ∈ G̃M have
to satisfy convergence conditions implied by the above inequalities.
It is plausible that any probability distribution can be obtained from a fixed
one by some automorphism; indeed there should be many such automorphisms. As
a simple example, the Wigner distribution with covariance Gij can be obtained
from the one with covariance δ ij by the linear automorphism φij ξj provided that
P k k
Gij =
k φi φj . Thus the space of Wigner distributions (the space of positive
covariance matrices) is the coset space GLM /OM .
In the same spirit, we will regard the space of all probability distributions as the
coset space of the group of automorphisms G̃M /SG M , where SG M is the subgroup of
automorphisms that leave the Wigner distribution of unit covariance invariant. We
can parametrize an arbitrary distribution with moments GI by the transformation
that relates it to the unit Wigner distribution with moments ΓI :
GI =
∞
X
φJi11 · · · φJinn ΓJ1 ···Jn .
(42)
n=1
Indeed, we will see below that all moments that differ infinitesimally from a given
one are obtained by such infinitesimal transformations. To rigorously justify our
point of view, we must prove that (in an appropriate topology) the Lie group G̃ is
the exponential of this Lie algebra. We will not address this somewhat technical
issue in this paper. In the next section we describe the Lie algebra in some more
details.
4.2. The Lie algebra of derivations
An automorphism that differs only infinitesimally from the identity is a derivation
of the tensor algebra. Any derivation of the free algebra is determined by its effect
on the generators ξi . They can be written as linear combinations viI LiI , where the
basis elements LiI are defined by
[LiI ξ]j = δji ξI .
(43)
The derivations form a Lie algebra with commutation relations
[LiI , LjJ ] = δJJ1 iJ2 LjJ1 IJ2 − δII1 jI2 LiI1 JI2 .
(44)
The change of the moments under such a derivations is
[LiI G]J = δJJ1 iJ2 GJ1 IJ2 .
(45)
We already encountered these infinitesimal variations in the derivation of the
Schwinger–Dyson equation.
Let us consider the infinitesimal neighborhood of some reference distribution
ΓI — for example the unit Wigner distribution. We will assume that ΓI satisfies
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2425
the strict positivity condition, ΓIJ u∗I uJ > 0; i.e. that this quadratic form vanishes
only when the polynomial u is identically zero. (This condition is satisfied by the
unit Wigner distribution.) It is the analog of the condition in classical probability
theory that the probability distribution does not vanish in some neighborhood of
the origin. Then, the Hankel matrix HI;J = GIJ is invertible on polynomials: It is
an inner product.
The infinitesimal change of moments under a derivation v = viI LiI is
[Lv Γ]k1 ···kn = vkI1 ΓIk2 ···kn + cyclic permutations in (k1 · · · kn ) .
(46)
Now, it is clear that the addition of an arbitrary infinitesimal cyclically symmetric
tensor gI to ΓI can be achieved by some derivation: We just find some tensor wI
of which gI is the cyclically symmetric part and put wkk1 ···kn = vkJ ΓJK . Since
the Hankel matrix is invertible, we can always find such a v. Thus an arbitrary
infinitesimal change in ΓI can be achieved by some viI .
Indeed there will be many such derivations, differing by those that leave ΓI
invariant. The isotropy Lie algebra of ΓI is defined by
vkI1 ΓIk2 ···kn + cyclic permutations in (k1 · · · kn ) = 0 .
(47)
We can simplify this condition for the choice where ΓI is the Wigner distribution.
For n = 1 this is just
vkI ΓI = 0 ;
(48)
for n = 2, we get, using the recursion relation for Wigner moments,
vkIjL
Γk2 j ΓI ΓL + k1 ↔ k2 = 0 .
1
(49)
In general, using the iterations of the Wigner recursion relation
ΓIk1 k2 = Γk1 j2 Γk2 j1 δII1 j1 I2 j2 I3 ΓI1 ΓI2 ΓI3
(50)
etc. we get
I j I2 j2 ···jn−1 In
Γk1 jn−1 Γk2 jn−2 · · · Γkn−1 j1 vk1n 1
ΓI1 · · · ΓIn
+ cyclic permutations in (k1 · · · kn ) = 0 .
(51)
In other words, we should lower a certain number of indices on viI using the second
moment and contract away the rest; the resulting tensor should not have a cyclically
symmetric part. It would be interesting to find the solutions of these conditions more
explicitly. We will not need this for our present work.
5. The Action Principle and Cohomology
We seek an action Ω(G) such that its variation under an infinitesimal automorphism
[LiI G]J = δJJ1 iJ2 GJ1 IJ2 is the factorized SD equation:
LiI Ω(G) = S J1 iJ2 GJ1 IJ2 + δII1 iI2 GI1 GI2 .
(52)
July 1, 2002 17:9 WSPC/139-IJMPA
2426
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
It is easy to identify a quantity that will give the first term:
LiI [S J GJ ] = S J1 iJ2 GJ1 IJ2 .
(53)
This term is simply the expectation value of the matrix model action. So
Ω(G) = S J GJ + χ(G)
(54)
LiI χ(G) = ηIi ≡ δII1 iI2 GI1 GI2 .
(55)
with
This term arises from the change in the measure of integration over matrices; hence
it is a kind of “anomaly.”
Now, in order for such a function χ(G) to exist, the anomaly ηIi must satisfy an
integrability condition LiI (LiJ χ) − LjJ (LiI χ) = [LiI , LjJ ]χ; i.e.
LiI ηJj − LjJ ηIi − δJJ1 iJ2 ηJj 1 IJ2 + δII1 jI2 ηIi1 JI2 = 0 .
(56)
A straightforward but tedious calculation shows that this is indeed satisfied. We
were not able to find a formal power series of moments satisfying this condition
even after many attempts.
Then we realized that, even in the case of a single matrix (treated in the
appendix) there is no solution of the above equation! The condition above is in
fact the statement that ηIi (G) is a one-cocycle of the Lie-algebra cohomology of G M
valued in the space of formal power series in G. (See the appendix.24 ) Although
η itself is a quadratic polynomial in the G, there is no polynomial or even formal
power series of which it is a variation: It represents a nontrivial element of the
cohomology of G twisted by its representation on the space of formal power series
in the moments.
We need to look for χ in some larger class of functions on the space PM of
probability distributions. Now, PM = GeM /SG M , a coset space of the group of
automorphisms. We can parametrize the moments in terms of the automorphism
that will bring them to a standard one: GI = [φ∗ Γ]I . So, another way of thinking of
functions on PM would be as functions on GeM invariant under the actioni of SG M .
Thus, instead of power series in GI , we will have power series in the coefficients φIi
determining an automorphism. In order to stand in for a function on PM , such a
power series would have to be invariant under the subgroup SG M .
Clearly, any power series of GI can be expressed as a power series of the φIi :
Simply substitute [φ∗ Γ]I for GI . But there could be a power series in φ that is
invariant under SG and still is not expressible as a power series in G. This subtle
distinction is the origin of the cohomology we are discussing.j We can now guess
that the quantity we seek is a function of this type on PM .
A hint is also provided by the origin of the term ηIi (G) in the Schwinger–Dyson
equations. It arises from the change of the measure of integration over matrices
i Such
j We
an idea was used successfully to solve a similar problem on cohomologies.25
give a simple example in the appendix.
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2427
under an infinitesimal, but nonlinear, change of variables. Thus, it should be useful
to study this change of measure under a finite nonlinear change of variables — an
automorphism.
More precisely, let φ(A) be a nonlinear transformation Ai 7→ φ(A)i =
P∞
I
n=1 φi AI , on the space of Hermitian N × N matrices. Also, let σ(φ, A) =
1
N 2 log det J(φ, A), where J(φ, A) is the Jacobian determinant of φ.
By the multiplicative property of Jacobians, we have
σ(φ1 φ2 , A) = σ(φ1 , φ2 (A)) + σ(φ2 , A) .
(57)
For example, σ(φ, A) = log det φ0 if [φ(x)]i = φj0i ξj is a linear transformation:
The Jacobian matrix is then a constant. It is convenient to factor out this linear
transformation and write
∞
X
φ̃ii1 ···in Ai1 · · · Ain .
(58)
[φ(A)]i = φj0i [φ̃(A)j ] , φ̃(A)i = Ai +
n=2
We will show in the appendix that σ(φ, A) can be written in terms of the traces
ΦI = N1 tr AI :
σ(φ, A) = log det φ0 +
∞
X
(−1)n+1 K1 i2 L1 K2 i3 L2
n i1 Ln
φ̃i2
φ̃i1
· · · φ̃K
ΦK1 ···Kn ΦLn ···L1 .
in
n
n=1
Thus, the expectation value of σ(φ, A) with respect to some distribution can be
expressed in terms of its moments GI = hΦI i, in the large N limit:
hσ(φ, A)i =c(φ, G) = log det φ1 +
∞
X
(−1)n+1
n=1
n
φ̃Ji11 i2 K1 φ̃Ji22 i3 K2 · · · φ̃Jinn i1 KnGJ1 ···JnGKn ···K1 .
The above equation for σ(φ1 φ2 , A) then shows that the expectation value c(φ, G)
satisfies the cocycle condition:
c(φ1 φ2 , G) = c(φ1 , φ2∗ (G)) + c(φ2 , G) .
(59)
Moreover, if we now restrict to the case of infinitesimal transformations, φ(ξ)i =
ξi + viI ξI , this c(φ, G) reduces to η:
c(φ, G) = viI ηIi (G) + O(v 2 ) .
(60)
Let us look at it another way: Let G = φ∗ Γ for some reference probability distribution Γ. Then the cocycle condition gives
c(φ1 , G) = c(φ1 φ, Γ) − c(φ, Γ) .
(61)
Choosing φ1 to be infinitesimal gives then,
ηiI (G) = LIi c(φ, Γ) .
(62)
Thus, we have solved our problem!: χ(φ) = c(φ, Γ) is a function on GeM whose
variation is η. But is it really a function on PM . In other words, is χ(φ)-invariant
under the right action of SG?
July 1, 2002 17:9 WSPC/139-IJMPA
2428
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
If φ2∗ Γ = Γ the cocycle condition reduces to
c(φφ2 , Γ) = c(φ, Γ) + c(φ2 , Γ) .
(63)
We need to show that the last term is zero.
We will only consider the case where the reference distribution Γ is the unit
Wignerian. If φ2 (ξ)i = ξi +viI ξI is infinitesimal, c(φ2 , Γ) is just viI ηIi (Γ) = viJiL ΓJ ΓL .
But, since v must leave the Wigner moments ΓI unchanged, it must satisfy (49).
If we contract that equation by Γk1 k2 we will get viJiL ΓJ ΓL = 0. Thus χ(φ) in
invariant at least under an infinitesimal φ2 ∈ SG. Within the ultrametric topology
of formal power series, the group SG should be connected, so that any element can
be reached by a succession of infinitesimal transformations.
To summarize, we now have an action principle for matrix models,
Ω(φ) =
∞
X
S i1 ···in φJi11 · · · φJinn ΓJ1 ···Jn + log det φi0j
n=1
+
∞
X
(−1)n+1 K1 i1 L1 K2 i2 L2
n in Ln
φ̃i3
· · · φ̃K
ΓKn ···K1 ΓL1 ···Ln .
φ̃i2
i1
n
n=1
The factorized Schwinger–Dyson equations follow from requiring that this action
be extremal under infinitesimal variations of φ. By choosing an ansatz for φ that
depends only on a few parameters and maximizing Ω with respect to them, we can
get approximate solutions to the factorized SD equations.
6. Entropy of Noncommutative Probability Distributions
Whenever we restrict the set of allowed observables of a system, some entropy is
created: It measures our ignorance of the variables we are not allowed to measure.
Familiar examples arise from thermodynamics, where only a finite number of macroscopic parameters are measured. In black hole physics where only the charge, mass
and angular momentum of a black hole can be measured by external particle scattering: The interior of a black hole is not observable to an outside observer.
There should be a similar entropy in the theory of strong interactions due to
confinement: Only “macroscopic” observables associated to hadrons are measurable
by a scattering of hadrons against each other. Quarks and gluons are observable only
in this indirect way. More precisely, only color invariant observables are measurable.
In this paper, we have a toy model of this entropy due to confinement of
gluons: We restrict to the gauge-invariant functions ΦI = N1 tr AI , of the matrices
A1 · · · AM . It turns out that the term χ in the action principle above is just the
entropy caused by this restriction.
Let Q be some space of “microscopic” variables with a probability measure µ,
and Φ : Q → Q̄ some map to a space of “macroscopic” variables. We can now define
the volume of any subset of Q̄ to be the volume of its preimage in Q: This is the
induced measure µ̄ on Q̄.
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2429
In particular we can consider the volume of the pre-image of a point q̄ ∈ Q̄. It
is a measure of our ignorance of the microscopic variables, when q̄ is the result of
measuring the macroscopic ones. Any monotonic function of this volume is just as
good a measure of this ignorance. The best choice is the logarithm of this volume,
since then it would be additive for statistically independent systems. Let us denote
this function on Φ(Q) by
σ(q̄) = log(µ[Φ−1 (q̄)]) .
(64)
The average of this quantity over Q̄ is the entropy of the induced probability
distribution µ̄.
Let us apply this idea to the case where the “microscopic” observable is a
single Hermitian N × N matrix A; the “macroscopic” observable is the spectrum,
the set of eigenvalues. We disregard the information in the basis in which A is
presented. We do so even if this information is measurable in principle; e.g. by
interference experiments in quantum mechanics. The natural measure on the space
2
of matrices is the uniform (Lebesgue) measure dA on RMN . Although the uniform
measure is not normalizable, the volume of the space of matrices with a given
spectrum {a1 · · · aN } is finite.10 Up to a constant
(i.e. depending only on N ), it
R
Q
is 1≤i<j≤N (ai − aj )2 . Thus the entropy is 2 x<y ρ(x)ρ(y) log |x − y|dx dy where
PN
ρ(x) = N1 i=1 δ(x − ai ). This expression make sense even in the limit N → ∞:
We get a continuous distribution of eigenvalues ρ(x).
What is the “joint spectrum” of a collection of M Hermitian matrices A1 · · · AM ?
Clearly they cannot be simultaneously diagonalized, so a direct definition of this
concept is impossible. Now, recall that the set of eigenvalues {a1 · · · aN } can be rePN
covered from the elementary symmetric functions Gn = N1 i=1 ani as the solutions
of the algebraic equation
1 N
x = G1 xN −1 − G2 xN −2 + · · · + (−1)N −1 GN .
N
(65)
The moments Gn for n > N are not independent: They can be expressed as polynomials of the lower ones. Although the set {a1 · · · aN } is determined by the sequence
G1 · · · GN , there is no explicit algebraic formula: The Galois theory shows that this
is impossible for N > 4. Galois theory also shows that any gauge-invariant polynomial of A can be expressed as a polynomial of the G1 · · · GN . Thus we can regard
this sequence N1 tr A, N1 tr A2 · · · as the spectrum of the matrix A.
Q
The volume i<j (ai − aj )2 of the space of matrices with a given spectrum is
a symmetric polynomial of order N (N2−1) in the eigenvalues. Hence, in principle, it
can be expressed as a polynomial in G1 · · · GN , athough there does not appear to
be a simple universal formula.k
k It is possible to get a formula for the volume in terms of the first 2N moments. The complication
is that only the first N moments can be freely specified. The remaining moments are determined by
these, and yet, there is no algebraic formula that expresses GN+1 · · · G2N in terms of G1 · · · GN .
July 1, 2002 17:9 WSPC/139-IJMPA
2430
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
This point of view suggests a generalization to several matrices: We can define the joint spectrum of a collection of matrices to be the quantities Gi1 ···in =
1
N tr Ai1 · · · Ain . Again, there are relations among these quantities when I is longer
than N ; but it is difficult to characterize these relations explicitly. Nevertheless, it
2
is meaningful to ask for the volume (with respect to the uniform measure on RMN )
of the set of all matrices with a given value for the sequence Gi1 ···in . Again, we will
not get any explicit formula for entropy by pursuing this point of view.
So we look for yet another way to think of the joint spectrum of a collection of
matrices. We can ask how the entropy of a collection of matrices with joint spectrum
GI changes if we transform them by some power series:
Ai 7→ φ(A)i = φIi AI .
(66)
Let c(φ, G) be this change. Then, if we perform another transformation, we must
have
c(φφ2 , G) = c(φ, φ2∗ G) + c(φ2 , G) ;
(67)
i.e. it must be a one-cocyle. Under infinitesimal variations, it reduces to η, since it
is just the infinitesimal change in the uniform measure dA.
In the last section we obtained this c(φ, G) explicitly as a formal power series
in G. It can be written as the variation
c(φ, G) = χ(φ∗ (G)) − χ(G)
(68)
of some function χ of the joint spectrum G. However this χ is not a formal power
series in G, so we cannot get an explicit formula for it. We can write it as an explicit
formal power series in φ which is invariant under the action of SG.
Thus we see the confluence of three apparently unrelated questions: An action
principle for the planar limit of matrix models (our main interest), cohomology
of the automorphism of formal power series and entropy of noncommutative
variables.
Voiculsecu has a somewhat different approach12 to defining the entropy of noncommuting random variables. Up to some additive constant his definition seems to
agree with ours. However, the transformation property of entropy under change of
variables is not discussed there. This transformation property is the cornerstone of
our approach.
7. Example: Two-Matrix Models
Let us consider a quartic multi-matrix model with the action:
1
1
S(M ) = − tr K ij Aij + g ijkl Aijkl .
2
4
(69)
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2431
Our reference action is the Gaussianl S0 (M ) = − tr 12 δ ij Ai Aj . We are interested in
estimating the Green functions and vacuum energy in the large N limit:
E exact = − lim
N →∞
1
Z
log
,
N2
Z0
(70)
where Z and Z0 are partition functions for S and S0 . Choose the linear change of
variable Ai → φi (A) = φji Aj . The variational matrix φji that maximizes Ω determines the multi-variable Wigner distribution that best approximates the quartic
matrix model. For a linear change of variables,
1
1
Ω[φ] = tr log[φji ] − K ij Gij − g ijkl Gijkl .
2
4
(71)
Here Gij = φki φkj and Gijkl = Gij Gkl + Gil Gjk are the Green functions of
S0 (φ−1 (A)). Thus, the matrix elements of G may be regarded as the variational
parameters and the condition for an extremum is
1 pq 1 pqkl
1
Gkl + g ijpq Gij + g pjqk Gjk + g ipql Gil ] = [G−1 ]pq .
K + [g
2
4
2
(72)
This is a nonlinear equation for the variational matrix G, reminiscent of the self
consistent equation for a mean field. To test our variational approach, we specialize
to a two matrix model for which some exact results are known from the work of
Mehta.28
7.1. Mehta’s quartic two-matrix model
Consider the action
1 2
g
(73)
(A + B 2 − cAB − cBA) + (A4 + B 4 ) ,
2
4
which corresponds to the choices K ij = 1−c −c1 , g 1111 = g 2222 = g and g ijkl = 0
otherwise.m We restrict to |c| < 1, where K ij is a positive matrix. Since S(A, B) =
S(B, A) and GAB = G∗BA we may take
α β
Gij =
(74)
β α
S(A, B) = − tr
with α, β real. For g > 0, Ω is bounded above if Gij is positive. Its maximum occurs
cα
at (α, β) determined by β = 1+2gα
and
4g 2 α3 + 4gα2 + (1 − c2 − 2g)α − 1 = 0 .
l In
(75)
the language of noncommutative probability theory we used earlier, what we call the Gaussian
in this section is really the multivariate Wignerian distribution. There should be no confusion,
since the Wignerian moments are realized by a Gaussian distribution of matrices.
m Kazakov relates this model to the Ising model on the collection of all planar lattices with coordination number four.29
July 1, 2002 17:9 WSPC/139-IJMPA
2432
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
We must pick the real root α(g, c) that lies in the physical region α ≥ 0. Thus, the
Gaussian ansatz determines the vacuum energy (E(g, c) = − 12 log(α2 − β 2 )) and
all the Green functions (e.g. GAA = α, GAB = β, GA4 = 2α2 etc.) approximately.
By contrast, only a few observables of this model have been calculated exactly.
Mehta28,n obtains the exact vacuum energy E ex (g, c) implicitly, as the solution
ex
of a quintic equation. Gex
AB and GA4 may be obtained by differentiation. As an
illustration, we compare with Mehta’s results in the weak and strong coupling
regions:
1
E ex g,
= −0.144 + 1.78g − 8.74g 2 + · · · ,
2
2
1
ex
GAB g,
= − 4.74g + 53.33g 2 + · · · ,
2
3
32
1
=
− 34.96g + · · · ,
Gex
AAAA g,
2
9
1
var
E
g,
= −0.144 + 3.56g − 23.7g 2 + · · · ,
2
1
2
Gvar
g,
= − 4.74g + 48.46g 2 + · · · ,
AB
2
3
1
32
Gvar
=
− 31.61g + 368.02g 2 + · · · etc.] ,
AAAA g,
2
9
(76)
1
1
3
E ex (g, c) = log g + log 3 − + · · · ,
2
2
4
Gex
AB (g, c) → 0 as g → ∞ ,
Gex
A4 (g, c) =
1
+ ··· ,
g
1
1
1
1
E (g, c) = log g + log 2 + √ + O
,
2
2
g
8g
c
c
1
Gvar
(g,
c)
=
−
+
O
,
AB
2g (2g) 32
g2
1
2
1
var
GA4 (g, c) = −
, etc.
3 + O
g (2g) 2
g2
var
We see that the Gaussian variational ansatz provides a reasonable first approximation in both the weak and strong coupling regions. The Gaussian variational
ansatz is not good near singularities of the free energy (phase transitions). As
|c| → 1− , the energy E ex diverges; this is not captured well by the Gaussian ansatz.
This reinforces our view that the Gaussian variational ansatz is the analog of mean
field theory.
n Some
other special classes of Green functions are also accessible (see Refs. 30 and 31).
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2433
7.2. Two-matrix model with tr[A, B]2 interaction
The power of our variational methods is their generality. We present an approximate
solution to a two-matrix model, for which we could find no exact results in the
literature. The action we consider is
2
m
c
g
2
2
2
S(A, B) = − tr
(A + B ) + (AB + BA) − [A, B] .
(77)
2
2
4
This is a caricature of the Yang–Mills action. A supersymmetric version
this
of
m2 c
ij
model is also of interest (see Ref. 32). Consider the regime where K = c m2
is positive, k = (m4 − c2 ) ≥ 0. As before, we pick a Gaussian ansatz and maximize
Ω. We get β = − mc2 α and
"r
#
m2
4g
AA
BB
G
=G
=α=
1+
−1 ,
2g
k
(78)
"
#
p
1
2g + k − k 2 + 4kg
E = − log
.
2
2g 2
All other mean field Green functions can be expressed in terms of α.
It is possible to improve on this Gaussian variational ansatz by using nonlinear
transformations. It is much easier to find first a Gaussian approximation and then
expand around it in a sort of loop expansion. This is the analog of the usual Goldstone methods of many body theory. We have performed such calculations for these
multimatrix models, but we will not report on them in this paper for the sake of
brevity. The results are qualitatively the same. In the next section (an appendix) we
will give the departures from the Gaussian ansatz in the case of the single-matrix
models.
Appendix A. Group Cohomology
Given a group G and a G-module V (i.e. a representation of G on a vector space
V ), we can define a cohomology theory.24 The r-cochains are functions
f : Gr → V .
(A.1)
The coboundary is
df (g1 , g2 , . . . , gr+1 )
= g1 f (g2 , . . . , gr+1 ) +
r
X
(−1)s f (g1 , g2 , . . . , gs−1 , gs gs+1 , gs+2 , . . . , gr+1 )
s=1
+ (−1)r+1 f (g1 , . . . , gr ) .
(A.2)
2
It is straightforward to check that d f = 0 for all f . A chain c is a cocycle or is
closed if df = 0; a cocycle is exact or is a coboundary if b = df for some f ; the rth
cohomology of G twisted by the module V , H r (G, V ) is the space of closed chains
July 1, 2002 17:9 WSPC/139-IJMPA
2434
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
modulo exact chains. H 0 (G, V ) is the space of invariant elements in V ; i.e. the
space of v satisfying gv − v = 0 for all g ∈ G. A one-cocycle is a function c : G → V
satisfying
c(g1 g2 ) = g1 c(g2 ) + c(g1 ) .
(A.3)
Solutions to this equation modulo one-coboundaries (which are of the form b(g) =
(g − 1)v for some v ∈ V ) is the first cohomology H 1 (G, V ). If G acts trivially on
V , a cocycle is just a homomorphism of G to the additive group of V : c(g1 g2 ) =
c(g2 ) + c(g1 ).
A one-cocycle gives a way of turning a representation on V into an affine action:
(g, v) 7→ gv + c(g) .
(A.4)
If c(g) is a coboundary (i.e. b(g) = (g − 1)u for some u), this affine action is really
a linear representation in disguise: If the origin is shifted by u we can reduce it to a
linear representation. Thus the elements of H 1 (G, V ) describe “true” affine actions
on V . For example, let G be the loop group of a Lie group G0 , the space of smooth
functions from the circle to G0 : G = S 1 G0 = {g : S 1 → G0 }. Let V = S 1 G0 be the
corresponding loop of the Lie algebra G0 of G0 . Then there is an obvious adjoint
representation of G on V ; a nontrivial one-cocycle is c(g) = g dg −1 , d being the
exterior derivative on the circle:
c(g1 g2 ) = g1 [g2 d(g2−1 )]g1−1 + g1 dg1−1 = ad g1 c(g2 ) + c(g1 ) .
(A.5)
Appendix B. A Single Random Matrix
In the special case where there is only one matrix (M = 1), there is a probability
distribution on the real line ρ(x)dx such that
Z
Gn = xn ρ(x)dx .
(B.1)
This follows because the Gn satisfy the positivity condition
∞
X
Gn+m u∗m un ≥ 0 ;
(B.2)
m,n=0
up to technical conditions, any sequence of real numbers satisfying this condition
determine a probability distribution on the real line. (This is the classical moment
problem solved in the nineteenth century.22 ) There is an advantage to transforming
the factorized SD equations into an equation for ρ(x) — it becomes a linear integral
equation, the Mehta–Dyson equation:10
Z b
ρ(y)dy
2P
+ S 0 (x) = 0 .
(B.3)
a x−y
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2435
Moreover, the solution26,27 to this equation can be expressed in purely algebraic
termso
√
1
S 0 (x)
ρ(x) = − θ(a ≤ x ≤ b) [(x − a)(b − x)] √
.
(B.4)
2π
(x − a)(x − b)
The numbers a and b are solutions of the algebraic equations
1
1
X
2 r 2 s r s
[r + s + 1]Sr+s+1
a b = 0,
r!s!
r,s=0
1
1
X
2 r 2 s r s
a b = 2,
[r + s]Sr+s
r!s!
r,s=0
(B.5)
where 12 r = 12 12 + 1 · · · 12 + r − 1 . The simplest example is the case of the
Wigner distribution. It is the analog of the Gaussian in the world of noncommutative probability distributions. For, if we choose the matrix elements of A to be
independent Gaussians, S(A) = − 21 tr A2 , we get the distribution function for the
eigenvalues of A to be (in the large N limit)
1 √
[4 − x2 ]θ(|x| < 2) .
(B.6)
2π
The odd moments vanish; the even moments are then given by the Catalan numbers
1
2k
G2k = Ck =
.
(B.7)
k+1 k
ρ0 (x) =
The Mehta–Dyson equation follows from maximizing the “action”
Z
Z
Ω(ρ) = ρ(x)S(x)dx + P log |x − y|ρ(x)ρ(y)dx dy
(B.8)
with respect to ρ(x). Then generating function log Z(S) is the maximum of this
functional over all probability distributions ρ. The physical meaning of the first
term is clear: It is just the expectation value of the action of the original matrix
model:
Z
X
ρ(x)S(x)dx =
Gn Sn .
(B.9)
n
The second term can be thought of as the “entropy” which arises because we
have lost the information about the angular variables in the matrix variable: The
function ρ(x) is the density of the distribution of the eigenvalues of A. Indeed,
P
i6=j log |ai − aj | is (up to a constant depending only on N ) the log of the volume
of Rthe space of all Hermitian matrices with spectrum a1 , a2 · · · aN .The entropy
P log |x − y|ρ(x)ρ(y)dx dy is the large N limit of this quantity. Note that the
P
Pm
k
k
a Laurent series X(z) = m
k=−∞ Xk z around infinity, bX(z)c =
k=0 Xk z denotes the
part that is a polynomial in z. This is analogous to the “integer part” of a real number, which
explains the notation.
o For
July 1, 2002 17:9 WSPC/139-IJMPA
2436
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
entropy is independent of the choice of the matrix model action: It is a universal
property of all one-matrix models. The meaning of the variational principle is now
clear: We seek the probability distribution of maximum entropy that has a given
set of moments Gr for r = 1 · · · n. The coefficients of the polynomial S are just
the Lagrange multipliers that enforce this condition. Thus we found a variational
principle, but indirectly in terms of the function ρ(x) rather than the moments Gn
themselves. The entropy could not be expressed explicitly in terms of the moments.
Indeed, in a sense, this is impossible:
The entropy cannot be expressed as a formal power series in Gn . This is surprisingR since there appears to be a linear relation between ρ(x) and Gn , since
Gn = xn ρ(x)dx; also the entropy is a quadratic function of ρ(x). So one might
think that entropy is a quadratic function of the Gn as well. But if we try to compute this function we will get a divergent answer. Indeed, we claim that even if we
do not require the series to converge, the entropy cannot be expressed as a power
series in Gn .
By thinking in terms of the change of variables that bring the probability distribution we seek to a standard one, we can find an explicit formula for entropy.
Since we are interested in polynomial actions S(A), which are modifications of the
quadratic action 12 A2 , the right choice of this reference distribution is the Wigner
distribution
ρ0 (x) =
1 √
[4 − x2 ]θ(|x| < 2) .
2π
There should thus be a change of variable φ(x) such that
Z
Z
Gk = xk ρ(x)dx = φ(x)k ρ0 (x)dx ;
(B.10)
(B.11)
in other words,
ρ(φ(x))φ0 (x) = ρ0 (x) .
(B.12)
Then we get
Z
Ω(φ) =
Z
S(φ(x))ρ0 (x)dx +
log
φ(x) − φ(y)
ρ0 (x)dxρ0 (y)dy .
x−y
(B.13)
We have dropped a constant term — the entropy of the reference distribution itself.
Also it will be convenient to choose the constant of integration such that φ(0) = 0.
Now we can regard the diffeomorphism as parametrized by its Taylor coefficients
φ(x) =
∞
X
φn xn ,
φ1 > 0 .
(B.14)
n=1
Although we cannot express the entropy in terms of the moments Gn themselves,
we will be able to express both the entropy and the moments in terms of the
parameters φn . Thus we have a “parametric form” of the variational problem. It
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2437
is this parametric form that we can extend to the case of multi-matrix models.
Indeed,
k
φ (x) =
∞
X
X
xn
n=1
φl1 · · · φlk
(B.15)
φl1 · · · φlk .
(B.16)
l1 +l2 +···+lk =n
so that
Gk =
∞
X
X
Γn
n=1
l1 +l2 +···+lk =n
It is convenient to factor out the linear transformation φ(x) = φ1 [x + φ̃(x)], where
P∞
φ̃(x) = k=2 φ̃k xk , with φ̃k = φk [φ1 ]−1 . Then
#
"
∞
X
xm − y m
φ(x) − φ(y)
= log φ1 + log 1 +
φ̃m
.
(B.17)
log
x−y
x−y
m=2
Using
xm − y m
=
x−y
X
xk y l
(B.18)
k+1+l=m; k,l≥0
and expanding the logarithm we get
∞
X
φ(x) − φ(y)
(−1)n+1
log
= log φ1 +
x−y
n
n=1
X
φ̃k1 +1+l1 · · · φ̃kn +1+ln xk1 +···+kn y l1 +···+ln .
×
k1 ,l1 ,...,kn ,ln
It follows then that
∞
X
Ω(φ) =
Sk Γn
k,n=1
+
X
(B.19)
φl1 · · · φlk + log φ1
l1 +l2 +···+lk =n
∞
X
(−1)n+1
n
n=1
X
k1 ,l1 ,...,kn ,ln
φ̃k1 +1+l1 · · · φ̃kn +1+ln Γk1 +···+kn Γl1 +···+ln .
(B.20)
While this formula may not be particularly transparent, it does accomplish our goal
of finding a variational principle that determines the moments. The parameters φk
characterize the probability distribution of the eigenvalues: They determine the
moments Gn by the above series. By extremizing the action Ω as a function of
these φk , we can then determine the moments. We will be able to generalize this
version of the action principle to multi-matrix models. In practice we would choose
some simple function φ(x) such as a polynomial to get an approximate solution to
this variational problem. Since all one-matrix models are exactly solvable, we can
use them to test the accuracy of our variational approximation.
July 1, 2002 17:9 WSPC/139-IJMPA
2438
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
B.1. Explicit variational calculations
Consider the quartic one matrix model. Its exact solution in the large N limit is
known from the work of Brezin et al.:26
Z
4
1 2
Z(g) = dA eN tr[− 2 A −gA ] ,
1
Z(g)
.
Eexact (g) = − lim
log
N →∞ N 2
Z(0)
(B.21)
The Gaussian with unit covariance is our reference action. Choose as a variational
ansatz the linear change of variable φ(x) = φ1 x, which merely scales the Wigner
distribution. The φ1 that maximizes Ω represents the Wigner distribution that best
approximates the quartic matrix model.
1
Ω(φ1 ) = log φ1 − G2 − gG4 .
2
(B.22)
1
α
2
2
Here G2k = φ2k
1 Γk . Letting α = φ1 , Ω(α) = 2 log α√− 2 − 2gα is bounded above
1+32g
only for g ≥ 0. It’s maximum occurs at α(g) = −1+16g
. Notice that α is determined by a nonlinear equation. This is reminiscent of the mean field theory; we
will sometimes refer to the Gaussian ansatz as mean field theory. Our variational
estimates are
√
1
−1 + 1 + 32g
E(g) = − log
,
2
16g
(B.23)
√
k
−1 + 1 + 32g
G2k (g) =
Ck .
16g
The exact results from26 are
Eex (g) =
1 2
1
(a (g) − 1)(9 − a2 (g)) − log (a2 (g)) ,
24
2
(B.24)
(2k)!
2k
2
a (g)[2k + 2 − ka (g)] ,
=
k!(k + 2)!
√
1
where a2 (g) = 24g
[−1 + 1 + 48g]. In both cases, the vacuum energy is analytic
at g = 0 with a square root branch point at a negative critical coupling. The mean
1
1
field critical coupling gcMF = − 32
is 50% more than the exact value gcex = − 48
.
The distribution of eigenvalues of the √
best Gaussian approximation is given by
−1
1
2
ρg (x) = φ−1
1 ρ0 (φ1 x) where ρ0 (x) = 2π 4 − x , |x| ≤ 2 is the standard Wigner
distribution. The exact distribution
p
1 1
2
2
ρex (x, g) =
+ 4ga (g) + 2gx
4a2 (g) − x2 ,
|x| ≤ 2a(g) .
(B.25)
π 2
G2k
ex (g)
is compared with the best Gaussian approximation in Fig. 1. The latter does not
capture the bimodal property of the former.
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2439
Eigenvalue Distribution
0.6
0.5
0.4
0.3
0.2
0.1
-1
-0.5
0.5
1
x
Fig. 1. Eigenvalue Distribution. The dark curve is exact, semicircle is mean field and bi-modal
light curve is cubic ansatz at one-loop.
The vacuum energy estimate starts out for small g, being twice as big as the
exact value. But the estimate improves and becomes exact as g → ∞. Meanwhile,
the estimate for G2 (g) is within 10% of its exact value for all g. G2k , k ≥ 2 for the
Gaussian ansatz do not have any new information. However, the higher cumulants
vanish for this ansatz.
We see that a Gaussian ansatz is a reasonable first approximation, and is not
restricted to small values of the coupling g. To improve on this, get a nontrivial estimate for the higher cumulants and capture the bimodal distribution of eigenvalues,
we need to make a nonlinear change of variable.
B.2. Nonlinear variational change of variables
The simplest nonlinear ansatz for the quartic model is a cubic polynomial: φ(x) =
φ1 x + φ3 x3 . A quadratic ansatz will not lower the energy since S(A) is even. Our
reference distribution is still the standard Wigner distribution. φ1,3 are determined
by the condition that
φ(x) − φ(y) − 1 hφ2 (x)i − ghφ4 (x)i0
Ω[φ] = log (B.26)
x−y
2
0
be a maximum. Considering the success of the linear change of variable, we expect
√
the deviations of φ1,3 from their mean field
0) to be small, irrespective
values ( α, √
−1+ 1+32g
of g. Within this approximation, we get with α =
16g
φ1 =
√
α−
√
α(−3 + 2α + (1 − 32g)α2 + 48gα3 + 144g 2α4 )
,
3 + 4α + (1 + 96g)α2 + 48gα3 + 432g 2 α4
5
8gα 2 (−2 + α)
φ3 =
,
3 + 4α + (1 + 96g)α2 + 48gα3 + 432g 2 α4
(B.27)
July 1, 2002 17:9 WSPC/139-IJMPA
2440
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
from which we calculate the variational Green functions and vacuum energy. The
procedure we have used to obtain φ1,3 can be thought of as a one-loop calculation
around mean field theory. Comparing with the exact results of Ref. 26, we find the
following qualitative improvements over the mean field ansatz.
MF
In addition to the mean field branch cut from −∞
√ to gc , the vacuum energy
−346−25 22
MF
1-loop
now has a double pole at gc < gc
=
< gcex . We can understand
15138
this double pole as a sort of Padé approximation to a branch cut that runs all the
way up to gcex . The vacuum energy variational estimate is lowered for all g.
Figure 1 demonstrates that the cubic ansatz is able to capture the bimodal
nature of the exact eigenvalue √distribution. If χ(x) = φ−1 (x), then ρ(x) =
1
ρ0 (χ(x))χ0 (x), where ρ0 (x) = 2π
4 − x2 , |x| ≤ 2.
The Green functions G2 , G4 are now within a percent of their exact values, for
all g. More significantly, the connected four-point function Gc4 = G4 − 2(G2 )2 which
vanished for the Gaussian ansatz, is nontrivial, and within 10% of its exact value,
across all values of g.
B.3. Formal power series in one variable
Given a sequence of complex numbers (a0 , a1 , a2 , . . .), with only a finite number of
nonzero entries, we have a polynomial with these numbers as coefficients: 33
a(z) =
∞
X
an z n .
(B.28)
n=0
Note that all the information in a polynomial is in its coefficients: The variable
z is just a book-keeping device. In fact we could have defined a polynomial as a
sequence of complex numbers (a0 , a1 , . . .) with a finite number of nonzero elements.
The addition multiplication and division of polynomials can be expressed directly
in terms of these coefficients:
X
[a + b]n = an + bn , [ab]n =
ak bl , [Da]n = (n + 1)an+1 .
(B.29)
k+l=n
A formal power series (a0 , a1 , . . .) is a sequence of complex numbers, with possibly an infinite number of nonzero terms. We define the sum, product and derivative
as for polynomials above:
X
[a + b]n = an + bn , [ab]n =
ak bl , [Da]n = (n + 1)an+1 .
(B.30)
k+l=n
The set of formal power series is a ring, indeed even an integral domain. (The proof
is the same as above for polynomials.) The opration D is a derivation on this ring.
The ring of formal power series is often denoted by C[[z]]. The idea is that such a
P
n
sequence can be thought of as the coefficients of a series ∞
n=0 an z ; the sum and
product postulated are what you would get from this interpretation. However, the
series may not make converge if z is thought of as a complex number: Hence the
name formal power series.
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2441
The composition a ◦ b is well-defined whenever b0 = 0:
[a ◦ b]n =
∞
X
X
ak
k=0
bl1 bl2 · · · blk .
(B.31)
l1 +···+lk =n
The point is that, for each n there are only a finite number of such l’s so that the
series on the rhs is really a finite series. In terms of series, this means we substitute
one series into the other:
a ◦ b(z) = a(b(z)) .
(B.32)
B.4. The group of automorphisms
The set of formal power series
(
G=
φ(z) =
∞
X
0
)
φn z φ0 = 0; φ1 6= 0
n
(B.33)
is a group under composition: The group of automorphisms. The group law is
[φ̃ ◦ φ]n =
n
X
X
φ̃k
k=1
φl1 · · · φlk .
(B.34)
l1 +l2 +···+lk =n
The inverse of φ (say φ̃) is determined by the recursion relations of Lagrange:
φ̃1 φ1 = 1 ,
φ̃n = −
n−1
1 X
φ̃k
φn1
k=1
X
φl1 · · · φlk .
(B.35)
l1 +···+lk =n
G is a topological group with respect to the ultrametric topology. It can be
thought of as a Lie group, the coefficients φn being the coordinates. The group
multiplication law can now be studied in the case where the left or the right element
is infinitesimal, leading to two sets of vector fields on the group manifold. For
example, if φ̃(x) = x + xk+1 , the change it would induce on the coordinates of φ is
X
[Lk φ]n =
φl1 · · · φlk+1
(B.36)
l1 +l2 ···lk+1 =n
or equivalently,
Lk φ(x) = φ(x)k+1 ,
for k = 0, 1, 2 . . . .
(B.37)
By choosing φ̃(x) = x + xk+1 in φ ◦ φ̃ we get the infinitesimal right action:
Rk φ(x) = xk+1 Dφ(x) ,
for k = 0, 1, 2 . . . .
(B.38)
Both sets satisfy the commutation relations of the Lie algebra G:
[Lm , Ln ] = (n − m)Lm+n ,
[Rm , Rn ] = (n − m)Rm+n .
(B.39)
This Lie algebra is also called the Virasoro algebra by some physicists and the Witt
algebra by some mathematicians.
July 1, 2002 17:9 WSPC/139-IJMPA
2442
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
There is a representation of this Lie algebra on the space of formal power series:
Ln a = xn+1 Da .
(B.40)
B.5. Cohomology of G
Now let V be the space of formal power series with real coefficients. Then G, the
group of automorphisms has a representation on V :
G = {φ : Z+ → R|φ0 = 0, φ1 > 0} ,
V = {a : Z+ → R} ,
φ∗ a(x) = a(φ−1 (x)) .
(B.41)
Now, log[φ(x)/x] is a power series in x because φ(x)/x is a formal power series
with positive constant term: [φ(x)/x]0 = φ1 > 0. We see easily that c(φ, x) =
log[φ(x)/x)] is a one-cocycle of G twisted by the representation V :
φ1 (φ2 (x))
c(φ1 φ2 , x) = log
x
φ2 (x)
φ1 (φ2 (x))
= log
+ log
φ2 (x)
x
= c(φ1 φ2 , φ2 (x)) + c(φ2 , x) .
(B.42)
Of course, neither log φ(x) nor log x are power series in x. So this cocycle is nontrivial.
The space of formal power series in two commuting variables (Sym2 V ) also
carries a representation of G. We again have a nontrivial one-cocyclep on this representation:
φ(x) − φ(y)
c(φ, x, y) = log
.
(B.44)
x−y
We recognize this as the entropy of the single matrix model. The same argument
shows that this is a nontrivial cocycle of G.
Now we understand that the entropy of the single matrix models has the mathematical meaning of a nontrivial one-cocycle of the group of automorphisms. It
explains why we could not express the entropy as a function of the moments. This
points also to a solution to the difficulty: We must think in terms of the automorphism φ rather than the moments as parametrizing the probability distribution.
p The
formula
m−1
X
xm − y m
=
xk ym−k−1
x−y
k=1
can be used to show that c(φ, x, y) is a formal power series in x and y.
(B.43)
July 1, 2002 17:9 WSPC/139-IJMPA
01079
Entropy of Operator-Valued Random Variables
2443
Appendix C. Formula for Cocycle
We will now get an explicit formula for σ(φ̃, A). The Jacobian matrix of φ̃ is obtained
P∞
by differentiating the series φ̃(A)i = Ai + n=2 φ̃ii1 ···in Ai1 · · · Ain :
a jd
Jib
c (A) =
X i ···i jj ···j
∂ φ̃aib (A)
= δij δca δbd +
φ̃i1 m 1 n [Ai1 · · · Aim ]ac [Aj1 · · · Ajn ]db
c
∂Ajd
m+n≥1
a jd
:= δij δca δbd + Kib
c (A) .
(C.1)
If we suppress the color indices a, b, c, d,
Jij (A) = δij 1 ⊗ 1 + φ̃IjJ
AI ⊗ AJ := δij 1 ⊗ 1 + Kij (A) .
i
(C.2)
We can now compute
1
1
tr K n (A) = 2 Kia11b1i2ab22 Kia22b2i3ab23 · · · Kiannbni1ab11
N2
N
1 i2 L1 K2 i3 L2
n i1 Ln
= φ̃K
· · · φ̃K
φ̃i2
i1
in
×
1
1
[AK1 ]aa12 · · · [AKn ]aan1 [AL1 ]bb21 · · · [ALn ]bb1n
N
N
1 i2 L1 K2 i3 L2
n i1 Ln
= φ̃K
· · · φ̃K
ΦK1 ···Kn ΦLn ···L1 .
φ̃i2
i1
in
(C.3)
Thus,
σ(φ̃, A) =
1
log det[1 + K(A)]
N2
=
∞
X
(−1)n+1 1
tr K n
2
n
N
n=1
=
∞
X
(−1)n+1 K1 i2 L1 K2 i3 L2
n i1 Ln
φ̃i1
φ̃i2
· · · φ̃K
ΦK1 ···Kn ΦLn ···L1 .
in
n
n=1
This is the formula we presented earlier.
References
1. G. ’t Hooft, Nucl. Phys. B75, 461 (1974); Under the Spell of the Gauge Principle
(World Scientific, Singapore, 1994).
2. E. Witten, Nucl. Phys. B160, 57 (1979).
3. Y. M. Makeenko and A. A. Migdal, Phys. Lett. B88, 135 (1979); erratum, ibid. B89,
437 (1980); Nucl. Phys. B188, 269 (1981).
4. S. G. Rajeev, Int. J. Mod. Phys. A9, 5583 (1994).
5. G. Krishnaswami and S. G. Rajeev, Phys. Lett. B441, 429 (1998); V. John, G. S.
Krishnaswami and S. G. Rajeev, Phys. Lett. B487, 125 (2000); Phys. Lett. B492,
63 (2000); Theoretical High Energy Physics: MRST 2000, ed. C. R. Hagen (Melville,
American Institute of Physics, New York, 2000), p. 220, S. G. Rajeev, Nucl. Phys.
Proc. Suppl. 96, 487 (2001); hep-th/9905072
July 1, 2002 17:9 WSPC/139-IJMPA
2444
01079
L. Akant, G. S. Krishnaswami & S. G. Rajeev
6. S. G. Rajeev and O. T. Turgut, Commun. Math. Phys. 192, 493 (1998); J. Math.
Phys. 37, 637 (1996); Int. J. Mod. Phys. A10, 2479 (1995).
7. L. Akant, G. S. Krishnaswami and S. G. Rajeev, Theoretical High Energy Physics:
MRST 2001, ed. V. Elias et al. (Melville, American Institute of Physics, New York,
2001), p. 267.
8. E. Wigner, Proc. Cambridge Philos. Soc. 47, 790 (1951); reprinted in C. E. Porter,
Statistical Theories of Spectra: Fluctuations (Academic, New York, 1965).
9. F. J. Dyson, J. Math. Phys. 3, 140, 157, 166 (1962).
10. M. L. Mehta, Random Matrices, 2nd ed. (Academic, New York, 1991).
11. P. Cvitanovic, Phys. Lett. B99, 49 (1981); P. Cvitanovic, P. G. Lauwers and P. N.
Scharbach, Nucl. Phys. B186, 165 (1981).
12. D. V. Voiculescu, K. J. Dykema and A. Nica, Free Random Variables (American
Mathematical Society, Providence, USA, 1992).
13. D. V. Voiculescu, Invent. Math. 118, 411 (1994).
14. D. V. Voiculescu, Invent. Math. 132, 189 (1998).
15. D. V. Voiculescu, Free Entropy, http://arXiv.org/abs/math.OA/0103168
16. L. G. Yaffe, Rev. Mod. Phys. 54, 407 (1982).
17. A. Jevicki and B. Sakita, Nucl. Phys. B165, 511 (1980).
18. F. A. Berezin, Commun. Math. Phys. 63, 131 (1978).
19. T. Guhr, A. Mueller-Groeling and H. A. Weidenmuller, Phys. Rep. 299, 189 (1998).
20. A. Sen, Nucl. Phys. Proc. Suppl. 94, 35 (2001).
21. J. Ambjorn, B. Durhuus and T. Jonsson, Quantum Geometry (Cambridge University
Press, Cambridge, 1997); J. Ambjorn, J. Jurkiewicz and R. Loll, Phys. Rev. D64,
044011 (2001).
22. N. I. Akhiezer, Classical Moment Problem (Hafner, New York, 1965).
23. J. Ambjorn, J. Jurkiewicz and Y. M. Makeenko, Phys. Lett. B251, 517 (1990).
24. L. Evens, The Cohomology of Groups (Oxford, University Press, UK, 1991).
25. G. Ferretti and S. G. Rajeev, Phys. Rev. Lett. 69, 2033 (1992).
26. E. Brezin, C. Itzykson, G. Parisi and J. B. Zuber, Commun. Math. Phys. 59, 35
(1978).
27. E. Brezin and A. Zee, Nucl. Phys. B402, 613 (1993).
28. M. L. Mehta, Commun. Math. Phys. 79, 327 (1978).
29. V. A. Kazakov, Phys. Lett. A119, 140 (1986).
30. M. Staudacher, Phys. Lett. B305, 332 (1993).
31. M. R. Douglas and M. Li, Phys. Lett. B348, 360 (1995).
32. N. Ishibashi, H. Kawai, Y. Kitazawa and A. Tsuchiya, Nucl. Phys. B498, 467 (1997).
33. H. Cartan, Elementary Theory of Analytic Functions of One or Several Complex
Variables (Hermann, Paris, 1973).