Download Exact probabilities for typical ranks of 2 × 2 ×... and 3 × 3 × 2 tensors Linköping University Post Print

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Exact probabilities for typical ranks of 2 × 2 × 2
and 3 × 3 × 2 tensors
Göran Bergqvist
Linköping University Post Print
N.B.: When citing this work, cite the original article.
Original Publication:
Göran Bergqvist , Exact probabilities for typical ranks of 2 × 2 × 2 and 3 × 3 × 2 tensors,
2013, Linear Algebra and its Applications, (438), 2, 663-667.
http://dx.doi.org/10.1016/j.laa.2011.02.041
Copyright: Elsevier
http://www.elsevier.com/
Postprint available at: Linköping University Electronic Press
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-85549
Exact probabilities for typical ranks of 2 × 2 × 2 and 3 × 3 × 2
tensors
Göran Bergqvist
Matematiska institutionen, Linköpings universitet,
SE-581 83 Linköping, Sweden
[email protected]
May 12, 2010
Abstract
We show that the probability to be of rank 2 for a 2 × 2 × 2 tensor with elements from a
standard normal distribution is π/4, and that the probability to be of rank 3 for a 3 × 3 × 2
tensor is 1/2. In the proof results on the expected number of real generalized eigenvalues of
random matrices are applied. For n × n × 2 tensors with n ≥ 4 we also present some new
aspects of their rank.
Keywords: tensors, multi-way arrays, typical rank, random matrices
AMS classification codes: 15A69, 15B52
1
Introduction
The rank concept for multi-way arrays or tensors is not as simple as for matrices. If T is a real
m × n × p 3-way array (or 3-tensor), then it can be expanded as
T =
r
X
ci ui ⊗ vi ⊗ wi
(ci ∈ R, ui ∈ Rm , vi ∈ Rn , wi ∈ Rp )
(1)
i=1
[2, 8, 10], where ⊗ denotes the tensor (or outer) product. This is called the CP expansion (or
CANDECOMP for canonical decomposition, or PARAFAC for parallel factors) of T , and the
extension to higher order tensors or arrays is obvious.
The are several rank concepts for tensors [1, 3, 4, 7, 10, 11]. The rank of T is the minimal
possible value of r in the CP expansion (1) and is always well defined. The column (mode-1)
rank of T is the dimension of the subspace of Rm spanned by the np columns of T (for every
fixed pair of values of jk we have such a column). The row (mode-2) rank r2 and the mode-3
rank r3 are defined analogously. The triple (r1 , r2 , r3 ) is called the multirank of T . A typical rank
of T is a rank which appears with nonzero probability if the elements Tijk are randomly chosen
from a continuous probability distribution. A generic rank is a typical rank which appears with
probability 1. In the matrix case, the number of terms r in the singular value expansion is
always equal to the column and row ranks of the matrix. However, for tensors, r, r1 , r2 and r3
1
can all be different. For matrices the typical and generic ranks of an m × n matrix are always
min(m, n). However, for a higher order tensor a generic rank over the real numbers does not
necessarily exist (over the complex numbers a generic rank always exists, but in this paper we
only consider real tensors and real CP expansions). Both the typical and generic ranks of an
m × n × p tensor may be strictly greater than min(m, n, p), and are in general hard to calculate.
In 1989, using numerical simulations with each tensor element drawn from a normal distribution with zero mean, Kruskal [12] reported that the probabilities for a 2 × 2 × 2 tensor to have
rank 2 or 3 are approximately 79% and 21% respectively. Hence, ranks 2 and 3 are typical, and
no generic rank exists (over the complex numbers it is 2). It was later shown [15] for all n ≥ 2,
that for n × n × 2 tensors there is no generic rank and that the typical ranks are n and n + 1.
The generic and typical ranks of m × n × p tensors for several small values of m, n and p have
recently been determined [3, 16]. While a generic rank seems to exist for most (m, n, p), another
case with two typical ranks is 5 × 3 × 3 tensors for which 5 and 6 are typical ranks.
Until now, the probabilities for random tensors to be of the different possible typical ranks
have only been studied by numerical simulations. Below we derive the first exact values of such
probabilities, namely for 2 × 2 × 2 and 3 × 3 × 2 tensors.
For other aspects of the CP expansion, such as uniqueness of the expansion, estimates
of maximal rank, and low rank approximations, we refer to the review paper [10] and the
references therein, other types of tensor decompositions (e.g., Tucker and higher order singular
value decompositions) and applications are also described there.
2
A characterization of n × n × 2 tensors of rank n
We now assume that T is a real n × n × 2 tensor whose elements Tijk , 1 ≤ i, j ≤ n, 1 ≤ k ≤ 2,
are picked from some continuous probability distribution. The theorem presented in this section
is essentially given by ten Berge [14], although the probablistic view was not used there (but
see [13]). Define the n × n matrices T1 and T2 , the frontal slices of T , by (T1 )ij = Tij1 and
(T2 )ij = Tij2 . We have
Theorem 1. With probability 1:
rank(T ) = n ⇐⇒ det(T2 − λT1 ) = 0 has n real solutions
Proof. Suppose first that rank(T ) = n. Then we can write
T =
n
X
ui ⊗ v i ⊗
i=1
di
ei
(2)
Following [14], we define the n × n matrices U with columns ui , V with columns vi , D diagonal
with elements di on the diagonal, and E diagonal with elements ei on the diagonal. Then
T1 = UDVT and T2 = UEVT . If we assume that T1 is invertible (true with probability
1), then det U 6= 0 6= det V and all di 6= 0. Hence the Q
generalized eigenvalue equation 0 =
det(T2 − λT1 ) = det U det(D − λE) det V = det U det V ni=1 (ei − λdi ) has n real solutions.
Conversely, if det(T2 − λT1 ) = 0 has n real solutions, then (again assuming T1 invertible)
−1
det(T2 T−1
1 − λI) = 0 has n real solutions. With probability 1, all eigenvalues of T2 T1 are
−1
distinct and it can be diagonalized as T2 T−1
(with coinciding eigenvalues there could
1 = PΛP
2
be non-trivial Jordan blocks). Define U = P and VT = P−1 T1 . Then T1 = PVT = UIVT
and T2 = PΛP−1 T1 = UΛVT and T has the expansion
n
X
1
T =
ui ⊗ v i ⊗
(3)
λi
i=1
and is therefore of rank n.
Notice that in the proof, the CP expansion of T is actually constructed. The exceptional
(probability 0) cases were discussed in some detail by ten Berge [14] but are not relevant for the
results of this paper.
3
The expected number of real generalized eigenvalues of random matrices
Consider random matrices whose elements are independent random variables and normally distributed with mean 0 and variance 1. Kanzieper and Akemann [9] (see also Edelman [6]) have
found the probabilities for such a random n × n matrix A to have k real eigenvalues, i.e., the
probability for having k real solutions to det(A − λI) = 0. Since T2 T−1
1 is not normally distributed we cannot apply their result to it. The same problem for the generalized eigenvalue
problem det(A − λB) = 0, with B having the same distribution as A, seems to be unsolved.
In [5], however, Edelman, Koslan and Shub found both the expected number of real eigenvalues and the expected number of real generalized eigenvalues for such random matrices. Let En
be the expectation value for the number of real solutions to det(A − λB) = 0 with A and B as
above. Then [5]
√ Γ( n+1
2 )
En = π
(4)
n
Γ( 2 )
It turns out that this information is sufficient to solve our problem for n × n × 2 tensors with
n = 2 and n = 3.
4
Probabilities for typical ranks of 2 × 2 × 2 and 3 × 3 × 2 tensors
Now assume that T is an n × n × 2 tensor whose elements are independent random variables
and normally distributed with mean 0 and variance 1. Then the slices T1 and T2 are random
n × n matrices. The expected number En of real solutions to det(T2 − λT1 ) = 0 is given by (4).
We also have
n
X
En =
kpn (k)
(5)
k=1
where pn (k) is the probability for having k real generalized eigenvalues. Since complex eigenvalues come in pairs we can write
[(n−1)/2]
En =
X
(n − 2k)pn (n − 2k)
k=0
The following is our main theorem.
3
(6)
Theorem 2. Suppose that T is an n × n × 2 tensor whose elements are independent random
variables which are normally distributed with mean 0 and variance 1, and let Pn denote the
probability that T has rank n. Then P2 = π/4 and P3 = 1/2.
Proof. By Theorem 1, Pn = pn (n) is equal to the probability that det(T2 − λT1 ) = 0 has n
√ Γ( 32 )
real solutions. For n = 2, by (4), the expected number of real solutions is E2 = π Γ(1)
. Recall
√
that Γ(m) = (m − 1)! if m is an integer and Γ( m
π 2(m−2)!!
(m−1)/2 if m is an odd integer. Hence
2) =
√ √π/2
π
E2 = π 1 = 2 . By (6), E2 = 2p2 (2) and we conclude that P2 = p2 (2) = π4 .
√ Γ(2)
√ 1
For n = 3, again by (4), E3 = π Γ(
=
π √π/2 = 2. By (6), E3 = 3p3 (3) + 1p3 (1) =
3
)
2
3p3 (3) + 1(1 − p3 (3)) = 2p3 (3) + 1 = 2P3 + 1 so we conclude that P3 = 1/2.
Since the only other typical rank of an n × n × 2 tensor is n + 1 we immediately have:
Corollary 3. Suppose that T is an n × n × 2 tensor whose elements are independent random
variables which are normally distributed with mean 0 and variance 1, and let P̃n denote the
probability that T has rank n + 1. Then P̃2 = 1 − π/4 and P̃3 = 1/2.
P[n/2]
For n ≥ 4, there are at least three terms in (6) so the condition k=0 pn (n − 2k) = 1 is not
sufficient to determine Pn = pn (n). For n = 4 we get
2p4 (2) + 4p4 (4) = E4 =
√ Γ( 52 )
3π
=
;
π
Γ(2)
4
p4 (0) + p4 (2) + p4 (4) = 1
(7)
and for n = 5
1p5 (1) + 3p5 (3) + 5p5 (5) = E5 =
√ Γ(3)
8
π 5 = ;
3
Γ( 2 )
p5 (1) + p5 (3) + p5 (5) = 1
(8)
and so on for increasing values of n.
Remark. It is interesting to interpret the above results as a result for how often curves
intersect. For the case n = 2, expanding det(T2 − xT1 ) = 0 to an equation of the type
(a1 x − a2 )(a3 x − a4 ) = (a5 x − a6 )(a7 x − a8 ), we see that the probability for two paraboloids
with real roots to intersect is π/4, when the coefficients ai are chosen from a standard normal
distribution. Since the equation has real solutions if B 2 − 4AC ≥ 0, where A = a1 a3 − a5 a7 ,
B = a5 a8 + a6 a7 − a1 a4 − a2 a3 , and C = a2 a4 − a6 a8 (for T , B 2 − 4AC will just be its
hyperdeterminant [4]), it is very simple to verify that the value of P2 must be close to π/4 by
running a large number of tests with each ai from the normal distribution N (0, 1) and checking
how often B 2 ≥ 4AC. Since π/4 ≈ 0.7854, we see that the value 79% of Kruskal’s original
simulation [12] is a good approximation. For n = 3 a result for qubic equations is obtained and
one can also by simple simulations see that the value of P3 must be near the exact value 1/2
found above.
References
[1] G Bergqvist and E G Larsson ”The higher-order singular value decomposition: theory and
an application” IEEE Signal Proc. Mag. 27 (2010) 151–154
4
[2] J D Carroll and J J Chang ”Analysis of individual differences in multidimensional scaling via
N-way generalization of Eckart-Young decomposition” Psychometrika 35 (1970) 283–319
[3] P Comon, J M F ten Berge, L De Lathauwer and J Castaing ”Generic and typical ranks of
multi-way arrays” Lin. Alg. Appl. 430 (2009) 2997–3007
[4] V De Silva and L-H Lim ”Tensor rank and the ill-posedness of the best low-rank approximation problem” SIAM J. Matrix Anal. Appl. 30 (2008) 1084–1127
[5] A Edelman, E Kostlan and M Shub ”How many eigenvalues of a random matrix are real?”
J. Amer. Math. Soc. 7 (1994) 247–267
[6] A Edelman ”The probability that a random real Gaussian matrix has real eigenvalues,
related distributions, and the circular law” J. Multivariate Anal. 60 (1997) 203–232
[7] S Friedland ”On the generic rank of 3-tensors” preprint, 2009 (arXiv:0805.3777v3)
[8] R Harshman ”Foundations of the PARAFAC procedure: Models and conditions for an
explanatory multi-modal factor analysis” UCLA working papers in phonetics 16 (1970)
1–84
[9] E Kanzieper and G Akemann ”Statistics of real eigenvalues in Ginibre’s ensemble of random
real matrices.” Phys. Rev. Lett. 95 (2005) 230201
[10] T G Kolda and B W Bader ”Tensor decompositions and applications” SIAM Review 51
(2009) 455–500
[11] J B Kruskal ”Three-way arrays: rank and uniqueness of trilinear decompositions, with
applications to arithmetic complexity and statistics” Lin. Alg. Appl. 18 (1977) 95–138
[12] J B Kruskal ”Rank, decomposition, and uniqueness for 3-way and N-way arrays” in Multiway data analysis 7–18, North-Holland (Amsterdam), 1989
[13] A Stegeman ”Degeneracy in CANDECOMP/PARAFAC explained for p × p × 2 arrays of
rank p + 1 or higher” Psychometrika 71 (2006) 483–501
[14] J M F ten Berge ”Kruskal’s polynomial for 2 × 2 × 2 arrays and a generalization to 2 × n × n
arrays” Psychometrika 56 (1991) 631–636
[15] J M F ten Berge and H A L Kiers ”Simplicity of core arrays in three-way principal component analysis and the typical rank of p × q × 2 arrays” Lin. Alg. Appl. 294 (1999) 169–179
[16] J M F ten Berge and A Stegeman ”Symmetry transformations for squared sliced three-way
arrays, with applications to their typical rank” Lin. Alg. Appl. 418 (2006) 215–224
5