Download Going to the Pictures: Eigenvector as Fixed Point by Mervyn Stone

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Topological quantum field theory wikipedia , lookup

Introduction to gauge theory wikipedia , lookup

Riemannian connection on a surface wikipedia , lookup

T-symmetry wikipedia , lookup

Line (geometry) wikipedia , lookup

Event symmetry wikipedia , lookup

Covariance and contravariance of vectors wikipedia , lookup

Metric tensor wikipedia , lookup

Scale invariance wikipedia , lookup

Tensors in curvilinear coordinates wikipedia , lookup

Cartesian coordinate system wikipedia , lookup

Noether's theorem wikipedia , lookup

Curvilinear coordinates wikipedia , lookup

Cartesian tensor wikipedia , lookup

Derivations of the Lorentz transformations wikipedia , lookup

Dual space wikipedia , lookup

Transcript
Going to the Pictures: Eigenvector as Fixed Point
by Mervyn Stone
Abstract: Section 1 of this short note reproduces the written version of my discussion of the RSS
Research Section paper [Tyler, 2009] read on 17th December 2008. Section 2 is the fuller explanation,
then promised, of the four slides shown at the meeting. The four Figures are here reproduced as JPeG
files.
1.
Written version.
“This paper starts with Cartesian coordinates that come with any data, graduates to matrices and ends up
with affine invariance — in other words, next door to the open-air geometry of coordinate freedom!
I doubt whether the authors needed the algebra of sections 2, 3 and 4 to be confident that that would
happen — before writing the computer program that does have to use coordinates and matrices.
Readers of the paper might have been spared the algebra — if only that great exponent of
coordinate-freedom, Paul Halmos, had gone deeper into probability & statistics to wean us off
coordinates & matrices wherever and whenever these impede understanding.
It is not too late to supply the alternative thin gruel.
(a) A few concepts and terms from the thinnest and least influential books on multivariate analysis:
V is the vector space of variables (made out of p names) and E is its dual space of evaluators e whose
evaluation of variable v (a possible ‘observation’ if v is a name) is the bilinear product [e, v]. V1 and V2
are inner products on V and also so-called ‘covariance operators’ (linear V → E).
(b) Realization that fixed-point theory can open the door to a simplified equivalent eigenanalysis for V1
and V2 : S = {v : (V1 + V2 )(v, v) = 1 and (V1 + V2 )(v, u) ≥ 0 for some fixed u} is the closed surface
of a (V1 + V2 )-hemisphere in V. The transformation T : S → S defined by s → ρ(s)V2−1 V1 s is
continuous. So S has a fixed point h with V2 h = ρ(h)V1 h and, as a consequence, you can take it from
here with a willingness to ‘go to the pictures’.
(c) The pictures I will show here are already downloadable and more fully explained in a 2008 research
report (www.ucl.stats.uk/research). Their reassuring features are affine invariants as obvious as three
lines meeting in a point — and, in these troubled days, more liberating than Sudoku.
Research Report No. 299, Department of Statistical Science, UCL, December 2008.
1
References
Shashkin, Yu. A. (1991) Fixed Points. Providence: American Mathematical Society.
Stone, M. (1987) Coordinate-Free Multivariable Statistics: An Illustrated Geometric Progression from
Halmos to Gauss and Bayes. Oxford: Clarendon Press.”
and now
Tyler, D., Critchley, F., Dumbgen, L. and Oja, H. (2009) ‘Invariant coordinate selection’, J. Roy. Statist.
Soc. B, 71.
2.
Fuller explanation of Figures 1-4.
Fig. 1 : Getting a fix (on an eigenvector as a ‘fixed point’ in S).
(i) V is the p-dimensional vector space of variables (linear combinations of the names of the p observed
variables) and E is the p-dimensional dual vector space of evaluators (linear functionals on V).
(ii) The evaluation of variable v by evaluator e is given by the real-valued bilinear product [e, v].
(iii) V1 and V2 are non-singular inner products on V that double as 1-1 linear transformations, V → E,
defined by [Vi v, u] ≡u Vi (v, u), i = 1, 2.
(iv) S is the (p − 1)-dimensional hemisphere {v : (V1 + V2 )(v, v) = 1 and (V1 + V2 )(v, u) ≤ 0} (using
V1 + V2 as the inner product preserves symmetry).
(v) Losing symmetry for a while, T s =def ρ(s)V2−1 V1 s (where ρ(s) is an obliging scalar function of s)
defines a continuous transformation of S onto S. (ρ(s) is continuous because the closed set V2−1 V1 S
does not include 0, the only point where ρ cannot be defined).
(vi) S is closed, bounded and simply connected. So fixed-point theory (Shaskin 1991, p.18) tells us
there is at least one point of S, h say, with T h = h ⇔ V2 h = ρV1 h (which restores essential symmetry).
Fig. 2 : An eigenline and its eigenvalue.
Fig. 2 then exhibits the coordinate-free eigenline H =def {ch : −∞ < c < ∞}. Its eigenvalue ρ is the
square of the ratio of the V2 and V1 lengths of any interval of H.
Fig. 3 : Reducing dimensionalty—for a second fix.
K∇ =def K : {v : [e, v] = 0 for all e in K} is the (p − 1)-dimensional bi-orthogonal complement of the
1-dimensional K. As drawn, it is both V1 -orthogonal and V2 -orthogonal to H. For p = 2, K∇ is
automatically the only other eigenline. For p > 2, another eigenline can be proved to exist in K∇ by the
2
same semi-fixed point analysis that detected H in S. And so on — until the existence of p eigenlines and
their corresponding eigenvalues has been established.
Fig. 4 : Invariance of ρ in E and ‘equivariance’ for eigenline evaluation.
(i) Fig. 4 relates to Theorem 1 of the paper’s Section 4. I have dropped the change of origin to b as a
trivial complication in the drawing of the figure, whose essential features do not implicate b. The figure
refers to a generic (jth!) eigenline in the generation of the p eigenlines (excluding the degenerate final
stage).
(ii) A0 : V → V is the dual of A : E → E. It is determined by the identity [e, A0 v] ≡e [Ae, v].
(iii) When V is a variance inner product for a probability distribution P on E, it can be represented as
V = e2 dP (e) in suggestive tensor operator language (Stone, 1987). The basic transformation A can be
R
interpreted as relocating and reshaping the distribution P : dP (e) at e is moved to be dP (e) at Ae in
order to define P ∗ . It can be verified (within the language) that the variance inner product for the
transformed distribution P ∗ is AV A0 : V → V (which is an inner product on V since
[AV A0 u, v] = [V A0 u, A0 v] = V (A0 u, A0 v)).
(iv) The identities [e, h] ≡ [e, A0 h∗ ] ≡ [Ae, h∗ ] ≡ [e∗ , h∗ ] establish an equivariance when P ∗ replaces P
: evaluations are invariant under the joint transformations e → e∗ = Ae and h → h∗ = (A0 )−1 h. For
equation (18), Fig. 4 illustrates an invariance of the eigenvalue. The γ in the paper’s equation (18) arises
only when γ1 and γ2 (in the proof in Appendix A) are taken to be unequal. The βj in equation (19) is
absent when b = 0, while the αj is unity if we are content to scale h and h∗ so that h = A0 h∗ .
3