Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
System of linear equations wikipedia , lookup
Euclidean vector wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Exterior algebra wikipedia , lookup
Symmetric cone wikipedia , lookup
Matrix calculus wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Covariance and contravariance of vectors wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Jordan normal form wikipedia , lookup
Vector space wikipedia , lookup
A blitzkrieg through decompositions of linear transformations Anirbit Department of Theoretical Physics, Tata Institute of Fundamental Research, Mumbai 400005, India (Dated: August 18, 2009) The objective of this article is to indicate the arguments that go into understanding the basic kinds of decompositions of linear transformation. The core ideas have been pointed out with the expectation that the reader can fill in the detailed proofs but the two decomposition theorems mentioned next are proven in details. Starting point is to understand the notion of diagonalizability of matrices of which 3 versions will be explained through matching of algebraic and geometric dimensions of eigenvalues, through existence of distinct roots of minimal polynomials and through families of projection operators. The final goal is to understand the Primary Decomposition Theorem (Rational Form) and the Cyclic Decomposition Theorem and how the first alone gives the powerful Jordan-Chevalley Decomposition and the both together give the miraculous Jordan Decomposition. En route we will need to understand the concepts of conductors and annihilators of vectors, cyclic vectors and cyclic vector spaces, minimal polynomials and characteristic polynomials, Cayley-Hamilton Theorem, Lagrange Polynomials and the idea of companion matrices. Towards the end ideas have been indicated about how these decomposition concepts generalize like to certain kinds of groups. The fist appendix gives some elementary concepts about Lie Algebras and Representation Theory and the second appendix gives some elementary concepts about Groups. Rings and Fields and Vector Spaces. I. MOTIVATION Given a matrix representation of a linear transformation on a finite dimensional vector space one can ask the following natural questions: • Can it be decided whether the matrix is diagonalizable without finding the eigen vectors? • Can any matrix be block-diagonalized? • Can any matrix be brought to a form where there are entries only along the diagonal and super/sub diagonal and the entries be known from just the characteristic and minimal polynomial? The answer to the first two questions is affirmative and over algebraically closed fields like C the answer to the last question is also affirmative. The point of this article is to indicate the key ideas that explain the above. A. Decomposition of Linear Transformations For the cause of Representation Theory it is important to understand the elementary ideas that go into the idea of decomposition of a linear transformation into transformations on smaller dimensional vector-spaces. It is desirable that a given linear transformation on an ndimensional vector-space is writable as a ”direct sum” of n linear transformations on one-dimensional subspaces of the original space. This is what is the idea behind “diagonalization”. But we know that all linear transformations are not diagonalizable and then it becomes necessary to understand which transformations are diagonalizable and when. Further if the matrix of the linear transformation is not diagonalizable it might still be reducible into a form that is “block-diagonal” and we need to see what is the simplest block-diagonal form to which a matrix can be reduced. Here I shall list out in a logical sequence the theorems which establish the above things and I shall indicate the basic idea behind them omitting the detailed proofs. All vector-spaces in the following section are finite dimensional. Many of the concepts might not naturally extend for infinite dimensional vectorspaces. 2 3 elementary operations on vector spaces: • Given two subspaces W1 and W2 of a vector space V we denote as W1 +W2 the sum of the two subspaces defined as the set {w1 + w2 |w1 ∈ W1 andw2 ∈ W2 }. The notion can be naturally extended to arbitrary number of subspaces. One notes that dim(W1 ) + dim(W2 ) = dim(W1 ∩ W2 ) + dim(W1 + W2 ). • Given k subspaces of V , say W1 , W2 ,...,Wk this set of subspaces is called independent if for any set of k vectors one from each subspace (Say ωi from Wi ,i ∈ {1, 2, ..., k}) the relation w1 + w2 + w3 + ... + wk = 0 implies each wi = 0 • Given k subspaces of V , say W1 , W2 ,...,Wk , the subspace W defined as W = W1 + W2 + ... + Wk is said to be a direct sum of these k subspaces (deoted as W = W1 ⊕ W2 ⊕ W3 ⊕ ... ⊕ Wk if these k subspaces are independent. One notes that W could be equal to V and then one would say that the k subspaces Wi , i ∈ {1, 2, ..., k} of V form a direct sum decomposition of V . One would then denote it as V = W1 ⊕ W2 ⊕ W3 ⊕ ... ⊕ Wk . Then one can see that an ordered basis of V is obtained by concatenating together an ordered basis each from Wi . I shall soon show a natural example of a set of subspaces of V connected to a given endomorphism on it which will always be independent but in general will NOT form a direct sum decomposition of the full space V . Such an example comes from the following crucial concept. The following are equivalent ways of defining a Characteristic Value (also known as an “Eigen Value”) for T an endomorphism of a finite dimensional vector-space V: • If the field of the vector space is not closed the existence of characteristic values isn’t guaranteed. 0 −1 One can easily see that the matrix doesn’t 1 0 have any characteristic values in R. • If c is a characteristic polynomial of an endomorphism A acting on the vector space V then an element v ∈ V is called a Characteristic Vector (or an “Eigen Vector”) for c if Av = cv. • An endomorphism of a vector space V is called Diagonalizable if there is a basis of V consisting of eigen vectors. It is easy to see that in that basis the matrix representation of the map will be diagonal. • One can see that for a characteristic value c of an endomorphism of V , the subset of V consisting of characteristic vectors for c forms a vector subspace of V . The space spanned by all the characteristic vectors of a given eigen value is called the Characteristic Space or Eigen Space for that characteristic value. This subspace can also be thought of as the Null Space of the operator T −cI (where T is the linear map). Dimension of the eigen subspace of an eigen value is called the Geometric Dimension of that eigen value and its multiplicity in the characteristic polynomial is called its Algebraic Dimension. 1. Diagonalizability when algebraic dimension equals geometric dimension We will realize the criteria of diagonalizability in multiple ways, some of which are conceptually cleaner and some of which are computationally efficient. • c is a characteristic value of T Diagonalizability(Version 1) • The operator (T − cI) is singular (not invertible) Let T be an endomorphism of a finite dimensional vector space V and let {c1 , c2 , c3 , ..., ck } be the set of characteristic values of T and Wi be the null space of (T − ci I). Then the following are equivalent: • det(T − cI) = 0 Just by virtue of the above 3 equivalent definitions of an eigenvalue the following observations and concepts follow: • Linear transformations related by a similarity transformation have the same characteristic values. Being similar is an equivalence relation on the set of all linear transformations and this splits the space into equivalence classes. And the characteristic value gives a multivalued function on each equivalence class. Such structures shall be ubiquitous in Representation Theory and are called Class Functions. • The polynomial defined as f (x) = det(T − xI) is called the Characteristic Polynomial and its roots are precisely the characteristic values. • T is diagonalizable • The characteristic polynomial of T is of the form f (x) = (x − c1 )dim(W1 ) (x − c2 )dim(W2 ) ...(x − ck )dim(Wk ) . Or in otherwords that for each eigen value its Algebraic Dimenison=Geometric Dimension. • dim(V ) = dim(W1 ) + dim(W2 ) + dim(W3 ) + ... + dim(Wk ) The crucial ingredients that go into the above recipe are: • Nullity of a diagonal matrix is equal to the number of 0s along its diagonal. 3 • One sees that the set of null spaces, one for each characteristic value of a given endomorphism of a finite dimensional vector space V are all linearly independent and hence if W = W1 +W2 +...+Wk then dim(W ) = dim(W1 ) + dim(W2 ) + ... + dim(Wk ). But in general dim(W ) < dim(V ) and hence in general the null-spaces don’t give a direct sum decomposition. But when the third criteria above is true dim(W ) = dim(V ) then the set of null-spaces of the distinct eigen values give a direct sum decomposition of the full space and hence diagonalizability. An example to see the geometric dimension and algebraic dimension conflict Consider 2 matrices 3 1 −1 5 −6 −6 A = 2 2 −1 and B = −1 4 2 2 2 0 3 −6 −4 A simple calculation shows that both of them have the same characteristic polynomial (x − 1)(x − 2)2 . So both of them have eigenvalues 1 and 2. Now let us look at these operators 1 1 −1 2 1 −1 A − I = 2 1 −1 , A − 2I = 2 0 −1 2 2 −2 2 2 −1 3 −6 −6 4 −6 −6 B − I = −1 3 2 , B − 2I = −1 2 2 3 −6 −6 3 −6 −5 In what follows we shall use the following technique that geometric dimension of an eigen value a of an automorphism T is the nullity of the operator T − aI which can be deduced from the Rank − N ullityT heorem which states that Rank + Nullity of an automorphism = dimension of the vector-space on which it acts. Further rank is usually easier to see either by just staring at the matrix hard enough or by solving a set of simultaneous equations. One can observe the following from the above: • Rank of A − I = 2. So for A geometric dimension of the eigen value 1 is 1 (same as its algebraic dimension). • Rank of A − 2I = 2 . So for A geometric dimension of the eigen value 2 is 1 (whereas algebraic dimension was2). • Rank of B − I = 2. So for B geometric dimension of the eigen value 1 is 1(same as its algebraic dimension). • Rank of B − 2I = 2 . So for B geometric dimension of the eigen value 2 is 2(same as its algebraic dimension). Hence A is NOT diagonalizablewhereas Bis. 3 2 2 The similarity transformation by −1 1 0 diagonal3 0 1 izes B. From this example we conclude that whether or not a matrix is diagnalizable is a deeper story than just the characteristic polynomial. To understand what really matters we have to look at some more subtle polynomials that capture this phenomenon. Annihilating Polynomial for a linear transformation over a ring is a polynomial p such that p(T ) = 0 where T is a matrix representation of the linear transformation. The following concepts follow from the above construction: • Since the vector-space is finite dimensional say of dimension n, then the space L(V, V ) is of dimension n2 and hence there is an annihilating polynomials of degree < n2 . But stronger is true as follows. • For a fixed linear transformation, set of its annihilating polynomials forms an ideal in the ring of polynomials. If in a ring there is euclid’s algorithm then one knows that there is a unique generator of the ideal which is the monic in the ideal of the lowest degree. This unique monic generator of the ideal of annihilating polynomials is called the minimal polynomial • The characteristic polynomial and the minimal polynomial have exactly the same roots upto multiplicities. This immediately shows that if the linear transformation has all distinct eigen values then its minimal and characteristic polynomial are the same. • The Cayley-Hamilton Theorem (weak form) states that the minimal polynomial divides the characteristic polynomial and hence a linear transformation satisfies its own characteristic polynomial. • If W is an invariant subspace for a linear transformation T and TW be the restriction of T to W . Then the characteristic polynomial and the minimal polynomial for TW divides the characteristic and the minimal polynomial for T respectively. 4 2. Conductors and Triangulability If W is an invariant subspace for the linear transformation of T on the vector space V and v ∈ V , then the T -conductor of v into W is the set of all polynomials g in the corresponding field such that g(T )v ∈ W . As for annihilators, the set of conductors also form an ideal and it has an unique monic generator. (By abuse of terminology, henceforth we shall call this unique monic generator as the, “T -conductor of v into W ” by) Since the minimal polynomial will take everything to 0, one notes that every T -conductor for a linear transformation T divides its minimal polynomial. Hence like above if the factorization of the minimal polynomial into irreducibles is known then that highly constraints the form of all the conductors. The following concepts follow from the above construction: • Let T be a linear operator acting on a finite dimensional vector space V with an invariant subspace W and let the minimal polynomial of T factorize completely into linear polynomials. Then there exists a v ∈ V, v ∈ / W and a characteristic value c of T such that (T − cI)v ∈ W . T he above existence theorem is what shall crucially make the Cyclic Decomposition Theorem produce the Jordan Form. • Call a linear transformation Triangulable if there is a basis in which the matrix is upper/lower triangular. A linear transformation on a finite dimensional vector space is triangulable iff its minimal polynomial is a product of linear polynomials with coefficients in the corresponding field. (Hence any linear transformation over an algebraically closed field like C is triangulable.) The first concept is almost a tautology!. The central idea in the first concept is that the one can pick any arbitrary vector, say β ∈ V \W and c comes from one of the common factors between the minimal polynomial and the characteristic polynomial (which is guaranteed by the linof β ear decomposition) and then v = T −conductor β does x−c the job. And by the minimality of the degree of the minimal polynomial, we have v ∈ / W . To get the second concept one just needs to iterate the first by starting with a W spanned by an eigen-vector of T . Let the eigen vector be v1 , then one is guaranteed that there is a v2 ∈ V W s.t (T − cI)v2 = mv1 for some c and m and hence T v2 = mv1 + cv1 . In the next iteration choose W as the subspace spanned by v1 and v2 . Continuing one generates the required oredered basis. The converse is trivial. 3. Diagonalizability when minimal polynomial has distinct roots This leads us immediately to the computationally most efficient test of diagonalizability. Diagonalizability(Version 2) An endomorphism of a finite dimensional vector space is diagonalizable iff its minimal polynomial factorizes into a product of distinct linear factors with coefficients in the corresponding field. Stated otherwise it means that if {c1 , c2 , ..., ck } are the distinct eigenvalues of the linear operator T then T is diagonalizable iff as operators (T − c1 I)(T − c2 I)...(T − ck I) = 0. The central idea in the above proof is to show that if the characteristic vectors can’t span the whole space then the minimal polynomial must have repeated roots. If c is an eigen value then there exists a vector α beyond the span of the characteristic vectors, say W , such that (T − cI)α ∈ W . Since the minimal polynomial factors as (x − c)q, q being some polynomial then q(T )α ∈ W since q(T )α is an eigen vector of T with eigen value c. Then look at the polynomial q(x) − q(c) and one can see that since c is a root of it, (q(T ) − q(c))α ∈ W (since W is an invariant subspace of T ) and hence q(c)α ∈ W . But since α ∈ / W by definition, we have q(c) = 0 which is in contradiction to the assumption that the minimal polynomial has repeated roots. Hence proved. The converse is trivial. 4. Diagonalizabilty through projection operators Projection operators and direct sum decompositions are intimately related to each other in the sense that giving one gives the other. This notion is made precise in the following way, If V = W1 ⊕ W2 ⊕ ... ⊕ Wk , then there exists k linear operators E1 , E2 , ..., Ek on V such that: • Each Ei is a projection i.e Ei2 = Ei • Ei Ej = 0 if i 6= j • I = E1 + E2 + ... + Ek • The range of Ei is Wi Conversely if E1 , E2 , ..., Ek are linear operators on V which satisfy these above mentioned conditions and if Wi is the range of Ei then V = W1 ⊕W2 ⊕...⊕Wk . The first part is trivially satisfied by choosing the Ei s as the projection operator to the Wi s. To prove the converse note that the condition I = E1 + E2 + ... + Ek 5 ensures that V = W1 + W2 + ... + W . This splits any vector α ∈ V as α = α1 + α2 + ... + αk and one can then see that Ej α = αj by using the construction that for any i, Ei α = Ei βi for some βi ∈ Wi using the surjectivity of Ei on Wi . This proves the uniqueness of the decomposition of alpha and hence the proof. • If g is any polynomial over the field F and hence if T is a diagonalizable linear operator as above with Ei s being projection operators into its eigenspaces then The above idea coupled with a concept we had earlier seen, that subspace of eigen-vectors of a given eigenvalue form an invariant subspace of the linear transformation and these subspaces for different eigen values are independent, gives us another way to characterize diagonalizability. • Thus if T is a diagonalizable operator then g(T ) = 0 iff all g(ci ) = 0 and hence the minimal polynomial Qk is i (x − ci ). Qk • Conversely if i (x − ci ) is the minimal polynomial for T then consider the set of k Lagrange Polynomials defined by these ci s as: Diagonalizability(Version 3) Let T be a linear operator on a finite dimensional vector space V . T is diagonalizable if there exists k distinct scalars c1 , c2 , ..., ck and k non-zero linear operators E1 , E2 , ..., Ek such that • T = c1 E1 + c2 E2 + ... + ck Ek • I = E1 + E2 + ... + Ek • Ei Ej = 0, i 6= j It would also imply that: • c1 , c2 , ..., ck are the eigenvalues of T • Ei2 = Ei (Ei is a projection) • The range of Ei is the characterisic space for T associated with eigenvalue ci Conversely if T is diagonalizable then k non-zero linear operators E1 , E2 , ..., Ek are guaranteed to exist which satisfy all the six criteria above. The converse is easy to prove by choosing Ei to be the projection operator into the characteristic space of eigenvalue ci and by showing that for any v ∈ V , T v = c1 E1 v + c2 E2 v + ... + ck Ek v using the idea that since Ei is the projection into an invariant space of T , it commutes with T i.e [T, Ei ] = 0∀i. To prove the diagonalizability condition one notes that T Ei = ci Ei and hence the range of Ei is the nullspace of (T − ci I). Thus ci s are the characterristic values of T and there is no other (which can be shown P by using the fact that if there is one say c then T = cI = i (ci − c)Ei and the consequence follows by the linear independence of the known eigenspaces). So we have seen that the range of all the Ei s is the space spanned by all the characteristic P vectors of T and that is the whole space since I = i Ei . Hence diagonalizable. Further we can see that for any v ∈ V , such that T v = ci v then it implies Ei v = 0. So the range of Ei = the nullspace of T − ci I. One can make the following important observations about the projection operators: g(T ) = g(c1 )E1 + g(c2 )E2 + ... + g(ck )Ek pj = Y x − ci cj − ci i6=j One then observes that pj (ci ) = δij and hence if T is diagonalizable then pj (T ) = Ej and hence the projection operators to the characteristic spaces of a diagonalizable operator are polynomials in the operator. (given by the Lagrange Polynomials) Conversely by defining Ej = pj (T ) one can show that these Ej s satisfy all the criteria’s of the third diagonalizability test and hence T is diagonalizable. This gives an independent proof of the fact that a linear transformation is diagonalizable iff its minimal polynomial factorizes into distinct linear polynomials. 6 5. Primary Decomposition Theorem and Jordan-Chevalley Decomposition We had earlier seen in the two matrices considered which had minimal polynomial being (x − 1)(x − 2)2 the algebraic and the geometric dimension of the eigenvalues didn’t match for the two matrices considered and the characteristic spaces for 1 and 2 were less than the full space. But in those examples one can see that the nullspaces of the operators (T − 2I)2 and (T − I) could give a direct sum decomposition of the full space. This idea is captured in the following theorem: Primary Decomposition Theorem Let T be a linear operator on a finite dimensional vector space V over the field F and let p be the minimal polynomial for T where p = pr11 pr22 ...pk rk where the pi are distinct irreducible monic polynomials over F and ri are positive integers. Let Wi be the nullspace of the operator pri i . Then the following holds: • V = W1 ⊕ W2 ⊕ ... ⊕ Wk • Each Wi is invariant under Ti • If Ti is the operator induced on Wi by T , then the minimal polynomial for Ti is pri i The block diagonal form for any linear transformation that this theorem guarantees is called the Rational Form This theorem gives a very powerful consequence which is crucial in representation theory. If Ei s denote the projection operators to the Wi s defined above and if pi = T − ci I then one can show by construction that I = E1 + E2 + ... + Ek and we can observe the following: • The operator D defined as D = c1 E1 + c2 E2 + ... + ck Ek along with the Ei s and the ci s satisfies all the criteria of the projection version of diagonalizability. Hence D is diagonalizable. In this context D is said to be semisimple. • One can show that operator N defined as N = T − D is nilpotent. • N and D commute and this decomposition of T is unique (as can be shown by direct evaluation) • Since we know that the projections are given by polynomials in the original linear transformation (specifically by Lagrange Polynomials), we conclude that the diagonalizable and the nilpotent parts of the operator are also given as polynomials in the operator. This unique decomposition of any linear operator over an algebraically closed field as a sum of a pair of commuting nilpotent and semisimple operators (which are polynomials in the original operator) is called the JordanChevalley Decomposition. The way these operators are constructed is as follows. Consider the polynomials fi = p pri i . One can see that all the fi s are relatively prime and hence their g.c.dP is one and hence there exists polynomials gi such that i fi gi = 1. One can then see that if Ei is defined as Ei = fi gi then Ei form a set of projection operators. (p|Ei Ej and hence Ei (T )Ej (T )v = 0, ∀v ∈ V ). Further by definition if v ∈ { range of Ei } then Ei v = v and pi (T )ri v = 0 by substituting the former into the LHS of the later equation. Further P for i 6= j, pri i |Ej and hence Ej v = 0 ri if pi v = 0. Since i Ei = 1 (by construction), we have Ei v = v. So the range of Ei is equal to the null space of pri i . On an algebraically closed field, as a consequence of the Jordan Form it shall become obvious that the dimension of Wi (the nullspace of the operator pri i ) is the algebraic dimension of the eigenvalue it corresponds to. Stated otherwise, the dimension of the nullspace of (T − cI)(its minimal polynomial multiplicity) is the multiplicity of c in the characteristic polynomial. Since pi (T )ri = 0 on Wi we have pi (Ti )ri = 0. So the minimal polynomial for Ti divides pri i . Further if g is any other annihilating polynomial of Ti then g(T )fi (T ) = 0 since after the direct sum decomposition is proven one sees that all the projections of a vector r along Wj (j 6= i) is annihilated by the factor pj j of fi and the projection along Wi will be killed by g. So trivially p|g(T )fi (T ) and so pri i |g. So the minimal polynomial of Ti divides pri i and pri i divides any annihilating polynomial of Ti . So pri i is the minimal polynomial of Ti . 7 6. Cyclic vectors, Cyclic subspaces and Companion Matrices Let T be a linear operator on the vector space V over the field F . In our aim to understand the decomposition of linear transformation further, the following constructions shall be of interest to us now: • Given a v ∈ V , one shall then be interested in polynomials g ∈ F [x] such that g(T )v = 0. This is an ideal in the ring of polynomials and it is non-zero since it contains the minimal polynomial. • Given a v ∈ V , one shall then be interested in the smallest T -invariant subspace of V which contains v. This subspace is also the intersection of all T -invariant subspaces containing v. This space is also the space spanned by all vectors of the form g(T )v, ∀g ∈ F [x]. • Given a T -invariant subspace, W of V one asks the question as to whether there is another T -invariant subspace W 0 such that V = W ⊕ W 0 . This leads us to define the following two concepts: • If v ∈ V then the T-cyclic subspace generated by v is the subspace of all vectors of the form g(T )v, ∀g ∈ F [x] denoted as Z(v, T ). • If Z(v, T ) = V then v is called a cyclic vector for T. • If v ∈ V then the T-annihilator of v is the ideal in the ring of polynomials F [x] generated by g ∈ F [x] such that g(T )v = 0. This ideal is denoted as M (v, T ). The unique monic generator of this ideal will also be called the T-annihilator of v (by abuse of notation!) • A T -invariant subspace W of V is T -admissible if for every v ∈ V and f ∈ F [x] there exists a w ∈ W such that f (T )v = f (T )w. One notes the following: • The T-cyclic subspace generated by 0 is 0. • If deg(pv ) = k then the vectors v, T v, T 2 v, ..., T k−1 v forms a basis for Z(v, T ). • The minimal polynomial for T |Z(v,T ) is pv . The above is seen by noting that the remainder obtained by dividing any polynomial by pv has to be of degree less then deg(pv ) and hence has to be a linear sum of v, T v, T 2 v, ..., T k−1 v which are linearly independent since any non-trivial relation between them will contradict the definition of pv as the minimal polynomial. The last assertion is established by noting that for any g ∈ F [x], pv (T |Z(v,T ) )g(T )v = 0 and hence pv (T |Z(v,T ) ) has Z(v, T ) as its kernel. And by definition there cannot be any polynomial of lower degree annihilating Z(v, T ) since that will contradict the definition of pv . From the above assertions one can see that if there is a cyclic vector then the minimal polynomial and the characteristic polynomial are the same. Let us for now look at a linear operator L on a vector space W of dimension k which has w ∈ W as its cyclic vector. Then the set {w, Lw, L2 w, ..., Lk−1 w} forms a basis for W . Consider the vectors defined as wi = Li−1 w then in the basis {w1 , w2 , ..., wk } L looks like 0 0 0 ... 0 −c0 1 0 0 ... 0 −c1 0 1 0 ... 0 −c2 . . . ... . . 0 0 0 ... 1 −ck−1 where the monic polynomial pw is c0 + c1 x + ... + ck−1xk−1 . And this matrix is called the Companion Matrix for this monic polynomial. One notes the following about the Companion Matrix • If T is a linear operator on a vector space V then T has a cyclic vector iff there is a basis of V in which T looks like the companion matrix of its minimal polynomial. • Minimal and the characteristic polynomial of a companion matrix is the same monic polynomial from which it came. These companion matrices in the cyclic subspaces shall be the building blocks from which we shall try to build the full linear transformations. • Z(v, T ) is 1-dimensional iff v is an eigen vector of T. • For the identity operator every non-zero vector generates a cyclic subspace. So for dim(V ) > 1, identity operator has no cyclic vector. • If a T -invariant subspace has a complementary T invariant subspace then it is also T -admissible. Let pv denote the unique monic generator of M (v, T ). Then one notes the following: • deg(pv ) = dim(Z(v, T )) 7. The idea behind cyclic decomposition (Part 1) The argument for Cyclic Decomposition Theorem needs synthesis of many concepts from linear algebra and hence we shall give the constituent ideas in instalments in an effort to make it more palatable. Basically, if T is a linear operator on a finite dimensional vector space V then we can show that there are vectors {v1 , v2 , ..., vr } such that V = Z(v1 , T ) ⊕ Z(v2 , T ) ⊕ ... ⊕ Z(vr , T ) 8 . Say one has found vectors {v1 , v2 , ..., vj } inductively and the subspace The direct sum of the cyclic spaces given by the above decomposition gives the required complementary space Wj = Z(v1 , T ) ⊕ Z(v2 , T ) ⊕ ... ⊕ Z(vj , T ) is proper. One now needs to see that all this ensures that there is another vector vj+1 ∈ V such that Wj ∩ Z(vj+1 , T ) = {0}. Take a vector v ∈ V, v ∈ / Wj . Then the Let f ∈ F [x] be the T -conductor of v into Wj and IF Wj is T -admissible then there is a w ∈ Wj s.t f (T )v = f (T )w. Let x = v − w. Then since v − x ∈ Wj , and by definition Wj is T invariant, g(T )v ∈ Wj iff g(T )w ∈ Wj for some g ∈ F [x], i.e if the the T -conductor into Wj of both x and v match i.e if f is the T -conductor of x into Wj . But f (T )x = 0, f being the T -conductor of x into W by the above argument and if g(T )v ∈ Wj for any g ∈ F [x], then by definition f |g and hence g(T )v ∈ Wj iff g(T )v = 0. Since by definition Z(v, T ) is the space of all g(T )v for arbitrary g ∈ F [x], we see that Wj ∩ Z(vj+1 , T ) = {0} So the thing to ensure for the induction to continue is that Wj is T -admissible. 8. Cyclic Decomposition Theorem and Jordan Form • A linear transformation has a cyclic vector iff its minimal polynomial and the characteristic polynomial match. One was was shown earlier by the companion matrix. The other way is obvious since if minimal and characteristic polynomial match then the size of the first cyclic space will exhaust the dimension • A priori, it may be the case that the minimal polynomial for the whole vector space is some polynomial but each specific vector has a smaller minimal polynomial (and the lcm of these smaller minimal polynomials is the minimal polynomial over the whole space). But one sees that there is a vector whose minimal polynomial is the minimal polynomial over the whole space. From the next part of the explanation of this theorem it shall be clear that if one chooses W0 as the null space then this vector is vi • Cayley-Hamilton Theorem (strong form)is a consequence of this theorem which states that if a linear operator T on a vector space V has minimal polynomial p and characteristic polynomial f then, – p|f The idea in the above section of decomposition of a vector space into direct sum of cyclic subspaces is made precise in the following sense: Cyclic Decomposition Theorem Let T be a linear operator on a finite-dimensional vector space V and let W0 be a proper T -admissible subspace of V . Then there exists a set of vectors {v1 , v2 , ..., vr } in V with respective T -annihilators p1 , p2 , ..., pr such that • V = W0 ⊕Z(v1 , T )⊕Z(v2 , T )⊕Z(v3 , T )...⊕Z(vr , T ) • pk |pk−1 , k ∈ {2, 3, ..., r} The integer r and the annihilators p1 , p2 , ..., pr are uniquely determined by the above 2 conditions and the fact that vk 6= 0, ∀k. (Note that one can always choose W0 as the null space since it is trivially always an admissible space further recall that earlier it had been shown that deg(pi ) = dim(Z(vi , T ))) Some of the consequences of existence of the above cyclic decomposition are: • Given a linear operator T on a vector space V , every T -admissible subspace of it has a complementary T -invariant subspace. – p and f have the same prime factors except for multiplicities. Qk – if the prime factorization of p is p = 1 fiki Qk di then f = where di is the nullity of 1 fi fi (T )ki divided by deg(fi ). The central idea is that pi is the minimal polynomial for T |Z(vi ,T ) and since Z(vi , T ) is a cyclic space pi is also the characteristic polynomial for this restricted operator and hence by the block diagonal Qr form given by the cyclic decomposition f = 1 pi and if W0 is chosen as the null space then p1 = p and hence the first part is shown and by the fact that pi |pi−1 we get the next part. From the Primary Decomposition we know that if Ti is the null space of fi (T )ki then fiki is the minimal polynomial for Ti . So by the fact just now proved that prime factors arising in the factorization of minimal and the characteristic polynomials are the same we see that the characteristic polynomial for Ti would be fidi withs some di ≥ ri . Since the degree of a characteristic polynomial is the dimensionality of the i) vector space we automatically have di = dim(V deg(fi ) . And further by the direct sum structure it follows Qk that f = 1 fidi 9 • The Jordan Formis obtainable over an algebraically closed field by doing the Cyclic Decomposition of the induced operator in every subspace given by the Primary Decomposition. Before understanding how the proof of the above big result works let us try to understand how the above leads to the Jordan Form. Suppose the characteristic polynomial f of a linear operator T on a vector space V over an algebraically closed field F factors as: f= k Y (x − ci )di So the Cyclic decomposition theorem says that a nilpotent operator on the space V has an ordered basis in which A1 0 ... 0 0 A2 ... 0 . ... . N = . . . ... . 0 0 ... Ar where Ai is a ki × ki companion matrix of the type explained above. Hence going back to Ti , since Ni can be written as above, we see that Ti can be written as a direct sum of matrices of the type: 1 and hence the ci s are its distinct eigenvalues and di > 1∀i and hence its minimal polynomial p has to be of the form p= k Y (x − ci )ri 1 with 1 ≤ ri ≤ di . If Wi is the nullspace of (T − ci I)ri then the Primary Decomposition theorem tells us that Lk V = 1 Wi and the operator Ti induced on Wi has as its minimal polynomial (x − ci )ri . ci 1 0 . 0 ci 1 . ... ... ... ... 0 0 0 . ci 0 0 ... 1 0 0 0 . 0 ci The above kind of matrices are called Elementary Jordan Block of eigen value ci . By abuse of notation calling T the matrix representation of the operator T and similarly forTi in the basis just generated we see that the final form looks like: Hence the operator Ni on Wi defined as Ni = Ti − ci I is nilpotent. So now we want to do a Cyclic Decomposition of Wi with respect to the nilpotent operator Ti −ci I on it. T = Hence we need to see how Cyclic Decomposition works for any nilpotent operator say N on some finite dimensional vector space V . T1 0 ... 0 0 T2 ... 0 . . ... . . . ... . 0 0 ... Tk where Cyclic Decomposition Theorem tells guarantees us the existence of r non-zero vectors {v1 , v2 , ..., vr } with N -annihilators {p1 , p2 , ..., pr } such that V = r M Ti = Z(vi , N ) 1 and pi+1 |pi . Since N is nilpotent its minimal polynomial is xk for some k ≤ n and hence each pi = xki where k1 = k and kr ≥ 1 and ki+1 ≤ ki . Now the idea of Companion Matrix guarantees that there exists a basis in the subspace Z(vi , N ) in which the induced operator is represented by a ki × ki matrix Ai of the form: 0 1 Ai = 0 . 0 0 0 1 . 0 ... ... ... ... ... 0 0 0 . 1 0 0 0 . 0 Ji1 0 0 Ji2 . . . . 0 0 ... 0 ... 0 ... . ... . ... Jini where Jim is the mth elementary Jordan block for the eigen value ci (corresponding to the block for Ti ) and ni being the number of cyclic spaces into which the nullspace of (x − ci I)ri splits. The above is called the Jordan Form for the linear operator. From the Jordan form three crucial observations immediately follow: • One can see that the dimension of the null space of (T − ci I)ri is di i.e dim(Wi ) = di (where di is the algebraic multiplicity of the eigenvalue ci and ri is its geometric multiplicity) 10 • ni is the geometric multiplicity for the eigenvalue ci since in the Cyclic Decomposition of Ni each cyclic subspace Z(vm , Ni ) gives one vector i.e Nikm which is non-zero and is in the null space of the operator Ni = Ti −ci I. Where km = dim(Z(vm , Ni )). This again shows the diagonalizability criteria that a linear transformation is diagonalizable iff ni = di , ∀i where ni and di have been just now argued to be the algebraic and the geometric multiplicities respectively of the eigenvalue ci • Dimension of the Elementary Jordan Block Ji1 for all i is the multiplicity of the eigenvalue ci in the minimal polynomial. 9. The idea behind cyclic decomposition (Part 2) Starting with the T -admissible space W0 one looks for a vector w1 such that if p1 = s(w1 , W0 ) (the conductor of w1 into W0 ) then p1 = maxw∈V degs(w, W0 ) and the corresponding w is taken as w1 . One continues this induction so that after k steps Pk one has Wk = W0 + 1 Z(wi , T ) and polynomials p1 , p2 , ..., pk such that wk ∈ V, wk ∈ / Wk−1 and among all T -conductors into Wk−1 it has the maximal degree conductor pk . So we see P that if w ∈ W and f ∈ s(w, Wk−1 ) then f w = w0 + 1≤i≤k−1 gi wi where gi are some polynomials and wi ∈ Wi . One can then show that f |gi and w0 = f z0 for some z0 ∈ W0 . (call this the “Divisibility Claim”) After k steps of the induction call the w of above as wk and f as pk and we have for some set of polynomials say hi and with z0 as above, the relation X pk wk = pk z0 + pk hi wi 1≤i≤k−1 and define vk = wk − z0 − X hi wi 1≤i≤k−1 Since wk − vk ∈ Wk−1 we have s(vk , Wk−1 ) = s(wk , Wk−1 ) = pk and since pk vk = 0 we have Wk−1 ∩ Z(vk , T ) = {0} pk vk = 0 + p1 v1 + p2 v2 + ... + pk−1 vk−1 on which applying the Divisibility Claim we have that pi |pi−1 After getting Wk−1 one searches for the vector wk in the rest of the space which has a conductor into Wk−1 of maximal degree to construct Wk = Wk−1 + Z(vk , T ). And dim(Wk ) > dim(Wk−1 ) and hence this induction will end after atmost dim(V ) steps. 10. Let us now try to understand the basic idea behind how the divisibility claim works, now that we have a hang of how the cyclic decomposition works as to how to inductively search for the cyclic vectors which will exhaust the full space by their cyclic subspaces. The argument is initially almost the same as above. Let gi = f hi + ri and ri = 0 or deg(ri ) < deg(f ) and Pk−1 define z = w P− 1 hi wi . Since z − w ∈ Wk−1 we have f z = w0 + k−1 ri wi and let j be the largest i for which 1 ri 6= 0. Since Wj−1 ⊂ Wk−1 we have that there exists a polynomial g such that p = gf where p = s(z, Wj−1 ) and f = s(z, Wk−1 ). Then we have . and we have the trivial relation X pz = gf z = grj wj + gw0 + gri wi 1≤i≤(j−1) Since pz ∈ Wj−1 it implies that grj wj ∈ Wj−1 . Now we remember that the degree of the monic generator of an ideal is the polynomial of the least degree in that ideal and also that the pi s were chosen to be the maximum degree polynomials among all the conductors into the respective Wi s and hence combining these two we have the inequality deg(grj ) ≥ deg(s(wj , Wj−1 )) = pj ≥ degs(z, Wj−1 ) = deg(p) and deg(p) = deg(f g) Hence we have deg(rj ) ≥ deg(f ) which is absurd by the definition of j and hence we have that all the ri = 0. This also shows that if And we have the construction that Wk = W0 ⊕ Z(v1 , T ) ⊕ Z(v2 , T ) ⊕ ... ⊕ Z(vk , T ) The idea behind cyclic decomposition (Part 3) gi = f hi and z= k−1 X 1 hi wi 11 then II. w0 = f z but since W0 is by definition an admissible space we have some z0 such that w0 = f z0 and hence we have for any w ∈ V and f = s(w, Wk−1 ), f (T )w = f (T )(z0 + k−1 X hi wi ) 1 and hence it shows that at every step of the induction each Wi is a T -admissible space! This completes the proof of the Cyclic Decomposition Theorem. 11. The larger picture One notes that for the Cyclic Decomposition to work we crucially needed two things • That the ideals of conductors are principal coupled with the non-existence of 0-divisors. • The uniqueness coming from the existence of a notion of unique prime factorization. The first property was coming from the fact that we were working in the polynomial ring where the euclid’s division algorithm gives the monic generators of the ideals. But in general we can carry over all that for any Principal Integral Domain (PID) (not necessarily a Euclidean Domain) which also are Unique Factorization Domains and since we never needed any specific property of vector spaces for this we can do the same Cyclic Decomposition on any module over a PID. This givs us the “Structure theorem for modules over PID” and since any abelian group is a Z−module where Z is a PID we can have the “Structure Theorem for Abelian Groups” which is conventionally stated as Every finitely generated abelian group is a direct sum of cyclic groups of prime power order and of a free abelian group. or If G is a finitely generated group then V = L ⊕ Cd1 ⊕ Cd2 ⊕ ... ⊕ Cdk where Ci is a cyclic group of order i and di > 1 and d1 |d2 |...|dk . One can identity any Cn to Zn . ACKNOWLEDGEMENT Interspersed in this document are influences arising from the few hundred emails exchanged over the last 4 years with Vipul (formerly my college-mate at CMI (Chennai Mathematical Institute) and currently a mathematics graduate student at UChicago) . A large part of the core content is from the algebra book by Hoffman and Kunze and there is the undeniable influence of the algebra books by Herstein and Artin on me over the last 4 years, Herstein being my initiation into the subject. Some of the contents in the second appendix of this article are from this breathtaking repository of group theory created by Vipul http://groupprops.subwiki.org/wiki/MainP age 12 III. APPENDIX-1 (SOME ELEMENTARY IDEAS OF LIE ALGEBRA AND REPRESENTATION THEORY) A. Lie Algebra Definition 1 A Lie Algebra is a vector-space V over the field F with a map V × V → V denoted by (x, y) 7→ [x, y] called the Lie Bracket of x and y such that the following axioms are satisfied: • The Lie Bracket is a bilinear map. • In the first picture if g ∈ G then ρg : V → V and hence we have an action π : G × V → V such that π(g, v) = ρg (v) for all g ∈ G and v ∈ V . The linearity of the action follows from ρg ∈ GL(V ). • In the second picture if one fixes a g ∈ G then π(g) : V → V such that π(g) ∈ GL(V ) since π(g −1 ) = π(g)−1 (guarantee of existence of inverse from the very defintion of a group action). So it gives a map ρ such that ρg (v) = π(g)(v). Gnerally the map ρ is kept implicit and we shall say that the vector-space V is a representation of G. • [x, x] = 0 for all x ∈ V • [x, [y, z]]+[z, [x, y]]+[y, [z, x]] = 0 for all x, y, z ∈ V The third axiom can also be said as that the Lie Bracket saisfies the Jacobi Identity We note that the second axiom implies anticommutativity i.e [x, y] = −[y, x] but anticommutativity implies the second axiom only when Char(F ) 6= 2 We shall in general keep the bilinar map implicit an call V as the Lie Algebra by a somewhat of an abuse of notation.Further by a subspace of the Lie Algebra V we shall mean a vector subspace of V inheriting the same bilinear form from the original space but the subspace need not be closed under this. 1. Associated concepts of a Lie Algebra V • Lie Algebras V1 and V2 over F are Isomorphic Lie Algebras is there exists a vector space isomorphism φ : V1 → V2 such that φ([x, y]) = [φ(x), φ(y)] • A subspace W of Lie Algebra V is said to be a Subalgebra of V if [x, y] ∈ W for all x, y ∈ W . • A Linear Lie Algebra is a subalgebra of gl(V ) which is the Lie Algebra of GL(V ). • A Ideal I of the Lie Algebra V is a vector-subspace of V such that [x, y] ∈ I whenever x ∈ V and y ∈ I. Because of anticommutativity one does not need to distinguish between Left Ideals and Right Ideals B. Representation Theory Definition 2 A Representation of a group G on a vector space V is either of the 2 equivalent data: • A homomorphism ρ : G → GL(V ) • A group action π of G on V , π : G × V → V such that π(g, v1 + v2 ) = π(g, v1 ) + π(g, v2 ) for all g ∈ G and v1 ∈ V and v2 ∈ V . The equivalence of the above 2 definition is easy to see. 1. Associated concepts of a Representation • Given two representations ρv on V and ρw on W , a map φ : V → W is called G-linear map if the following is satisfied ρw(g) (φ(v)) = φ(ρv(g) (v)). Where ρw(g) is the element of GL(W ) to which g is mapped to by ρw and similarly for ρv(g) . • Given a representation of the group G on V a Subrepresentation of G is a proper subspace W of G which is invariant under the action of G on V . • A representation V of G is said to be Irreducible if V has no proper invariant subspace under that action. • If ρ is a representation of G on V with inner product H0 , then one can define on V a Ginvariant inner-product i.e an inner product H such that H(v1 , v2 ) = H(ρg (v1 ), ρg (v2 )) for all v1 , v2 ∈ P V . Such a H is can be defined as H(v1 , v2 ) = g∈G H0 (ρg (v1 ), ρg (v2 )). For compact Lie Groups we can extend this notion using the idea of Invariant Measures or Haar Measures. • Under the inner-product H let W ⊥ be the orthogonal subspace of W where W is an invariant subspace of the representation V , then W ⊥ is also a representation. • A G-linear map is a map between vector-spaces V and W , φ : V → W , each carrying a representation of the group G such that φ(g(v)) = g(φ(v)) for all g ∈ G and v ∈ V . Where it is understood that the action of g on the LHS is the representation on V and the action of g on the RHS is the representation on W . – Given a G-linear map one can see that Im φ, Ker φ and Coker φ are also representations of G – Given two vector-spaces the set of all G-linear maps between them form a vector space and this space can be thought of as the subspace of 13 Hom(V, W ) which are “fixed under the action of G” • Let G act from the left on a finite set X where the elements of X are used to label the elements in a chosen basis of a vector space V . So a basis of V is given as {ex |x ∈ X} and natural action P hence a P of G on V is given as g( ax ex ) = ax egx . This called the Permutation Representation The following properties work equally well for finite groups and compact topological groups but for the later in general the summations over the group has to be replaced by integrations wth respect to the Haar Measure on the group • Let R be the space of all complex-valued functions on G and then there is a natural representation of G on R in the following way: Let f : G → C be an element of R and g, h be any two elements in G. Then the action is given by gf (h) = f (g −1 h) . This is called the Regular Representation • Given a representation V of G which has W as a subrepresentation, one can construct another subrepresentation W 0 of V such that V = W ⊕ W 0 . There are 2 natural ways of consructing this W 0 : – On V one can define the G-invariant inner product and then look at the space W ⊥ which is normal to W under this inner-product. Then one can easily see that W ⊥ is a subrepresentation of G. Further V = W ⊕ W⊥ – Let π0 be some projection map from V onto W which is identity on W . There is nothing canonical about this map since choosing such a projection map is equivalent to choosing some arbitrary subspace U of V such that V = W ⊕ U and there are infinite ways in which U can be chosen. But having made a choice of π0 one can define another map π from V onto W as follows (where we call the representation of G on V as ρ): πV → W π(v) = X ρg (π0 (ρg−1 (v))) g∈G – The above map π has the following properties: ∗ π is a G-linear map from V to W and hence Ker(π) is a subrepresentation of G. ∗ If G is a finite group then on W , π is a map that multiplies all vectors by |G|. ∗ One can easily show that V = W ⊕ Ker(π) . Hence the decomposition is done! But one must note that the proof of independence of W and Ker(π) works ONLY IF the field is NOT of finite charateristic. • Continuing the above process inductively one can see that for finite and compact groups, any representation can be written as a direct sum of irreducible representations. This property is called Complete Reducibilty. One can state this as: F or any representation V of a finite group G, there is a direct sum decomposition V = V1⊕a1 ⊕ V2⊕a2 ... ⊕ Vk⊕ak where the Vi are distinct irreducible representation. The decomposition of V into a direct sum of k factors is unique as in the Vi and its multiplicities ai are unique. • To prove the above one needs the following fact which is called the Schur’s Lemma. If V and W are two irreducible representations of a group G and φ : V → W is a G-linear map then: – Either φ is an isomorphism or φ = 0. – If V = W and the underlying field is algebraically closed then φ is a scalar multiple of identity. The key idea to be used in understanding the above is that for a G-linear map, the kernel and image both are representations of the group and hence irreducibility of the domain and image force them to be either 0 or the full space. For the second part, the algebraic closure assures the existence of eigen values and eigen vectors for them. Let c be some eigen value and then apply the first part to the G-linear map φ − cI. One notes that there are representations for which the above fails and even given an invariant subspace one may not find a complememtary invariant. Like the representation of R sending 1 a a 7→ 0 1 has the x − axis as its invariant subspace but has no complementary subspace. Further complications in reducibility proofs arise if the underlying field is of finite characteristic as pointed out earlier. Later we shall see that if a group satisfies the property of being Semisimple then it is completely reducible. 14 IV. APPENDIX-2 (SOME ELEMENTARY DEFINITIONS IN ALGEBRA) A. Group Given a set S an action of a group G on the set S is a map: The concept of a Group evolved from ideas of symmery transformations but the abstractions were started by Abel and Galois. Finally the formal definition was given by von Dyck in the 1880. We take the following definition of group for our purposes. Definition 3 A Group is a set G equipped with the following 3 operations: • A binary operation of multiplication, ∗ : G×G → G. • An unary operaion of inversion, • To put it very bluntly the raison d’etre for Groups is that they act on other structures in various natural ways. Let us make this idea of act more precise: −1 : G → G. • A 0-ary operation 0 − ary : G → e which on any element of the set evaluates to a special element of the set denoted by e called the neutral or the identity element. G×S →S (g, s) 7→ gs such that the following 2 axioms are satisfied: – 1s = s for all s ∈ S and 1 is the identity of G. – (gg 0 )s = g(g 0 s) for all g, g 0 ∈ G and s ∈ S. Given such an action S is called a G-set and sometimes the action is precisely called the Left Action and the Right Action is defined in the obvious way. When a group G acts on a set X one can think of this as giving a map from G → Aut(X) • Neutrality of e, ∀a ∈ G, we have a∗e = e∗a = a. Very often the set S is actually also a group and the role of G is played by some subgroup of it. There are many important and natural actions of a subgroup on the full group. But even in the general setting of just a G-set many of the important notions can be seen. Like given this action and an element s ∈ S one has the following two natural sets: • Annihilation by the inverse element,∀a ∈ G, we have a ∗ a1 = a1 ∗ a = e. – Orb(s) (Orbit of s) = {s0 ∈ S|s0 = gs for some g ∈ G} and the operations statisfy the following 3 compatibility conditions: • Associativity, ∀a, b, c ∈ G, we have a ∗ (b ∗ c) = (a ∗ b) ∗ c. – Stab(s) (Stabilizer of s)={g ∈ G|gs = s} 1. Associated Concepts of a Group • It is good to to be aware that there exists variants of a group simplest of which are: – Magma A set equipped with a binary operation. – Monoid A Magma with an identity element and whose binary operation is associative. – Semigroup A Magma whose binary operation is associative. – Inverse Semigroup A Semigroup whose every element has an inverse. • A Subgroup H of a group G is a non-empty subset of G which satsfies either of the following conditions: Now we note the following important properties of the above: – One an put the following relation on S that s ∼ s0 is s0 = gs for some g ∈ G. It is easy to see that this is an E quivalence Relation by checking for the properties of T ransitivity, Reflexivity and S ymmetry and this check crucially needs the properties of closure,existence of identity and inverse of each element of a Group. This does NOT need associativity. So it shows that {Orb(s)|s ∈ S} forms a Partition of the set S since “being in a orbit” is an equivalence relation and hence 2 orbits can either be identical or disjoint but cant’s have a proper subset of each overlapping. – Hence S is a disjoint union of orbits. – Stab(s) is a subgroup of G. – It is closed under l eft quotient of elements, that is, for any a, b ∈ H, a−1 b ∈ H. – It is closed under r ight quotient of elements, that is, for any a, b ∈ H, ab−1 ∈ H. – The partitioning effect on S and subgroup property of Stab(s) depends only on the group properties of G and doesn’t demand anything special from S. 15 • A very important class of group actions is the Left an Right action of a subgroup H on the full group G by group multiplication given by the maps G×H → G and H × G → G respectively. Same arguments that worked for the general action on a set will work here too since the group G is also a set to show that the group multiplication action of H on G partitions the fullgoup into equivalence classes. But here we change terminologies slightly and call Orbits as Cosets. For this the Stab(g) is given a new name Centralizer(g) denoted as Z(g). And the Orb(g) under this action is also given a new name called the the Conjugacy Class of g denoted as C(g). Stating as equation we have: Z(x) = {g ∈ G|gxg −1 = x} A Left Coset of g ∈ G = gH = {gh|h ∈ H} A Right Coset of g ∈ G = Hg = {hg|h ∈ H} C(x) = {g ∈ G|g = g 0 xg 0−1 for some g 0 ∈ G} So Left Cosets and Right Cosets both partition the group G. One defines the symbol G/H (said as “G mod H”) to be the set of all cosets of S. One doesn’t need to distinguish between the Left and the Right Cosets while defining G/H since there is a natural isomorphism between the set of all Left Cosets to the set of all Right Cosets. Again from the idea of partitioning of the acted on set by Orbits we have here what is called the Class Equation stated as: X |G| = |C| conjugacyclassesC • We note this following important thing that when a group was acting on just a set S the different Orbit(s) as s varies over S are of different sizes (for finite groups one can think of “size” as cardinality of the set Orb(s) and for others one can say that there is no natural bijection from Orb(s) to Orb(r) is s 6= r) But when a subgroup acts on the full group all the orbits or the cosets for all elements of the group are bijective to each other. The group structure of the set being acted on brings in this new feature. And it is easy to see that |gH| = |H| for all g ∈ G (stating for finite groups). Since orbits partition the acted-on set, one trivially gets the Lagrange’s Thorem that |H| divides |G|. Further since the cosets partition the set the number of partitions is precisely |G/H| and hence we have |G| = |H||G/H|. • This brings us to the extremely important idea in Group Theory that If the group G acts on the set S then there is a natural bijection from G/Stab(s) to Orb(s) for any s ∈ S which maps the coset gStab(s) to the element gs This powerfully couples with the fact that the cardinality of the set S is the sum of the cardinalities of the orbits of the action all of which are not guaranteed to be of the same size unless S is a group. • Another standard group action which is important for Representation Theory is the Conjugation Action which is an action of a group G on itself given as: B. Ring The concept of a Ring evolved from attempts to generalize the properties of integers in which one can add as well as multiply staying within the set but cannot divide. Interestingly this generalization gives powerful extension of the idea of a “prime number” and this lifting excitingly connects to geometry through what is called the “Hilbert’s Nullstellensatz”. But just now we shall not get into thse exciting ideas! Definition 4 A Ring is a set R endowed with 2 binary operations + (addition) and × (multiplication) such that the following axioms are satisfied: • R is an abelian group with respect to +. • × operation is commutative, associative and has an identity. • The × operation is left-distributive as well as rightdistributive over + that is ∀a, b, c ∈ R we have a × (b + c) = a × b + a × c and (b + c) × a = b × a + c × a. Some standard notation • The identity for + is denoted by 0 and the identity of × is denoted by 1. G×G→G • Almost always a × (b + c) is written as a(b + c). For example the first equation of the third condition above will read as a(b + c) = ab + ac. (g, x) 7→ gxg −1 • (R, +) will mean R thought of as an abelian group under addtion. 16 1. Associated Concepts of a Ring • A Subring is a subset of a Ring which satisfies all the axioms of being a Ring. • If 1 = 0 then the Ring is a zero ring. • An Ideal I of a Ring R is subset of R such that: – I is a subgroup of (R, +). – If i ∈ I and r ∈ R then ri ∈ I. D. Vector Space Definition 6 A Vector Space is a set of 3 data: • An abelian group G. • A field F . • An action of F on G i.e a map F × G → G where If R is non-commutative then one needs to distingish between Left Ideal and Right Ideal depending on whether the r in the second axiom of being an ideal is multiplying from the left or the right. • The group operation in G is denoted as addition by + An Ideal can also be defined as a non-empty subset of R such that given a finite subset ai ⊂ I and another finite subset ri ⊂ R the linear combination r1 a1 + r2 a2 + ... + rk ak ∈ I. and the following consistency conditions are satisfied by the action: If a, b ∈ F and v, w ∈ G and 1 is the multiplicative identity of the Field. • A Principal Ideal generated by an element r ∈ R is the set of all multiples of r. It is denoted as (r). (a) = aR = Ra = {ra|r ∈ R} • An Ideal generated by {a1 , a2 , ..., an } ⊂ R is the smallest Ideal containing these elements. It is denoted by (a1 , a2 , ..., an ). (a1 , a2 , ..., an ) = {r1 a1 + r2 a2 + ... + rn an |ri ∈ R} • A Maximal Ideal of a Ring is an Ideal which is not contained in any other ideal except itself and the Ring.One notes that ther can be many maximal ideals in a Ring. • All ideals in Z are principal ideals. All maximal ideals of Z are the principal ideals generated by the prime numbers and the converse is also true. • A Ring is an Integral Domain if 1 6= 0 and if for a, b ∈ R such that ab = 0 then either a = 0 or b = 0. • The action of an element a ∈ F on v ∈ G is denoted by av • 1v = v • (ab)v = a(bv) • (a + b)v = av + bv • a(v + w) = av + aw We shall denote a vector space V as the triplet (G, F, F × G → G). And when it is said that v ∈ V , v will be meant to be an elements of the implicit abelian group that is part of the data called “Vector Space”. 1. Associated concepts of a Vector Space • W is called a Subspace of V = (G, F, F × G → G) if: – W ⊂G – If w1 , w2 ∈ W then w1 + w2 ∈ W – If w ∈ W and a ∈ F then aw ∈ W – 0 (identity of G) is in W . C. Field The concept of a F ield is modelled on the idea of Real numbers where you can do addition, multiplcation, subtraction and division staying wthin the set. Definition 5 A Field is a Ring where all elements except the additive identity have a multiplicative inverse. So the Field is an abelian group under addition and the non-zero elements of the Field form an abelian group under multiplication. The left and the right distribution laws of the multiplication over addition are inherited from it being a Ring to start off. It is to be noted that the base-field of the subspace W is F , the same as that of the total-space V . • An Homomorphism φ from vector space V to W is both over the field F is a map φ : V → W which satisfies the following axioms: If v1 and v2 are in V and c ∈ F then, – φ(v1 + v2 ) = φ(v1 ) + φ(v2 ) – φ(cv) = cφ(v) If φ is also bijective then it is called an Isomorphism. If V = W then the isomorphism is called an Automorphism of V (or W ). 17 • If F is a Field then GL(n, F ) can be defined in either of the following ways: – The group of all automphisms of the n − dimensional vector-space over F . – The group of all invertible n × n matrices with entries in F . – Th group of all n × n matrices with entries in F whose determinant is 6= 0.