* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 10
Survey
Document related concepts
Singular-value decomposition wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Four-vector wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Determinant wikipedia , lookup
Capelli's identity wikipedia , lookup
Matrix calculus wikipedia , lookup
Gaussian elimination wikipedia , lookup
Exterior algebra wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Matrix multiplication wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Jordan normal form wikipedia , lookup
Transcript
Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 10 Instructor: Farid Alizadeh Scribe: Ge Zhang 11/19/2001 1 Overview In this section we continue the study of Jordan algebras. We first study the notions of minimum and characteristic polynomials, and from there define eigenvalues. Next we introduce a central concept called the quadratic representation. 2 2.1 Two examples of Jordan algebra and the identity element Two prototypical examples In the last lecture we showed that the following examples are indeed Jordan algebras with identity. We use these two algebras as our prototypical examples throughout. Jordan algebra of square matrices The set Mn of all n × n matrices under the binary operation “◦” defined by X ◦ Y = YX+YX forms the Jordan algebra (Mn , ◦). Clearly I the identity matrix 2 is the identity element with respect to this Jordan algebra. The set Sn of all n × n symmetric matrices under the “◦” operation forms a subalgebra of (Mn , ◦). Even though Sn is not closed under the ordinary (associative multiplication), it is closed under the “◦” operation. Since the identity matrix is symmetric it is the identity of Sn as well. 1 scribe: Ge Zhang Lecture 10 Date: 11/19/2001 The algebra of quadratic forms Define ∀x = (x0 ; x̄)T , y = (y0 ; ȳ)T ∈ Rn+1 x ◦ y = (x0 y0 + x̄T Bȳ; x0 ȳ + y0 x̄) where B is a symmetric n × n matrix. Then (Rn+1 , ◦) is also a Jordan algebra. def Clearly under this multiplication e = (1; 0) is the identity element. Below, we will always assume that B = I the identity matrix, unless otherwise stated. 3 The minimum polynomial In this section we start developing the concepts of eigenvalues, rank, inverse elements, and so on. In the algebra of matrices, these concepts are developed in a somewhat different order than the one we are about to present. In fact, in Jordan algebras we start from minimum polynomials, then define the notion of the characteristic polynomials, and from there define eigenvalues, inverse, trace, determinant and other concepts. This, somewhat reverse development happens to be more convenient for general power-associative algebras. From lecture 9 recall that in a power associative algebra (A, ?) the subalgebra generated by a single element x ∈ A, denoted by A(x) is associative. Also, recall that A(x) P isi essentially the set of all power series in a single element, i.e. the set of fi x . A subalgebra of this associative algebra is the set of polynomials P k i A[x] = i=0 fi x | k ∈ Z , where Z = {0, 1, 2, . . .}. 3.1 Degree and rank Definition 1 Let A be a power-associative algebra with the identity element e such that dim(A) = n. For x ∈ A, let d be the smallest integer such that e, x, · · · , xd is linearly dependent. Then d is called degree of x and denoted by deg(x). Obviously d ≤ n. Definition 2 Let r = maxx∈A deg(x), then r is called rank of A and denoted by rank(A). Definition 3 The vector x is regular if deg(x) = rank(A) = r. It can be shown that if the underlying vector space is over the set of real numbers R, then the set of regular elements is dense in A. This means that for every element x ∈ A and any positive number there is a regular element in the ball of radius centered at x. Furthermore almost all elements in A are regular. These statements will become clear shortly. It is now easy to see that the set of power series A(x) and the set of polynomials A[x] are identitcal. Since xd can be written P∞ as a linear combination of xj for j = 0, 1, . . . , d − 1, each power series j=1 αj xj can be written as a 2 scribe: Ge Zhang Lecture 10 Date: 11/19/2001 Pd polynomial j=0 βj xj , where each βj is an infinite sum over an appropriate subsequence of αi . 3.2 Definition of minimum polynomials Definition 4 Let x ∈ A have degree d. Then xd , xd−1 , · · · , x, e are linearly dependent. Therefore there are functionsa1 (x), a2 (x), · · · , ad (x) such that xd − a1 (x)xd−1 + · · · + (−1)d ad (x)e = 0. def The polynomial mx (t) = td − a1 (x)td−1 + · · · + (−1)d ad (x) is called the minimum polynomial of x. Now suppose there are two minimum polynomials m1 (t) and m2 (t) for x. Then they both are monic (i.e. coefficient of the highest degree term is 1) and of the same degree, say d. But this means for m3 (t) = m1 (t) − m2 (t) we must have m3 (x) = 0, which is a contradiction since the degree of m3 (t) is less than d. Thus, Lemma 1 The minimum polynomial of an element x is unique. Observe that deg(e) = 1 since e and e1 are already linearly dependent and therefore the minimum polynomial of e is m(t) = t − 1. For each regular element x, there are r functions aj (x) that determine the coefficients of the minimum polynomial. These coefficients depend only on x and otherwise are fixed. Thus if we know the functional form of these coefficients, we have in principle an algorithm that determines the minimum polynomial. We can show almost immediately that the coefficients of the minimum polynomial are in fact rational functions of x. To see this, note that, since {e, x, . . . , xd−1 } is a set of linearly independent vectors, we can extend it by a set of n − d other vectors to a basis of the vector space A. Let this basis be {e, x, . . . , xd−1 , ed , . . . , en }. Then, the equation xd − a1 (x)xd−1 + · · · + (−1)d ad (x) = 0 can be thought of as a system of n linear equations in d unknowns aj (x). By Cramer’s rule, each aj (x) is the ratio of two determinants Det e, x, . . . , xj−1 , xd , xj+1 , . . . , xd−1 , ed , . . . , en aj (x) = . Det e, x, . . . , xj−1 , xj , xj+1 , . . . , xd−1 , ed , . . . , en In fact it can be shown that aj (x) are all polynomials in x, that is in the ratio of determinants above, the denominator, divides the numerator. This is a consequence of a lemma in ring theory due to Gauss, which is beyond the scope of our course to state and prove. Accepting that the aj are polynomials, we can actually prove that they are homogeneous polynomials. (Recall that f(x) is said to be homogeneous of degree 3 scribe: Ge Zhang Lecture 10 Date: 11/19/2001 p, if f(αx) = αp f(x)). To see this suppose m1 (t) is the minimum polynomial of x and m1 (x) = xd −a1 (x)xd−1 +· · ·+(−1)d ad (x)e = 0. Clearly, ∀α ∈ R, α 6= 0, if deg(x) = d then deg(αx) = d as well. In fact, m1 (αx) = (αx)d − a1 (αx)(αx)d−1 + · · · + (−1)d ad (αx)e = 0. def Let us call m2 (t) = m1 (αt). Since, m2 is monic and of degree d, and m2 (x) = m1 (αx) = 0, we conclude that m2 (t) is identical to the minimum polynomial of x, that is m1 (t) ≡ m2 (t). Therefore, αd aj (x) = αd−j aj (αx) which means that aj (αx) = αj aj (x). Thus we have proven Lemma 2 The coefficients aj (x) of the minimum polynomials of x are homogeneous polynomials of degree j in x. It follows in particular that the coefficient a1 (x) is a linear function of x. Using the notion of minimum polynomials we can now extend the usual notions of linear algebra, such as trace, determinant, and eigenvalues, at least for regular elements. Definition 5 Let x be a regular element of the Jordan algebra J of rank r and m(t) its minimum polynomial. (Thus m(t) is of degree r.) Then • the roots of m(t), λi (x), are the eigenvalues of x, def • the trace of x, tr(x) = a1 (x) = P i λi , • the determinant of x, det(x) = ar (x) = Q i λi . Example 1 (Minimum polynomials in (Mn , ◦)) For (Mn , ◦), The concept of minimum polynomial coincides with the linear algebraic notion we are already familiar with. Also, a matrix X is regular if its characteristic polynomial (defined as F(t) = Det(tI − X)) coincides with its minimum polynomial. (From linear algebra we know that the minimum polynomial of a matrix divides its characteristic polynomial.) For regular matrices X, then the notions of trace, determinant, and eigenvalues coincide with the familiar ones. For symmetric matrices, that is the subalgebra (Sn , ◦), we now that for each X there are are orthogonal eigenvectors, thatPis, there is an orthogonal matrix Q with columns qi such that X = QΛQT = λi qi qTi . We also know that for a symmetric matrix X, the characteristic and minimum polynomials coincide if, and only if X has n distinct eigenvalues. Example 2 (Minimum polynomials for (Rn+1 , ◦)) It can be very easily verified that for each x = (x0 ; x̄), x2 − 2x0 x + (x20 − x̄T Bx̄)e = 0 Thus, (Rn+1 , ◦) is a Jordan algebra of rank 2. The roots of this quadratic polynomial are x0 ±kx̄k. If these two roots are equal then we must have kx̄k = 0 4 scribe: Ge Zhang Lecture 10 Date: 11/19/2001 which implies x̄ = 0; that is x is a multiple of the identity element e. Thus, like all Jordan algebras since deg(e) = 1, it implies that the only nonregular elements in Rn+1 are multiples of the identity element. We also see that tr(x) = 2x0 4 and det x = x20 − x̄T Bx̄. Characteristic polynomial and the inverse 4.1 Characteristic polynomial To extend the notions of eigenvalues, trace and determinant from regular elements to all elements of the Jordan algebra J we need to develop the notion of the characteristic polynomial (sometimes called the generic minimum polynomial). An easy way to do this extension is to first define for regular elements the characteristic polynomial to be the same as the minimum polynomial. Then, since we have already seen that the coefficients of the this polynomial are polynomials in x, it follows that they are well-defined for all x. Thus, Definition 6 For each x ∈ J, where (J, ◦) is a Jordan algebra of rank r, let the r polynomials aj (x) for j = 1, . . . , r be defined as above. Then, i The characteristic polynomial of x is def fx (t) = tr − a1 (x)tr−1 + · · · + (−1)r ar (x), ii The r roots of fx are the eigenvalues of x, and the algebraic multiplicity of an eigenvalue λi is its multiplicity as a root of the characteristic polynomial, Pr Qr def def iii tr(x) = a1 (x) = i=1 λi and det(x) = ar (x) = i=1 λi . It can be shown that the minimum polynomial of x divides the characteristic polynomial, and the roots of the minimum polynomial are the same as the roots of the characteristic polynomial, except that the multiplicity of each root may be larger in the characteristic polynomial. It also follows from the definition that an element x is regular if, and only if its minimum and characteristic polynomials are identical. As an example, the characteristic polynomial of the identity element e is the polynomial (t − 1)r , and thus tr(e) = r and det(e) = 1, as in the algebra of matrices. Also, one can show the extension of the Cayley-Hamilton theorem of linear algebra: Theorem 1 For each element x ∈ J, where (J, ◦) is a Jordan algebra fx (x) = 0. 5 scribe: Ge Zhang Lecture 10 4.2 Date: 11/19/2001 The linear operator L0 By the Cayley-Hamilton theorem fx (x) = 0, that is, xr − a1 (x)xr−1 + · · · + (−1)r ar (x)e = 0. Applying the L operator to both sides, and noting that the Lk (x) commute for all k, we get: Lr (x) − a1 (x)Lr−1 (x) + · · · + (−1)r ar (x)L(e) = L(0). Since L(e) = I, the identity matrix, and L(0) = 0, we get that f L(x) = 0. Now, L(x) is a zero of the characteristic polynomial of x fx (t). From matrix algebra we know that for any square matrix A, any polynomial p(t) for which p(A) = 0 must divide the characteristic polynomial (in the ordinary matrix sense) of A. Thus, def FL(x) (t) = Det tI − L(x) = fx (t)q(t). for some polynomial q(t). Since, for a regular element x, {e, x, · · · , xr−1 } is a Pr−1 basis of of the subalgebra J[x], we have ∀u ∈ J[x], u = i=0 αi xi . Therefore, x ◦ u = L(x)u = r−1 X i=0 αi xi+1 = r−1 X βi xi . i=0 Pr−1 where the βi are obtained by noting that xr = i=1 (−1)r−i ai (x)xr−i , that is βi = αi−1 + (−1)i ar−i (x). As a result, even though L(x) is an n × n matrix, it maps J[x] back to itself; therefore, its restriction to the r-dimensional subspace J[x] may be considered as an r × r matrix, denoted by L0 (x). The matrix L0 (x) actually can be written explicitly with respect to the basis {e, x, . . . , xr−1 }: This is the matrix that maps the αi to the βi : 0 0 ··· 0 (−1)r−1 ar (x) 1 0 · · · 0 (−1)r−2 ar−1 (x) r−3 L0 (x) = 0 1 · · · 0 (−1) ar−2 (x) . .. .. . . .. .. . . . . . 0 0 ··· 1 a1 (x) Then, x ◦ u = L0 (x)u for all u ∈ J[x]. Lemma 3 If x is regular in the rank r Jordan algebra (J, ◦), then for every polynomial p(t), we have p(x) = 0 if, and only if p L0 (x) = 0 Thus, remembering that aj (x) are defined for all elements and not just the regular elements, we have Theorem 2 The characteristic polynomial of x is identical to the characteristic polynomial of the matrix L0 (x). 6 scribe: Ge Zhang Lecture 10 4.3 Date: 11/19/2001 Definition of the inverse At first glance it seems natural to define the inverse of an element in a Jordan algebra (J, ◦) with an identity element e, to be the element (if it exists) y such that y ◦ x = e. The problem is that, unlike associative algebras, this y may not be unique. For instance, in the Jordan algebra (M2 , ◦), define, 1 0 1 t X= T= |t∈R 0 −1 t −1 Then, every element Y of T has the property that Y ◦ X = (YX + XY)/2 = I. Remember that in associative algebras for each x, there is at most one element y such that xy = yx = e. This is easy to see. Let y1 and y2 be two vectors such that xy1 = y1 x = e xy2 = y2 x = e Multiply both sides of the first equation by y2 to get y2 (xy1 ) = y2 . By associativity, (y2 x)y1 = y2 , and since y2 x = e, we get y1 = y2 . To define an appropriate notion of inverse, we insist that the inverse of x be also in the (associative) subalgebra J[x]; this requirement ensures uniqueness of the inverse element. In this subalgebra, it turns out that x−1 can actually be expressed as a polynomials in x: Since xr − a1 (x)xr−1 + · · · + (−1)r ar (x)e = 0, we can give def xr−1 −a1 (x)xr−2 +···+(−1)r−1 ar−1 (x)e . (−1)r−1 ar (x) Definition 7 x−1 = A simple calculation shows that x−1 ◦ x = e. Example 3 (Inverse in (Mn , ◦) and (Rn+1 , ◦).) • For the Jordan algebra (Mn , ◦), the inverse X−1 coincides with the usual matrix theoretic inverse of the matrix X. • For (Rn+1 , ◦), −1 x0 x1 1 def x − 2x0 e = 2 = .. 2 . x0 − kx̄k2 − x0 − kx̄k xn x0 −x1 1 Rx. .. = . det(x) −xn For arbitrary symetric matrix B only the last eqaulity holds. 7 scribe: Ge Zhang Lecture 10 5 Date: 11/19/2001 The quadratic representation A fundamental concept in Jordan algebras is the notion of quadratic representation, which is a linear transformation associated to each element x of a Jordan algebra J. First let us give the Definition 8 The quadratic representation of x ∈ J is def Qx = 2L2 (x) − L(x2 ) and thus Qx y = 2x ◦ (x ◦ y) − x2 ◦ y At first glance this seems a bit arbitrary. However, this matrix is the generalization of the operation in square matrices that sends a matrix Y to XYX: X2 Y + YX2 XY + YX XY + YX + X) − 2 2 2 X2 Y + XYX + XYX + YX2 − X2 Y − YX2 = 2 = XYX. QX Y = (X Therefore, vec(XYX) = (XT ⊗ X) vec(Y), that is QX = XT ⊗ X. Example 4 (Qx is (Rn+1 , ◦)) For (Rn+1 , ◦), kxk2 Qx = 2 Arw2 (x) − Arw(x2 ) = x0 x̄ 8 x0 x̄T det(x)I + 2x̄x̄T