Download notes 1

Math 125, Fall 2005 N. Kerzman QUADRATIC FORMS AND SYMMETRIC MATRICES. EIGENVALUES, EIGENVECTORS, DIAGONALIZATION, AND REDUCTION OF QUADRICS TO PRINCIPAL AXES. A ROAD MAP INVOLVING MATLAB. We use MatLab notation without warning. Sometimes we use A*B for matrix multiplication (in MatLab style) and sometimes we just write AB (in math style). All matrices have real entries. All symmetric matrices A in this write-up are 3 by 3 unless otherwise stated. Statements easily extend to any n by n symmetric A. References: Linear Algebra by Otto Bretscher, Prentice Hall. A nice undergraduate textbook. Matrix Theory , Vols. I, II, by F. Gantmacher, Chelsea. A thorough graduate level classic and reference. Very useful. -------1. Motivation. Quadratic functions in two variables. The goal is to plot and understand the plane curve C given by x 2  2 y 2  xy  1 . Think of C as the level set of q( x, y )  x 2  2 y 2  xy where q( x, y )  1 . Thanks to MatLab, it is easy to plot C using meshgrid and the contour commands. Try it and see what you get. Now you want to better understand what you plotted. A wonderful and far-reaching train of thought starts at this point. Introduce the column vector [x;y], its transpose [x;y]’ (which is the row vector [x,y]), and a newly created matrix  1 1 / 2 A  . You can rewrite 1 / 2 2  q( x, y )  [ x; y ]'* A * [ x, y ] , or skipping the matrix product symbol *, as is done in mathematics—but not in MatLab!— q( x, y )  [ x; y ]' A [ x, y ] (1.1) Verify that (1) is true. A is a 2 by 2 matrix. It is symmetric, i.e., A(i,j)=A(j,i). A was concocted by putting in its main diagonal the coefficients of the “pure” quadratic terms in q(x,y), and in the off-diagonal entries the coefficient of the “mixed” term xy divided by 2. The same is true for any function q( x, y )  ax 2  by 2  cxy  a c / 2 The symmetric matrix A    , c / 2 b  constructed like the previous A, does the trick. You can write q(x,y) via (1), exactly as in the example. Please check it! Functions q( x, y )  ax 2  by 2  cxy are called quadratic forms , or just quadratic functions in two variables. with a,b,c real. Notice the interplay: PING: Any quadratic form q( x, y ) gives rise to matrix A such that (1) is true. a 2 by 2 symmetric PONG: Any symmetric 2 by 2 matrix A creates a quadratic form q(x,y) defined by (1). -------2. Quadratic forms in three variables. Getting more ambitious, let’s surface S in 3-d space given by Change of notation. plot and understand now the 9 x 2  7 y 2  3z 2  2 xy  4 xz  6 yz  1 S is the level surface of the quadratic function in three variables q( x, y, z )  9 x 2  7 y 2  3z 2  2 xy  4 xz  6 yz for the level q( x, y , z )  1 . MatLab plots S using meshgrid for 3-d arrays and isosurface. Try it: Success! Now we’ll try to understand S. We change to a more systematic notation because what we are up to works just as well in 2,3,4, 5, or in any number of variables. Here is what we do: instead of x,y,z, we write x1 , x2 , x3 and reserve the name x for the column vector x  [ x1 ; x2 ; x3 ] . A new 3 by 3 matrix A is created as before: In the main diagonal are 9, 7, and 3, i.e., the coefficients of the “pure” quadratic terms. In the off-diagonal position A(i,j)=coefficient of x i x j divided by 2. I.e.,  9 1 2  A   1 7  3    2  3 3  A is a 3 by 3 symmetric matrix. In the new notation, the same easy verifications done in the two-variables case show that q( x1 , x2 , x3 )  x' A x (2.1) What was said in our example extends word for word to any quadratic form q( x 1 , x 2 , x 3 )  a 11 ( x 1 ) 2  a 22 ( x 2 ) 2  a 33 ( x 3 ) 2  a 12 x 1 x 2  a 13 x 1 x 3  a 23 x 2 x 3 The symmetric matrix A arising from q has A(i,i)= A(i, i )  aii and A(i, j )  aij / 2 if i  j The same PING-PONG still goes on: q( x1 , x2 , x3 ) produces a symmetric A for which (2.1) holds. And any symmetric A generates a q( x1 , x2 , x3 ) defined by (2.1). -------3. When A is diagonal, 1. the surface S given by q( x, y , z )  1 is very easy to undertand, and 2. the a ii are precisely the eigenvalues i of A Reasons: 1. If all A(i,j)=0 off the main diagonal, then, in the old notation, the equation of the surface S is q( x, y, z )  a11 x 2  a 22 y 2  a33 z 2  1 . Our study of quadrics started with examples of this type. E.g, if all the aii  0 , S is an ellipsoid. If two of the aii  0 and one is <0, S is a hyperboloid of one sheet. While if two are <0 and one is >0 we’ll get a hyperboloid of two sheets. (And if all three are <0 what do we get? And if some aii  0 ? Answer please!) 2. Basic linear algebra says that  is an eigenvalue of A exactly when p( )  det( I  A)  0 . Write the matrix I  A . You will see that p( )  (  a11 )(   a22 )(   a33 ) . This makes it obvious which are the roots of p( ) , i.e., the eigenvalues of A. The eigenvectors of a diagonal A J, K, along the coordinate axes. are simply the famous I, 1  E.g., v1  0 , etc.   0 Please check it. -------- 4. Eigenvalues and eigenvectors of a symmetric matrix A THEOREM For any SYMMETRIC 3 by 3 matrix A,    the eigenvalues are necessarily real. For different eigenvalues 1  2 of A, the corresponding eigenvectors v1 and v2 are automatically orthogonal ( i.e., v1  v2 in the sense of space geometry or of multivariable calculus). There are ALWAYS three eigenvectors v1, v2, v3 of A that are orthogonal to each other and normalized of length=1; v1 corresponds to 1 , etc. The theorem is valid because A is symmetric. It does not matter whether the eigenvalues of A are repeated or not. Our references have the proofs-- but you don’t need to study them for this class. -------- 5. The heart of the matter. A crucial change of variables. Let V be the 3 by 3 matrix whose columns are the v1, v2, v3 above. Then (5.1) V’*V=I is an immediate consequence of the definition of matrix product and of the ortonormality of v1,v2,v3. Please make sure you verify and understand this. Hence V has inv(V) and ` (5.2) inv(V)=V’ Any square matrix V satisfying (5.2) is called orthogonal. We get another ping-pong important for our business. PING: start with any three vectors v1,v2,v3 ; place them as columns of a matrix V. If v1,v2,v3 are orthonormal, then V satisfies (5.1), i.e., V is orthogonal. PONG: A way to check whether three arbitrary vectors are orthonormal is to check whether the matrix V that has them as columns is orthogonal, i.e., you should check whether V satisfies (5.1)or not. This is very easy to do with MatLab. Next: Define R as the diagonal 3 by 3 matrix 1 0 (5.3) R=  0 2   0 0 0 0  3  A direct consequence of the matrix multiplication definition and of what it means to be an eigenvector is (5.4) A*V=V*R or, equivalently, (5.5) A=V*R*inv(V)=V*R*V’ Please make sure you verify and understand (5.4) and (5.5). NOW DO A CHANGE OF VARIABLES. Write (5.6) q( x1 , x2 , x3 )  x' A x  x'VRV ' x and define a column vector (5.7) y V' x  y1  Here y   y 2     y 3  appears in our story for the first time. Magic now sets in. Remember that for any two matrices A and B (A*B)’=B’*A’. Hence y '  x'V (5.8) and you can rewrite (5.6) as (5.9) q( x1 , x2 , x3 )  x' A x  y' Ry BUT R IS DIAGONAL AND SECTION 3 ABOVE APPLIES. THE SURFACE S SATISFYING q( x1 , x2 , x3 )  x' A x  1 IS HENCE VERY EASY TO ANALYZE IN TERMS OF y . NAMELY, S IS THE SET OF POINTS x  ( x1 , x2 , x3 ) IN SPACE SUCH THAT y ' Ry  1 , i.e.,S IS DEFINED BY (5.10) 1 y12  2 y2 2  3 y33  1 WHERE y IS THE VECTOR ASSOCIATED WITH x VIA (5.7) . (5.10) has no mixed terms. Hence it will be an ellipsoid if all three eigenvalues are >0, an…, etc. in other cases. -------6. The geometric meaning of y associated to x  [ x1 ; x2 ; x3 ] this y mean? is fundamental: via (5.7), i.e., y is y  V ' x . What does What are these three numbers y1 , y2 , y3 ? Now, x1 , x2 , x3 are the components of a point P in space. According to what the word “component” means, (5.11) P  x1 I  x2 J  x3 K where I, J, K are the famous unit vectors along the coordinate system axes. We are talking about the basic system of reference used from the beginning to do analytic geometry. By (5.7), (5.12) x V * y The definition of matrix multiplication shows immediately that (5.13)  x1   x  = y v1  y v 2  y v3 2 3  2 1  x3   x1  Since  x 2   x1 I  x2 J  x3 K , we get,    x3  (5.14) x1 I  x2 J  x3 K  y1 v1  y2 v 2  y3 v3 CONCLUSION: ANY POINT P WITH COMPONENTS x1 , x2 , x3 IN THE ORIGINAL SYSTEM OF REFERENCE BASED ON I, J, K HAS COMPONENTS y1 , y2 , y3 IN THE SYSTEM OF REFERENCE DETERMINED BY v1,v2,v3. -------THE SURFACE S IS MADE UP OF THE POINTS P WHOSE COMPONENTS y1 , y2 , y3 IN THE EIGENVECTORS v1,v2,v3 REFERENCE SYSTEM SATISFY THE SIMPLIFIED “DIAGONALIZED” FORMULA (5.10). THE AXES OF SUCH A COORDINATE SYSTEM ARE CALLED THE “PRINCIPAL AXES” OF S. THE ORIGINAL EQUATION IN TERMS OF x1 , x2 , x3 , IS LESS TRANSPARENT. IT HAS MIXED TERMS. THE ORIGINAL I,J,K SYSTEM MAY NOT BE TOO GOOD TO DESCRIBE S. IT MAY NOT BE THE MOST NATURAL GIVEN THE POSITION OF THE QUADRIC S. WE SEE THAT THE APPEARANCE OF MIXED TERMS IN THE EQUATION FOR S COMES EXACTLY FROM THE ROTATED POSITION OF S. A ROTATION OF THE COORDINATE AXES IS THUS CAPABLE OF GETTING RID OF THEM. -------- 7. Using MatLab to plot S and superimposing its principal axes. The strategy. a) Plot S using meshgrid and isosurface. Play with the scale until you get a nice informative figure. Don’t use too many points in meshgrid if you don’t want the computer to be excruciatingly slow. (100 points in each variable will produce 10^6 points in the 3-d array. That’s still ok.) Use an m-file for flexibility and for saving your work. For an example see(10) below. b)Put the equation of S in the form Rewrite it as x' Ax  1 (or q( x )  1 (or q( x )  0 ). x' Ax  0 ). The familiar command (7.1) [V,R]= eig(A) provides the eigenvectors v1=V(:,1), v2=V(:,2),v3=V(:,3). What was stated in section 4 is perfectly fine also in the case when some eigenvalue of A is repeated, i.e., when p( )  det( I  A) , a third order polynomial, has a double or triple root. But we want to examine what’s peculiar in that case, both mathematically and in MatLab, compared to the most frequent situation in which all three eigenvalues are different. -------8. When all three eigenvalues of A are different. There is no choice for v1 (except, obviously, for multiples), nor for v2, nor for v3. This is an easy exercise in linear algebra that students may safely attempt. In producing V, MatLab normalizes v1, v2, and v3 to length =1. It could choose –v1 instead of v1, or –v2 instead of v2, etc. But, other than similar obvious options, it has no choice. Life without choices is supposed to be easy. In this case it is. The three axes along v1, v2, v3 are the principal axes of S. Use hold on to preserve your previous figure of S. The axis corresponding to v1 goes through the origin and is directed by v1. Hence its parametric equation is (8.1) x=t v1(1), y=t v1(2), z=t v1(3) To prevent overwriting the variables x,y,z --which you might need later on-- use, e.g., t=linspace (-20,20,100) (the 20 is guessed, e.g., from playing around with scales in (a)) xx1= t*v1(1); yy1=t*v1(2); zz1=t*v1(3); this gives you axis1. Similarly you get axis2, etc. Finally, (8.2) plot3(xx1,yy1,zz1,‘r’,xx2,yy2,zz2,’b’, xx3,yy3,zz3,’g’) This plots the three axes in red, blue and green to help you track them down. You may also use dotted lines along with color, or other variations you like. -------9. Example. Here is an m-file %quadric 2. x^2 +2*y^2 +3 z^2 +xy +2 xz +6 yz=1; %hyperboloid of one sheet %matrix of quadric A=[1 .5 1;.5 2 3;1 3 3]; x=linspace(-20,20,100); y=linspace(-20,20,100); z=linspace(-20,20,100); [X,Y,Z]=meshgrid(x,y,z); Q=X.^2 +2*(Y.^2) +3*(Z.^2) +X.*Y +2*(X.*Z) +6*(Y.*Z); isosurface(X,Y,Z,Q,1); axis equal xlabel('x') ylabel('y') zlabel('z') title ('hyperboloid of one sheet') hold on %Plot the principal axes. A=[1 .5 1;.5 2 3;1 3 3]; [U,R]=eig(A); %cols of U hold evecs; diag of R holds evals; %AU=UR; A=U*R*inv(U); v1=U(:,1); r1=R(1,1); %1st evec and its eval v2=U(:,2); r2=R(2,2); %etc. v3=U(:,3); r3=R(3,3); t=(-30:.1:30); xx1=v1(1).*t; yy1=v1(2).*t; zz1=v1(3).*t; %parametric eq of axis1 (for v1), i.e., line through origin directed by v1 xx2=v2(1).*t; yy2=v2(2).*t; zz2=v2(3).*t; %etc xx3=v3(1).*t; yy3=v3(2).*t; zz3=v3(3).*t; %used xx (rather than x) to prevent overwriting some other x that might be around,etc. plot3(x1,y1,z1,'b',x2,y2,z2,'r',x3,y3,z3,'g'); grid box % R= % -0.5915 0 0 % 0 0.8032 0 % 0 0 5.7884 -------- 10. When A has repeated eigenvalues. The goal is to obtain three orthonormal eigenvectors v1,v2,v3 of A and use them as columns of V. They exist by the theorem in Section (4). Once V is constructed, we proceed exactly as in (8). This is almost the end of the story. (But read (11)!) Example: A= 1 1 1 1 1 1 1 1 1 >> [V,R]=eig(A) V= 0.4082 0.7071 0.5774 0.4082 -0.7071 0.5774 -0.8165 0 0.5774 R= -0.0000 0 0 0 0 0 0 0 3.0000 Here   0 is a repeated eigenvalue with multiplicity=2. MatLab came up with eigenvectors v1,v2 v3, the columns of V. Are they orthonormal? Section 5 explains how to check for this: you compute V’*V . If you get the identity I, they are. If you don’t, they aren’t. In this example, V'*V ans = 1.0000 0.0000 -0.0000 0.0000 1.0000 0.0000 -0.0000 0.0000 1.0000 Hence it works. The principal axes of the quadric q(x,y,z)=x^2+y^2+z^2+2xy+2xz+2yz=1 associated with A are along v1=V(:1),v2=V(:,2), v3=V(:,3). You plot the axes exactly as you did in (8). The novelty is mathematical (and not a MatLab quirk): Namely, two eigenvectors v1 and v2 correspond to   0 . They are not multiples of each other (look at the 0 in v2: obviously v2 is not a multiple of v1 or viceversa.) Hence all linear combinations v  c1v1  c2 v2 will also be eigenvectors with eigenvalue 0 (do you see why?) Hence the eigenvectors corresponding to   0 fill up a whole plane . Within that plane, MatLab could have chosen a different pair of orthonormal vectors. Why it chose these ones and not others is a bit of a mystery. But who cares? Having the orthogonal matrix V is all you need to proceed as in (8). Follow the steps and see what you get for S. This example makes very clear the role of repeated eigenvalues. Instead of the three unique principal axes you got in (8), you have infinitely many ones. One is a line along v3, the eigenvector for the non-repeated eigenvalue   3 . Here there is no choice. But any two lines through the origin that are orthogonal to each other and to v3 will do as principal axes. S is perfectly rotationally symmetric around the axis of v3. -------11. When A has repeated eigenvalues and MatLab has a stability quirk. Occasionally, when there are repeated eigenvalues,(7.1) furnishes a V where the columns are NOT orthogonal. This is rare, especially if you use decimals. It is an unlikely event for A to have repeated eigenvalues. Example of what can happen and what to do about it. exactly as follows. DON’T USE DECIMALS! >> B= [2,0,0;0,2,0;0,0,3]; %B is symmetric >> >> >> >> >> % T’= inv(T). Construct w1= (1/sqrt(3))*[1;1;1]; w2=(1/sqrt(2))*[1;-1;0]; w3=(1/sqrt(6))*[1;1;-2]; T=[w1,w2,w3]; A=T*B*T’; >> A A = Check T*T’=I %A defined. It’s symmetric. A 2.1667 0.1667 -0.3333 0.1667 2.1667 -0.3333 -0.3333 -0.3333 2.6667 >> [V,R]=eig(A) V = -0.9129 0.1826 -0.3651 0.4082 0.4082 -0.8165 0.5435 0.6100 0.5767 R = %R has repeated eval 2 2.0000 0 0 0 3.0000 0 0 0 2.0000 -0.0000 1.0000 -0.0000 -0.5953 -0.0000 1.0000 >> V'*V ans = 1.0000 -0.0000 -0.5953 % Hence v1 is not  v3. Notice that v1, v3 correspond to the same eigenvalue   2 . On the other hand, v1  v2 and v3  v2. They have no choice because they correspond to the different eigenvalues 2 and 3. This V is useless to carry out the program in (8). What to do. Change the given matrix A in an imperceptible way keeping it symmetric. Let’s add, e.g., 10^(-10) to A(1,1) Anew=A; Anew(1,1)=A(1,1)+10^(-10); Here’s what MatLab returns: Anew = 2.1667 0.1667 -0.3333 0.1667 2.1667 -0.3333 -0.3333 -0.3333 2.6667 >> [Vnew,Rnew]=eig(Anew) Vnew = -0.9129 0.1826 -0.3651 0.4082 0.4082 -0.8165 0.0000 0.8944 0.4472 0 3.0000 0 0 0 2.0000 -0.0000 1.0000 -0.0000 -0.0000 -0.0000 1.0000 Rnew = 2.0000 0 0 >> Vnew'*Vnew ans = 1.0000 -0.0000 -0.0000 % The vectors v1new,v2new, v3new are orthonormal and all is now fine. We proceed as in (8). Notice: Anew looks like A, Rnew looks like R, but Vnew looks very different from V in its third colummn. Adding the little 10(^-10)has changed A into a symmetric matrix Anew that is A for our intents and purposes. But Anew has three different eigenvalues. Hence the corresponding eigenvectors are orthogonal. Two eigenvalues are very, very close to each other. Check it using format long . This method invites a lot of thinking about instabilities and other matters. Some reflexion shows that it is quite legitimate. For those who do not like what we did, a totally different method follows. -------12. The orth command. 12.1. Orthonormalizing two vectors in R^3. Start with any two vectors w1 and w2 in R^3. Think them as column vectors. Let L be the subspace of R^3 of all linear combinations of w1 and w2. L can have dimension 2 (plane), 1 (line), or 0 (L is just the origin). Form the matrix W=[w1,w2]. It is 3 by 2. Then V=orth(W) produces a matrix V whose columns are an orthonormal basis of L. Example 1 w1=[1;2;3], w2=[4;5;6], W=[w1,w2], W = 1 2 3 4 5 6 >> V=orth(W) V = -0.4287 -0.5663 -0.7039 0.8060 0.1124 -0.5812 MatLab returned a V with two columns v1=V(:,1), v2=V(:,2) showing that L has dimension 2 and providing an orthonormal basis v1, v2 of L. The command has “orthonormalized the vectors w1, w2”. Checking that v1,v2 are orthonormal: V'*V ans = 1.0000 0 Example 2 0 1.0000 w1=[1;2;3], w2=[-2;-4;-6], W=[w1,w2], W = 1 2 3 -2 -4 -6 >> V=orth(W) V = -0.2673 -0.5345 -0.8018 V has only one column. Hence L has dimension 1. It’s a line . the column matrix V provides an orthonormal basis of L. This only says that V has length=1. Check it: V'*V ans = 1 12.2. Orthonormalizing The command orth Example 3 k vectors in R^n works in R^n in full generality . >> w1=[1;2;3;4;5]; >> W w2=w1.^2; w3=w1.^3; W=[w1,w2,w3]; W = 1 2 3 4 5 1 4 9 16 25 1 8 27 64 125 >> V=orth(W) V = -0.0084 -0.0596 -0.1936 -0.4503 -0.8696 0.2035 0.4878 0.6313 0.4127 -0.3897 0.7219 0.4539 -0.1524 -0.4454 0.2265 Since V has three columns, the dimension of L is 3. Recall that L is the subspace in R^5 of all linear combinations of w1, w2, w3. Furthemore, the columns of V provide an orthonormal basis of L. Checking the orthonormality: V'*V ans = 1.0000 -0.0000 0.0000 -0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 12.3. Rephrasing what orth does in terms of matrices. For any arbitrary n by k matrix W, V=orth(W) is an n by r matrix. Its columns are an orthonormal basis of the range of W and r is the dimension of that range. Reason: the range of W is precisely the subspace L in R^n generated by all the linear combinations of the columns of W. -------13. Using orth for quadrics with repeated eigenvalues. Back to (11). [V,R]=eig(A) produced a useless V. Its v1, v3 are not orthogonal. But v1 and correponding to the eigenvalue   2 . the subspace of R^3 generated by v1 and   2. first and third columns v3 are eigenvectors of A Hence all vectors in L, v3, are eigenvectors for Now > VV=orth([v1,v3]) VV = -0.8153 -0.2393 -0.5273 0.4106 -0.8810 -0.2352 shows that L has dimension 2 (it’s a plane) and provides an orthonormal basis vv1=VV(:,1), vv3=VV(:,2) of L; vv1 and vv3 are still eigenvectors of A with eigenvalue   2 because they lie in L. Both are orthogonal to v2 (which has a different eigenvalue   3 ). Conclusion vv1,v2,vv3 are orthonormal eigenvectors of A with eigenvalues 2,3,2 respectively. You can use them to build a new matrix Vnew that will diagonalize the quadric exactly as V did in (8). -------14. Quadrics with additional linear terms. No paraboloids or saddles appeared in what we did so far. That’s because they have equations like x2  y2  z  0 x2  y2  z  0 These equations are not just quadratic. linear term (power=1). They have an additional Let S be described by the equation (14.1) f(x)= x’*A*x + b*x=c where c is a given number and b=[b1,b2,b3] is a given row vector. Here is what you do to analyze and understand S. We have already done most of the work. An example that may help our discussion is f ( x )  x'*A * x  x1  2 x2  3x3  0 with 0 0  1  A  0  1.5 1.5    0 1.5  1.5 So, here b=[1,2,3] and c=0. a) first focus on x’Ax. Find V and R and write x’Ax=y’Ry, with V’x=y, x=Vy, exactly as before. Hence f ( x )  y '*R * y  b * V * y Calling d  b *V we rewrite f ( x )  y '*R * y  d * y  c The term y '*R * y is already diagonalized and we get for the equation of S (14.2) 1 y12  2 y2 2  3 y32  d1 y1  d 2 y2  d 3 y3  c b) Now comes the only novelty: Complete the squares whenever you can. If, e.g., 1  0 you can associate the 1st and 4th terms. Doing the high school algebra, (14.3) 1 y12  d1 y1  1 ( y1  p1 ) 2  1 p12 with p1   d1 21 But if 1  0 you cannot complete that square In our example, it turns out that 1  3, 2  0, 3  1 and you can hence do (14.3) for y1 and y 3 but not for y 2 . Changing variables once more, call Y1  y1  p1 , nothing to y 2 . (14.3) Y3  y3  p3 and do Now (14.2) looks like 1Y12  3Y32  d 2 y2  c  1 p12  3 p32 Students should compute all these numbers,including d 2 , p 2 , p3 ; d 2  3.5355 is thus different from 0. and S is, …is what? Please complete Exercise 3 in hwk 10. It clinches quite a bit of the material we saw over the last few weeks. --------

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download notes 1