* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CBrayMath216-1-3
Survey
Document related concepts
Linear algebra wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
System of linear equations wikipedia , lookup
Jordan normal form wikipedia , lookup
Determinant wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Four-vector wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Matrix calculus wikipedia , lookup
Transcript
CBrayMath216-1-3-b.mp4 So before we go on with inverse matrices, we're going to need to develop a tool, a powerful tool. The idea is that of an elementary matrix. So I'm going to start with some definitions. Now let me also note-- I'm going to make the same definition of an elementary matrix that the book does. But I am going to make an additional definition that the book does not make at all. This additional definition of a different item is a very convenient idea for proving results about elementary matrices. And well, anyway, we'll see that momentarily. So first, the definition of an elementary matrix is the result of applying a row operation to the identity matrix. So, in particular, start with the identity matrix, you apply one row operation. The result of this, now-- critically starting with the identity matrix only, doing a single row operation only-- not a single step in which you do multiple operations-- just a single, strictly defined, row operation, the result of that is called an elementary matrix. It turns out that these elementary matrices are really powerful. We'll see that develop over the next little while. OK, so here's the additional definition that I want to make. This is a related idea, and that is that of a operation matrix-- I remind you, this is not in the book, but it's a very useful idea and I encourage you all to make use of this idea. So a row interpolation matrix is a matrix that executes a row operation by left multiplication. So, in particular, if you have some matrix, and your reply a row operation, you get some result, well, that row operation, we will see momentarily, could have been executed simply by left multiplying that matrix A by some matrix F. And that matrix f is called a row operation matrix. Now a very reasonable complaint that a student might make at a time like this is how do you know that such a thing exists. We can't just say, oh yeah, the thing that does this powerful thing, maybe there is no such matrix. How do I know that left multiplication can always be done in such a way as to execute a row operation. It's really not immediately obvious. So I want to give you some examples. Here is a matrix A. And let me first execute a row operation. Now, for the purposes of the row operation, let's scratch this off. Let's just take that matrix, and let's do this row operation to it. And now you'll notice that there is some given rules. This first row, the formula for it is take the first row. So this goes straight over to that. Now we have this rule here. That's a formula for this row. And it tells me that I am to take the second row of the previous matrix and then add 5 times the first row of the previous matrix. And that's the formula for how we compute that second row. OK, et cetera. All right, so that's just the row operation. But now the assertion is that this row operation can be achieved, and now it's-- I've got that row operation. Let's just look at the resulting matrix. The assertion I've made is that that row operation could be executed, in effect, by left multiplication by some matrix-- in this case, this matrix. So let's see how that works. Well, now there's different points of view that you can take, I'll remind you on different points of view on matrix multiplication. But the one I'm going to use here is that I can view the product by rows. And its rows are linear combinations of the rows of the right matrix where you use the corresponding row in the left matrix for coefficients in that linear combination. So with this idea in mind, this linear combination of these three rows-- well, of course, that's just one times the first row because 0 times the second, 0 times the third. And so I get the first row. I get exactly what the row operation had me get in that position. OK, what does this tell me? Well, that says to take five times that row plus 1 times that row plus 0 times that row. Well gosh, look, that is in fact exactly the instruction for this row from the row operation. And you see here five times the first row plus 1 times the second row, and then plus 0 times the third row. There is no third row in that part of the instruction, that is, we're taking zero times it. So you can see this is always going to work. In fact, for any row operation, and for any row, for the instruction, creating any row as part of that row operation, that's always a linear combination of the rows of the original matrix. And a linear combination of those rows can always be represented by coefficients in the corresponding row of what we're creating as a new matrix. So in particular, this one, I take that one and interpret that as a linear combination of rows 1, 2, and 3. These are those coefficients. And if I view this as a linear combination of rows 1, 2, and 3. These are those coefficients, and, lastly, if I take this as a linear combination of rows 1, 2, and 3, these are those coefficients. So, for every row operation, there is a corresponding matrix that will execute that row operation. And again, you can generate that matrix just by looking one row at a time at the instructions in the row operation, and that will tell you, one row at a time, how to compute your row operation matrix. Let's see another quick example. Here's a row operation in particular, a row operation where I just switched rows 1 and 3. You can see that row 1 goes to the third position. Row 3 goes to the first position. Well, how do you say row three only as a linear combination of rows? Well, these coefficients. Row 3 only. You'll notice 1 times row 3, none of a row 1, on a row two, just 1 times wrote 3. Likewise, how do you say row 2 as a linear combination of rows 1, 2, and 3. And how do you say row 1 as a linear combination of rows 1, 2, and 3. Well, it's 1 times row 1 plus no row 2 plus no row 3's. All right. So these row operation matrices always exist. Row operations make new rows as linear combinations of the rows of A. Left multiplying makes rows of the product as linear combination rows of a. Said differently, row operations and the left multiplying are doing pretty much the same thing. [INAUDIBLE] conclusion. Here is a sort of a summary. Previously, we have thought of a row operation as sort of a process, an action, something that-- it's an algorithm, however you want to call it, that you do to one matrix that then results in another matrix. And what we have now is we don't have to think of this as a process or an action, or some, whatever, vaguely defined things. That row operation is entirely reflected in some matrix, and I can view that simply as an algebraic process. I can view that row operation as left multiplication by some matrix f Now again, what are the benefits? Why is this useful? Why might this be? We're going to see using powerful uses in the next few moments. At this point what's our instinct that this might be useful? Well, this whole process thing, gosh, it's a way of thinking about it. But I don't have any way of really working with that. Matrix multiplication is algebra. I can do algebra of these kinds of equations. And algebra has the potential-- the fact that this algebra has properties that we know are always true gives us the potential that we can make additional more powerful interpretations of these row operations. And that's exactly what's going to happen. So here's the big theorem. This theorem is in the book. The proof that I'm going to show you here it is different than the proof that's in the book by quite a bit. I think my proof is a lot easier. So here it is. Suppose you have an elementary matrix that's obtained by a certain row operation acting on the identity. And let's remember that this is what an elementary matrix is. An elementary matrix is what you get when you take the identity, and you do a certain row operation. Whatever comes out of that single row operation applying to just the identity matrix, that's called an elementary matrix. So this e here is an elementary matrix. OK. That being said, if you were to do the same row operation to some other matrix a-- so let's draw a little diagram of that here to take some other matrix a, any matrix a of the appropriate shape, and apply the exact same row operation, the assertion here, the assertion of the theorem, is that the result of doing that same row operation would be EA. Which is a very interesting assertion. This is suggesting that a matrix multiplication, in particular, a matrix multiplication by a matrix that is the result of the row operation executes that row operation. All right, so here's a really nice way to see this result. And that is let's take this and try to figure out how would we write this-- sorry, wrong one. Let's take what we're given here, let's take our given statement, our definition of E, and let's write that in terms of row operation matrices. Keep in mind that execution of the row operation can be thought of as left multiplication by a row operation matrix. So, well, you start with the identity matrix. We do a row operation to it. Row operations, you'll note here, I'm representing as left multiplication by that row operation matrix f. And the result is this matrix E. All right, so what we have here, then, this little bit of algebra right here, this is just an algebraic way of representing what we are given, that E is obtained by a row operation on the identity matrix. Well, if that's what we're given, notice that means E equals F. This is immediate from the algebra in particular. This is the identity matrix I can scratch it right off. So E is equal to F. And this is most of what we needed. Now, in order to get to our conclusion, I'm just going to multiply A both sides of the equation here. See my original equation here, E is equal to F. I'm just multiplying by A on the right on both sides. And now, let's interpret what this says. What this says is that if you take a matrix A, and if you apply this same row operation to it, same row operation, because, you'll see, I'm left multiplying by the exact same row operation matrix that generated E in, the first place, so you take the matrix A, apply the same row operation, and, sure enough, the result is as required. And that proves our result. So it's interesting how easily this fell out. And, at a glance, the theorem looks like it's kind of almost hocus pocus. That when you apply a row operation, that the result of that row operation effectively executes its own row operation that generated it. Seems like a big statement. Really, a different point of view on this theorem is what we've got right here. What we have right here is the statement that elementary matrices are row operation matrices. And row operation matrices are elementary matrices. So with this idea in mind, with the understanding that elementary matrices and row operation matrices, even though they are defined differently, are ultimately the same thing, well, the theorem is actually not so surprising at all, right? If E is obtained by a row operation, then E is a row operation matrix. And it all comes out of the fact that that I just cancels. OK, so with this in mind, keep in mind that E and F are the same, that every elementary matrix is a row operation matrix and vice versa. The question now comes up, well, do we really need these two different terms? Do I need the term elementary matrix and the distinctive term row operation matrix. And going forward, certainly, for work that we'll do after this section that we're currently in, no, we don't need two separate terms. We see that they are exactly the same thing. And certainly, this is the common terminology is to only use this term, elementary matrices, and it's just understood, as a result of what we've just proved, that elementary matrices these have this feature that they execute row operations. And why use a separate term? There's no need for it. On the other hand, we do need this term while we are proving this fact about elementary matrices because they are different as definitions. You look back here at the definition of these two items, and you see an elementary matrix is defined as the result of applying the row operation, whereas a row operation matrix executes a row operation. They are defined in very different ways. And we need these different definitions because we needed to make the interpretation of a row operation. I needed to have something that would do this process of what row operation matrices do so that I could conclude that, in fact, it's the same as an elementary matrix. So, in some sense, we had to go to some trouble to prove that the definition we made-- we needed the definition to prove that we didn't, ultimately, need the definition. All right, so going forward we will use the term elementary matrices. If we're going to be proving things about elementary matrices, and proving how elementary matrices multiply in those situations, we might have need to use row operation matrices.