Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 0 - Review of linear Algebra Objectives • Basic definitions on matrices • Matrix multiplications • Addition and subtraction of matrices • Computing the determinant of a matrix • Scaling of a matrix • Matrix transformation • Transpose and Inverse of a matrix • DOT and CROSS Products of vectors What is a matrix? A matrix is a two dimensional array that stores the elements (numbers, or symbols representing numbers) in m rows and n columns. A matrix might be denoted by a letter such as A and is said to be m-by-n (m×n) in size. Here is an example: Note that indices start from 1 in MATLAB. What is a matrix? Example of a 4-by-4 matrix: 16 2 3 13 5 11 10 8 A 9 7 6 12 4 14 15 1 The 4-by-4 matrix on the right hand side can be created in MATLAB using: A = [16 2 3 13; 5 11 10 8; 9 7 6 12; 4 14 15 1] Note that indices start from 1 in MATLAB. Also each row is separated by a “;” and each member by a blank space. If you type the above on MATLAB command line you will get: A= 16 2 3 13 5 11 10 8 9 7 6 12 4 14 15 1 Multiplying two matrices Let B and C be m-by-n and q-by-p matrices respectively. Here is what we can say about the product of these two: B * C is possible iff n = q C * B is possible iff p = m Thus, the product of two matrices is possible when the number of columns on the left matrix is the same as the number of rows on the right matrix. How do we multiply two matrices. We show this in an example: 7 10 13 1 3 5 1* 7 3 * 8 5 * 9 1*10 3 *11 5 *12 1*13 3 *14 5 *15 B *C * 8 11 14 2 4 6 9 12 15 2 * 7 4 * 8 6 * 9 2 *10 4 *11 6 *12 2 *13 4 *14 6 *15 76 103 130 100 136 172 I am sure by going through this example you will figure out how the rows on the first matrix were multiplied by the columns of the second one, one-by-one. Note that we couldn’t multiply C by B, why? Adding or Subtracting two matrices To add or subtract two matrices, they must be exactly of the same type and size. Example (Addition) 1 4 7 10 13 16 1 10 11 4 13 17 7 16 23 B C 2 5 8 11 14 17 2 11 13 5 14 19 8 17 25 3 6 9 12 15 18 3 12 15 6 15 21 9 18 27 Example (Subtraction) 1 4 7 10 13 16 1 10 9 4 13 9 7 16 9 B C 2 5 8 11 14 17 2 11 9 5 14 9 8 17 9 3 6 9 12 15 18 3 12 9 6 15 9 9 18 9 In this case, it is possible to do B+C and C+B and they both produce the same result. However, B-C and C-B are possible but do not produce the same result. Multiplying a matrix by a constant or Identity Matrix When you multiply a matrix by a constant, all elements of the matrix will be multiplied by that constant. The is referred to as scaling. Example (const=4): 16 5 9 4 64 20 36 16 2 11 7 14 8 44 28 56 Const * A 4 * 3 10 6 15 12 40 24 60 13 8 12 1 42 32 48 4 The Identity matrix is a square matrix (number of rows and columns are the same), where the diagonal values of the matrix are all 1 and the rest of the elements are 0. When multiplied by another matrix of the same size, identity matrix produces the original matrix. 1 0 0 0 16 5 9 4 16 5 9 4 0 1 0 0 2 11 7 14 2 11 7 14 * I*A 0 0 1 0 3 10 6 15 3 10 6 15 0 0 0 1 13 8 12 1 13 8 12 1 Identity * Matrix A Matrix A Computing the determinant of a matrix Each square matrix of arbitrary size has a number called determinant of the matrix. This number is computed through a process. Let’s try a 2-by-2 matrix first: a b A , the determinan t of this matrix is : | A | a * d b * c c d This is a bit more complicated when matrix is of larger size. Let’s try a 3by-3 matrix. a A d g b c e f , the determinat must be computed in two steps : h i e f d f d e | A | a * b* c* a (e * i f * h ) b ( d * i f * g ) c ( d * h e * g ) h i g i g h Similarly, for a 4-by-4 matrix, you will go through three steps: 1) Take the header of the first row, with corresponding 3-by-3 matrices underneath, 2) Process the 3-by-3 matrices by repeating the steps in the above example, 3) Compute the determinant of the 2-by-2 matrices and unfold them to find the final result. Transpose and Inverse of a Matrix The transpose of a matrix is the same matrix with rows and columns switched. The transpose of the A is: 16 2 3 13 5 11 10 8 A 9 7 6 12 4 14 15 1 16 5 9 4 2 11 7 14 AT 3 10 6 15 13 8 12 1 The inverse of a matrix, A-1, is the matrix that produced the unit matrix I when it is multiplied by the matrix itself. I = A-1* A. There are several ways to compute the inverse of a matrix. We will introduce the most common one here. Perhaps, we need to find out whether a matrix has an inverse (invertible). If a matrix is not invertible, it is singular. An n-by-n matrix A is invertible if there exists an n-by-n matrix C such that AC = CA = I, where I is the identity matrix. Inverse of a Matrix How do we find the inverse of a matrix? See example below. 2 4 2 4 1 0 We wish to compute A-1: A 6 8 | 0 1 6 8 First place the identity matrix of the A| I same size on the right-hand-side of the original matrix. Then, work out through steps to move the identity matrix to the left-hand-side. Once that is accomplished, the matrix on the right-hand-side will be the inverse. Step1: we need to get ride of 4 on the first row. Thus, we will use: row1 = row1 – (.5) row2 2 .5 * 6 4 .5 * 8 1 0.5 * 0 0 .5 *1 1 0 1 .5 | | 6 8 0 1 6 8 0 1 Step2: we need to get ride of 6 on the second row. Thus, row2 = row2 + 6row1 0 1 .5 1 0 1 .5 1 6 6(1) 8 6(0) | 0 6(1) 1 6(.5) 0 8 | 6 2 Step3: The matrix on the left is almost ready. The next thing we need to do is to divide the each row by a number to produce the identity matrix on the left. Row1=row1*(-1), row2 = row2/8 0.5 1* (1) 1 0 * (1) 0 1* (1) (1) .5 * (1) 0.5 1 0 1 | | 0/8 0 8 / 8 1 6 / 8 0 . 75 2 / 8 ( 0 . 25 ) 0 1 0 . 75 0 . 25 0.5 As you notice the identity matrix has moved 1 1 A to the left-hand-side and thus the right-hand-side 0 . 75 0 . 25 matrix is the inverse. DOT Product v1 u1 The DOT product of v v2 and u u2 is defined as : v3 u3 u v v1u1 v2u2 v3u3 The DOT product of vector will be a scalar and can also be defined as: u v | u || v | cos( ) where is the angle between th e two vectors and | u | and | v | denote the magnitute of the vectors u and v respective ly. The DOT product of two orthogonal (vertical) vectors is 0. Example: v = 2i + 3j – 4k and u = 2i - 3j + 2k u.v (2)( 2) (3)( 3) (4)( 2) 13 What is the angle between these two vectors? First we need to compute the magnitude (length) of each vector. | v | 2 2 32 (4) 2 29 5.3852 and | u | 2 2 (3) 2 (2) 2 17 4.1231 cos( ) u.v 13 -0.5855 which results in arccos(0.5855) | u || v | (4.1231)(5.3852) o 125.84 CROSS Product v1 u1 The CROSS product of v v2 and u u 2 is defined as : v3 u3 j k i v u v1 v2 v3 i (v2u3 v3u 2 ) j (v1u3 v3u1 ) k (v1u 2 v2u1 ) u1 u 2 u3 The CROSS product can be defined as: u v | u || v | sin( ) where is the angle between th e two vectors and | u | and | v | denote the magnitute of the vectors u and v respective ly. Example: v = 2i + 3j – 4k and u = 2i - 3j + 2k j k i v u 2 3 4 2 3 2 i ((3)( 2) (4)( 3)) j (( 2)( 2) (4)( 2)) k (( 2)( 3) (3)( 2)) 6i 12 j 12k Note: The result is a vector Chapter 0 - Review of Statistics Objectives • Computing the mean (average) • Finding the median • Computing the variance • Computing standard deviation • Probability and random numbers • Probability distribution function • Cumulative distribution function Computing mean and median The mean of a set of values is the weighted average of the possible values in that set. Basically sum of all values divided by the number of values. Sometimes this can be written as the sum shown on the right-hand-side. Where f denotes the number of occurrences of a particular value being m N observed in the set. f *x x i 1 i N j 1 i j N The median is the point in the sequence where the set is divided into two equal parts each containing ½ of the values. Example: X = {2, 12, 4, 12, 7, 3, 5, 12, 4, 8, 2} Sorted X={2, 2, 3, 4, 4, 5, 7, 8, 12, 12, 12} The number in the middle of this set is 5, that is Values f (frequency) the median. What if we had an even number of 2 2 values? What is the average? 3 1 11 4 2 5 1 7 1 or 8 1 12 3 x i 1 i 11 2 12 4 12 7 3 5 12 4 8 2 6.454545 11 7 fx i 1 i i 11 2 * 2 1* 3 2 * 4 1* 5 1* 7 1* 8 3 *12 6.454545 11 Measure of Variation The mean and median do not describe the amount of dispersion or variation among the observed values. See examples below: 30 45 40 40 35 25 35 20 30 30 25 25 15 20 20 15 10 15 10 10 5 5 5 0 0 1 2 3 4 5 0 1 2 3 4 5 1 2 3 4 5 All three have the same median and mean. The first measure of variance is range. Once again let’s consider X = {2, 12, 4, 12, 7, 3, 5, 12, 4, 8, 2}. Range defines the difference between the largest and the smallest values. Range = 12-2 = 10. The secondN measure of variation is the mean deviation, which is defined as: | xi | i 1 Mean deviation n 2(| 2 6.4545 |) 3(| 12 6.4545 |) 2(| 4 6.4545 |) (| 3 6.4545 |) (| 5 6.4545 |) (| 7 6.4545 |) (| 8 6.4545 |) 11 2(4.4545) 3(5.5555) 2(2.4545) 3.4545 1.4545 1.5555 2.5555 3.59 11 Measure of Variation The most commonly used measure for the variation is sample variance referred to as variance. 2 ( x ) i s2 n 1 For the X = {2, 12, 4, 12, 7, 3, 5, 12, 4, 8, 2}, this will be: 2(2 6.4545) 2 3(12 6.4545) 2 2(4 6.4545) 2 (3 6.4545) 2 (5 6.4545) 2 (7 6.4545) 2 (8 6.4545) 2 s 10 2 2(4.4545) 2 3(5.5555) 2 2(2.4545) 2 (3.4545) 2 (1.4545) 2 (1.5555) 2 (2.5555) 2 10 16.0727 The standard deviation is an important measure deviation usually referred to as the error and is defined as: s s 2 16.0727 MATLAB Examples • Let the columns represent heart rate, weight and hours of exercise per week Random numbers and probability Probability of an event, xi, is the chance of observing that event when a large number of observations have been made. This is defined as: P ( xi ) N xi N Where, Nx represents the number of times that that particular event have been observed and N represents the total number of observations. Of course, the sum of all probabilities must be 1, i.e., if you try all the possibilities, then you must observe all events. For example in the set: X = {2, 12, 4, 12, 7, 3, 5, 12, 4, 8, 2} P(2) P(4) N2 N4 2 N 3 1 , P(12) 12 , P(3) P(5) P(7) P(8) N N 11 N 11 11 Note: The sum of all these probabilities is 1: 2 2 3 1 1 1 1 1 11 11 11 11 11 11 11 A random set is a set in which the probability of a variable appearing is the same as that of all other variables. Since we cannot create perfectly random values we use pseudo random generators to produce a set of values. Probability Distribution and Cumulative Distribution In the previous example we had: N2 N4 2 N 3 1 , P(12) 12 , P(3) P(5) P(7) P(8) N N 11 N 11 11 P(2) P(4) 1 0.3 0.9 0.25 0.8 0.7 0.2 0.6 0.5 0.15 0.4 0.1 0.3 0.2 0.05 0.1 0 0 1 2 3 4 5 6 7 8 9 10 11 12 PDF This is simply plotting the probabilities of each values as they are observed. Using this type of distribution we can easily tell which one of the values were seen most often 1 2 3 4 5 6 7 8 9 10 11 12 CDF This is the cumulative probabilities as we get to the next values. The final result is always 1. That is where all possible values are observed