Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
2-6 Measures of Spread Remember from yesterday, the mean, median, and mode are valuable to us in a couple of ways 1 They help us get a sense for the typical value a set of data will take. 2. They allow us to reduce a great number of values down to a single value. 2. M of CTs are easy to calculate. There are limitations however that we have to deal with…. Find the mean of the following two sets of data: Set 1: 1,2,3,4,5 1+2+3+4+5 X= 5 = 15 5 =3 Set 2: -44,-7,15,22,29 - 44 - 7 + 15 + 22 + 29 X= 5 = 15 The means are the same for 5 these 2 very different sets of =3 data…..hmmm Weather As a simple example, consider average temperatures for cities. While two cities may each have an average temperature of 15 °C, it's helpful to understand that the range for cities near the coast is smaller than for cities inland, which clarifies that, while the average is similar, the chance for variation is greater inland than near the coast. So, an average of 15 occurs for one city with highs of 25 °C and lows of 5 °C, and also occurs for another city with highs of 18 and lows of 12. The standard deviation allows us to recognize that the average for the city with the wider variation, and thus a higher standard deviation, will not offer as reliable a prediction of temperature as the city with the smaller variation and lower standard deviation. While the mean is useful for determining the “middle” of a set of data, we need a way to distinguish between various sets of data. 1,2,3,4,5 -44, -7, 15, 22, 29 A measure that allows us to examine the difference between these data sets is called the spread. A more common variable to use to measure the spread of a set of data is the Standard Deviation We need a predefinition to understand Standard Deviation. Variance A measure of dispersion that is found by averaging the squares of the deviation of each piece of data. Standard Deviation A measure of dispersion found by taking the square root of the variance. The square root brings the scale of the measure back down to the scale of the raw data… Do not copy this page = (x – 2 u) n Population Standard Deviation s= (x – 2 x) n-1 Sample Standard Deviation For our standard deviation calculations, we will use the sample standard deviation version. Calculate the SD for the data sets : 1,2,3,4,5 and -44,-7, 15, 22, 29 Data (x) (x – x) (x – x)2 1 -2 4 2 -1 1 3 0 0 4 1 1 5 2 4 Subtract each from the mean Square each These are the data values (x – 2 x) =4 + 1 + 0 + 1 + 4 =10 Add up all the squared values 10 = 10 n-1 4 = 2.5 (variance) 2.5 = 1.58 (standard deviation) Calculate the SD for the data sets : 1,2,3,4,5 and -44,-7, 15, 22, 29 Data (x) (x – x) (x – x)2 -44 -47 2209 -7 -10 100 15 12 144 22 19 361 29 26 676 (x – 2 x) = 2209 + 100 + 144 + 361 + 676 = 3490 3490 = 3490 n-1 4 = 872.5 (variance) 872.5 = 29.54 (standard deviation) 1,2,3,4,5 s = 1.58 -44, -7, 15, 22, 29 s = 29.54 From this we can see how the greater the standard deviation the greater the spread of the data. Homework