Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Standard Deviation and Interpreting Standard Deviation The Mean Mean: The sum of the data items divided by the number of items. Mean = π₯ π where π₯ represents the sum of all the data items and n represents the number of items The Median β’ Median is the data item in the middle of each set of ranked, or ordered, data. β’ To find the median of a group of data items, 1. Arrange the data items in order, from smallest to largest. 2. If the number of data items is odd, the median is the data item in the middle of the list. 3. If the number of data items is even, the median is the mean of the two middle data items. Ex: Five employees in a manufacturing plant earn salaries of $19,700, $20,400, $21,500, $22,600 and $23,000 annually. The section manager has an annual salary of $95,000. a. Find the median annual salary for the six people. b. Find the mean annual salary for the six people. Note: β’ In the last example, the median annual salary is $22,050 and the mean annual salary is $33,700. Why such a big difference between these two measures of central tendency? β’ The relatively high annual salary of the section manager, $95,000, pulls the mean salary to a value considerably higher than the median salary. β’ When one or more data items are much greater than the other items, these extreme values can greatly influence the mean. In cases like this, the median, rather than the mean, is used to summarize the incomes. Calculate the mean and median for birth weights and motherβs ages. The Mode β’ Mode is the data value that occurs most often in a data set. β’ If more than one data value has the highest frequency, then each of these data values is a mode. β’ If no data items are repeated, then the data set has no mode. Range β’ Used to describe the spread of data items in a data set. β’ Range: The difference between the highest and the lowest data values in a data set: Range = highest data value β lowest data value Ex: Honoluluβs hottest day is 89º and its coldest day is 61º. The range in temperature is: 89-61 = 28º Standard Deviation β’ A second measure of dispersion, and one that is dependent on all of the data items, is called the standard deviation. β’ The standard deviation is found by determining how much each data item differs from the mean. Population (π₯βπ)2 , π β’ Population standard deviation: π = where the xβs represent the data values, ΞΌ represents the mean, and N represents the total amount of data. Steps to Calculate the Population Standard Deviation: 1. Find the mean, ΞΌ, of the data values 2. Find the difference, x β ΞΌ, between each data value (x) and the mean (ΞΌ) 3. Square each difference, (π₯ β π)2 4. Sum these values (π₯ β π)2 π= 5. Divide by the total number of data values π 6. Take the square root Ex: Ms. Mosier measured the height of her trees growing at home. The heights of the 5 trees are listed below, in inches: 45,60,67,83,95 Find the standard deviation of the heights of Ms. Mosierβs trees. Step 1: Find the mean, ΞΌ, of the data values 45 + 60 + 67 + 83 + 95 350 π= = = 70 5 5 Step 2: Find the difference, x β ΞΌ, between each data value (x) and the mean (ΞΌ) 45 β 70 = -25 60 β 70 = -10 67 β 70 = -3 83 β 70 = 13 95 β 70 = 25 Step 3: Square each difference, (π₯ β π)2 (β25)2 = 625 (β10)2 = 100 (β3)2 = 9 (13)2 = 169 (25)2 = 625 Step 4: Sum these values 625 + 100 + 9 + 169 + 625 = 1528 Step 5: Divide by the total number of data values 1528 = 305.6 5 Step 6: Take the square root π = 305.6 = ππ. ππ On Your Own: The table displays the number of hurricanes in the Atlantic Ocean from 1992 to 1997. What are the mean and standard deviation? Answer: Mean = ΞΌ = 35/6 = 5.83 Standard deviation = Ο = 3.34 Sampling β’ A sample is part of a population. β’ If you determine a sample carefully, the statistics for the sample can be used to make general conclusions about the larger population. β’ Suppose you want to know what percent of high school students in the US use Twitter everyday. It likely would be impossible to get that answer from every student. So instead you select a sample of the students (like at South Miami High) to estimate the percentage who use Twitter everyday. Sample Standard Deviation β’ If only given a sample of a population, you can no longer compute the population standard deviation. You only have a part of the population and therefore have to calculate the sample standard deviation instead. It turns out that it is an unbiased estimator for the population. (π₯βπ₯)2 , πβ1 β’ Sample standard deviation: π = where the xβs represent the data values, π₯ represents the sample mean, and n represents the number of data taken by the sample. Steps to Calculate the Sample Standard Deviation: 1. Find the sample mean, π₯, of the sample data values 2. Find the difference, π₯ β π₯, between each data value (x) and the sample mean (π₯) 3. Square each difference, (π₯ β π₯)2 π = 4. Sum these values 5. Divide by the number of data taken by the sample minus 1 6. Take the square root (π₯ β π₯)2 πβ1 Ex: The blood alcohol concentrations of a sample of drivers involved in fatal crashes and then convicted with jail sentences are given below (based on data from the U.S. Department of Justice): 0.27, 0.17, 0.29 Find the sample mean and sample standard deviation. Step 1: Find the sample mean, π₯, of the sample data values 0.27 + 0.17 + 0.29 0.73 π₯= = = 0.24 3 3 Step 2: Find the difference, π₯ β π₯, between each data value (x) and the sample mean (π₯) 0.27 β 0.24 = .03 0.17 β 0.24 = -.07 0.29 - 0.24 = .05 Step 4: Sum these values . 0009 + .0049 + .0025 = .0083 Step 3: Square each difference, (π₯ β π₯)2 (.03)2 = .0009 Step 5: Divide by the number of data given minus 1 (β.07)2 = .0049 .0083 (.05)2 = .0025 = .00415 2 Step 6: Take the square root π = .00415 =. ππ Calculate the standard deviation for birth weights. 1. The standard deviation for the skewed distribution is 2.6. This is significantly greater than the symmetric distributionβs. Explain why this makes sense. _____________________________________________________________ _____________________________________________________________ 2. Which measures of center and spread would you report for the symmetric distribution? For the skewed distribution? Explain your reasoning. _____________________________________________________________ _____________________________________________________________ _____________________________________________________________ Ex: Two fifth-grade classes have nearly identical mean scores on an aptitude test, but one class has a standard deviation three times that of the other. All other factors being equal, which class is easier to teach, and why? Ex: Shown below are the means and standard deviations of the yearly returns on two investments from 1926 through 2004. a. Use the means to determine which investment provided the greater yearly return. b. Use the standard deviation to determine which investment has the greater risk. Explain your answer. Interpreting and Understanding Standard Deviation β’ As stated earlier, standard deviation measures the variation among values. Values close together will yield a small standard deviation, whereas values spread farther apart will yield a larger standard deviation. β’ Many common statistics (such as human height, weight, or blood pressure) gathered from samples in the natural world tend to have a normal distribution about their mean. A normal distribution has a symmetric bell shape centered on the mean. β’ We will develop a sense for values of standard deviations using the Empirical Rule. This only works for normal distributions!!! Interactive β’ http://www.shodor.org/interactivate/activities/NormalDistribution/ β’ Questions to ask: β’ What happens to the curve as the standard deviation gets larger? β’ What happens to the data as the number of trials increases? β’ What does this remind you of that we discussed? Ex: IQ scores of normal adults on the Weschler test have a bell-shaped distribution with a mean of 100 and a standard deviation of 15. What percentage of adults have IQ scores between 55 and 145? mean Letβs first draw the distribution. Since the mean is 100, we put 100 in the center. With a standard deviation of 15, we know that between 85 and 115, we are within 1 standard deviation from the mean. Similarly, we know that between 70 and 130, we are within 2 standard deviations from the mean. 1 s.d Finally, between 55 and 145, we are within 3 standard deviations from the mean. Now to answer the question. If we want to know the percentage of adults who have IQ scores between 55 and 145, we need to remember the Empirical Rule in that 99.7% of data fall within 3 standard deviations from the mean. 2 s.dβs Therefore, the answer is 99.7% 3 s.dβs Ex: The table displays the number of U.S. hurricane strikes by decade from the years 1851 to 2000. Letβs say weβre given that the mean is 17.6 and the standard deviation is 3.5. How many standard deviations from the mean do all the values fall? Ex: On Your Own: For an English class, the average score on a research project was 82 and the standard deviation of the normally distributed scores was 5. Sketch a normal curve showing the project scores and three standard deviations from the mean.