Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Mathematics IV Unit 1 1st Edition Mathematics IV Frameworks Student Edition Unit 1 How Confident Are You? 1st Edition June, 2010 Georgia Department of Education Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 1 of 39 Mathematics IV Unit 1 1st Edition Table of Contents INTRODUCTION .........................................................................................................................3 Colors of Reese’s Pieces Candies Learning Task…………...........................................................7 Pennies Learning Task...................................................................................................................16 Gettysburg Address Learning Task...............................................................................................22 Confidence Intervals Learning Tasks ..........................................................................................31 Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 2 of 39 Mathematics IV Unit 1 1st Edition Mathematics IV – Unit 1 How Confident Are You? Student Edition INTRODUCTION: In Mathematics III, students began examining sampling distributions and sampling variability, laying the groundwork for the Central Limit Theorem, a major topic in this unit. Then, students looked at a number of discrete probability distributions and at one particular continuous probability distribution, the normal distribution. This unit will continue linking the two. The Central Limit Theorem allows us to use a normal distribution to approximate the distribution of sample means and proportions, even when the original sampling distribution itself is not a normal distribution. The second major topic is constructing confidence intervals and understanding margin of error. These ideas are prevalent in the media and, therefore, important for all students to understand. Much of the content in this unit is beyond what was traditionally taught in Georgia and may be beyond the mathematical and statistical training received by many teachers. Some general references that may be helpful to teachers are the Guidelines for Assessment and Instruction in Statistics Education (GAISE) Report, which can be found online at www.amstat.org/education/gaise, NCTM’s Navigating through Data Analysis in Grades 9 – 12, and “Statistics in the High School Mathematics Curriculum: Building Sound Reasoning under Uncertain Conditions” by Richard Scheaffer and Josh Tabor in the August 2008 Mathematics Teacher. Also, the school’s AP Statistics teacher or local RESA math specialist may be able to provide assistance. It is assumed throughout the unit that students already know how to use normal distribution tables or how to calculate normal probabilities on their calculators. These skills were taught in Mathematics III and should be maintained in the present unit. Other previous topics, e.g. sampling distributions, will be reviewed throughout the unit. ENDURING UNDERSTANDINGS: As the sample size increases, the value of a statistic approaches the true value of the population parameter, the standard deviation of the sample means is standard deviation of the sample proportions is p 1 p n , and the . These formulas for the n standard deviation are valid as long as the population is at least 10 times the size of the sample. The Central Limit Theorem allows us to use normal probability calculations, given certain conditions are met, for sample means and proportions even if the original distributions are non-normal. Confidence intervals provide a range of values that estimate the population parameter. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 3 of 39 Mathematics IV Unit 1 1st Edition The margin of error, often cited in media articles, tells how accurate we believe our estimate of the parameter to be. To decrease the margin of error, we could increase the sample size or decrease how confident we need the result to be. KEY STANDARDS ADDRESSED: MM4D1. Using simulation, students will develop the idea of the central limit theorem. MM4D2. Using student-generated data from random samples of at least 30 members, students will determine the margin of error and confidence interval for a specified level of confidence. MM4D3. Students will use confidence intervals and margin of error to make inferences from data about a population. Technology is used to evaluate confidence intervals, but students will be aware of the ideas involved. RELATED STANDARDS ADDRESSED: MM4P1. Students will solve problems (using appropriate technology). a. Build new mathematical knowledge through problem solving. b. Solve problems that arise in mathematics and in other contexts. c. Apply and adapt a variety of appropriate strategies to solve problems. d. Monitor and reflect on the process of mathematical problem solving. MM4P2. Students will reason and evaluate mathematical arguments. a. Recognize reasoning and proof as fundamental aspects of mathematics. b. Make and investigate mathematical conjectures. c. Develop and evaluate mathematical arguments and proofs. d. Select and use various types of reasoning and methods of proof. MM4P3. Students will communicate mathematically. a. Organize and consolidate their mathematical thinking through communication. b. Communicate their mathematical thinking coherently and clearly to peers, teachers, and others. c. Analyze and evaluate the mathematical thinking and strategies of others. d. Use the language of mathematics to express mathematical ideas precisely. MM4P4. Students will make connections among mathematical ideas and to other disciplines. a. Recognize and use connections among mathematical ideas. b. Understand how mathematical ideas interconnect and build on one another to produce a coherent whole. c. Recognize and apply mathematics in contexts outside of mathematics. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 4 of 39 Mathematics IV Unit 1 1st Edition MM4P5. Students will represent mathematics in multiple ways. a. Create and use representations to organize, record, and communicate mathematical ideas. b. Select, apply, and translate among mathematical representations to solve problems. c. Use representations to model and interpret physical, social, and mathematical phenomena. UNIT OVERVIEW: The unit begins with students reading and discussing an internet article on presidential approval ratings, employing percents and margins of error. This motivates the unit and will be revisited in the Confidence Intervals Tasks. The Reese’s Pieces Candies task reviews ideas about sampling distributions of sample proportions and develops this knowledge into the Central Limit Theorem (CLT) for Proportions. In addition to collecting data with actual candy, students will use simulation to fully develop the CLT. The task concludes with a problem intended to synthesize the ideas from the simulation. The next two tasks, Pennies and Gettysburg Address, are nearly identical. Both tasks review sampling distributions of sample means from normal populations, a topic from Acc. Math II. Then, either with pennies or words from the Gettysburg Address, students will draw samples of various sizes, create whole class plots, and discuss the results. This activity leads to the CLT for Means. The tasks conclude with a problem intended to synthesize the ideas from the activity. The final series of learning tasks builds on students’ understanding of the empirical rule to develop confidence intervals and margin of error. The article from the beginning of the unit will serve as a catalyst and basis for the investigation of margin of error and confidence intervals for proportions. Students will then simulate samples for the creation of additional samples in an effort to understand what it means to say that one is 95% confident. The last part of this series addresses confidence intervals for sample means, including the use of data collection and simulation. The culminating task requires students to synthesize the statistical knowledge gained throughout high school. They must design, implement, and analyze the results of a survey or experiment, paying particular attention to the use of statistical inference. VOCABULARY AND FORMULAS Central Limit Theorem: Choose a simple random sample of size n from any population with mean and standard deviation . When n is large (at least 30), the sampling distribution of the sample mean x is approximately normal with mean and standard deviation . n Choose a simple random sample of size n from a large population with population parameter p having some characteristic of interest. Then the sampling distribution of the Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 5 of 39 Mathematics IV Unit 1 1st Edition sample proportion p̂ is approximately normal with mean p and standard deviation p 1 p . This approximation becomes more and more accurate as the sample size n n increases, and it is generally considered valid if the population is much larger than the sample, i.e. np 10 and n(1 – p) 10. The CLT allows us to use normal calculations to determine probabilities about sample proportions and sample means obtained from populations that are not normally distributed. Confidence Interval is an interval for a parameter, calculated from the data, usually in the form estimate margin of error. The confidence level gives the probability that the interval will capture the true parameter value in repeated samples. Margin of Error is the value in the confidence interval that says how accurate we believe our estimate of the parameter to be. The margin of error is comprised of the product of the z-score and the standard deviation (or standard error of the estimate). The margin of error can be decreased by increasing the sample size or decreasing the confidence level. Parameter is a number that describes the population. A parameter is a fixed number, but in practice we do not know its value because we cannot examine the entire population. Sample Mean is a statistic measuring the average of the observations in the sample. It is written as x . The mean of the population, a parameter, is written as . Sample Proportion is a statistic indicating the proportion of successes in a particular sample. It is written as p̂ . The population proportion, a parameter, is written as p. Sampling Distribution of a statistics is the distribution of values taken by the statistic in all possible samples of the same size from the same population. Sampling Variability refers to the fact that the value of a statistic varies in repeated random sampling. Statistic is a number that describes a sample. The value of the statistics is known when we have taken a sample, but it can change from sample to sample. We often use a statistic to estimate an unknown parameter. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 6 of 39 Mathematics IV Unit 1 1st Edition COLORS OF REESE’S PIECES CANDIES1 LEARNING TASK: 1. Why do I need to learn more about statistics? a. Read the following article. What do the numbers in the article represent? b. How reliable do you think the ratings are? c. How do you think pollsters determine approval ratings such as these? d. What do you think a margin of error is? Why is that important? During this unit, you will learn the answers to these questions and how those answers are important. Excerpt from http://www.cnn.com/2005/POLITICS/12/19/bush.poll/ Poll: Iraq speeches, election don't help Bush Tuesday, December 20, 2005; Posted: 12:56 a.m. EST (05:56 GMT) CNN -- President Bush's approval ratings do not appear to have changed significantly, despite a number of recent speeches he's given to shore up public support for the war in Iraq and its historic elections on Thursday. A CNN/USA Today Gallup poll conducted over the weekend found his approval rating stood at 41 percent, while more than half, or 56 percent, disapprove of how the president is handling his job. A majority, or 52 percent, say it was a mistake to send troops to Iraq, and 61 percent say they disapprove of how he is handling Iraq specifically. The margin of error was plus or minus 3 percentage points. ... The poll was nearly split, 49 percent to 47 percent, between those who thought the U.S. will either "definitely" or "probably" win, and those who said the U.S. will lose. That said, 69 percent of those polled expressed optimism that the U.S. can win the war. The margin of error for how respondents assessed the war was plus or minus 4.5 percentage points. ... Although half those polled said that a stable government in Iraq was likely within a year, 62 percent said Iraqi forces were unlikely to ensure security without U.S. assistance. And 63 percent said Iraq was unlikely to prevent terrorists from using Iraq as a base. The margin of error on questions pertaining to troop duration in Iraq, as well as the country's future, was plus or minus 3 percentage points. The poll interviewed 1,003 adult Americans and found that the public has also grown more skeptical about Bush's key arguments in favor of the war. Compared with two years ago, when 57 percent considered Iraq a part of the war on terrorism, 43 percent think so now. In the 1 This task is based on chapter 16 from Rossman, A. J., Chance, B. L., & VonOehsen, J. B. (2002). Workshop statistics: Discovery with data and the graphing calculator (2nd edition). Emeryville, CA: Key Curriculum Press. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 7 of 39 Mathematics IV Unit 1 1st Edition weekend poll, 55 percent said they view the war in Iraq as separate from the war on terror. The margin of error on this line of questioning was plus or minus 3 percentage points. On the domestic front, 56 percent of those polled say they disapprove of how Bush is handling the economy; by contrast, 41 percent approve. The margin of error was plus or minus 3 percentage points. The president may find support for his call to renew the Patriot Act. Forty-four percent said they felt the Patriot Act is about right, and 18 percent said it doesn't go far enough. A third of respondents say they believe the Patriot Act has gone too far in restricting people's civil liberties to investigate suspected terrorism. Nearly two-thirds said they are not willing to sacrifice civil liberties to prevent terrorism, as compared to 49 percent saying so in 2002. The margin of error was plus or minus 4.5 percentage points for those questions. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 8 of 39 Mathematics IV Unit 1 1st Edition 2. Reviewing some basics: a. Think about a single bag of Reese’s Pieces. Does this single bag represent a sample of Reese’s Pieces or the population of Reese’s pieces? b. We use the term statistic to refer to measures based on samples and the term parameter to refer to measures of the entire population. If there are 62 Reese’s Pieces in your bag, is 62 a statistic or a parameter? If Hershey claims that 25% of all Reese’s Pieces are brown, is 25% a statistic or a parameter? We also use different symbols to represent statistics and parameters. The following table will be very useful as we continue through this unit. Proportion Mean Standard deviation Number Parameter P “mu” “sigma” N p̂ x Statistic “p-hat” “x-bar” S N 3. How many orange candies should I expect in a bag of Reese’s Pieces? a. From your bag of Reese’s Pieces, take a random sample of 10 candies. Record the count and proportion of each color in your sample. Orange Yellow Brown Count Proportion b. Do you know the value of the proportion of orange candies manufactured by Hershey? c. Do you know the value of the proportion of orange candies among the 10 that you selected? d. Do you think that every student in the class obtained the same proportion of orange candies in his or her sample? Why or why not? e. Combine your results with the rest of the class and produce a dotplot for the distribution of sample proportions of orange candies (out of a sample of 10 candies) obtained by the class members. f. What is the average of the sample proportions obtained by your class? g. Put the Reese’s Pieces back in the bag and take a random sample of 25 candies. Record the count and proportion of each color in your sample. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 9 of 39 Mathematics IV Unit 1 Orange 1st Edition Yellow Brown Count Proportion h. Combine your results with the rest of the class and produce a dotplot for the distribution of sample proportions of orange candies (out of a sample of 25 candies) obtained by the class members. Is there more or less variability than when you sampled 10 candies? Is this what you expected? Explain. i. What is the average of the sample proportions (from the samples of 25) obtained by your class? Do you think this is closer or farther from the true proportion of oranges than the value you found in f? Explain. j. This time, take a random sample of 40 candies. Record the count and proportion of each color in your sample. Orange Yellow Brown Count Proportion k. Combine your results with the rest of the class and produce a dotplot for the distribution of sample proportions of orange candies (out of a sample of 40 candies) obtained by the class members. Is there more or less variability than the previous two samples? Is this what you expected? Explain. l. What is the average of the sample proportions (from the samples of 40) obtained by your class? Do you think this is closer or farther from the true proportion of oranges than the values you found in f and i? Explain. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 10 of 39 Mathematics IV Unit 1 1st Edition 4. Sampling Distribution of p̂ We have been looking a number of different sampling distributions of p̂ , but we have seen that there is great variability in the distributions. We would like to know that p̂ is a good estimate for the true proportion of orange Reese’s Pieces. However, there are guidelines for when we can use the statistic to estimate the parameter. This is what we will investigate in the next section. First, however, we need to understand the center, shape, and spread of the sampling distribution of p̂ . We know that if we are counting the number of Reese’s pieces that are orange and comparing with those that are not orange, then the counts of oranges follow a binomial distribution (given that the population is much larger than our sample size). a. Recall from Math III the formulas for the mean and standard deviation of a binomial distribution. b. Given that p̂ = X/n, where X is the count of oranges and n is the total in the sample, how might we find p̂ and p̂ ? Find formulas for each statistic. c. This leads to the statement of the characteristics of the sampling distribution of a sample proportion. The Sampling Distribution of a Sample Proportion: Choose a simple random sample of size n from a large population with population parameter p having some characteristic of interest. Let p̂ be the proportion of the sample having that characteristic. Then: o The mean of the sampling distribution is ____. o The standard deviation of the sampling distribution is ___________. d. Let’s look at the standard deviation a bit more. What happens to the standard deviation as the sample size increases? Try a few examples to verify your conclusion. Then use the formula to explain why your conjecture is true. If we wanted to cut the standard deviation in half, thus decreasing the variability of p̂ , what would we need to do in terms of our sample size? e. Caution: We can only use the formula for the standard deviation of p̂ when the population is at least 10 times as large as the sample. For each of the samples taken in part 3, determine what the population of Reese’s Pieces must be for us to use the standard deviation formula derived above. Is it safe to assume that the population is at least as large as these amounts? Explain. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 11 of 39 Mathematics IV Unit 1 1st Edition 5. Simulating the Selection of Orange Reese’s Pieces As you saw above, there is variation in the distributions depending on the size of your sample and which sample is chosen. To better investigate the distribution of the sample proportions, we need more samples and we need samples of larger size. We will turn to technology to help with this sampling. For this simulation, we need to assume a value for the true proportion of orange candies. Let’s assume p = 0.45. a. First, let’s imagine that there are 100 students in the class and each takes a sample of 50 Reese’s Pieces. We can simulate this situation with your calculator. Type randBin(50,0.45) in your calculator. (randBin is found in the following way: MathPROB6.) What number did you get? Compare with a neighbor. What do you think this command does? How could you obtain the proportion that are orange rather than the count? b. Now, we want to generate 100 samples of size 50. This time, input randBin(50,0.45,100)/50L1. The latter part (store in L1) puts all of the outputs into List 1. Using your Stat Plots, create a histogram or stem-and-leaf plot of the proportions of orange candies. Sketch the graph below. (If you are using a program that creates dotplots, create a dotplot instead.) Do you notice a pattern in the distribution of the sample proportions? Explain. c. Find the mean and standard deviation of the output using 1-Var Stats. How do these compare with the theoretical mean and standard deviation for a sampling distribution of a sample proportion from part 3? Mean: ______________ Standard Deviation: ______________ d. Use the TRACE button on the calculator to count how many of the 100 sample proportions are within 0.07 of 0.45. Note: 0.07 is close to the standard deviation you found above, so we are going about one standard deviation on each side of the mean. Then repeat for within 0.14 and for within 0.21. Record the results below: Number of the 100 Sample Proportions Within 0.07 of 0.45 Within 0.14 of 0.45 Within 0.21 of 0.45 Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 12 of 39 Percentage of the 100 Sample Proportions Mathematics IV Unit 1 1st Edition e. If each of the 100 students who sampled Reese’s Pieces were to estimate the population proportion of orange candies by going a distance of 0.14 on either side of his or her sample proportion, what percentage of the 100 students would capture the actual proportion (0.45) within this interval? f. If you did not know the actual proportion of oranges, would the simulation above provide you with a definitive way of knowing whether your sample was within 0.14 of the mean? Explain. g. Simulate drawing out 200 Reese’s Pieces 100 times. Find the mean and standard deviation of the set of sample proportions in this simulation. Compare with the theoretical mean and standard deviation of the sampling distribution with sample size 200. h. How does the plot of the sampling distribution different from the above plot? How do the mean and standard deviation compare? What percentage of the 200 sample proportions fall within 0.07 of 0.45 (or approximately 2 standard deviations)? How does this compare with the answer to part e? i. You should notice that these distributions follow an approximately normal distribution, a topic you learned about in Math 2. You also learned the Empirical Rule that states how much of the data will fall within 1, 2, and 3 standard deviations of the mean. Restate the rule: In a normal distribution with mean and standard deviation : o ____% of the observations fall within 1 standard deviation (1 ) of the mean ( ). o ____% of the observations fall within 2 standard deviations (2 ) of the mean ( ). o ____% of the observations fall within 3 standard deviation (3 ) of the mean ( ). j. Do your answers to parts e and i agree with the Empirical Rule? Explain. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 13 of 39 Mathematics IV Unit 1 1st Edition This leads us to an important result in statistics: the Central Limit Theorem (CLT) for a Sample Proportion: Choose a simple random sample of size n from a large population with population parameter p having some characteristic of interest. Then the sampling distribution of the sample proportion p̂ is approximately normal with mean p and standard deviation p 1 p . This approximation n becomes more and more accurate as the sample size n increases, and it is generally considered valid if the population is much larger than the sample, i.e. np 10 and n(1 – p) 10. k. How might this theorem be helpful? What advantage does this theorem provide in determining the likelihood of events? Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 14 of 39 Mathematics IV Unit 1 1st Edition 6. Applying the CLT for Sample Proportions2 A USA Today poll asked a random sample of 1012 U.S. adults what they did with their cereal milk after they have eaten the cereal. Of the respondents, 67% said that they drink the milk. a. Is it possible to know for certain what percentage of U.S. adults drink their cereal milk? Explain. b. Suppose we know that 70% of U.S. adults drink the cereal milk. Find the mean and standard deviation of the proportion p̂ of the sample that say they drink the cereal milk. c. Explain why you can use the formula for the standard deviation of p̂ in this situation. d. What sample size would be required to reduce the standard deviation of the sample proportion to one-third the value you found in part b? e. Check that you can use the normal approximation for the distribution of p̂ , i.e. the Central Limit Theorem for Sample Proportions. f. In Math III, you learned to calculate z-scores using the following formula: z X . Note that X is a statistic, is a parameter for the mean, and is the parameter for the standard deviation. More specifically, whenever we standardize values, we take the estimate (or statistic) minus the corresponding parameter and divide the difference by the corresponding standard deviation. We will standardize proportions in the same way. Substitute the statistics and parameters for proportions into the z-score formula to obtain our standardization formula for proportions. g. Find the probability of obtaining a sample of 1012 adults in which 67% or fewer say they drink the cereal milk. (Use the standard normal distribution.) h. Find the probability that p̂ takes a value between 0.67 and 0.73. This will tell us if a simple random sample of 1012 adults will usually give a result p̂ within 3 percentage points of the true population proportion. 2 Problem adapted from Yates, D. S., Moore, D. S., & Starnes, D. S. (2003). The Practice of Statistics: TI-83/89 Graphing Calculator Enhanced (2nd ed.). New York: W.H. Freeman. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 15 of 39 Mathematics IV Unit 1 1st Edition PENNIES LEARNING TASK: 1. Sampling Distribution of a Sample Mean from a Normal Population The scores of individual students on the ACT entrance exam have a normal distribution with mean 18.6 and standard deviation 5.9. a. Use your calculator to simulate the scores of 25 randomly selected students who took the ACT. Record the mean and standard deviations of these 25 people in the table below. Repeat, simulating the scores of 100 people. (To do this, use the following command: randNorm( , , n)L1.) Population 25 people 100 people Mean 18.6 Standard Deviation 5.9 b. As a class, compile the means for the sample of 25 people. Determine the mean and standard deviation of this set of means. That is, calculate x and x . How does the mean of the sample means compare with the population mean? How does the standard deviation of the sample means compare with the population standard deviation? c. Describe the plot of this set of means. How does the plot compare with the normal distribution? d. As a class, compile the means for the sample of 100 people. Determine the mean and standard deviation of this set of means. That is, calculate x and x . How does the mean of the sample means compare with the population mean? How does the standard deviation of the sample means compare with the population standard deviation? e. Describe the plot of this set of means. How does the plot compare with the normal distribution? f. Determine formulas for the mean of the sample means, the sample means, x . Compare with a neighbor. x , and the standard deviation of g. Just as we saw with proportions, the sample mean is an unbiased estimator of the population mean. The Sampling Distribution of a Sample Mean: Choose a simple random sample of size n from a large population with mean standard deviation . Then: o The mean of the sampling distribution of x is ____. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 16 of 39 and Mathematics IV Unit 1 1st Edition o The standard deviation of the sampling distribution of x is ___________. h. Again, we must be cautious about when we use the formula for the standard deviation of x . What was the rule when we looked at proportions? It is the same here. i. Put these latter facts together with your response to part b to complete the following statement: Choose a simple random sample of size n from a population that has a normal distribution with mean and standard deviation . Then the sample mean x has a ____________distribution with mean ________ and standard deviation ____________. This problem centered on a population that was known to be normally distributed. What about populations that are not normally distributed? Can we still use the facts above? Let’s investigate! Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 17 of 39 Mathematics IV Unit 1 1st Edition 2. How Old are Your Pennies?3 a. Make a frequency table of the year and the age of the 25 pennies you brought to class. Find the average age of the 25 pennies. Record the mean age as x 25 . x 25 = ___________ Example (if the year is 2009): Year Age Frequency 2009 0 3 2008 1 6 2007 2 3 ... ... ... b. Put your 25 pennies in a cup or bag and randomly select 5 pennies. Find the average age of the 5 pennies in your sample, and record the mean age as x 5 . Replace the pennies in the cup, and repeat. x1 5 = ________________ x2 5 = ________________ c. Repeat the process two more times, this time removing 10 pennies at a time. Calculate the average age of the sample of 10 pennies and record as x 10 . x1 10 = ________________ x2 10 = ________________ e. Clear a space on the floor and use masking tape to make a number line (horizontal axis) with ages marked from 0 to about 30 on the axis. Each interval should be a little more than the width of a penny. Place the pennies on the axis according to age, making a penny dotplot on the floor. Look at the shape of the final dotplot. Describe the distribution of the pennies’ ages. 3 The Workshop Statistics books provide an alternatives to actually working with pennies; however, it is not possible to recreate the methods here. It requires assigning three-digit numbers to penny ages and using a random number generator to “sample” pennies. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 18 of 39 Mathematics IV Unit 1 1st Edition f. Make a second axis on the floor and label it with increments of 0.5. Each interval should be a little more than the width of a nickel. Use the nickels to plot the means for the sample size 5. What is the shape of the dotplot for the distribution of x 5 ? How does it compare with the original distribution of pennies’ ages? g. Make a third axis on which to create a dotplot of the means for the sample size 10. Use dimes for this plot. What is the shape of the dotplot for the distribution of x 10 ? How does it compare with the previous plots? h. Finally, make a fourth axis. On this axis, use the quarters to record the means for the sample size of 25. Describe the shape of the dotplot. i. Divide into groups for the next part of this task. There should be at least 4 groups. Each group will take one of the dotplots above and find the mean and standard deviation of the sample means. (Because there are so many pennies, two groups should separately determine the overall mean and standard deviation and check each other.) Post a table similar to the one below on the board so that groups can post their results. Record all groups’ result here. Mean Standard Deviation Shape of the Distribution “Population” Samples of 5 Samples of 10 Samples of 25 j. Previously, we stated that if samples were taken from a normal distribution, that the mean and standard deviation of the sampling distribution of sample means was also normal with . In this activity, we did not begin with a normal distribution. n However, compare the means for the samples of 5, 10, and 25 with the overall mean of the pennies. Then compare the standard deviations with the standard deviation of all the pennies. Do these formulas appear to hold despite the population of penny ages being obviously non-normal? Explain. x and x k. Suppose that the U.S. Department of Treasury estimated that the average age of pennies presently in circulation is 12.25 years with a standard deviation of 9.5. Determine the theoretical means and standard deviations for the sampling distributions of sample size 5, 10, and 25. Population Samples of 5 Samples of 10 Samples of 25 Mean 12.25 Standard Deviation 9.5 Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 19 of 39 Mathematics IV Unit 1 1st Edition l. This brings us to the Central Limit Theorem (CLT) for Sample Means: Choose a simple random sample of size n from any population, regardless of the original shape of the distribution, with mean and finite standard deviation . When n is large, the sampling distribution of the sample mean x is approximately normal with mean ______ and standard deviation ________. Note: The statement “when n is large” seems a bit ambiguous. A good rule of thumb is that the sample size should be at least 30, as we can see in the dotplots above. When the sample sizes were sample, i.e. 5 and 10, the plots were still quite right skewed. 3. Applying the CLT for Sample Means Whenever we have a normal distribution, we are able to use normal tables and normal calculations to find probabilities. Due to the CLT, we can use normal calculations to determine probabilities about sample means drawn from large samples. a. First, we need to recall how to standardize scores, that is, find z-scores. State the formula for z-scores that you learned in Math 3. Recall that whenever we standardize values, we take the estimate (or statistic) minus the corresponding parameter and divide that difference by the corresponding standard deviation. We will standardize sample means in the same way. Substitute the statistics and parameters for sample means into the z-score formula to obtain our standardization formula for sample means. b. Consider a candy bar whose weight varies according to a normal distribution with a mean of 2.23 ounces and a standard deviation of 0.05 ounces. Find the probability that a single candy bar weighs less than 2.2 ounces. c. Suppose you take a sample of 20 candy bars. What are the mean and standard deviation of the sampling distribution of the average weight of this size sample? d. Find the probability that the mean weight of these 20 candy bars is less than 2.2 ounces. e. Find the probability that the mean weight of these candy bars is between 2.21 and 2.25 ounces. f. How would you expect your answers to parts d and e would be different if the sample size were 50 instead of 20? Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 20 of 39 Mathematics IV Unit 1 1st Edition g. Calculate these probabilities and comment on your conjecture. h. Which of your answers, if any, to parts b, c, d, e, and g would be affected if the distribution of the candy bar weights was not normally distributed? Explain. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 21 of 39 Mathematics IV Unit 1 1st Edition GETTYSBURG ADDRESS LEARNING TASK: 1. Sampling Distribution of a Sample Mean from a Normal Population The scores of individual students on the ACT entrance exam have a normal distribution with mean 18.6 and standard deviation 5.9. a. Use your calculator to simulate the scores of 25 randomly selected students who took the ACT. Record the mean and standard deviations of these 25 people in the table below. Repeat, simulating the scores of 100 people. (To do this, use the following command: randNorm( , , n)L1.) Population 25 people 100 people Mean 18.6 Standard Deviation 5.9 b. As a class, compile the means for the sample of 25 people. Determine the mean and standard deviation of this set of means. That is, calculate x and x . How does the mean of the sample means compare with the population mean? How does the standard deviation of the sample means compare with the population standard deviation? c. Describe the plot of this set of means. How does the plot compare with the normal distribution? d. As a class, compile the means for the sample of 100 people. Determine the mean and standard deviation of this set of means. That is, calculate x and x . How does the mean of the sample means compare with the population mean? How does the standard deviation of the sample means compare with the population standard deviation? e. Describe the plot of this set of means. How does the plot compare with the normal distribution? f. Determine formulas for the mean of the sample means, the sample means, x . Compare with a neighbor. x , and the standard deviation of g. Just as we saw with proportions, the sample mean is an unbiased estimator of the population mean. The Sampling Distribution of a Sample Mean: Choose a simple random sample of size n from a large population with mean standard deviation . Then: and o The mean of the sampling distribution of x is ____. o The standard deviation of the sampling distribution of x is ___________. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 22 of 39 Mathematics IV Unit 1 1st Edition h. Again, we must be cautious about when we use the formula for the standard deviation of x . What was the rule when we looked at proportions? It is the same here. i. Put these latter facts together with your response to part b to complete the following statement: Choose a simple random sample of size n from a population that has a normal distribution with mean and standard deviation . Then the sample mean x has a ____________distribution with mean ________ and standard deviation ____________. This problem centered on a population that was known to be normally distributed. What about populations that are not normally distributed? Can we still use the facts above? Let’s investigate! Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 23 of 39 Mathematics IV Unit 1 1st Edition 2. How Long are the Words in the Gettysburg Address? (Note: The Gettysburg Address and Word List are at the end of this task.) a. To answer the question of how long the words in the Gettysburg address are, we want to take a sample of the words. Propose different ways you might take a random sample of 5 words from the Gettysburg address. b. The last page of this activity lists the words from the Gettysburg address. There are 268 words, and each word is assigned a number from 1 (001) to 268. To select a simple random sample of 5 words, we need to generate 5 distinct random integers between 1 and 268. (Use the following command: randInt(1, 268, 5). If any of the numbers repeat, repeat the command until you have 5 distinct integers.) Find the words on the list that correspond to these integers. List the word lengths. Find the average length of the 5 words in your sample, and record the mean length as x 5 . Generate a new set of 5 distinct integers, and repeat. Also record each mean length on a separate post-it note. Make sure the post-it is labeled, e.g. x 5 ! Random Integers Word Lengths Average Length Sample One Sample Two x1 5 = x2 5 = c. Repeat the process two more times, this time choosing 10 words at a time. Calculate the average length of the sample of 10 words and record as x 10 . Also record each mean length on a separate post-it note. Make sure the post-it is labeled! Sample One Sample Two x1 10 = x2 10 = Random Integers Word Lengths Average Length d. Repeat the process one more time, this time choosing 25 words. Calculate the average length of sample of 25 words and record as x 25 . Also record the mean length on a post-it note. Make sure the post-it is labeled! Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 24 of 39 Mathematics IV Unit 1 1st Edition Random Integers Word Lengths Average Length x 25 = Clear a space on the floor, the wall, or the board. Use masking tape to make a number line (horizontal axis) with average lengths marked from 2 to about 6 on the axis, with tick marks every 0.2. Each interval should be a little more than the width of a post-it note. e. Place the post-its with the average length of the samples of 5 words on the axis according to the average length, making a post-it dotplot on the floor/wall/board. Look at the shape of the final dotplot. Describe the distribution of the words’ lengths. Make a second axis on the floor and label it with increments of 0.1. Again, each interval should be a little more than the width of a post-it note. f. Plot the means for the sample size of 10. What is the shape of the dotplot for the distribution of x 10 ? How does it compare with the previous distribution? Make a third axis on which to create a dotplot of the means for the sample size 25. g. Plot the means for the sample size of 25. What is the shape of the dotplot for the distribution of x 25 ? How does it compare with the previous distribution? h. Divide into groups for the next part of this task. There should be at least 4 groups. Each group will take one of the dotplots above, as well as the entire list of words, and find the mean and standard deviation of the sample means. (Because there are so many words, two groups should separately determine the overall mean and standard deviation and check each other.) Post a table similar to the one below on the board so that groups can post their results. Record all groups’ result here. Mean Standard Deviation Shape of the Distribution Population Samples of 5 Samples of 10 Samples of 25 i. Previously, we stated that if samples were taken from a normal distribution, that the mean and standard deviation of the sampling distribution of sample means was also normal Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 25 of 39 Mathematics IV with Unit 1 . In this activity, we did not begin with a normal distribution. n However, compare the means for the samples of 5, 10, and 25 with the overall mean of the word lengths. Then compare the standard deviations with the standard deviation of all the word lengths. Do these formulas appear to hold despite the population of word lengths being obviously non-normal? Explain. x and 1st Edition x j. This brings us to the Central Limit Theorem (CLT) for Sample Means Choose a simple random sample of size n from any population, regardless of the original shape of the distribution, with mean and finite standard deviation . When n is large, the sampling distribution of the sample mean x is approximately normal with mean ______ and standard deviation ________. Note: The statement “when n is large” seems a bit ambiguous. A good rule of thumb is that the sample size should be at least 30, as we can see in the dotplots above. When the sample sizes were sample, i.e. 5 and 10, the plots were still quite right skewed. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 26 of 39 Mathematics IV Unit 1 1st Edition 3. Applying the CLT for Sample Means Whenever we have a normal distribution, we are able to use normal tables and normal calculations to find probabilities. Due to the CLT, we can use normal calculations to determine probabilities about sample means drawn from large samples. a. First, we need to recall how to standardize scores, that is, find z-scores. State the formula for z-scores that you learned in Math 3. Recall that whenever we standardize values, we take the estimate (or statistic) minus the corresponding parameter and divide that difference by the corresponding standard deviation. We will standardize sample means in the same way. Substitute the statistics and parameters for sample means into the z-score formula to obtain our standardization formula for sample means. b. Consider an IQ test with scores that vary according to a normal distribution with a mean of 100 and a standard deviation of 15. Find the probability that a single person scores higher than 110. c. Suppose you take a sample of 20 individuals who took this IQ test. What are the mean and standard deviation of the sampling distribution of the average score of this size sample? d. Find the probability that the mean score of these 20 people is greater than 110. e. Find the probability that the mean score of these 20 people is between 95 and 110. f. How would you expect your answers to parts d and e would be different if the sample size were 50 instead of 20? g. Calculate these probabilities and comment on your conjecture. h. Which of your answers, if any, to parts b, c, d, e, and g would be affected if the distribution of the candy bar weights was not normally distributed? Explain. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 27 of 39 Mathematics IV Unit 1 1st Edition The Gettysburg Address Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this. But, in a larger sense, we cannot dedicate -- we can not consecrate -- we can not hallow -- this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us -that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion -- that we here highly resolve that these dead shall not have died in vain -- that this nation, under God, shall have a new birth of freedom -- and that government of the people, by the people, for the people, shall not perish from the earth. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 28 of 39 Mathematics IV Unit 1 1st Edition Gettysburg Address Word List (page 1) Number 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 Word Four score and seven years ago. our fathers brought forth upon this continent a new nation: conceived in Liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in A great civil war, testing whether that nation, or any Length 4 5 3 5 5 3 3 7 7 5 4 4 9 1 3 6 9 2 7 3 9 2 3 11 4 3 3 3 7 5 3 2 3 7 2 1 5 5 3 7 7 4 6 2 3 Number 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 068 069 070 071 072 073 074 075 076 077 078 079 080 081 082 083 084 085 086 087 088 089 090 Word nation so conceived and So dedicated, Can Long endure. We Are met on A great battlefield of That war. We Have Come To dedicate a portion Of That field as A final resting place For those Who Here Gave their lives That That nation might Length 6 2 9 3 2 9 3 4 5 2 3 3 2 1 5 11 2 4 3 2 4 4 2 8 1 7 2 4 5 2 1 5 7 5 3 5 3 4 4 5 5 4 4 6 5 Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 29 of 39 Number 091 092 093 094 095 096 097 098 099 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 Word live. It is altogether fitting and proper that we should do this. But in a larger sense, we cannot dedicate, we cannot consecrate, we cannot hallow this ground. The brave men, living and dead, who struggled here have consecrated it, far above our poor power Length 4 2 2 10 7 3 6 4 2 6 2 4 3 2 1 6 5 2 6 8 2 6 10 2 6 6 4 6 3 5 3 6 3 4 3 9 4 4 11 2 3 5 3 4 5 Mathematics IV Unit 1 1st Edition Gettysburg Address Word List (page 2) Number 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 Word to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here Length 2 3 2 7 3 5 4 6 4 3 4 8 4 2 3 4 3 2 3 5 6 4 4 3 4 2 2 3 2 3 6 6 2 2 9 4 2 3 10 4 5 4 3 6 4 Number 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 Word Have Thus Far So Nobly advanced. It Is rather For Us Here to Be dedicated To The Great Task remaining before us, That From These honored Dead We Take increased devotion to That Cause To Which They Gave The Last Full measure Of devotion, That Length 4 4 3 2 5 8 2 2 6 3 2 4 2 2 9 2 3 5 4 9 6 2 4 4 5 7 4 2 4 9 8 2 4 5 2 5 4 4 3 4 4 7 2 8 4 Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 30 of 39 Number 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 Word we here highly resolve that these dead shall not have died in vain, that this nation, under God, shall have a new birth of freedom, and that government of the people, by the people, for the people, shall not perish from the earth. Length 2 4 6 7 4 5 4 5 3 4 4 2 4 4 4 6 5 3 5 4 1 3 5 2 7 3 4 10 2 3 6 2 3 6 3 3 6 5 3 6 4 3 5 Mathematics IV Unit 1 1st Edition CONFIDENCE INTERVALS LEARNING TASKS: Statistical Inference provides methods for drawing conclusions about a population from sample data. The two major types of statistical inference are confidence intervals and tests of significance. We will focus only on understanding and using confidence intervals in Math 4. Tests of significance are a major focus of Statistics courses. 1. The Empirical Rule a. Suppose that the mean SATM score for seniors in Georgia was 550 with a standard deviation of 50 points. Consider a simple random sample of 100 Georgia seniors who take the SAT. Describe the distribution of the sample mean scores. b. What are the mean and standard deviation of this sampling distribution? c. Use the Empirical Rule to determine between what two scores 68% of the data falls, 95% of the data falls, and 99.7% of the data falls. For the 95% interval, this means that in 95% of all samples of 100 students from this population, the mean score for the sample will fall within ___ standard deviations of the true population mean or ____ points from the mean. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 31 of 39 Mathematics IV Unit 1 1st Edition 2. Confidence Intervals In the above problem, we took the mean and added/subtracted a certain number of standard deviations. That is, we calculated x 3 x x 3 n x 2 x x 2 n for the 95% interval and for the 99.7% interval. The interval of numbers found, i.e. (540, 560) is called a 95% confidence interval for the population mean. Above, we knew the population mean, but in practice, we often do not. So we take samples and create confidence intervals as a method of estimating the true value of the parameter. When we find a 95% confidence interval, we believe with 95% confidence that the true parameter falls within our interval. However, we must accept that 5% of all samples will give intervals which do not include the parameter. Every confidence interval takes the same shape: estimate 560), the margin of error is 10. margin of error. In the interval (540, The margin of error has two main components: the number of standard deviations from the mean (i.e. the z-score) and the standard deviation. (Margin of error = z .) Because we do not usually know the details of a population parameter (e.g. mean and standard deviation), we must use estimates of these values. So our margin of error becomes m = z( estimate). Therefore, the confidence interval becomes estimate margin of error estimate z( estimate). The z-score used in the confidence interval depends on how confident one wants to be. There are a few common levels of confidence used in practice: 90%, 95%, and 99%. The Empirical Rule provides estimates for the amount of data within specified numbers of standard deviations, and therefore, can help us find approximate intervals for being 68%, 95%, and 99.7% confident that we have included the true population parameter. Let’s find closer estimates for the number of standard deviations from the mean within which certain percentages of data lie. a. Within how many standard deviations of the mean would one locate the middle 95% of the data? (Hint: Draw a picture and use the normal table or invNorm on your calculator.) b. Within how many standard deviations of the mean would one locate the middle 90% of the data? (Hint: Draw a picture and use the normal table or invNorm on your calculator.) c. For the common confidence levels, then, we have the following z-scores, called z*. Complete the following table using your answers above. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 32 of 39 Mathematics IV Unit 1 Confidence level 90% 95% 99% 1st Edition z* 2.576 For any confidence intervals you are expected to compute by hand in Math 4, you will use these z* values. Thus, our final form of the confidence interval is estimate z*( estimate). You will continue investigating confidence intervals and, specifically, margin of error, through the next two activities. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 33 of 39 Mathematics IV Unit 1 1st Edition 3. President’s Approval Ratings: Part 1 (Confidence Intervals for Proportions) a. In the article at the beginning of this unit, Bush’s approval rating was 41 percent with a margin of error of plus or minus 3 points. Write this statement as a confidence interval. b. The article does not state the confidence level for the ratings. Use the margin of error to determine the standard error of the estimate (standard deviation) if a 95% confidence level was used. What is the standard error if a 99% confidence level was used? c. What is the formula for calculating the standard deviation of sample proportions? In practice, we do not know the true population parameter. Instead, we must estimate using the sample proportion. Rewrite this equation using the symbol for the sample proportion. This is called the standard error of the sample proportion. SE = p̂ d. If the Gallup poll in the article used a 95% confidence level, what was the sample size? What if the poll used a 99% confidence level? (Use the answers to b and c. Recall that the sample proportion was 41%.) Refer to the article. How many people were polled? What does that tell you about the confidence level employed? e. Assume the pollsters were undecided about how many citizens to include in their poll. However, they knew that they wanted the margin of error to be 3 percent or less and they wanted to be 95% confident in their results. That is, they wanted z * pˆ 1 pˆ n m or 1.96 pˆ 1 pˆ n .03 . What value of p will always give the greatest margin of error? Make a conjecture. Explain your reasoning. Compare with a neighbor. Call this value p*. Use the value of p* you found to determine the sample size needed by the pollsters. f. Increase the sample sizes you found in part e by 500. Still using your p*, determine the margin of error in both the 95% and 99% cases for intervals with these new sample sizes. How different is the margin of error? (Margin of error = z*SEestimate.) What do you notice? g. Decrease the sample sizes you found in part e by 500. Still using your p*, determine the margin of error in both the 95% and 99% cases for intervals with these new sample sizes. How different is the margin of error? h. Finally, still using your p*, determine the margin of error in both the 95% and 99% cases for intervals with sample sizes of 200 people. How different is the margin of error? (Margin of error = z*SEestimate.) What do you notice? Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 34 of 39 Mathematics IV Unit 1 1st Edition i. Based on your answers to f , g, and h, how does the margin of error behave when the 1 sample size changes? Use a graph of y to support your answer. (How does x y 1 relate to our problem?) x j. Look back through these examples, how does the confidence level affect the margin of error? Why does this make sense in the context of confidence intervals? Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 35 of 39 Mathematics IV Unit 1 1st Edition 4. President’s Approval Ratings: Part 2 (Confidence Intervals for Proportions) As we saw in the last problem, the form of a confidence interval for the population proportion p pˆ 1 pˆ is estimate z*( estimate) or pˆ z * . n a. Suppose you are planning to conduct a presidential approval survey in your school. You are planning to construct a 95% confidence interval of the proportion of students who approve of the job the president is doing and you want a margin of error of no more than 5 points. Determine the minimum number of people you need to survey. b. Suppose you surveyed 50 students and 34 said that they approve of how the president is doing. Construct 90%, 95%, and 99% confidence intervals for the true proportion of students in your school who approve of how the president is doing. Be sure to specifically calculate the margin of error. c. Explain, in context, what it means to be 95% confident. Are you sure that the true proportion is within this interval? Explain. d. Based on the results above, write a sentence or two about how the students at the school view how well the president is doing his job. Back up any conclusions. e. We can also use a calculator to construct confidence intervals. On your TI calculator, use the following keystrokes: STAT – TESTS – 1-PropZInt. What do you think 1-PropZInt means? Using the information from b, find the 95% confidence interval. How does this interval compare with the one you found previously? f. Suppose you know that a sample of 350 students was collected and that 58% gave the president a positive approval rating. Explain how to use the calculator to find a 90% confidence interval for the true proportion of students at the school who would give a positive approval rating. Find the confidence interval. g. The calculator makes finding confidence intervals quite easy; however, you do not always obtain all the information you want from the calculator output. For example, you are not given the margin of error in the output. How can you use the output to determine the margin of error? Illustrate, using an 88% confidence interval for the survey in this problem (x = 34, n = 50). h. Let’s do a simulation. Suppose that 55% of the students in your school have a positive opinion of the president. We are going to generate a sample of 50 of those students. Input the following into your calculator: randBin(1, .55, 50)L1:sum(L1). Why is it appropriate to simulate this situation with a binomial distribution? Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 36 of 39 Mathematics IV Unit 1 1st Edition The 1 in the formula indicates a success. The 55% is the probability of success. 50 is the sample size. The values will be stored in List 1, but the calculator will also sum the values in List 1, returning only this sum. What does the sum represent? i. Now, construct a 95% confidence interval for the true proportion of positive approval ratings. Does 55% fall in your interval? Did you expect this? Explain. j. Repeat simulating and constructing confidence intervals four more times. List your confidence intervals below. k. As a class, make a tally of how many confidence intervals contained 55% and how many did not. Are you surprised at the result? Explain. l. The margin of error in the above confidence intervals is quite large. Name two ways we could reduce the margin of error. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 37 of 39 Mathematics IV Unit 1 1st Edition 5. Racing Hearts (Confidence Intervals for Means) For this next activity, we will turn our attention to finding confidence intervals for population means. a. State the estimate and standard deviation for estimating population means from sample means. b. Recall that the form of a confidence interval is estimate z*( estimate). Write the statement of the confidence intervals for means by substituting the values from above. c. Because we generally do not know the true mean and standard deviation, we must estimate them. What is the estimate for ? For the standard deviation, instead of , we use the sample standard deviation, s. Make the appropriate substitution in the confidence interval from above. s , the n distribution is no longer exactly a normal distribution. Instead, it takes on a t-distribution. The t-distribution is similar to the normal. To see this, graph the following. (The commands are under the DISTR menu.) d. When we do not know the population standard deviation and instead use Set your window to be the following: X[-5, 5] and Y[0, .5]. Y1 = normalpdf(X) Y2 = tpdf(X, 2) Y3 = tpdf(X,30) Y2 would be the t-distribution for a small sample size, whereas Y3 is the t-distribution for a sample size of 31. What do you notice about the graphs of Y1, Y2, and Y3? For the purpose of this course, we need to recognize that we are using a t-distribution. s This alters our confidence interval slightly to be x t * . We will use technology to n calculate these confidence intervals. e. Let’s collect some data! We want to estimate the average resting heart rate of a typical senior. Take your pulse for a full 60 seconds. Write it down. Form a group of five students and record each of their pulses. Input the data into List 1 of your calculator. f. Using the calculator, construct a 95% confidence interval for the average resting heart rate. To do this, use the following key strokes: STAT – TESTS – Tinterval. Because we have data, choose Data. Make sure that List 1 is indicated, with a frequency of 1, and a Clevel of 95. Record the mean, standard deviation, the confidence interval. g. Now, combine with another group of 5. Repeat part f with all 10 sets of data. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 38 of 39 Mathematics IV Unit 1 1st Edition h. As a class, collect at least 30 resting heart rates. Enter into L1 and calculate the confidence interval. Write a sentence or two reporting what you believe to be true about the average resting heart rate of seniors in your school. i. Suppose you were planning a larger study of heart rates. Based on these results, you estimated the standard deviation to be 4.9. You would like to form a 99% confidence interval and have a margin of error no more than 3 beats from the true mean. How many students must you include in your study? (Use z* instead of t*.) j. Suppose you did collect this data from 200 students. The average resting heart rate was 58.6 beats per minute and the standard deviation was 4.8. Use the calculator to find a 99% confidence interval for the mean resting heart rate of students at your school. Interpret your results in the context of the problem. (Note: For this problem, we have Stats instead of Data. Make the appropriate change in the TInterval screen.) 6. Synthesis of Confidence Intervals Write a brief review sheet or notes that you would give to an absent classmate outlining the major ideas about confidence intervals and how to construct them. Georgia Department of Education Kathy Cox, State Superintendent of Schools Copyright 2010 © All Rights Reserved Unit 1: Page 39 of 39