Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
2016 version 1 AP Statistics Chapter 2 – Modeling Distributions of Data Chapter 2 Test Standards: Here’s what’s on the test, question by question: Multiple Choice Section: #1. I can use percentiles to locate individual values within distributions of data. #2. I can correctly interpret a percentile rank in the context of a given situation. #3. I can use z-scores to make comparisons and form conclusions due to those z-score values. #4. I can interpret a Normal probability plot to assess normality of a distribution of data. #5. I can describe the effect of adding, subtracting, multiplying by, or dividing by a constant on the shape, center, and spread of a distribution of data. #6. I can look at different dot plots displaying distributions of data and determine which one is best approximated by a normal distribution. #7. I can use the standard Normal distribution (z-scores) to calculate the proportion of values in a specified interval (< or >). #8. I can use the standard Normal distribution (z-scores) to calculate the proportion of values in a specified interval (< or >). #9. I can interpret a cumulative relative frequency graph and calculate the IQR for a distribution using that graph. 2016 version 2 #10. I can describe the effect of adding, subtracting, multiplying by, or dividing by a constant on the values from a distribution of data and determine which of those values and change and which ones do not. #11. I can use the standard Normal distribution to determine a z-score from a percentile or quartile value (via INVNORM). #12. I can use the standard Normal distribution to determine a z-score from a percentile or quartile value (via INVNORM). Free Response Section: Free Response Question #1. I can use z-scores to make comparisons and form conclusions due to those z-score values. I can use the standard Normal distribution to determine a zscore from a percentile or quartile value (via INVNORM). Free Response Question #2. I can use the standard Normal distribution (z-scores) to calculate the proportion of values in a specified interval (< or >). I can use the standard Normal distribution (z-scores) to calculate the proportion of values in a specified interval (for values ‘between’). I can use the standard Normal distribution (z-scores) to calculate the probability of an event for values in a specified interval (< or >). 2016 version 3 Day #1: Introduction to Percentile; Describing Location in a Distribution I can use percentiles to locate individual values within distributions of data. During this lesson, we will learn how to describe the ‘location’ in a distribution. Another way to state this is by saying we are going to ‘measure position’. Vocabulary: Percentile (measuring relative location/position of data) – the ‘pth‘ percentile of a distribution is the value with ‘p’ percent of the observations less than it. Example: Wins in Major League Baseball The stem plot below shows the number of wins for each of the 30 Major League Baseball teams in 2009. 5 9 6 2455 7 00455589 8 0345667778 9 123557 10 3 Key: 5|9 represents a team with 59 wins. 2016 version 4 Problem: Find the percentiles for the following teams: (a) The Colorado Rockies, who won 92 games. (b) The New York Yankees, who won 103 games. (c) The Kansas City Royals and Cleveland Indians, who both won 65 games. Practice Multiple Choice Question: #1. The dotplot below displays the total number of miles that the 28 residents of one street in a certain community traveled to work in one five-day work week. Which of the following is closest to the percentile rank of a resident from this street who traveled 85 miles to work that week? a. 60 b. 70 c. 75 d. 80 e. 85 2016 version 5 EXAMPLE: Percentiles… continued #1. Using the definition of percentile that says “the percent of observations strictly below the given value”, what percentile rank is a female who has a temperature of 98.0 degrees F? #2. Using the definition of percentile that says “the percent of observations at or below the given value”, what percentile rank is a female who has a temperature of 98.0 degrees F? #3. Using the definition of percentile that says “the percent of observations strictly below the given value”, what percentile rank is a male who has a temperature of 97.4 degrees F? 2016 version #4. Using the definition of percentile that says “the percent of observations at or below the given value”, what percentile rank is a male who has a temperature of 97.4.0 degrees F? #5. State the temperature of a female that would rank at the 64th percentile using the definition: “the percent of observations strictly below the given value.” #6. State the temperature of a male that would rank at the 93rd percentile using the definition: “the percent of observations at or below the given value.” 6 2016 version 7 I can interpret a cumulative relative frequency graph. We will now look at a graph that helps us determine percentile ranks. This graph is known as a ‘cumulative relative frequency graph.’ Some textbooks, refer to it as an ‘ogive.’ Example: State Median Household Incomes Here is a table showing the distribution of median household incomes for the 50 states and the District of Columbia. Median Cumulative Relative Cumulative Income Frequency Relative Frequency Frequency ($1000s) Frequency 35 to < 40 1 1/51 = 0.020 1 1/51 = 0.020 40 to < 45 10 10/51 = 0.196 11 11/51 = 0.216 45 to < 50 14 14/51 = 0.275 25 25/51 = 0.490 50 to < 55 12 12/51 = 0.236 37 37/51 = 0.725 55 to < 60 5 5/51 = 0.098 42 42/51 = 0.824 60 to < 65 6 6/51 = 0.118 48 48/51 = 0.941 65 to < 70 3 3/51 = 0.059 51 51/51 = 1.000 Here is the cumulative relative frequency graph for the income data. The point at (50,0.49) means 49% of the states had median household incomes less than $50,000. The point at (55, 0.725) means that 72.5% of the states had median household incomes less than $55,000. Thus, 72.5% - 49% = 23.5% of the states had median household incomes between $50,000 and $55,000 since the cumulative relative frequency increased by 0.235. Due to rounding error, this value is slightly different than the relative frequency for the 50 to <55 category. 2016 version 8 Problem: Use the cumulative relative frequency graph for the state income data to answer each question. (a) At what percentile is California, with a median income of $57,445? (b) Estimate and interpret the first quartile of this solution. (c) Between what two values would the MIDDLE 50% lie? (d) What is the IQR of the data? (e) What percentile rank would a median household income of $45,000 be? 2016 version 9 Day #2: Review of Percentiles and Describing Location in a Distribution; Introduction of Z-scores I can use percentiles to locate individual values within distributions of data. A small AP class has a test and the test scores are the following: 54 82 58 84 60 89 68 94 72 96 78 97 What is a percentile rank a measure of? What percentile rank is the person who scored an 82%? Interpret this rank. Which number(s) are at the 25th, 50th and 75th percentile ranks? 2016 version 10 I can interpret a cumulative relative frequency graph (ogive). Below is a cumulative relative frequency graph for the lengths, in minutes, of 200 songs recorded by the Rolling Stones. What are the median and interquartile range of song lengths? Draw lines on the graph to show how you arrived at your answers. What is the 90th percentile? What percentile would 5.5 minutes rank at? 2016 version 11 Introduction of Z-Scores: How to compare an ‘apple’ to an ‘orange’: z-scores!!!! I can find the standardized value (z-score) of an observation. Interpret z-scores in context. If ‘x’ is an observation from a distribution that has a mean µ and standard deviation σ the standardized value of x is: A standardized value is often called a ________________. A _________________ tells us how many standard deviations the original observation falls away from the mean, and in which direction. 2016 version 12 When we ‘standardize’ a variable with a normal distribution, it produces a new variable that has what we call the: Example: Wins in Major League Baseball In 2009, the mean number of wins was 81 with a standard deviation of 11.4 wins. Problem: Find and interpret the z-scores for the following teams. (a) The New York Yankees, with 103 wins. (b) The New York Mets, with 70 wins. 2016 version 13 We also use z-scores to give data a common scale. Without looking back at the previous page, try to recall the formula for a Z-Score: EXAMPLE: Home run kings The single-season home run record for major league baseball has been set just three times since Babe Ruth hit 60 home runs in 1927. Roger Maris hit 61 in 1961, Mark McGwire hit 70 in 1998 and Barry Bonds hit 73 in 2001. In an absolute sense, Barry Bonds had the best performance of these four players, since he hit the most home runs in a single season. However, in a relative sense this may not be true. Baseball historians suggest that hitting a home run has been easier in some eras than others. This is due to many factors, including quality of batters, quality of pitchers, hardness of the baseball, dimensions of ballparks, and possible use of performance-enhancing drugs. To make a fair comparison, we should see how these performances rate relative to others hitters during the same year. Problem: Compute the standardized scores for each performance. Which player had the most outstanding performance relative to his peers? 2016 version Year 14 Player 1927 Babe Ruth HR Mean SD 60 7.2 9.7 1961 Roger Maris 61 18.8 13.4 1998 Mark McGwire 70 20.7 12.7 2001 Barry Bonds 73 21.4 13.2 z-score 2016 version 15 EXAMPLE #1: SAT vs. ACT Student A takes the SAT in math and scores a 680, while Student B takes the ACT and scores a 27 on the same portion of the test. The scores of the SAT math test are N(500, 100) and the scores of the ACT math test are N(18,6). Which student (A or B) has the higher score? EXAMPLE #2: SAT vs. ACT Student A takes the SAT in reading and scores a 645, while student B takes the ACT and scores a 26 on the same portion of the test. The scores of the SAT reading test are N(510, 85) and the scores of the ACT reading test are N(19, 5). Which student (A or B) has the higher score? 2016 version 16 MULTPLE CHOICE PRACTICE: #1. One of the values in a normal distribution is 58 and its corresponding z-score is 2.08. If the mean of the distribution is 53, what is the standard deviation of the distribution? a. 5 b. 0.416 c. 2.40 d. -2.40 e. -0.416 #2. The weight of adult male grizzly bears living in the wild in the continental United States is approximately normally distributed with a mean of 500 pounds and a standard deviation of 50 pounds. The weight of adult female grizzly bears is approximately normally distributed with a mean of 300 pounds and a standard deviation of 40 pounds. Approximately, what would be the weight of a female grizzly bear with the same standardized score (z-score) as a male grizzly bear with a weight of 530 pounds? a. 276 pounds b. 324 pounds c. 330 pounds d. 340 pounds e. 530 pounds 2016 version 17 Day #3: Transforming Data I can describe the effect of adding, subtracting, multiplying by, or dividing by a constant on the shape, center, and spread of a distribution of data. EXAMPLE: Test Scores Here are a graph and table of summary statistics for a sample of 30 test scores. The maximum possible score on the test was 50 points. s x Min Q1 M Q3 Max IQR Range n x Score 30 35.8 8.17 12 32 37 41 48 9 36 Suppose that the teacher was nice and added 5 points to each test score. How would this change the shape, center, and spread of the distribution? 2016 version 18 Here are graphs and summary statistics for the original scores and the +5 scores: Score s x Min Q1 M Q3 Max IQR Range n x 30 35.8 8.17 12 32 37 41 48 9 36 Score + 5 30 40.8 8.17 17 37 42 46 53 9 36 From both the graph and summary statistics, we can see that the measures of center and measures of position all increased by 5. However the shape of the distribution did not change nor did the spread of the distribution. 2016 version 19 Suppose that the teacher wanted to convert the original test scores to percent’s. Since the test was out of 50 points, he should multiply each score by 2 to make them out of 100. Here are graphs and summary statistics for the original scores and the doubled scores. n Score x sx 30 35.8 8.17 Min Q1 M Q3 Max IQR Range 12 32 37 41 48 9 36 Score x 2 60 71.6 16.34 24 64 74 82 96 18 72 From the graphs and summary statistics we can see that the measures of center, location, and spread all have doubled, just like the individual observations. But even though the distribution is more spread out, the shape hasn’t changed. It is still skewed to the left with the same clusters and gaps. 2016 version 20 EXAMPLE: Taxi Cabs In 2010, Taxi Cabs in New York City charged an initial fee of $2.50 plus $2 per mile. In equation form, fare = 2.50 + 2(miles). At the end of a month a businessman collects all of his taxi cab receipts and calculates some numerical summaries. The mean fare he paid was $15.45 with a standard deviation of $10.20. What are the mean and standard deviation of the lengths of his cab rides in miles? 2016 version 21 EXAMPLE: Song Lengths According to these data, the mean song length was 4.23 minutes, and the standard deviation was 1.38 minutes. A music lover who wants to create a mix of songs wants to have 5 seconds of silence between songs, so he needs to add five seconds to the length of each song. He also wants to express the times in seconds, rather than minutes. Find the mean and standard deviation of the transformed data. What are the mean and standard deviation of the z-scores of song lengths? Justify your answer. 2016 version 22 EXAMPLE: Height, in meters is measured for each person in a sample. After the data are collected, all the height measurements are converted from meters to centimeters by multiplying each measurement by 100. What statistics will remain the same for both units of measure? Which ones will change? MULTIPLE CHOICE PRACTCE QUESTON: Suppose the distribution of a set of scores has a mean of 28 and a standard deviation of 6. If 3 is added to each score, what will be the mean and the standard deviation of the distribution of new scores? Mean Standard Deviation a. 31 10 b. 31 6 c. 28 10 d. 28 18 e. 28 6 2016 version 23 EXAMPLE: FREE RESPONSE QUESTION: (from 2007 AP Statistics Exam (Form B) #1) The Better Business Council of a large city has concluded that students in the city’s schools are not learning enough about economics to function in the modern world. These findings were based on test results from a random sample of 20 twelfth-grade students who completed a 46-question-multiple-choice test on basic economic concepts. The data set below shows the number of questions that each of the 20 students in the sample answered correctly. 12 16 18 17 18 33 41 44 38 35 19 36 19 13 43 8 16 14 10 9 a. Display these data in a stemplot. b. Use your stemplot from part (a) to describe the main features of this score distribution. c. Why would it be misleading to report only a measure of center for this score distribution? 2016 version 24 2016 version 25 Day #4: Intro to Density Curves I can approximately locate the median (equal-areas point) and the mean (balance point) on a density curve. Density Curves Density Curves – 3 things to know about them Describes the overall distribution. Always on or above the horizontal axis. The area is exactly 1 underneath it. EXAMPLE OF A DENSITY CURVE: Imagine a histogram underneath the density curve shown: 2016 version EXAMPLE: #1: “Finding Means and Medians.” The figures below display three density curves, each with three points indicated. At which of these points on each curve do the mean and the median fall? a. Graph A: i. Describe the shape: ii. Mode: iii. Median: iv. Mean: b.Graph B: i. Describe the shape: ii. Mode: iii. Median: iv. Mean: c. Graph C: i. Describe the shape: ii. Mode: iii. Median: iv. Mean: 26 2016 version EXAMPLE: Unusual distribution Describe the SHAPE of this density curve. Mark with vertical lines the mean, median, and mode. 27 2016 version 28 MULTIPLE CHOICE PRACTICE: For the following histogram, what is the proper ordering of the mean, median, and mode? Note that the graph is NOT numerically precise – only the relative positions are important. a. I = mean, II = median, III = mode b. I = mode, II = median, III = mean c. I = median, II = mean, III = mode d. I = mode, II = mean, III = median e. I = mean, II = mode, III = median HOW TO THINK OF THE MEAN & MEDIAN: Median of density curves – Mean of density curves – 2016 version 29 EXAMPLE: “Biking Accidents” Accidents on a level, 3 mile bike path occur uniformly along the length of the path. The figure below displays the density curve that describes the uniform distribution of accidents. a. Explain why this curve satisfies the two requirements for a density curve. b. The proportion of accidents that occur in the first mile of the path is the area under the density curve between 0 miles and 1 mile. What is the area? 2016 version 30 c. Sue’s property adjoins the bike path between the 0.8 mile mark and the 1.1 mile mark. What proportion of accidents happen in front of Sue’s property? Explain. d. What is the mean of the density curve pictured above? e. What is the median? f. What proportion of accidents occur between the 0.5 mile mark and 1.2 mile mark OR between the 2.7 mile mark and 3.0 mile mark. 2016 version Examples of how density curves would model various distributions: 31 2016 version REVIEW EXAMPLE: 400 Meter Sprint: Both male and female athletes competed in a 400 meter sprint. The male times were N(51.02, 0.25) and the females were N(51.95, 0.35). (All times are in seconds) Sarah ran the sprint in 51.89. Josh ran the sprint in 51.53 Which athlete did better? 32 2016 version 33 Day #5: Normal Curves & Distributions Symbols for mean and standard deviation: Normal Curves -- _________________ in shape. A ‘normal curve’ describes a ________________ distribution. Characteristics: ______________________________ ______________________________ ______________________________ The mean is located at the same place on the curve as the _________________. 2016 version 34 Normal curves can come in different shapes & sizes, as shown below. These two normal curves show the mean and standard deviation. The standard deviation σ controls the spread of a Normal curve. Curves with larger standard deviations are MORE spread out. Normal Distribution: is described by a ‘normal’ curve. Any particular Normal distribution is specified by TWO numbers: The mean of a Normal distribution is at the center of the symmetric Normal curve. The standard deviation is the distance from the center of the inflection points on either side. We abbreviate the Normal distribution as follows: 2016 version 35 Let’s take a look at the standard normal distribution: 2016 version 36 I can use the 68–95–99.7 rule to estimate the percent of observations from a Normal distribution that fall in an interval involving points one, two, or three standard deviations on either side of the mean. 68-95-99.7% Rule AKA: ‘Empirical Rule’ _____ of all observations fall within _______ from ______. _____ of all observations fall within _______ from ______ _____ of all observations fall within _______ from ______ 2016 version 37 EXAMPLE: Potato Chips The distribution of weights of 9 ounce bags of a particular brand of potato chips is approximately Normal with a mean µ = 9.12 ounces and a standard deviation of σ = 0.05 ounces. a. Shade the region that is TWO standard deviations above the mean. b. What percent of bags weight less than 9.02 ounces? c. Between what weights do the middle 68% of bags fall? d. What percent of 9 ounce bags of this brand of potato chips weight between 8.97 and 9.17 ounces? e. A bag that weighs 9.07 ounces is at what percentile in this distribution? 2016 version 38 EXAMPLE: Batting Averages The histogram below shows the distribution of batting average (proportion of hits) for the 432 Major League Baseball players with at least 100 plate appearances in the 2009 season. The smooth curve shows the overall shape of the distribution. Describe the distribution above. Describe the shape of the histogram. 2016 version 39 Example: Batting Averages In the previous alternate example about batting averages for Major League Baseball players in 2009, the mean of the 432 batting averages was 0.261 with a standard deviation of 0.034. Suppose that the distribution is exactly Normal with μ = 0.261 and σ = 0.034. Problem: (a) Sketch a Normal density curve for this distribution of batting averages. Label the points that are 1, 2, and 3 standard deviations from the mean. 2016 version 40 (b) What percent of the batting averages are above 0.329? Show your work. (c) What percent of the batting averages are between 0.227 and .295? (d) What percent of the batting averages are greater than 0.159? (e) What percent of the batting averages are between 0.193 and 0.295? 2016 version 41 Day #6: Review of Density Curves/ Empirical Rule/ Standard Normal Calculations I can approximately locate the median (equal-areas point) and the mean (balance point) on a density curve. On the density curve below, draw two vertical lines where you think the median and the mean of the distribution are. Label each line, and describe in words what feature of the curve you are using to locate each measure. 2016 version MULTIPLE CHOICE Practice: “Cockroaches” The weights of laboratory cockroaches follow a Normal distribution with mean 80 grams and standard deviation of 2 grams. The figure below is the Normal curve for this distribution of weights. #1. Point C on this Normal curve corresponds to a. 84 g b. 82 g c. 78 g d. 76 g e. 74 g. #2. About what percent of the cockroaches have weights between 76 and 84 grams? a. 99.7% b. 95% c. 68% d. 47.5% e. 34% #3. About what percent of the cockroaches have weights less than 78 grams? a. 34% b. 32% c. 16% d. 2.5% e. none of these. 42 2016 version 43 I can use the standard Normal distribution to calculate the proportion of values in a specified interval. Finding Areas under the Standard Normal Curve #1. How do you use the standard normal table (Table A) to find the area under the standard normal curve to the left of a given z-value? #2. How do you use the standard normal table (Table A) to find the area under the standard normal curve to the right of a given z-value? 2016 version 44 #3. How do you use the standard normal table (Table A) to find the area under the standard normal curve between two given z-values? MULTIPLE CHOICE PRACTICE: A normal density curve has which of the following properties? a. It is symmetric. b. It has a peak centered above its mean. c. The spread of the curve is proportional to it standard deviation. d. All of the properties, (a) to (c), are correct. e. None of the properties, (a) to (c), is correct. 2016 version 45 Examples: Use the “Standard Normal Probabilities Table” (Table A) to find the proportion of observations from a standard normal distribution that satisfies each of the following statements. #1. z < 1.24 #2. z > -0.23 2016 version #3. –1.84 < z < 0.02 #4. –0.34 < z < 0.93 46 2016 version EXTRA PRACTICE: Using Table A (table of standard normal probabilities) or your TI-83 calculator, find the proportion of observations from a standard normal distribution that satisfies each of the following statements. In each case, shade the area under the standard normal curve to that is the answer to the question. a. Z < -2.25 b. -2.25 < Z < 1.77 47 2016 version c. Z > 0.83 d. 2.25 < Z < 1.77 48 2016 version 49 Example: Serving Speed In the 2008 Wimbledon tennis tournament, Rafael Nadal averaged 115 miles per hour (mph) on his first serves1. Assume that the distribution of his first serve speeds is Normal with a mean of 115 mph and a standard deviation of 6 mph. a. About what proportion of his first serves would you expect to exceed 120 mph? 2016 version b. What percent of Rafael Nadal’s first serves are between 100 and 110 mph? 50 2016 version 51 Day #7: Review of Standard Normal Calculations Multiple Choice Practice #1. Suppose the distribution of a set of scores has a mean of 28 and a standard deviation of 6. If 3 is added to each score, what will be the mean and the standard deviation of the distribution of new scores? Mean Standard Deviation a. 31 10 b. 31 6 c. 28 10 d. 28 18 e. 28 6 #2. Two measures of center are marked on the density curve shown. a. The median is at the solid line and the mean is at the dotted line. b. The median is at the dotted line and the mean is at the solid line. c. The mode is at the solid line and the median is at the solid line. d. The mode is at the solid line and the median is at the dotted line. e. The mode is at the dotted line and the mean is at the solid line. 2016 version 52 #3. A market research company employs a large number of typists to enter data into a computer. The time taken for new typists to learn the computer system is known to have a normal distribution with a mean of 90 minutes and a standard deviation of 18 minutes. The proportion of typists that take more than two hours to learn the computer system is a. 0.952 b. 0.548 c. 0.048 d. 0.452 #4. A company produces packets of soap powder labeled “Giant Size 32 Ounces.” The actual weight of soap powder in a box has a normal distribution with a mean of 33 ounces and a standard deviation of 0.7 ounces. What proportion of packets is underweight (i.e.: weights less than 32 ounces)? a. 0.0764 b. 0.2420 c. 0.7580 d. 0.9236 2016 version 53 EXAMPLE: 2002 AP STATISTICS EXAM #3: There are 4 runners on the New High School team. The team is planning to participate in a race in which each runner runs a mile. The team time is the sum of the individual times for the 4 runners. Assume that the individual times of the 4 runners are all independent of each other. The individual times, in minutes, of the runners in similar races are approximately normally distributed with the following means and standard deviations. Mean Standard Deviation Runner 1 4.9 0.15 Runner 2 4.7 0.16 Runner 3 4.5 0.14 Runner 4 0.15 Runner 3 thinks that he can run a mile in less than 4.2 minutes in the next race. Is this likely to happen? Explain 2016 version 54 EXAMPLE: 2006 AP STATISTICS EXAM #3: Golf balls must meet a set of five standards in order to be used in professional tournaments. One of these standards is distance traveled. When a ball is hit by a mechanical device, Iron Byron, with a 10-degree angle of launch, a backspin of 42 revolutions per second, and a ball velocity of 235 feet per second, the distance the ball travels may not exceed 290.7 yards. Manufacturers want to develop balls that will travel as close to the 290.7 yards as possible without exceeding that distance. A particular manufacturer has determined that the distances traveled for the balls it produces are normally distributed with a standard deviation of 2.6 yards. This manufacturer has a new process that allows it to set the mean distance the ball will travel. If the manufacturer sets the mean distance traveled to 288.5 yards, what is the probability that a ball that is randomly selected for testing will travel too far? 2016 version 55 EXAMPLE: GRE EXAMS: The Graduate Record Examinations are widely used to help predict the performance of applicants to graduate schools. The range of possible scores on a GRE is 200 to 900. The psychology department at a university finds that the scores of its applicants on the quantitative GRE are approximately Normal with mean = 544 and standard deviation = 103. #1. Make an accurate sketch of the distribution of these applicants’ GRE scores. Be sure to provide a scale on the horizontal axis. #2. Use the 68–95–99.7 rule to find the proportion of applicants whose score is between 338 and 853. 2016 version #3. What proportion of GRE scores are below 500? #4. What proportion of GRE scores are above 800? #5. What proportion of GRE scores are between 558 and 757? 56 2016 version 57 EXAMPLE: IQ SCORES: Scores on the Wechsler Adult Intelligence Scale (WAIS, a standard ‘IQ test’) for the 20 to 34 age group are approximately normally distributed with μ = 110 and σ = 25. Use the 68-95-99.7% rule to answer these questions. a. About what percent of people in this age group have scores above 110? b. About what percent have scores above 160? c. In what range do the middle 95% all of IQ scores lie? 2016 version 58 d. Consider percentile ranks: i. If someone’s score were reported as the 16th percentile, about what score would that individual have? ii. 84th percentile? iii. 97.5th percentile? e. What percent of people age 20 to 34 have IQ scores above 100? f. What percent have scores above 150? g. What percent have a score BETWEEN 68 and 115? 2016 version 59 EXAMPLE: 1999 AP Statistics Exam Question #4 A company is considering implementing one of two quality control plans for monitoring the weights of automobile batteries that it manufactures. If the manufacturing process is working properly, the battery weights are approximately normally distributed with a specified mean and standard deviation. Quality control plan A calls for rejecting a battery as defective if its weight falls more than 2 standard deviation below the specified mean. Quality control plan B calls for rejecting a battery as defective if its weight falls more than 1.5 inter quartile ranges below the lower quartile of the specified population. a. What proportion of batteries will be rejected by plan A? 2016 version b. What proportion of batteries will be rejected by plan B? 60 2016 version 61 Day #8: ‘Working Backwards’ with Normal Calculations (using INVNorm) I can use the standard Normal distribution to determine a z-score from a percentile. I can use Table A to find the percentile of a value from any Normal distribution and the value that corresponds to a given percentile. Use the “Standard Normal Probabilities Table” (Table A) to find the value z of a standard normal variable that satisfies each of the following conditions. (Use the value of z from Table A that comes closest to satisfying the condition.) #1. The point z with 30% of the observations falling below it. 2016 version #2. The point z with 74% of the observations falling above it. #3. The 40th percentile. 62 2016 version 63 EXAMPLE: GRE EXAMS: The Graduate Record Examinations are widely used to help predict the performance of applicants to graduate schools. The range of possible scores on a GRE is 200 to 900. The psychology department at a university finds that the scores of its applicants on the quantitative GRE are approximately Normal with mean = 544 and standard deviation = 103. Calculate and interpret the 34th percentile of the distribution of applicants’ GRE scores. EXAMPLE: IQ SCORES: Scores on the Wechsler Adult Intelligence Scale (WAIS, a standard ‘IQ test’) for the 20 to 34 age group are approximately normally distributed with μ = 110 and σ = 25. How high an IQ score is needed to be in the highest 25%? 2016 version 64 EXAMPLE: Heights of three-year-old females According to http://www.cdc.gov/growthcharts/, the heights of 3 year old females are approximately Normally distributed with a mean of 94.5 cm and a standard deviation of 4 cm. What is the third quartile of this distribution? MULTIPLE CHOICE!!! The time to complete a standardized exam is approximately normal with a mean of 70 minutes and a standard deviation of 10 minutes. How much time should be given to complete the exam so that 80% of the students will complete the exam in the time given? a. 84 minutes b. 78.4 minutes c. 92.8 minutes d. 79.8 minutes 2016 version 65 EXAMPLE: Basketball During the 2009-2010 basketball season, the number of points scored in each game by the Boston Celtics was approximately normally distributed with a mean of 99.2 points and a standard deviation of 10.5 points. a. What is the 33rd percentile of points scored by the Celtics? b. The mean number of points scored by Los Angeles Lakers was 101.7. In what proportion of their games did the Celtics score more than the Lakers’ mean score? 2016 version 66 EXAMPLE: #1. Find the proportion of observations from a standard Normal distribution that satisfies -1.51 < z < 0.84. Sketch the Normal curve and shade the area under the curve that is the answer to the question. #2. What z-score in a Normal distribution has 58% of all scores below it? #3. What z-score in a Normal distribution has 33% of all scores above it? 2016 version EXAMPLE: Hand Sanitizer A company produces packets of hand sanitizer that are deemed to be underweight if less than 31.5 ounces. The actual weight of the packets of sanitizer has a normal distribution with a mean of 32.8 oz. and a standard deviation of 0.9 oz. What proportion of packets are underweight (i.e., weigh less than 31.5 oz.)? 67 2016 version EXAMPLE: Inflection Points Consider the ‘inflection points’ of this normal curve and approximate the mean and standard deviation values. 68 2016 version 69 Day #9: Review of ‘Invnorm’/ Assessing Normality #1. In a large set of data that are approximately normally distributed. d is the value in the data set that has a z-score of 1.50 e is the value of the third quartile, and f is the value of the 80th percentile. What is the correct order from least to greatest for the values of e, f and d? #2. A distribution of scores is approximately normal with a mean of 82 and standard deviation of 4.6. What equation can be used to find the score ‘x’ above which 35 percent of the scores fall? 2016 version 70 #3. The weights of laboratory cockroaches follow a Normal distribution with mean 80 grams and standard deviation of 2 grams. How much would a cockroach weigh, if it ranks at the 80th percentile? #4. The weights of Russet potatoes are normally distributed with a mean of 1.3 pounds and a standard deviation of 0.37 pounds. What is the probability that a randomly selected potato will weigh more than 1.65 pounds? 2016 version Assessing Normality I can make an appropriate graph to determine if a distribution is bell-shaped. Normal Probability Plot – a graph that provides a good assessment of how _____________ a distribution is. Example: No Space in the Fridge? The measurements listed below describe the useable capacity (in cubic feet) of a sample of 36 side-by-side refrigerators. <source: Consumer Reports, May 2010> Are the data close to Normal? 12.9 13.7 14.1 14.2 14.5 14.5 14.6 14.7 15.1 15.2 15.3 15.3 15.3 15.3 15.5 15.6 15.6 15.8 16.0 16.0 16.2 16.2 16.3 16.4 16.5 16.6 16.6 16.6 16.8 17.0 17.0 17.2 17.4 17.4 17.9 18.4 Here is a histogram of these data. It seems roughly symmetric and bell shaped. 71 2016 version 72 I can use the 68-95-99.7 rule to assess Normality of a data set. The mean and standard deviation of these data are x = 15.825 and sx = 1.217. x 1sx x 2 sx x 3sx = (14.608, 17.042) 24 of 36 = 66.7% = (13.391, 18.259) 34 of 36 = 94.4% = (12.174, 19.467) 36 of 36 = 100% These percent’s are quite close to what we would expect based on the 68-95-99.7 rule. Combined with the graph, this gives good evidence that this distribution is close to Normal. 2016 version 73 I can interpret a Normal probability plot EXAMPLE: Here is a Normal probability plot (also called a Normal quartile plot) of the refrigerator data from the previous page. Interpret this normal probability plot. INTERPRET: 2016 version 74 EXAMPLE: State land areas Problem: The histogram and Normal probability plot below display the land areas for the 50 states. Is this distribution approximately Normal? 2016 version 75 EXAMPLE: NBA free throw percentage This is an example of a distribution that is skewed to the left. Notice that the lowest free throw percentages are too the left of what we would expect and the highest free throw percentages are not as far to the right as we would expect. Interpret this normal probability plot. 2016 version 76 EXAMPLE: A normal probability plot of the survival times of the guinea pigs in a medical experiment is shown below. Use this plot to describe the shape of the distribution of survival times. Then explain carefully how this shape is seen in the normal probability plot. 2016 version 77 #2. Multiple Choice Question: The plot shown is a normal probability plot for a set of data. The data value is plotted on the x-axis, and the standardized value is plotted on the y-axis. Which statement is true for this data set? a. The data are clearly normally distributed. b. The data are approximately normally distributed. c. The data are clearly skewed to the right. d. The data are clearly skewed to the left. e. There is insufficient information to determine the shape of the distribution. 2016 version #3. EXAMPLE: A Normal probability plot for the weights of 40 squirrels trapped and released on a college campus is shown below. Is the distribution of squirrel weights approximately Normal? Justify your answer. 78 2016 version 79 Day #10: FRAPPY PRACTICE/ Test Review AP Statistics – Chapter 2 Free Response Question The Dow Jones Industrial Average (“The Dow”) is an index measuring the stock performance of 30 large American companies, and is often used as a measure of overall economic growth in the United States. Below is Minitab output describing the daily percentage changes in the Dow for the first three months of 2009 and the first three months of 2010. (Note that the market was open for 61 days during the first three months of each year. A negative value indicates a percentage decrease in the index for that day). Descriptive Statistics: Dow 2009, Dow 2010 Variable N Mean Dow 2009 61 -0.198 2.331 Dow 20120 61 0.078 StDev Min 0.821 Q1 Median Q3 Max -4.660 -1.530 -0.310 1.150 6.820 -2.640 -0.270 0.110 0.465 1.660 Both distributions are approximately Normally distributed. #1. Consider a day when the Dow increased by 1%. In which year, 2009 or 2010, would such a day be considered a better day for the stock market, relative to other days in that year? Provide appropriate statistical calculations to support your answer. 2016 version #2. Based on these data, estimate the number of days in 2009 that the Dow decreased by more than 1% in these 61 days. #3. Estimate the 19th percentile of daily change for the first three months of 2010. 80 2016 version 81 AP Statistics – Chapter 2 Free Response Question RUBRIC/Scoring Criteria The Dow Jones Industrial Average (“The Dow”) is an index measuring the stock performance of 30 large American companies, and is often used as a measure of overall economic growth in the United States. Below is Minitab output describing the daily percentage changes in the Dow for the first three months of 2009 and the first three months of 2010. (Note that the market was open for 61 days during the first three months of each year. A negative value indicates a percentage decrease in the index for that day). Descriptive Statistics: Dow 2009, Dow 2010 Variable N Mean Dow 2009 61 -0.198 2.331 Dow 20120 61 0.078 StDev Min 0.821 Q1 Median Q3 Max -4.660 -1.530 -0.310 1.150 6.820 -2.640 -0.270 0.110 0.465 1.660 Both distributions are approximately Normally distributed. #1. Consider a day when the Dow increased by 1%. In which year, 2009 or 2010, would such a day be considered a better day for the stock market, relative to other days in that year? Provide appropriate statistical calculations to support your answer. To get an ‘E’, student must do BOTH of the following correctly: Student must properly calculate the two zscores. Student must properly decide which day is ‘considered a better day for the stock market.’ 2016 version 82 To get a ‘P’, student must do ONE of the components correctly. A student gets an ‘I’ if they do NEITHER of the components correctly. #2. Based on these data, estimate the number of days in 2009 that the Dow decreased by more than 1% in these 61 days. To get an ‘E’, student must do ALL THREE of the following correctly: Student must draw a normal curve correctly. Student must show a reasonable amount of ‘work’. Student must get the correct ANSWER. To get a ‘P’, student must do TWO of the THREE components correctly. A student gets an ‘I’ if they do EITHER ONE or NONE of the components correctly. #3. Estimate the 19th percentile of daily change for the first three months of 2010. To get an ‘E’, student must do BOTH of the following correctly: Student must show a reasonable amount of ‘work’. Student must get the correct ANSWER. To get a ‘P’, student must do ONE of the components correctly. A student gets an ‘I’ if they do NEITHER of the components correctly.