Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 6 Modeling Random Events: The Normal and Binomial Models Copyright © 2014 Pearson Education, Inc. All rights reserved Learning Objectives Be able to distinguish between discrete and continuous-valued variables. Know when a Normal model is appropriate and be able to apply the model to find probabilities. Know when the binomial model is appropriate and be able to apply the model to find probabilities. 6- 2 Copyright © 2014 Pearson Education, Inc. All rights reserved 6.1 Probability Distributions Are Models of Random Experiments Copyright © 2014 Pearson Education, Inc. All rights reserved Probability Models and Distributions A Probability Model is a description of how a statistician thinks data are produced. 6- 4 Uniform Linear Normal Other A Probability Distribution or Probability Distribution Function (pdf) is a table graph or formula that gives all the outcomes of an experiment and their probabilities. Copyright © 2014 Pearson Education, Inc. All rights reserved Discrete vs. Continuous A random variable is called Discrete if the outcomes are values that can be listed or counted. A random variable is called Continuous if the outcomes cannot be listed because they occur over a range. 6- 5 Number of classes taken The roll of a die Time to finish the exam Exact weight Copyright © 2014 Pearson Education, Inc. All rights reserved Discrete or Continuous Classify the following as discrete or continuous: of the left thumb →Continuous Number of children in a →Discrete the family Number of devices in the house that connect to the →Discrete Internet Sodium concentration in →Continuous the bloodstream Length 6- 6 Copyright © 2014 Pearson Education, Inc. All rights reserved Discrete Probability Distributions The most common way to display a pdf for discrete data is with a table. The probability distribution table always has two columns (or rows). The first, x, displays all the possible outcomes The second, P(x), displays the probabilities for these outcomes. 6- 7 Copyright © 2014 Pearson Education, Inc. All rights reserved Examples of Probability Distribution Tables Die Roll Raffle x P(x) Prize 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6 x P(x) 95 0.01 995 0.005 -5 0.985 The sum of all the probabilities must equal 1. 6- 8 Copyright © 2014 Pearson Education, Inc. All rights reserved Examples of Probability Distribution Graphs 6- 9 Copyright © 2014 Pearson Education, Inc. All rights reserved Examples of Probability Distribution Functions If 1/3 of the population is Hispanic and 20 randomly selected people from the population are chosen, then the probability distribution function for the number of x 20 x Hispanics selected is 20! 1 2 P( x) Probability distribution functions can be quite complicated! 6 - 10 x !(20 x)! 3 3 Copyright © 2014 Pearson Education, Inc. All rights reserved Continuous Data and Probability Distribution Functions Often represented as a curve. The area under the curve between two values of x represents the probability of x being between these two values. The total area under the curve must equal 1. The curve cannot lie below the x-axis. 6 - 11 Copyright © 2014 Pearson Education, Inc. All rights reserved Finding Probabilities for Uniform Distributions The curve above shows the probability distribution function for time to wait for a bus that comes every 12 minutes. Find the probability that you will wait between 5 and 10 minutes. Shade in the area. Find the area of the rectangle 6 - 12 P(5 < x <10) = Base x Height = 5 x 0.8333 = 0.41655 Copyright © 2014 Pearson Education, Inc. All rights reserved 6.2 The Normal Model Copyright © 2014 Pearson Education, Inc. All rights reserved The Normal Model The Normal Model is a good model if: The distribution is unimodal. The distribution is approximately symmetric. The distribution is approximately bell shaped. 6 - 14 The Normal Distribution is also called Gaussian. Copyright © 2014 Pearson Education, Inc. All rights reserved Center and Spread of the Normal Distribution 6 - 15 m stands for the center or mean of a distribution. s stands for the standard deviation of a distribution Note that the Greek letters m and s are used for distributions and x and s are used for sample data. Copyright © 2014 Pearson Education, Inc. All rights reserved Notation and Area N(6,2) means the normal distribution with mean m = 6 and standard deviation s = 2. The area under the normal curve, above the x-axis, and to the left of x = 4 represents P(x < 4). P(x < 4) = P(x ≤ 4) for a continuous variable. 6 - 16 Copyright © 2014 Pearson Education, Inc. All rights reserved Probability and StatCrunch 1. 2. 3. 4. 5. 6. 6 - 17 Draw a rough sketch of the normal curve along with the mean and area to be found. Go to Stat→ Calculators→ Normal. Type in the mean and Choose <= or >=, and the Std. Dev. Type in the value of x. Click on “Compute”. Copyright © 2014 Pearson Education, Inc. All rights reserved Using StatCrunch for Probability 6 - 18 Copyright © 2014 Pearson Education, Inc. All rights reserved Example: Baby Seals 6 - 19 Research has shown that the mean length of a newborn Pacific harbor seal is 29.5 in. and that s = 1.2 in. Suppose that the lengths follow the Normal model. Find the probability that a randomly selected pup will be more than 32 in. P(x > 32) ≈ 0.019 Copyright © 2014 Pearson Education, Inc. All rights reserved Finding the Probability Between Two Values Since StatCrunch can only handle probabilities involving “≤” and “≥”, use geometry to find the probability of x falling between two values. P(3 < x < 6) = P(x < 6) – P(x < 3) ≈ 0.9772 – 0.1587 = 0.8185 6 - 20 Copyright © 2014 Pearson Education, Inc. All rights reserved Finding Probabilities from Percentiles 6 - 21 Newborn seals in the bottom 10th percentile will probably not survive. Given that new born seals are N(29.5,1.2), find the 10th percentile. With StatCrunch put in 0.1 for the probability and click on Compute. The 10th percentile is about 27.96 in. Copyright © 2014 Pearson Education, Inc. All rights reserved The Normal Model and the Empirical Rule The Empirical Rule told us that if a distribution is approximately normal, then 68% of the data will fall within 1 standard deviation of the mean, 95% within 2, and 99.7% within 3. If the distribution is exactly normal, then these numbers are just the corresponding areas under the normal curve. 6 - 22 Copyright © 2014 Pearson Education, Inc. All rights reserved 6.3 The Binomial Model Copyright © 2014 Pearson Education, Inc. All rights reserved The Binomial Model The Binomial Model applies if: 1. 2. 3. 4. 6 - 24 There are a fixed number of trials. Only two outcomes are possible for each trial: Yes or No, Success or Failure, Heads or Tails, etc. The probability of success, p, is the same for each trial. The trials are independent. Copyright © 2014 Pearson Education, Inc. All rights reserved Binomial or Not? 40 randomly selected college students were asked if they selected their major in order to get a good job. 35 randomly selected Americans were asked what country their mothers were born. Not Binomial, more than two possible answers per trial. To estimate the probability that students will pass an exam, the professor records a study group’s success on the exam. 6 - 25 Binomial Not Binomial, since the outcomes are not independent. Copyright © 2014 Pearson Education, Inc. All rights reserved Surveys and Independence The Binomial Model may be used if the respondents of a survey are selected with replacement. If the selection is done without replacement and the population size is at least 10 times larger than the sample size, then we may still use the Binomial Model as an approximation. 6 - 26 Copyright © 2014 Pearson Education, Inc. All rights reserved Visualizing the Binomial Distribution If n is large and p is close to 0.5, then the binomial distribution is approximately normal 6 - 27 Copyright © 2014 Pearson Education, Inc. All rights reserved Words and Inequalities Exactly Less Than At Least More Than At Most 6 - 28 → → → → → = < => > <= Copyright © 2014 Pearson Education, Inc. All rights reserved Notice that “Less Than” and “At Least” are complements and “More Than” and “At Most” are Complements. Finding a Binomial Probability 12% of all US women will eventually develop breast cancer. If 30 women are randomly selected, what is the probability that exactly 4 of them will eventually develop breast cancer? b(30,0.12,4) =? StatCrunch: Stat→Calculators→Binomial n = 30, p = 0.12, P(x = 4) The probability that exactly 4 of the 30 women will develop breast cancer is about 0.20. 6 - 29 Copyright © 2014 Pearson Education, Inc. All rights reserved Finding a Binomial Probability 14% of all clothing bought online is returned. If an online retailer sells 35 items of clothing, what is the probability that: At least 5 will be returned? n = 35, p = .14, P(x => 5) ≈ 0.55 Fewer than 7 will be returned? n = 35, p = .14, P(x < 7) ≈ 0.79 More than 6 will be returned? n = 35, p = .14, P(x > 6) ≈ 0.21 At most 4 will be returned? n = 35, p = .14, P(x <= 4) ≈ 0.45 6 - 30 Copyright © 2014 Pearson Education, Inc. All rights reserved The Expected Value If we roll a six sided die 30 times then we would expect to roll a two 30 x 1/6 = 5 times. For a Binomial Distribution, m = np is called the mean or the expected value. A Binomial Distribution with n trials and probability of success p has standard deviation s np(1 p) 6 - 31 Copyright © 2014 Pearson Education, Inc. All rights reserved The Standard Deviation On any particular day, there is a 6% chance of a fatal accident in the city. Find the mean and standard deviation for the number of accidents in a (365 day) year. m = np = 365 x 0.06 = 21.9 s 365 0.06 0.94 4.5 We expect about 22 fatal accidents per year give or take four or five accidents. 6 - 32 Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 6 Case Study Copyright © 2014 Pearson Education, Inc. All rights reserved Too Heavy or Not? McDonald’s claims that its ice cream cones weigh 3.18 ounces. However, one of the authors bought five cones and found that all five weighed more than that. Is this surprising? Assume the ice cream weights are normally distributed or at least symmetric about the mean. 6 - 34 Copyright © 2014 Pearson Education, Inc. All rights reserved McDonald’s claims that its ice cream cones weigh 3.18 ounces. However, one of the authors bought five cones and found that all five weighed more than that. Is this surprising? n = 5, p = 0.5 P(x = 5) ≈ 0.031 If the mean is 3.18 ounces as McDonald’s claims, then there is only a 3.1% chance that out of 5 randomly chosen cones, they will all weigh above 3.18 ounces. With such a small probability, suspicion is raised. 6 - 35 Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 6 Guided Exercise 1 Copyright © 2014 Pearson Education, Inc. All rights reserved SAT Scores 6 - 37 According to data from the College Board, the mean quantitative SAT score for female college-bound high school seniors in 2009 was 500. SAT scores are approximately Normally distributed with a population standard deviation of 100. What percentage of the female college-bound high school seniors had scores above 675? Copyright © 2014 Pearson Education, Inc. All rights reserved N(500,100) To find the z-score for 675, subtract the mean and divide by the standard deviation. Report the z-score. z xm s 675 500 100 1.75 6 - 38 Copyright © 2014 Pearson Education, Inc. All rights reserved N(500,100) 6 - 39 Refer to the Normal curve. Explain why the SAT score of 500 is right below the z-score of 0. The dots on the axis mark the location of z-scores that are integers from -3 to 3. The mean is always at the center of the Normal curve. The mean SAT score is 500 and the mean z-score is 0 by definition. Copyright © 2014 Pearson Education, Inc. All rights reserved N(500,100) 6 - 40 Carefully sketch a copy of the curve. Pencil in the SAT scores of 200, 300, 400, 600, and 700 in the correct places. Notice that the mean is 500 and the standard deviation is 100. The z-scores are -3, -2, -1, and 1. Copyright © 2014 Pearson Education, Inc. All rights reserved N(500,100) 6 - 41 Draw a vertical line through the curve at 675. Just above the 675 (indicated on the graph with “???”). Put in the corresponding z-score. We want to find what percentage of students had scores above 675. Shade the area to the right of this boundary, because numbers to the right are larger. Copyright © 2014 Pearson Education, Inc. All rights reserved N(500,100) 6 - 42 Use StatCrunch to find the area to the right of this z-score P(z > 1.75) ≈ 0.04 Copyright © 2014 Pearson Education, Inc. All rights reserved N(500,100) P(z > 1.75) ≈ 0.04 Finally, write a sentence telling what you found. About 4% of the female college-bound high school seniors had scores above 675. 6 - 43 Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 6 Guided Exercise 2 Copyright © 2014 Pearson Education, Inc. All rights reserved SAT Scores 6 - 45 According to data from the College Board, the mean quantitative SAT score for female college-bound high school seniors in 2009 was 500. SAT scores are approximately Normally distributed with a population standard deviation of 100. A scholarship committee wants to give awards to college-bound women who score at the 96th percentile or above on the SAT. What score does an applicant need? Copyright © 2014 Pearson Education, Inc. All rights reserved SAT Scores: N(500,100) 6 - 46 A scholarship committee wants to give awards to college-bound women who score at the 96th percentile or above on the SAT. What score does an applicant need? Will the SAT test score be above the mean or below it? Explain. The 96th percentile is the score such that 96% of all scores are at or below this score. The 50th percentile is the mean, so the 96th percentile is above the mean. Copyright © 2014 Pearson Education, Inc. All rights reserved SAT Scores: N(500,100) 6 - 47 Label the curve with integer z-scores. The dots represent the position of integer z-scores from -3 to 3. Copyright © 2014 Pearson Education, Inc. All rights reserved SAT Scores: N(500,100) Use StatCrunch to find the z-score that has area to the left 0.96. 6 - 48 z ≈ 1.75 Copyright © 2014 Pearson Education, Inc. All rights reserved SAT Scores: N(500,100) , z ≈ 1.75 6 - 49 Add that z-score to the sketch and draw a vertical line above it through the curve. Shade the left side because the area to the left is what is given. Copyright © 2014 Pearson Education, Inc. All rights reserved SAT Scores: N(500,100) , z ≈ 1.75 6 - 50 Find the SAT score that corresponds to the z-score. The score should be z standard deviations above the mean, so x = m + zs = 500 + (1.75)(100) 675 Copyright © 2014 Pearson Education, Inc. All rights reserved SAT Scores: N(500,100) , z ≈ 1.75, x = 675 6 - 51 Copyright © 2014 Pearson Education, Inc. All rights reserved SAT Scores: N(500,100) , z ≈ 1.75, x = 675 Finally, write a sentence stating what you found. The applicant needs a score of at least 675 to receive the scholarship. 6 - 52 Copyright © 2014 Pearson Education, Inc. All rights reserved