Download X - jmullenkhs

+ Chapter 6: Random Variables Section 6.1 Discrete and Continuous Random Variables The Practice of Statistics, 4th edition – For AP* STARNES, YATES, MOORE + Chapter 6 Random Variables  6.1 Discrete and Continuous Random Variables  6.2 Transforming and Combining Random Variables  6.3 Binomial and Geometric Random Variables + Section 6.1 Discrete and Continuous Random Variables Learning Objectives After this section, you should be able to…  APPLY the concept of discrete random variables to a variety of statistical settings  CALCULATE and INTERPRET the mean (expected value) of a discrete random variable  CALCULATE and INTERPRET the standard deviation (and variance) of a discrete random variable  DESCRIBE continuous random variables Pig +  Greedy The Game: The match consists of 5 rounds. Each round consists of a number of games. Before each game, each person decides whether to stop (and retain the points earned in that round) or continue to play (and win more points, or lost all points gained in the round, depending on the outcome. •All players get the first two throws for free. A die is tossed twice. The points are added together. •All players stand up. •Each player has two options: quit, sit down, and keep the score that is the total of the two dice for Round 1; or continue to play and remain standing. Round 1 2 3 4 5 TOTAL Points Randomness, Probability, and Simulation How can we use probability to make and justify decisions? Pig The Game: The match consists of 5 rounds. Each round consists of a number of games. Before each game, each person decides whether to stop (and retain the points earned in that round) or continue to play (and win more points, or lost all points gained in the round, depending on the outcome. •If players continue to play, they Round Points remain standing. The die is 1 tossed. If the number 1, 3, 4, 5, 2 or 6 is rolled, the player adds this number to his or her total for that 3 round. But if the number is 2, the 4 students that are still standing 5 lose all points for that round, and record a score of 0 for that round. TOTAL •This continues until all players The object of the game is to determine have sat down, or a 2 is rolled. a strategy that in the long term will Then that round is over. maximize the total points. Randomness, Probability, and Simulation How can we use probability to make and justify decisions? +  Greedy Pig Randomness, Probability, and Simulation •What strategies did you use to decide when to stop? •Choose a strategy to use to play again. Choose this strategy and stick to it. Play the game again. How did this strategy work for you? •Record your results in a back-to-back stem-and-leaf plot. Compare the results. +  Greedy Water vs. Tap Water +  Bottled Station Bottled Water Cup? Truth Discrete and Continuous Random Variables 1) Look at your index card with your station written on it. 2) Go to the corresponding station. Pick up three cups (one each A, B, and C) and take them back to your seat. 3) Your task is to determine which one of the three cups contains the bottled water. Drink all the water in Cup A first, then the water in Cup B, and finally the water in Cup C. Write down the letter of the cup that you think held the bottled water. Do not discuss your results with any of your classmates. 4) When Ms. Raskin tells you to do so, go to the board and record your station number and the letter of the cup you identified as containing bottled water. Water vs. Tap Water When my Nerd Camp class did this activity, 13 out of the 21 teachers made the correct identifications. If you assume that the teachers can’t tell bottled water from tap water, you would assume that 7 teachers would guess correctly (1/3 of the total). How likely is it that 13 of the 21 teachers would guess correctly? To answer this, we will need a different kind of probability model than what we have been using. Discrete and Continuous Random Variables 5) Let’s assume that no one in the class can distinguish tap water from bottled water. In that case, students would just be guessing which cup of water tastes different. If so, what’s the probability that an individual student would guess correctly? 6) How many correct identifications would you need to see to be convinced that the students in your class aren’t just guessing? Choose a partner and design a simulation to answer this question. What do you conclude about your class’s ability to distinguish tap water from bottled water? +  Bottled Water vs. Tap Water X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X 2 3 4 5 6 7 8 9 10 11 12 13 14 How likely is it that 13 of the 21 teachers would guess correctly? To answer this, we will need a different kind of probability model than what we have been using. Discrete and Continuous Random Variables When my Nerd Camp class did this activity, 13 out of the 21 teachers made the correct identifications. If you assume that the teachers can’t tell bottled water from tap water, you would assume that 7 teachers would guess correctly (1/3 of the total). Here is a dotplot showing 100 trials of the simulation to see how often there are 13 or more correct guesses (using RandInt; letting 1 be a correct guess and 2 & 3 be incorrect; looking in groups of 21; in the simulation below, there were 4 trials in which three teachers guessed correctly). +  Bottled Variable and Probability Distribution A numerical variable that describes the outcomes of a chance process is called a random variable. The probability model for a random variable is its probability distribution Definition: A random variable takes numerical values that describe the outcomes of some chance process. The probability distribution of a random variable gives its possible values and their probabilities. Example: Consider tossing a fair coin 3 times. Define X = the number of heads obtained X = 0: TTT X = 1: HTT THT TTH X = 2: HHT HTH THH X = 3: HHH Value 0 1 2 3 Probability 1/8 3/8 3/8 1/8 What’s the probability that you will get at least one “heads” in three tosses? How would you interpret that probability? Discrete and Continuous Random Variables A probability model describes the possible outcomes of a chance process and the likelihood that those outcomes will occur. +  Random Random Variables Shoe size (discrete) Foot length (continuous) Discrete and Continuous Random Variables There are two main types of random variables: discrete and continuous. If we can find a way to list all possible outcomes for a random variable and assign probabilities to each one, we have a discrete random variable. +  Discrete Random Variables +  Discrete Discrete Random Variables and Their Probability Distributions A discrete random variable X takes a fixed set of possible values with gaps between. The probability distribution of a discrete random variable X lists the values xi and their probabilities pi: Value: x1 Probability: p1 x2 p2 x3 p3 … … The probabilities pi must satisfy two requirements: 1. Every probability pi is a number between 0 and 1. 2. The sum of the probabilities is 1. To find the probability of any event, add the probabilities pi of the particular values xi that make up the event. Discrete and Continuous Random Variables There are two main types of random variables: discrete and continuous. If we can find a way to list all possible outcomes for a random variable and assign probabilities to each one, we have a discrete random variable. Babies’ Health at Birth (Apgar Scores) +  Example: In 1952, Dr. Virginia Apgar suggested five criteria for measuring a baby’s health at birth: skin color, heart rate, muscle tone, breathing, and response to stimuli. She developed a 0 – 1 – 2 scale to rate a newborn on each of the five criteria. A baby’s Apgar score is the sum of the ratings on the five scales, which gives a whole-number value from 0 to 10. What Apgar scores are typical? To find out, researchers recorded the Apgar scores of over 2 million newborn babies in a single year. Imagine selecting one of these newborns at random. (That’s our chance process.) Define X as the Apgar score of a randomly-selected baby one minute after birth. The table on the next slide gives the probability description for X. Babies’ Health at Birth (Apgar Scores) +  Example: (a) Show that the probability distribution for X is legitimate. (b) Make a histogram of the probability distribution. Describe what you see. (c) Apgar scores of 7 or higher indicate a healthy baby. What is P(X ≥ 7)? Value: 0 1 2 3 4 5 6 7 8 9 10 Probability: 0.001 0.006 0.007 0.008 0.012 0.020 0.038 0.099 0.319 0.437 0.053 (a) All probabilities are between 0 and 1 and they add up to 1. This is a legitimate probability distribution. (c) P(X ≥ 7) = .908 We’d have a 91 % chance of randomly choosing a healthy baby. Notice the (b) The left-skewed shape of the distribution suggests a randomly selected newborn will have an Apgar score at the high end of the scale. There is a small chance of getting a baby with a score of 5 or lower. difference between > and ≥. With discrete variables, these are different but not with continuous variables. NHL Goals +  Example: (a) Show that the probability distribution for X is legitimate. (b) Make a histogram of the probability distribution. Describe what you see. (c) What is the probability that the number of goals scored by a randomlyselected game is at least 6? Goals: 0 1 2 3 4 5 6 7 8 9 Probability: 0.061 0.154 0.228 0.229 0.173 0.094 0.041 0.015 0.004 0.001 (a) All probabilities are between 0 and 1 and they add up to 1. This is a legitimate probability distribution. (c) P(X ≥ 6) = .041 + .015 + .004 + .001 = 0.061. We’d have a 6.1 % chance that the number of goals scored is at least 6. (b) The histogram is right-skewed, which means that the majority of games are relatively low scoring. It is pretty unusual for a team to score 6 or more goals. YOUR UNDERSTANDING +  CHECK Value of X 0 1 2 3 4 Probability 0.02 0.10 0.20 0.42 0.26 1) Say in words what the meaning of P(X ≥ 3) is. What is this probability? 2) Write the event “the student got a grade worse than C” in terms of values of the random variable X. What is the probability of this event? 3) Sketch a graph of the probability distribution. Describe what you see. Discrete and Continuous Random Variables North Carolina State University posts the grade distribution for its courses online. Students in Statistics 101 in a recent semester received 26% As, 42% Bs, 20% Cs, 10% Ds, and 2% Fs. Choose a Statistics student at random. The student’s grade on a four-point scale is a discrete random variable X with this probability distribution: YOUR UNDERSTANDING +  CHECK Value of X 0 1 2 3 4 Probability 0.02 0.10 0.20 0.42 0.26 1) Say in words what the meaning of P(X ≥ 3) is. What is this probability? The probability that the student gets either an A or a B is 0.68. 2) Write the event “the student got a grade worse than C” in terms of values of the random variable X. What is the probability of this event? P(X < 2) = 0.12 3) Sketch a graph of the probability distribution. Describe what you see. The histogram is left-skewed. Higher grades are more likely, but there are a few lower grades. Discrete and Continuous Random Variables North Carolina State University posts the grade distribution for its courses online. Students in Statistics 101 in a recent semester received 26% As, 42% Bs, 20% Cs, 10% Ds, and 2% Fs. Choose a Statistics student at random. The student’s grade on a four-point scale is a discrete random variable X with this probability distribution: of a Discrete Random Variable The mean of any discrete random variable is an average of the possible outcomes, with each outcome weighted by its probability. Definition: Suppose that X is a discrete random variable whose probability distribution is Value: x1 x2 x3 … Probability: p1 p2 p3 … To find the mean (expected value) of X, multiply each possible value by its probability, then add all the products:  x  E(X)  x1 p1  x 2 p2  x 3 p3  ...   x i pi Discrete and Continuous Random Variables When analyzing discrete random variables, we’ll follow the same strategy we used with quantitative data – describe the shape, center, and spread, and identify any outliers. +  Mean Apgar Scores – What’s Typical? +  Example: Consider the random variable X = Apgar Score Compute the mean of the random variable X and interpret it in context. Value: 0 1 2 3 4 5 6 7 8 9 10 Probability: 0.001 0.006 0.007 0.008 0.012 0.020 0.038 0.099 0.319 0.437 0.053 x  E(X)   xi pi  (0)(0.001)  (1)(0.006)  (2)(0.007)  ... (10)(0.053)  8.128 The mean Apgar score of a randomly selected newborn is 8.128. This is the longterm average Agar score of many, many randomly chosen babies.  Note: The expected value does not need to be a possible value of X or an integer! It is a long-term average over many repetitions. Winning (and Losing) at Roulette Consider the random variable X = net gain from a single $1 bet on red. Compute the mean of the random variable X and interpret it in context. Value: -1 1 Probability: 20/38 18/38 x  E(X)   xi pi  ($1)(20 /38)  ($1)(18 /38)  $0.05 In the long run, the player loses (and the casino gains) five cents per bet. The ordinary average of -1 and 1 is 0, but $0 isn’t the average winnings because the player is less likely to win $1 than to lose $1. +  Example: More Roulette +  Example: Another wager players can make in roulette is called a “corner bet.” To make this bet, a player places his chips on the intersection of four numbered squares on the roulette table. If one of these numbers comes up on the wheel and the player bet $1, the player gets his $1 back plus $8 more. Otherwise, the casino keeps the original $1 bet. Consider the random variable X = net gain from a single $1 corner bet Compute the mean of the random variable X and interpret it in context. Value: -1 8 Probability: 34/38 4/38 x  E(X)   xi pi  ($1)(34 /38)  ($8)(4 /38)  $0.05 If a player were to make $1 corner bets many, many times, the average gain would be about -$0.05 per bet.  NHL Goals +  Example: Consider the random variable X = Number of goals scored by a randomlyselected team in a randomly-selected game Goals: 0 1 2 3 4 5 6 7 8 9 Probability: 0.061 0.154 0.228 0.229 0.173 0.094 0.041 0.015 0.004 0.001 x  E(X)   xi pi  (0)(0.061)  (1)(0.154)  ... (9)(0.001)  2.851 The number of goals for a randomly-selected team in a randomly-selected game is 2.851. If you were to repeat the random sampling process over and over again, the mean number of goals scored would be about 2.851 in the long run.  ERROR ALERT! Discrete and Continuous Random Variables Many students incorrectly believe that the expected value of a random variable must be equal to one of the possible values of the variable. This is not the case. In the roulette examples, the expected value was -$0.05, even though this was not a possible gain from a $1 single bet. In the Apgar example, the average score 8.128, is not a possible value of the random variable. If you think of the mean as a long-run average over many repetitions, these facts make sense. +  AP Deviation of a Discrete Random Variable Definition: Suppose that X is a discrete random variable whose probability distribution is Value: x1 x2 x3 … Probability: p1 p2 p3 … and that µX is the mean of X. The variance of X is Var(X)   X2  (x1   X ) 2 p1  (x 2   X ) 2 p2  (x 3   X ) 2 p3  ...   (x i   X ) 2 pi This formula is given to you on the formula sheet of the AP exam. To get the standard deviation of a random variable, take the square root of the variance. Discrete and Continuous Random Variables Since we use the mean as the measure of center for a discrete random variable, we’ll use the standard deviation as our measure of spread. The definition of the variance of a random variable is similar to the definition of the variance for a set of quantitative data. +  Standard Apgar Scores – How Variable Are They? +  Example: Consider the random variable X = Apgar Score Compute the standard deviation of the random variable X and interpret it in context. Recall that the mean Apgar score was 8.128. Value: 0 1 2 3 4 5 6 7 8 9 10 Probability: 0.001 0.006 0.007 0.008 0.012 0.020 0.038 0.099 0.319 0.437 0.053   (x i X ) pi 2 X 2  (0  8.128)2 (0.001)  (1 8.128)2 (0.006)  ... (10  8.128)2 (0.053) Variance  2.066  X  2.066 1.437 The standard deviation of X is 1.437. On average, a randomly selected baby’s Apgar score will differ from the mean 8.128 by about 1.4 units. Analyzing Random Variables on the 1) Entering the values of the random variable in L1 and the corresponding probabilities in L2. (Practice by using the Apgar Scores) 2) To graph a histogram of the probability distribution: 1) Set up a statistics plot with Xlist: L1 and Freq: L2 2) Adjust your window settings to: 1) Xmin = -1 When you are done 2) Xmax = 11 with the histogram, 3) Xscl = 1 remember to reset Freq 4) Ymin = -0.1 back to 1. 5) Ymax = 0.5 6) Yscl = 0.1 3) Press GRAPH (F3 on TI-89) 3) To calculate the mean and standard deviation of the random variable, use one-variable statistics with the values in L1 and the probabilities in L2: 1) TI-83/84: Execute the command 1-Var Stats L1, L2 2) TI-89: In the Statistics/List Editor, press F4 (Calc) and choose 1: 1-Var Stats. Use the inputs List: list1 and Freq: list 2 Discrete and Continuous Random Variables Calculator +  Technology: Discrete and Continuous Random Variables Calculator Analyzing Random Variables on the +  Technology: Discrete and Continuous Random Variables Calculator Analyzing Random Variables on the +  Technology: NHL Goals +  Example: Consider the random variable X = goals scored Compute the standard deviation of the random variable X and interpret it in context. Recall that the mean goals scored was 2.851. Goals 0 1 2 3 4 5 6 7 8 9 Probability: 0.061 0.154 0.228 0.229 0.173 0.094 0.041 0.015 0.004 0.001   (x i X ) pi 2 X 2  (0  2.851)2 (0.061)  (1 2.851)2 (0.154)  ... (9 1.54)2 (0.001)  2.66 X  2.66 1.63 The standard deviation of X is 1.63. On average, a randomly-selected team’s number of goals in a randomly-selected game will differ from the mean by about 1.63 goals. YOUR UNDERSTANDING +  CHECK Cars Sold 0 1 2 3 Probability 0.3 0.4 0.2 0.1 1) Compute and interpret the meaning of X. 2) Compute and interpret the standard deviation of X. Discrete and Continuous Random Variables A large auto dealership keeps track of sales made during each hour of the day. Let X = the number of cars sold during the first hour our of business on a randomly-selected Friday. Based on previous records, the probability distribution of X is as follows: YOUR UNDERSTANDING +  CHECK Cars Sold 0 1 2 3 Probability 0.3 0.4 0.2 0.1 1) Compute and interpret the meaning of X. μx = 1.1. The long-run average, over many Friday mornings, will be about 1.1 cars sold. 2) Compute and interpret the standard deviation of X. σx = 0.943. On average, the number of cars sold on a randomly-selected Friday will differ from the mean (1.1) by about 0.943 cars sold. Discrete and Continuous Random Variables A large auto dealership keeps track of sales made during each hour of the day. Let X = the number of cars sold during the first hour our of business on a randomly-selected Friday. Based on previous records, the probability distribution of X is as follows: ERROR ALERT! Discrete and Continuous Random Variables In many cases where you are asked to calculate the mean (expected value) or standard deviation of a random variable, you will lose credit for not showing adequate work – or for not showing work at all. To get full credit, you need to go beyond just reporting your commands used on your graphing calculator. Show the first couple of terms with an ellipsis (. . .) for the calculation of the mean or standard deviation. +  AP Random Variables Definition: A continuous random variable X takes on all values in an interval of numbers. The probability distribution of X is described by a density curve. The probability of any event is the area under the density curve and above the values of X that make up the event. The probability model of a discrete random variable X assigns a probability between 0 and 1 to each possible value of X. A continuous random variable Y has infinitely many possible values. All continuous probability models assign probability 0 to every individual outcome. Only intervals of values have positive probability. Discrete and Continuous Random Variables Discrete random variables commonly arise from situations that involve counting something. Situations that involve measuring something often result in a continuous random variable. +  Continuous Random Variables The command 2rand will generate a random number from 0 to 2. To generate a random number from 1 to 2, use the command rand+1. You can combine these commands to produce other intervals. Discrete and Continuous Random Variables Calculator Commands: The calculator command rand will generate a random number from 0 to 1. +  Continuous Random Variables The random number generator will spread its output uniformly across the entire interval from 0 to 1 as we allow it to generate a long sequence of random numbers. The results of many trials are represented by the density curve of a uniform distribution. This density curve appears at right. It has height 1 over the interval from 0 to 1. The area under the density curve is 1, and the probability of any event is the area under the density curve and above the event in question. Remember that each individual probability is 0. Discrete and Continuous Random Variables Random Numbers: +  Continuous Young Women’s Heights +  Example: The heights of young women closely follow the Normal distribution with mean = 64 inches and standard deviation = 2.7 inches. This is a distribution for a large set of data. Choose one young woman at random. Call her height Y. If we repeat the random choice many, many times, the distribution of values of Y is the same Normal distribution that describes the heights of all young women. Find the probability that the chosen woman is between 68 and 70 inches tall. Young Women’s Heights +  Example: Define Y as the height of a randomly chosen young woman. Y is a continuous random variable whose probability distribution is N(64, 2.7). What is the probability that a randomly chosen young woman has height between 68 and 70 inches? P(68 ≤ Y ≤ 70) = ??? 68  64 2.7  1.48 z 70  64 2.7  2.22 z P(1.48 ≤ Z ≤ 2.22) = P(Z ≤ 2.22) – P(Z ≤ 1.48)  = 0.9868 – 0.9306  = 0.0562 There is about a 5.6% chance that a randomly chosen young woman has a height between 68 and 70 inches. Young Women’s Heights – Using a Calculator 2nd VARS (gives you DISTR) Use normalcdf(68, 70, 64, 2.7) +  Example: Weights of 3-yr.-old Females +  Example: Define Y as the weight of a randomly-chosen 3-yr.-old female. Y is a continuous random variable whose probability distribution is N(30.7, 3.6). What is the probability that a randomly chosen 3-yr.-old female weighs at least 30 pounds? P(Y ≥ 30) = ??? 30  30.7 3.6  0.19 z P(Z ≥ -0.19) = 1 – P(Z < -0.19)  = 1 – 0.4247 = 0.5753 There is about a 58% chance that the randomly selected 3-yr.-old female will weigh at least 30 pounds. ERROR ALERT! Discrete and Continuous Random Variables When you solve problems involving random variables, start by defining the random variable of interest. For example, let X = the Apgar score of a randomly-selected baby or let Y = the height of a randomly-selected young woman. Then state the probability you’re trying to find in terms of the random variable P(68 ≤ Y ≤ 70) or P(X ≥ 7). +  AP + Section 6.1 Discrete and Continuous Random Variables Summary In this section, we learned that…  A random variable is a variable taking numerical values determined by the outcome of a chance process. The probability distribution of a random variable X tells us what the possible values of X are and how probabilities are assigned to those values.  A discrete random variable has a fixed set of possible values with gaps between them. The probability distribution assigns each of these values a probability between 0 and 1 such that the sum of all the probabilities is exactly 1.  A continuous random variable takes all values in some interval of numbers. A density curve describes the probability distribution of a continuous random variable. + Section 6.1 Discrete and Continuous Random Variables Summary In this section, we learned that…  The mean of a random variable is the long-run average value of the variable after many repetitions of the chance process. It is also known as the expected value of the random variable.  The expected value of a discrete random variable X is  The variance of a random variable is the average squared deviation of the values of the variable from their mean. The standard deviation is the square root of the variance. For a discrete random variable X, x   xi pi  x1 p1  x2 p2  x3 p3  ... 2  X  (xi X )2 pi  (x1 X )2 p1  (x2 X )2 p2  (x3 X )2 p3  ... + Looking Ahead… In the next Section… We’ll learn how to determine the mean and standard deviation when we transform or combine random variables. We’ll learn about  Linear Transformations  Combining Random Variables  Combining Normal Random Variables

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download X - jmullenkhs