Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 7.1 Discrete and Continuous Random Variables A random variable, X, is a numerical outcome of a random phenomenon. For example, when a fair die is rolled the possible values of the random variable are X = 1, 2, 3, 4, 5, or 6. A discrete random variable, has a countable number of possible values. The probability distribution of X lists the values and their probabilities. This information can also be displayed as a probability histogram. Example: What is the probability distribution for the discrete random variable X that counts the number of boys in a three-child family. Display in a histogram. X P(X) Find P(X < 2) Find P(X < 2) 2 A continuous random variable X can take all the values in a specified interval. The probability distribution of X is described by a density curve. Example: Does the random number generator, rand, give a discrete or continuous random variable? What is its distribution? The probability of an event is given by the area under the density curve. Example: What is the probability of the random number being between 0.3 and 0.7? What is the probability of X < 0.5? What is the probability of X = 0.5? What is the probability of X < 0.5? 3 Random variables often behave differently than algebraic variables. For instance, in algebra, x + x = 2x. For random variables, does X+X = 2X? Consider the random phenomena of rolling one dice. What is the probability model? X P(X) What is the probability model for 2X? 2X P(2X) What is the model for X + X X+X P(X+X) Find the mean and standard deviation for X, 2X, and X+X. What do you notice about the means? Why is there less variability when we add X+X than for 2X? 4 Normal Probability Distributions Recall that the Normal curve we learned about in Chapter 2 is a density curve (meaning its area = 1). So the Normal distribution is a continuous probability distribution. The standard Normal random variable is given by X Z Example: Suppose that the true proportion of adults who jog is 0.15. If we asked an SRS of 150 adults “Do you jog?” would we expect exactly 15% to say yes? The actual response from the survey for repeated samples of 150 adults will be normally distributed with .15 with .029 . What is the probability that 20% or more of a survey’s respondents say they jog? 5 Follow up: What’s the probability that between 14% and 16% of the respondents say they jog? To simulate the jogging survey, we can use randNorm(.15, .029) In 100 trials, how often do you get p > 0.20? Hint: Store the 100 trials in L1and then SortD (L1) 6 The Greed Game The goal of this game is to be the person with the most points at the end of 5 rounds. All students should stand up. You will start each round with 5 points. A single dice will be rolled. If the roll is a “1” you lose all your points for that round. If the roll is any other number, you add that number to your score. You may sit down at any time to keep your points. The round ends when a “1” is rolled. 7 7.2 Means and Variances of Random Variables The mean of a random variable X is the weighted average of the possible values of X. It takes into account that all outcomes of X are not equally likely. Example: In Pick 3 Lottery, you choose a 3-digit number. If your number matches the number chosen by the lottery board, your payoff is $500. Otherwise you win nothing. What is the mean payoff? Payoff X: Probability: Mean Payoff = If a lottery ticket costs $1, how much does the state profit on average? Note: In this example, The mean payoff is not a possible outcome X. The mean is also called the expected value, but it is the expected value over the long-run. 8 We use the symbol X or E(x) to represent the mean value of the random variable X. For a discrete random variable X, with possible outcomes x1, x2, …xk X E( x) x1 p1 x2 p2 ...xk pk For a continuous random variable, X lies at the balance point of the probability distribution curve. 2 The variance of a discrete random variable is X . X 2 x1 X p1 x2 X p2 ... xk X pk 2 2 2 Example: Find the standard deviation of X for the Pick 3 game. 9 The Law of Large Numbers LoLN: As the sample size of a SRS is increased, the sample mean x always approaches the true mean of the population. Example: Suppose that the SAT Math scores of AP Statistics students are distributed normally with a mean of 620 and a standard deviation of 70. Simulate drawing a sample of 100 students and calculate their average SAT score. Seq(X, X, 1, 100) L1 randNorm(620, 70, 100) L2 cumSum(L2) L3 L3/L1 L4 Examine a scatter plot of L1 and L4 and interpret it. 10 Remember: * The Law of Large Numbers holds true regardless of the shape of the probability distribution. *There is no “law of small numbers” – short sequences of random events do not follow the type of average behavior that occur in the long run. *LoLN is very important for inferential statistics 11 Rules for Combining Means If X is a random variable and a and b are constants a bX a b X If X and Y are random variables, then X Y X Y Example: We know that the probability distribution for rolling one die is X 1 2 3 4 5 6 1 1 1 1 1 P(X) 16 6 6 6 6 6 What is the mean of this distribution? x Suppose we create a new random phenomenon that doubles each roll of the die and adds 3 to it. X P(X) What is the mean of this new distribution? 12 Rules for Combining Variances Rule 1: If X is a random variable and a and b are 2 2 2 b X a bX constants, then Rule 2: If X and Y are independent random variables, then X2 Y X2 Y2 X2 Y X2 Y2 Rule 3: If X and Y are dependent random variables with correlation , then X2 Y X2 Y2 2 X Y Example: Find the mean and variance for rolling one die and doubling the pips. Example: Find the mean and variance for rolling two dice and counting the pips. Use the above example to explain why in Statistics X X 2X 13 Why do variances add even when we are finding the difference between two variables? To answer this question, we are going to substitute range as our measure of variability instead of variance. Imagine you have a basket of grapefruit weighing 14 – 22 ounces and a basket of oranges weighing 7 – 10 ounces. What is the range of grapefruit weights? What is the range of orange weights? Now suppose that you are going to randomly pick one fruit from each basket. What is the range of the possible sum of the two weights? Max: Min: Range: What could the range of the difference in the two weights be? Max: Min: Range: