Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 6 Random Variables Random Variables and Expected Value 1 Betting on Death! Many people in America have life insurance policies. Although you might not want to think of it this way, when you purchase a life insurance policy, you’re betting that you will die sooner rather than later… Although this is a bet that people really don’t want to win, it is a bet that they are willing to take just to be sure that their families are financially secure in the event of death. Most families depend on the income of one or more people in the household. What would happen if that income suddenly disappeared? Life insurance help us handle such disasters. When you purchase a life insurance policy, it’s in your best interests that the company makes a profit and does well; why do you think that is? 2 Betting on Death! Question: You purchase a policy that charges only $50 a year. If it pays $10,000 for death and $5000 for a permanent disability, is the company likely to make a profit? Actuaries at for the company have determined the following probabilities in any given year: P (Death) = 1/1000 P (Permanently disabled) = 2/1000 P (Healthy) = 997/1000 We’ll come back to this problem later on… 3 Random Variables A Random Variable is a variable whose values are numbers that are determined by an outcome of a random event. Note: Random variables are denoted by capital letters, while the values of random variables are denoted with lowercase letters (small letters) 4 Discrete Random Variables and Exp. Value A discrete random variable has a countable number of outcomes. In other words, it is possible for you to count and make a list of all of the possible outcomes. Discrete random variables take on only integer values. Suppose, for example, that we flip a coin and count the number of heads. The number of heads results from a random process - flipping a coin. And the number of heads is represented by an integer value. The mean of the discrete random variable, X, is also called the expected value of X. Notationally, the expected value of X is denoted by E(X). It is what we expect to happen. The formula for expected value is: X E( X ) x P( x) 5 Examples In the experiment of flipping three coins, consider the outcomes and define the random variable X as the number of heads that appear. The outcomes are {no heads, 1 head, 2 heads, or 3 heads} X has values in the set: {0, 1, 2, 3 When rolling two dice and finding the sum, determine the outcomes and the random variable Y. The outcomes are {(1,1), (1,2), (1,3), (1,4), (1,5), etc…} Y has values in the set: {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} In our life insurance example, what are the outcomes and random variables Z if we define them as the possible payments. The outcomes are {die, disabled, healthy} Z has values in the set: {$10,000, $5000, $0} 6 Back to Betting on Death So, will the company make a profit for any given year? How much will they make or lose? These questions are answered by finding the expected value. Policyholder Outcome Payout x Probability P(X = x) Die $10,000 1/1000 Disability $5000 2/1000 Healthy $0 997/1000 The expected Value is: 1 2 997 X E ( X ) x P( x) $10,000 $5000 $0 1000 1000 1000 $10 $10 $0 $20 7 Back to Betting on Death So, what does this mean? The expected value for the company is a payout, on average, of $20 per customer per year. Since each customer pays $50 per year, the company expects to make a profit of $30 per customer per year. It’s important to note that the insurance company will never really pay anyone $20; it only pays $10,000, $5000, or $0. $20 is the expected average payout given a large number of policy holders based on the LLN. 8 Labor Costs A car’s air conditioner recently needed to be repaired at the auto shop. The mechanic said that he could for $60 in 75% of the cases by drawing down and recharging the coolant. If that fails, it will cost an additional $140 to replace the unit. What are the outcomes, random variables, and the probability distribution? Outcome Cost Probability x P(X = x) Quick fix works $60 ¾ =.75 Replace unit $200 ¼ = .25 9 Labor Costs A car’s air conditioner recently needed to be repaired at the auto shop. The mechanic said that it could for $60 in 75% of the cases by drawing down and recharging the coolant. If that fails, it will cost an additional $140 to replace the unit. What is the expected value of the cost of this repair? X E ( X ) x P( x) $600.75 $2000.25 $45 $50 $95 10 Labor Costs A car’s air conditioner recently needed to be repaired at the auto shop. The mechanic said that it could for $60 in 75% of the cases by drawing down and recharging the coolant. If that fails, it will cost an additional $140 to replace the unit. What does this mean in context of this problem? Car owners with this problem will spend an average of $95 to get their car fixed at this auto shop. 11 Got to Love Those Aces It takes $5 to play a game From a standard 52 card deck of cards, if you get an ace of hearts, you get $100 If you get any other ace, you get $10. If you get any other heart, you get your $5 back. If you get any other card, you lose. Make a probability distribution for this game. Make a histogram of the probability distribution. What is the expect value of this game and is it worth it to play this game? 12 Got to Love Those Aces First, you want to determine your possible winnings (let’s include the $5 cost) and probabilities: Outcome X = Payout Probability: P(X = x) Ace of Hearts Other Aces $95 $5 1/52 = .0192 3/52 = .0577 Other Hearts Other Cards $0 -$5 12/52 = .2308 36/52 = .6923 Now, we can find the expected value, E(X): X E( X ) x P( x ) $950.0192 $5.0577 $0(.2308) $5(.6923) $1.82 $0.29 $0 $3.46 $1.35 Is this game worth playing? 13 Probability Histogram We can use histograms to display probability distributions as well as distributions of data. 14 Continuous Random Variable Continuous random variables, in contrast, can take on any value within a range of values. A continuous random variable is not countable. In other words, you cannot list every single possible outcome. For example, the amount of water can you put into a 5-gallon container – there are an infinite number of possibilities. 15 Example Which of the following is a discrete random variable? The average height of a randomly selected group of boys. II. The annual number of sweepstakes winners from New York City. III. The number of presidential elections in the 20th century. I. (A) I only (B) II only (C) III only (D) I and II (E) II and III 16 Solution The correct answer is B. The annual number of sweepstakes winners is an integer value and it results from a random process; so it is a discrete random variable. The average height of a group of boys could be a non-integer, so it is not a discrete variable. And the number of presidential elections in the 20th century is an integer, but it does not vary and it does not result from a random process; so it is not a random variable. 17 Means and Variances of Random Variables Recall that the mean, x, of a set of observations is our ordinary average. The mean of a random variable X is a weighted average – it takes into account the fact that not all outcomes need be equally likely. The mean of a random variable X is also called the expected value of X. The expected value takes probability into account The Variance of a Random Variable µ is the mean of X and the VARIANCE of X is 2x = (x1- µx)2p1 + (x2 - µx)2p2 + ….+ (xk - µx)2pk = k (x ) i 1 i 2 pi The standard deviation x of X is the square root of the variance. Let’s go back and try to determine the variance and standard deviation of our insurance policy problem. The Variance of a Random Variable Let’s go back and try to determine the variance and standard deviation of our insurance policy problem. Recall: μX = E(X) = $20 Policyholder Outcome Payout x Probability P(X = x) Deviation (x – μ) Death $10,000 1/1000 (10,000 – 20) = 9,980 Disability Neither $5000 $0 2/1000 997/1000 (5,000 – 20) = 4,980 (0 – 20) = -20 The variance is the expected value of those squared deviations: 2 1 2 2 997 Var ( X ) 9980 2 4980 ( 20 ) 149,600 1000 1000 1000 The Variance of a Random Variable To find the standard deviation of our problem, we find the square root of our variance: SD( X ) Var( X ) 149,600 $386.78 So the insurance company can expect an average payout of $20 per policy with a standard deviation of about $386.78. What does this mean? The company charges $50 for each policy and expects to pay $20 per policy, so there is a $30 profit for each policy (on average). However, a spread of $386.78 is very large for just $30. Remember, about 68% of the time, the values fall within one SD of the mean in a normal distribution. Another Example: Playing a game Let’s say that you want to play a spinner game that costs $5 to play: The following are the payouts: You spin a spinner using the following probabilities Spin A B C p 1/3 1/6 1/2 If it lands on A, you get $5 If it lands on B, you get $12 If it lands on C, you get $0 You end up with the following random variables Spin A B C What is the Expected Value of the game? X 0 7 -5 X =E(X)= Expected Value p 1/3 1/6 1/2 Example: Playing a game Using the distribution and random variable in the table, we get the following: Spin X p A 0 1/3 B 7 1/6 C -5 1/2 The Expected Value of the game: X = E(X)= Expected Value = 0(1/3) + 7(1/6) + (-5)(1/2) = 0 + 7/6 - 5/2 = -8/6 1.33 What does this mean? This means that in the very long-run, we can expect to lose about $1.33 per game on average. It is important to note that we will never actually lose $1.33 (because the payouts are $0, $7, and -$5), but a loss of $1.33 is our average long-term payout per game. Example: Playing a game Using the distribution and random variable in the table, we get the following Spin X p A 0 1/3 B 7 1/6 C -5 1/2 Now let’s calculate the variance and standard deviation for the payouts of this game: Xi X A 0 -1.33 B 7 C -5 (Xi - X )2 (Xi - X )2pi 1.33 1.7689 .5896 -1.33 8.33 69.3889 11.5648 -1.33 -3.67 13.4689 6.7344 (Xi - X ) Example: Playing a game Now let’s calculate the variance and standard deviation for the payouts of this game: Xi X (Xi - X) (Xi - X)2 (Xi - X )2pi A 0 -1.33 1.33 1.7689 .5896 B 7 -1.33 8.33 69.3889 11.5648 C -5 -1.33 -3.67 13.4689 6.7344 In order to get the variance we add up all the numbers in the last column: variance of X var(X) ( X i X )2 pi 18.8889 standard deviation of X SD(X) var(X) 18.8889 $4.35 The Law of Large Numbers The Law of Large Numbers states that the longrun relative frequency of repeated independent events gets closer and closer to the true relative frequency as the number of trials increases. The LLN ensures us that, in the long-run, we can find an average value that we expect to happen, namely, the Expected Value.