Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Worksheet on Random Variables (no computer needed) Topic: Introduction to random variables and the idea of distribution. (Material needed : two distinguishable dice) 1. A random experiment. Take two dice (if they are the same color, use a marker to color one of a different color, in this worksheet we will call them ‘green’ and ‘red’ but you can use any other colors). a) You will roll the two dice. Before you roll those dice, remember the definition of random experiment and explain why rolling these two dice is a random experiment. b) Think: what are the possible values for the ‘green die’ ? ______________________ What are the possible values for the ‘red’ die? ______________________________ Each ‘outcome’ of these experiment is a pair of values (one of the ‘red’ and one of the ‘green’), for example (3,5) is a possible outcome, another one is (2,2). Remember the definition of ‘sample space’. How many ‘outcomes’ you think the ‘sample space’ will have? ______ Write the sample space here: 2. Introducing the notion of random variable Maybe I am not interested in exactly which is the outcome that occurs, maybe I am interested only in certain aspect of the outcome, like ‘what is the difference between the two values’ or ‘what is the sum of the two values?’ or ‘which is the highest of the two values’? In that case we define a random variable, a variable which value depends on the outcome of a random experiment. Below you see the list of possible outcomes and the values the two random variables (X and Z) take for each outcome green 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 red 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 x=sum 2 3 4 5 6 7 3 4 5 6 7 8 4 5 6 7 8 9 z=difference 0 1 2 3 4 5 1 0 1 2 3 4 2 1 0 1 2 3 first 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 4 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 second 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 x=sum 5 6 7 8 9 10 6 7 8 9 10 11 7 8 9 10 11 12 z=difference 3 2 1 0 1 2 4 3 2 1 0 1 5 4 3 2 1 0 If both dice are fair, the probability of each outcome is 1/36. Now count in how many of those outcomes the sum of the two results is 2, 3,4,…… We can make a list of all the values that the variable X (sum of the green and the red dice values) and write next to them the probability of each value. X P(x) 2 3 4 5 6 7 8 9 10 11 12 So, you have come up with a table that describes the ‘distribution’ of the random variable (the values of the variable and the probability of each value). This is a ‘discrete’ random variable, because it only takes integer values. There are other random variables that are continuous. X 2 3 4 5 6 7 8 9 10 11 12 P(x) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 We can plot the values of the probability and get a visual representation of the distribution. The distribution looks symmetric. 0.15 P(x) 0.10 0.05 0.00 2 7 12 X In the same way, we could think of the distribution of the variable Z : absolute value of the difference of the two results Make a graph to display the distribution of Z Z P(z) 0 6/36 1 10/36 2 8/36 3 6/36 4 4/36 5 2/36 Probability distributions, same as the frequency distributions we learned before, can be symmetric or skewed. Now, use the same sample space to find the distribution of the variable, H=’the highest of the two values’ h P(h) Obtain a graphical representation of the distribution of h. 3.Mean or ‘expected value’ of a random variable. Roll the two dice 10 times, report the result for each dice and calculate the sum of the two outcomes each time: Roll # 1 ‘green’ ‘red’ sum 2 3 4 5 6 7 8 9 10 You have done 10 ‘realizations’ of a random experiment, this is equivalent to taking a sample of size 10 from a population. Now calculate the mean of the variable ‘sum’ for these 10 realizations. Report the mean here ____________ (this would be equivalent to a ‘sample mean’). Think that we continue rolling the two dice for ever and ever, what would be the mean (average) value of the sum of the two dice in the very long run? This would be equivalent to the mean of a population. There is a formula that allows us to calculate that mean without having to actually roll the dice for ever and ever. For discrete random variables the ‘mean’ or ‘expected value’ of a random variable is E ( X ) xi p ( xi ) (each possible value of the variable is multiplied by its probability and then we add those products). For the example of the sum of the two dice: E ( X ) xi p ( xi ) = 2*1/36+3*2/36+4*3/36+5*4/36+6*5/36+7*6/36+8*5/36+9*4/36+10*3/36+11*2/36+12*1/36=7 2 3 4 5 6 7 8 9 10 11 12 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 So if we repeat the random experiment of tossing two dice, in the very long run, the average value of the sum of the values in the two dice would be 7. Note.- Not only the mean or expected value of a variable can be calculated, also the variance Var(X) could be calculated . An easy way of calculating the variance is : Var(X)=E(X2)- [E(X)]2 In the example: E ( X 2 ) x 2 i p( xi ) = 4*1/36+9*2/36+16*3/36+25*4/36+36*5/36+49*6/36+64*5/36+81*4/36+100*3/36+121*2/36+144*1/36= ? Complete the calculation and calculate Var(x)=E(X2)-72 = Exercise : 1) Calculate the mean value E(Z) for the variable Z: difference (in absolute value) of the two outcomes. Show your work: E(Z)= 2) Calculate the mean or expected value E(H) of the variable H: ‘the largest of the two numbers”