Download Introducing the notion of random variable:

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Worksheet on Random Variables (no computer needed)
Topic: Introduction to random variables and the idea of distribution.
(Material needed : two distinguishable dice)
1. A random experiment.
Take two dice (if they are the same color, use a marker to color one of a different color, in this
worksheet we will call them ‘green’ and ‘red’ but you can use any other colors).
a) You will roll the two dice. Before you roll those dice, remember the definition of random
experiment and explain why rolling these two dice is a random experiment.
b) Think: what are the possible values for the ‘green die’ ? ______________________
What are the possible values for the ‘red’ die? ______________________________
Each ‘outcome’ of these experiment is a pair of values (one of the ‘red’ and one of the
‘green’), for example (3,5) is a possible outcome, another one is (2,2).
Remember the definition of ‘sample space’.
How many ‘outcomes’ you think the ‘sample space’ will have? ______
Write the sample space here:
2. Introducing the notion of random variable
Maybe I am not interested in exactly which is the outcome that occurs, maybe I am interested
only in certain aspect of the outcome, like ‘what is the difference between the two values’ or
‘what is the sum of the two values?’ or ‘which is the highest of the two values’? In that case
we define a random variable, a variable which value depends on the outcome of a random
experiment. Below you see the list of possible outcomes and the values the two random
variables (X and Z) take for each outcome
green
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1
1
1
1
1
1
2
2
2
2
2
2
3
3
3
3
3
3
red
1
2
3
4
5
6
1
2
3
4
5
6
1
2
3
4
5
6
x=sum
2
3
4
5
6
7
3
4
5
6
7
8
4
5
6
7
8
9
z=difference
0
1
2
3
4
5
1
0
1
2
3
4
2
1
0
1
2
3
first
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
4
4
4
4
4
4
5
5
5
5
5
5
6
6
6
6
6
6
second
1
2
3
4
5
6
1
2
3
4
5
6
1
2
3
4
5
6
x=sum
5
6
7
8
9
10
6
7
8
9
10
11
7
8
9
10
11
12
z=difference
3
2
1
0
1
2
4
3
2
1
0
1
5
4
3
2
1
0
If both dice are fair, the probability of each outcome is 1/36. Now count in how many of those
outcomes the sum of the two results is 2, 3,4,……
We can make a list of all the values that the variable X (sum of the green and the red dice values) and
write next to them the probability of each value.
X
P(x)
2
3
4
5
6
7
8
9
10
11
12
So, you have come up with a table that describes the ‘distribution’ of the random variable (the values
of the variable and the probability of each value). This is a ‘discrete’ random variable, because it only
takes integer values. There are other random variables that are continuous.
X
2
3
4
5
6
7
8
9
10
11
12
P(x)
1/36
2/36
3/36
4/36
5/36
6/36
5/36
4/36
3/36
2/36
1/36
We can plot the values of the probability and get a visual representation of the distribution. The
distribution looks symmetric.
0.15
P(x)
0.10
0.05
0.00
2
7
12
X
In the same way, we could think of the distribution of the variable Z : absolute value of the
difference of the two results
Make a graph to display the distribution of Z
Z
P(z)
0
6/36
1
10/36
2
8/36
3
6/36
4
4/36
5
2/36
Probability distributions, same as the
frequency distributions we learned before,
can be symmetric or skewed.
Now, use the same sample space to find the distribution of the variable, H=’the highest of the
two values’
h
P(h)
Obtain a graphical representation of the distribution of h.
3.Mean or ‘expected value’ of a random variable.
Roll the two dice 10 times, report the result for each dice and calculate the sum of the two
outcomes each time:
Roll # 1
‘green’
‘red’
sum
2
3
4
5
6
7
8
9
10
You have done 10 ‘realizations’ of a random experiment, this is equivalent to taking a sample
of size 10 from a population. Now calculate the mean of the variable ‘sum’ for these 10
realizations. Report the mean here ____________ (this would be equivalent to a ‘sample
mean’).
Think that we continue rolling the two dice for ever and ever, what would be the mean
(average) value of the sum of the two dice in the very long run? This would be equivalent to
the mean of a population. There is a formula that allows us to calculate that mean without
having to actually roll the dice for ever and ever.
For discrete random variables the ‘mean’ or ‘expected value’ of a random variable is
E ( X )   xi p ( xi ) (each possible value of the variable is multiplied by its probability and then
we add those products).
For the example of the sum of the two dice:
E ( X )   xi p ( xi ) = 2*1/36+3*2/36+4*3/36+5*4/36+6*5/36+7*6/36+8*5/36+9*4/36+10*3/36+11*2/36+12*1/36=7
2
3
4
5
6
7
8
9
10
11
12
1/36
2/36
3/36
4/36
5/36
6/36
5/36
4/36
3/36
2/36
1/36
So if we repeat the random experiment of tossing two dice, in the very long run, the average
value of the sum of the values in the two dice would be 7.
Note.- Not only the mean or expected value of a variable can be calculated, also the variance
Var(X) could be calculated . An easy way of calculating the variance is :
Var(X)=E(X2)- [E(X)]2
In the example:
E ( X 2 )   x 2 i p( xi ) =
4*1/36+9*2/36+16*3/36+25*4/36+36*5/36+49*6/36+64*5/36+81*4/36+100*3/36+121*2/36+144*1/36= ?
Complete the calculation and calculate Var(x)=E(X2)-72 =
Exercise :
1) Calculate the mean value E(Z) for the variable Z: difference (in absolute value) of the two
outcomes. Show your work:
E(Z)=
2) Calculate the mean or expected value E(H) of the variable H: ‘the largest of the two
numbers”
Related documents