Download A "hand" of 5 cards is dealt from a thoroughly shuffled deck of cards

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Randomness wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
RANDOM VARIABLES and
PROBABILITY DISTRIBUTIONS
A random variable X is a function that assigns (real) numbers to the elements of the sample
space S of a random experiment.
The value space V of a random variable is the set of all possible values of the r. v. X.
A discrete random variable is one whose value space is finite or countably infinite.
A continuous r.v. is one whose value space is an interval of real numbers.
The probability function of a discrete random variable is the function that specifies the
probabilities of the r.v. assuming each of the various values in the value space.
That is, px  PX  x .
The probability distribution of a discrete random variable consists of its value space together
with its probability function.
That is, the "possibilities" together with their "probabilities".
It may be specified in function form or may be presented as a table of values or as a probability
histogram.
EXAMPLE: A "hand" of 5 cards is dealt from a thoroughly shuffled deck of cards. There are
 52 
   2 ,598 ,960 different possible hands in the sample space S.
5
Let X denote the number of "Hearts" in the hand so dealt. Then the value space is
V = {0, 1, 2, 3, 4, 5}. (V lists the possibilities)
The Probabilities are:
 13  39 
  
0 5
p0   PX  0       0.2215
 52 
 
5
 13  39 
  
1 4
p1  PX  1      0.4114
 52 
 
5
Page 1 of 7
 13  39 
  
2 3
p2   PX  2      0.2743
 52 
 
5
 13  39 
  
3 2
p3  PX  3      0.0815
 52 
 
5
 13  39 
  
4 1
p4   PX  4       0.0107
 52 
 
5
 13  39 
  
5 0
p5   PX  5      0.0005
 52 
 
5
The probability function for this r.v. X is
 13  39 
 

x  5  x 

px   PX  x  
 52 
 
5
The probability distribution of this r.v. X is
 13  39 
 

x  5  x 

px   PX  x  
 52 
 
5
for x V ; i.e. for x  0 ,1,2,3,4 ,5 .
Probability Distribution in table form
Possible Value: x =
0
1
2
3
4
5
Total
Probability: p(x) =
0.2215
0.4114
0.2743
0.0815
0.0107
0.0005
0.9999 or 1.0000
Basic Requirements of p(x)
a. 0  px  1 for each x  V
b.
 px   1
all xV
Page 2 of 7
Histogram of Probability Distribution
Probability Histogram for Num ber of Hearts in 5-Card Hand
0.45
0.40
0.35
Probability
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0
1
2
3
4
5
Num ber of Hearts (Value of X)
Question: What is the average number of Hearts in a 5-card hand selected this way?
Definition: The mean or expected value of a discrete random variable X is defined as
  EX  
 x. px  .
all x  V
Example:
  EX  
 x  px 
all x  V
5
  x  px 
x 0
 0  0.2215   1  0.4114 
 2  0.2743   3  0.0815 
 4  0.0107   5  0.0005 
 0  0.4114  0.5486
 0.2445  0.0428  0.0025
 1.2498
but is really 1.25 and is off because of rounding probabilities to 4 decimal places.
Page 3 of 7
Calculating the Mean in Table Form
x=
0
1
2
3
4
5
Total
p(x) =
0.2215
0.4114
0.2743
0.0815
0.0107
0.0005
0.9999
x p(x) =
0.0000
0.4114
0.5486
0.2445
0.0428
0.0025
 = 1.2498
Definition: The Variance of a discrete r.v. X is

 2  VarX   E  X   2


 x   2 . px 
all x in V
Example:

 2  VarX   E  X   2

5
  x  1.25 2 . p x 
x 0
 0  1.25 2 0.2215 
 1  1.25 2 0.4114 
 2  1.25 2 0.2743 
 3  1.25 2 0.0815 
 4  1.25 2 0.0107 
 5  1.25 2 0.0005 
 0.34609375  0.0257125
 0.15429375  0.24959375
 0.08091875  0.00703125
 0.86364375
[Based on other information, the correct value without rounding error is 0.86397]
Comment: The variance is the expected value of the quantity (X-)2. It is a measure of the
amount of variability or variation to be expected among the possible values of a random variable.
Page 4 of 7
Definition: The expected value of X2, the Square of X
 
 x 2  p x 
E X2 
Example:
all x  V
 
 x 2  p x 
E X2 
all x  V
5
  x 2  p x 
x 0
 
Calculating E X 2 in Table Form
x=
0
1
2
3
4
5
Total
x2 p(x) =
0.0000
0.4114
1.0972
0.7335
0.1712
0.0125
p(x) =
0.2215
0.4114
0.2743
0.0815
0.0107
0.0005
0.9999
 
E X 2 = 2.4258
Theorem: The variance of any random variable X can be determined by

 2  VarX   E  X   2
 
 E X 2  2
Example:

(the definition )
(a useful calculatio n method)
 
E X 2   2  2.4258  1.2498 2
 2.4258  1.5620
 0.8638
 
Actually, E X 2 
165
 2.426470588 so that
68
 
 EX
2
2
2
165  5 
 
    2.4265  1.5625  0.86397
68  4 
2
Page 5 of 7
which is what we had stated earlier. The rounding error is larger here because of the squaring
taking place.
Definition: The Standard Deviation of a r.v. X is the square root of its variance.
   2  VarX  .
Example:
   2  0.86397
 0.9295
Comment: The standard deviation is the most commonly used measure of variation or
variability of the values of a random variable.
Empirical Rule
If the shape of a probability distribution is mound-shaped and fairly symmetric, then the
amount of probability between:
a.    and    is about 0.68
b.   2 and   2 is about 0.95
c.   3 and   3 is almost 1.00
For a discrete random variable, look at its histogram. The histogram for the above example of
Hearts in a 5-card hand is not symmetric but it does have a “mound” or high region. How well
does the Empirical Rule apply in this case?
 = 1.25 and  =0.93. The interval between    and    here is from
    1.25-0.93  0.32 to     1.25  0.93  2.18 .
Thus,
P    X       P0.32  X  2.18  PX  1 or 2
.
 0.4114  0.2743  0.6857
which is quite close to the predicted value of 0.68. Because the histogram is not symmetric and
mound-shaped we do not expect the Empirical Rule to work very well..
Similarly, the interval between   2 and   2 is from   2  1.25  1.86  0.61 to
  2  1.25  1.86  3.11 and
P  2  X    2   P 0.61  X  3.11  PX  0 or 1 or 2 or 3
 0.2215  0.4114  0.2743  0.0815  0.9887
This value is considerably higher than the Empirical Rule value of 0.95.
Page 6 of 7
Another example with a perfectly symmetric and quite mound-shaped distribution follows.
Example: If one tosses a coin 10 times and counts the number of Heads observed in the 10
tosses, the probability distribution of the random variable Y = number of Heads in 10 tosses
has the probability histogram given below. For this random variable,  = 5.00 and  = 1.58 .
Number of Heads in 10 Tosses of a Coin
0.25
Probability
0.20
0.15
0.10
0.05
0.00
0
1
2
3
4
5
6
7
8
9
10
Num ber of Heads (Value of X)
Reading probabilities from the histogram as accurately as you can, check to see how well the
Empirical Rule works in this case.
Example: A random variable X is defined as the number of accidents a randomly chosen
Saskatchewan driver has in a one-year period. Using accident records maintained by SGI
over the past ten years, the probability distribution for r.v. X was determined to be as follows.
Number
x=
Probability
P[X = x] =
0
1
2
3
4
0.58
0.24
0.13
0.04
0.01
How many accidents does one expect a typical Saskatchewan driver to have in a 12-month
period?
How much variability does one expect to observe about this expected number?
Page 7 of 7