Download Chapter 4

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Randomness wikipedia , lookup

Transcript
STA 2023
Chapter 4 – Discrete Random Variables

Two Types of Random Variables (4.1)
o Random Variable – a variable or process that assigns each outcome of an
experiment to exactly one numerical value
o Discrete Random Variable – random variable whose range is finite or countably
infinite
 Binomial (4.4)
 Poisson (4.5)
 Hypergeometric (4.6)
o Continuous Random Variable – random variable whose range is infinite and not
countable
 Uniform (5.2)
 Normal (5.3)
 Exponential (5.6)
o Mixed – random variables whose range is a combination of both discrete and
continuous random variables
 We will not discuss mixed random variables in this course
o Example – Classify each random variable as discrete or continuous
 Number of U.S. earthquakes in 2002 – Discrete
 Length of Northern Pike in Lake Ontario – Continuous
 Speed of automobiles on I-4 – Continuous
 Cost of tuition – Discrete

Probability Distributions for Discrete Random Variables (4.2)
o Requirements for the Probability Distribution of a Discrete Random Variable x
 p(x)  0
 p(x) = 1
o Example – Sum of two rolls of a six-sided die
 Define the distribution of x
 Number of total elements in S  6*6 = 36
 # of ways to get 2  1; (1+1)
 # of ways to get 3  2; (1+2), (2+1)
 # of ways to get 4  3; (1+3), (2+2), (3+1)
 # of ways to get 5  4; (1+4), (2+3), (3+2), (4+1)
 # of ways to get 6  5; (1+5), (2+4), (3+3), (4+2), (5+1)
 # of ways to get 7  6; (1+6), (2+5), (3+4), (4+3), (5+2), (6+1)
 # of ways to get 8  5; (2+6), (3+5), (4+4), (5+3), (6+2)
 # of ways to get 9  4; (3+6), (4+5), (5+4), (6+3)
 # of ways to get 10  3; (4+6), (5+5), (6+4)
 # of ways to get 11  2; (5+6), (6+5)
 # of ways to get 12  1; (6+6)
2
3
4
5
6
7
8
9
10
11
12
X
P(x) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
1
STA 2023
Chapter 4 – Discrete Random Variables



Verify that the derived distribution is valid
 All p(x)  0
 1/36*(1+2+3+4+5+6+5+4+3+2+1) = 1
Find P(x=7), P(x4), P(x>4), P(x=2  x=3), and P(x=2  x=3)
 P(x=7) = 6/36
 P(x4) = P(x=2) + P(x=3) + P(x=4) = 1/36 + 2/36 + 3/36 = 6/36
 P(x>4) = 1 – P(x4) = 1 – 6/36 = 30/36
 P(x=2  x=3) = 0
 P(x=2  x=3) = P(x=2) + P(x=3) – P(x=2  x=3) = 3/36
Expected Values of Discrete Random Variables (4.3)
o Expected Value – mean or average of a random variable
 Use E(x) to denote expected value
  = E(x) = (x*p(x))
o Variance – squared distance from the mean
 2 = E[(x-)2] = ((x-)2*p(x))
o Standard Deviation – “spread” of the distribution

=
2
2
3
4
5
6
7
8
9
10
11
12
X
1/36
2/36
3/36
4/36
5/36
6/36
5/36
4/36
3/36
2/36
1/36
P(x)
o Example – Sum of two rolls of a six-sided die (continued)
 Calculate the expected value of x. E(x) =  = (x*p(x)) =
1
*[(2*1)+(3*2)+(4*3)+(5*4)+(6*5)+(7*6)+(8*5)+(9*4)+(10*3)+(11
36
*2)+(12*1)] = 7
 Calculate the variance of x. 2 = ((x-)2*p(x)) =
1
*[((5)2*1)+(4)2*2)+(3)2*3+((2)2*4)+((1)2*5)+(02*6)+(12*5)+(22*4)
36
+(32*3)+(42*2)+(52*1) = 5.8333
 Calculate the standard deviation of x.  = 5.8333 = 2.415
 Calculate the proportion of data falling within one, two, and three standard
deviations from the mean. Using  = 7 and  = 2.415, we have the
following table:
Number
of S.D.’s
Lower
Limit
Upper
Limit
Prop.
Of x
1
2
3
4.585
2.17
-.245
9.415
11.83
14.245
.667
.944
1.00
Empirical Chebyshev’s
Rule
Rule
.68
.95
.997
0
 .75
 .89
2
STA 2023
Chapter 4 – Discrete Random Variables

The Binomial Random Variable (4.4)
o Characteristics of a Binomial Random Variable
 n identical observations
 Two possible outcomes for each observation (“success” or “failure”)
 Probabilities within each observation remain constant
 Each observation is independent of the others
 The binomial random variable x represents the number of “successes”
o The Binomial Probability Distribution
n
 p(x) =   p x q n  x , where p is the probability of success, q is the
 x
probability of failure, n is the total number of observations, and x is the
number of successes. Note that q = 1 – p.
o Mean, Variance, and Standard Deviation of the Binomial Distribution
  = np
 2 = npq
  = npq
o Example – Yahtzee!
 Suppose that we are seeking a four-of-a-kind with the number “3”. What
is the probability that, in rolling five die, we get four “3”s? Here, n=5,
since we have five die that we are rolling (our observations). Also, p=1/6,
since the probability of getting a “3” is 1/6, so q = 1-p = 5/6 (this is our
4
1
 5  1   5 
probability of “failure”). So p(x=4) =      = .0032.
 4  6   6 
Furthermore, since there are six possible die values, the probability of any
four-of-a-kind in Yahtzee! is 6(.0032) = .0192.
o Another Method for Calculating Binomial Probabilities
 Problem: Suppose we are doing a binomial experiment where we have
ten people each tossing a quarter, where x represents the number of heads
tosses. If we want to find what the probability is that we get exactly five
10  5 5
heads, this is an easy calculation: p(x=5) =  .5 .5 = .246.
5
However, suppose we wanted to calculate the probability that we get at
most five heads. This calculation is much more involved using the
formula – in fact, we must use it six times to get our answer!
 Solution #1: The binomial tables given in Table II of Appendix A give
cumulative probabilities for selected values of n and p. To solve the
problem presented here, we would look up n=10, which is found in subtable f, and then find the column with p = .5. Looking up k=5 on the far
left column we see that the value in the table is .623. This is not p(x=5)!
This is p(x  5). So, the answer to our question is .623. The binomial
tables only give values for selected values of n and p.
 Solution #2: Use a scientific calculator. Many Texas Instruments models
of the type TI-8x are capable of this.
 Solution #3: Use Normal Approximation to the Binomial (Section 5.5).
3
STA 2023
Chapter 4 – Discrete Random Variables
o Using Binomial Tables
 “At most k”, “no more than k”: P(x  k)
 “Less than k”: P(x < k) = P(x  k-1)
 “At least k”, “no less than k”: P(x  k) = 1 – P(x < k) = 1 – P(x  k-1)
 “More than k”: P(x > k) = 1 – P(x  k)
 “Equals k”, “is k”: P(x = k) = P(x  k) – P(x  k-1)
 “Is not k”, “does not equal k”: P(x  k) = 1 – [P(x  k) – P(x  k-1)]
 P(l < x < k) = P(x  k-1) – P(x  l)
 P(l  x  k) = P(x  k) – P(x  l-1)
o Example – 25 tosses of a quarter
 Calculate the mean and standard deviation of x, where x represents the
number of heads.  = np = (25)*(.5) = 12.5, and 
= npq = (25) * (.5) * (.5) = 6.25 = 2.5.
 Find the probability that we observe at most 13 heads. P(x13) = .655
 Find the probability that we observe less than 12 heads. P(x<12) =
P(x11) = .345
 Find the probability that we observe more than 15 heads. P(x>15) = 1–
P(x15) = 1– .885 = .115
 Find the probability that we observe at least 18 heads. P(x18) = 1–
P(x17) = 1–.978 = .022
 Find the probability that we observe exactly 14 heads. P(x=14) =
P(x14)–P(x13) = .788– .655 = .133
 Find P(4<x<8). P(4<x<8) = P(x7) – P(x4) = .022–.000 = .022
 Find P(6<x11). P(6<x11) = P(x11) – P(x6) = .345– .007 = .338

The Poisson Random Variable (4.5)
o Probability Distribution, Mean, and Variance for a Poisson Random Variable
x e 
 p(x) =
, for x = 0, 1, 2, …
x!
 =
 2 =  (so  =  )
o Example – Game-ending injuries per game in the NFL
 Suppose that x represents the number of game-ending injuries that occur
each game, and x is distributed as a Poisson random variable with  = 2.2.
Find the mean and standard deviation of x.  =  = 2.2, and  =  =
2.2 = 1.48
 Find the probability that exactly 2 game-ending injuries occur in the next
2.2 2 e 2.2
NFL game, using the mass function. P(x=2) =
= .2681
2!
 Find the probability that exactly 4 game-ending injuries occur in the next
2.2 4 e 2.2
NFL game, using the mass function. P(x=4) =
= .1082
4!
4
STA 2023
Chapter 4 – Discrete Random Variables
o Using Poisson Tables
 Cumulative Poisson probabilities are given in Table III of Appendix A, for
selected values of . Thus, these tables operate in much the same fashion
as the binomial tables.
o Example – Game-ending injuries per game in the NFL (continued)
 Find the probability that no more than 2 game-ending injuries will occur in
the next NFL game, using Table III. P(x2) = .623
 Find the probability that at least 4 game-ending injuries will occur in the
next NFL game, using Table III. P(x4) = 1 – P(x3) = 1–.819 = .181
 Find the probability that exactly 2 game-ending injuries will occur in the
next NFL game, using Table III. P(x=2) = P(x2) – P(x1) = .623–.355
= .268

The Hypergeometric Random Variable (4.6)
o Probability Distribution, Mean, and Variance of Hypergeometric Random Var.
 r  N  r 
 

x  n  x 

 p(x) =
, where N = total number of elements, n = total number
N
 
n
sampled, r = total number of “successes”, and x = “successes” drawn
nr
 =
N
r ( N  r ) n( N  n)
 2 =
N 2 ( N  1)
o Example – Blackjack
 What is the probability that your initial hand is dealt before anyone else’s
and your hand (of 2 cards) is both face cards? With any Hypergeometric
problem, we want to initially partition the entire set (a deck of cards) into
two groups: “successes” and “failures” (face cards and non-face cards).
In a standard deck of cards, we have N=52 total cards, of which we are
being dealt n=2. Of these 52 cards, there are r=12 face cards, and Nr=40 non-face cards. We want to find the probability of getting x=2 face
cards, and n-x=0 non-face cards. Substituting these values into our
12  40 
  
2 0
66 * 1
formula, we have P(2 face cards) =    =
= .0498.
1326
 52 
 
2
 Let x represent the number of face cards drawn. Calculate the mean and
standard deviation of x. From above we have N=52, n=2, and r=12, so 
5
STA 2023
Chapter 4 – Discrete Random Variables
=
r ( N  r ) n( N  n)
12(52  12)2(52  2)
nr
2 * 12
=
= .4615, 2 =
=
2
N
52
N ( N  1)
52 2 (52  1)
= .3481, and  = .3481 = .5900.
o Example – “The Price Is Right” – modified
 Among 8 total tiles – 3 strikes and 5 numbers – what is the probability that
 5  3 
  
5 1
in drawing 6 tiles, I win? P(5) =    = .1071.
8
 
6
 Repeat the procedure, but draw 7 tiles instead of 6. What is the
 5  3 
  
5 2
probability that you win now? P(5) =    = .375.
8
 
7
 In the experiment of drawing 7 tiles, what is the expected number of
numbered tiles drawn, where x represents the number of numbered tiles
7*5
nr
drawn? Using N=8, n=7, and r=5, we have  =
=
= 4.375.
8
N
6