Download Document

Document related concepts
no text concepts found
Transcript
Discrete Random Variables
Discrete random variables
For a discrete random variable X the probability
distribution is described by the probability function,
p(x), which has the following properties :
1. 0  px  1
2.
 px   1
x
3. Pa  X  b 
 p x 
a  x b
Comment:
• For a discrete random variable the number of
possible values (i.e. x such that p(x) > 0) is
either finite or countably infinite (in a 1-1
correspondence with positive integers.)
Recall
p(x) = P[X = x] = the probability function of X.
This can be defined for any random variable X.
For a continuous random variable
p(x) = 0 for all values of X.
Let SX ={x| p(x) > 0}. This set is countable (i. e. it can
be put into a 1-1 correspondence with the integers}
SX ={x| p(x) > 0}= {x1, x2, x3, x4, …}
Thus let

 p  x   p  x 
x
i 1
i
Proof: (that the set SX ={x| p(x) > 0} is countable)
(i. e. can be put into a 1-1 correspondence with the integers}
SX = S1  S2  S3  S3  …
where
 1
1
Si   x
 p  x  
i
 i 1
i. e.
 1

S1   x  p  x   1 Note: n  S1   2
 2

 1
1
S2   x  p  x    Note: n  S3   3
2
 3
 1
1
S3   x  p  x    Note: n  S3   4
3
 4
Thus the number of elements of Si  n  Si   i  1 (is finite)
Thus the elements of SX = S1  S2  S3  S3  …
can be arranged {x1, x2, x3, x4, … }
by choosing the first elements to be the elements of S1 ,
the next elements to be the elements of S2 ,
the next elements to be the elements of S3 ,
the next elements to be the elements of S4 ,
etc
This allows us to write
 p  x
x

for
 px 
i 1
i
A Discrete Random Variable
A random variable X is called discrete if

 p  x   p  x   1
x
i 1
i
That is all the probability is accounted for by values, x,
such that p(x) > 0.
Discrete Random Variables
For a discrete random variable X the probability
distribution is described by the probability
function p(x), which has the following properties
1.
0  p  x  1

2.
 p  x   p  x   1
x
3.
i 1
P  a  x  b 
i
 p  x
a  x b
Graph: Discrete Random Variable
P  a  x  b 
p(x)
a
 p  x
a  x b
b
Some Important Discrete
distributions
The Bernoulli distribution
Suppose that we have a experiment that has two
outcomes
1. Success (S)
2. Failure (F)
These terms are used in reliability testing.
Suppose that p is the probability of success (S) and
q = 1 – p is the probability of failure (F)
This experiment is sometimes called a Bernoulli Trial
Let
0 if the outcome is F
X 
1 if the outcome is S
q
Then p  x   P  X  x   
p
x0
x 1
The probability distribution with probability function
q x  0
p  x   P  X  x  
p x 1
is called the Bernoulli distribution
1
0.8
0.6
p
q = 1- p
0.4
0.2
0
0
1
The Binomial distribution
Suppose that we have a experiment that has two
outcomes (A Bernoulli trial)
1. Success (S)
2. Failure (F)
Suppose that p is the probability of success (S) and
q = 1 – p is the probability of failure (F)
Now assume that the Bernoulli trial is repeated
independently n times.
Let
X  the number of successes occuring in th n trials
Note: the possible values of X are {0, 1, 2, …, n}
For n = 5 the outcomes together with the values of X and the
probabilities of each outcome are given in the table below:
FFFFF SFFFF FSFFF FFSFF FFFSF FFFFS
0
1
1
1
1
1
q5
pq4
pq4
pq4
pq4
pq4
SSFFF
2
p2q3
SFSFF
2
p2q3
SFFSF
2
p2q3
SFFFS
2
p2q3
FSSFF
2
p2 q3
FSFSF
2
p2q3
FSFFS
2
p2q3
FFSSF
2
p2q3
FFSFS
2
p2q3
FFFSS
2
p2q3
SSSFF
3
p3q2
SSFSF
3
p3q2
SSFFS
3
p3 q2
SFSSF
3
p3q2
SFSFS
3
p3q2
SFFSS
3
p3q2
FSSSF
3
p3q2
FSSFS
3
p3q2
FSFSS
3
p3q2
FFSSS
3
p3q2
SSSSF
4
p 4q
SSSFS
4
p4q
SSFSS
4
p4q
SFSSS
4
p4q
FSSSS
4
p4q
SSSSS
5
p5
For n = 5 the following table gives the different
possible values of X, x, and p(x) = P[X = x]
x
0
1
p(x) = P[X = x]
q5
5pq4
2
3
10p3q2 10p2q3
4
5
5p4q
p5
For general n, the outcome of the sequence of n
Bernoulli trails is a sequence of S’s and F’s of length
n.
SSFSFFSFFF…FSSSFFSFSFFS
• The value of X for such a sequence is k = the number
of S’s in the sequence.
• The probability of such a sequence is pkqn – k ( a p for
each S and a q for each F)
•
n
There are   such sequences containing exactly
k 
k S’s
n
•  k  is the number of ways of selecting the k
positions

for the S’s. (the remaining n – k positions
are for the F’s
Thus
 n  k nk
p k   P X  k     p q
k 
k  0,1, 2,3,
, n  1, n
These are the terms in the expansion of (p + q)n
using the Binomial Theorem
 p  q
n
 n  0 n  n  1 n 1  n  2 n 2
   p q   p q   p q 
0
1
 2
 n n 0
  p q
 n
For this reason the probability function
 n  x n x
p  x   P  X  x    p q
x  0,1, 2,
 x
,n
is called the probability function for the Binomial
distribution
Summary
We observe a Bernoulli trial (S,F) n times.
Let X denote the number of successes in the n trials.
Then X has a binomial distribution, i. e.
 n  x n x
p  x   P  X  x    p q
x  0,1, 2,
 x
where
1. p = the probability of success (S), and
2. q = 1 – p = the probability of failure (F)
,n
Example
A coin is tossed n= 7 times.
Let X denote the number of heads (H) in the n = 7
trials.
Then X has a binomial distribution, with p = ½ and
n = 7.
Thus
 n  x n x
p  x   P  X  x    p q
x  0,1, 2,
 x
 7  1 x 1 7 x
   2   2 
x  0,1, 2, , 7
 x
7 1 7
   2 
 x
x  0,1, 2,
,7
,n
x
0
1/
p(x)
p(x)
128
1
7/
128
2
21/
128
3
35/
128
4
35/
5
128
21/
6
7/
128
7
1/
128
0.3
0.25
0.2
0.15
0.1
0.05
0
0
1
2
3
4
x
5
6
7
128
Example
If a surgeon performs “eye surgery” the chance of “success”
is 85%. Suppose that the surgery is perfomed n = 20 times
Let X denote the number of successful surgeries in the n =
20 trials.
Then X has a binomial distribution, with p = 0.85 and n =
20.
Thus
 n  x n x
p  x   P  X  x    p q
x  0,1, 2, , n
 x
 20 
x
20  x
   .85 .15
x  0,1, 2, , 20
 x
x
p (x )
x
p (x )
x
p (x )
x
p (x )
0
0.0000
6
0.0000
12
0.0046
18
0.2293
1
0.0000
7
0.0000
13
0.0160
19
0.1368
2
0.0000
8
0.0000
14
0.0454
20
0.0388
3
0.0000
9
0.0000
15
0.1028
4
0.0000
10
0.0002
16
0.1821
5
0.0000
11
0.0011
17
0.2428
0.3000
0.2500
p(x)
0.2000
0.1500
0.1000
0.0500
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
x
The probability that at least sixteen operations are
successful
= P[X ≥ 16]
= p(16) + p(17) + p(18) + p(19) + p(20)
= 0.1821 + 0.2428 + 0.2293 + 0.1368 + 0.0388
= 0.8298
Other discrete distributions
•
•
•
•
Poisson distribution
Geometric distribution
Negative Binomial distribution
Hypergeometric distribution
The Poisson distribution
• Suppose events are occurring randomly and
uniformly in time.
• Let X be the number of events occuring in a
fixed period of time. Then X will have a
Poisson distribution with parameter l.
p  x 
lx
x!
e
l
x  0,1, 2,3, 4,
Some properties of the probability function for
the Poisson distribution with parameter l.

x 0
x 0
 p  x  
1.


l
x
 x! e
x 0
l
l
x
x!
e l  1
2
3
4

l l l
l
 e 1  l 
 

2! 3! 4!

 e  l  el   1
2
3
4
u
u
u
u
using e  1  u    
2! 3! 4!



n x
n x
2. If pBin  x p, n     p 1  p 
 x
is the probability function for the Binomial
distribution with parameters n and p, and we
allow n → ∞ and p → 0 such that np = a
constant (= l say) then
l x l
lim pBin  x p, n   pPoisson  x l  
e
n  , p 0
x!
n x
n x
Proof: pBin  x p, n     p 1  p 
 x
l
Suppose np  l or p 
n
x
n!
l
pBin  x p, n   pBin  x l , n  
 
x ! n  x !  n 
l
x
 l
1  
 n
n x
n!
 l  l

1   1  
x
x ! n  n  x !  n   n 
x
n
x
n
n

1
n

x

1




l
 l  l

1   1  
x!
nn n
 n  n
x
n
x
l  1   x  1  l   l 
 1 1    1 
1   1  
x!  n  
n  n   n 
x
n
Now
lim pBin  x l , n 
n 

l
 1 

lim 1  
x ! n 
 n 
x
l
 l

lim 1  
x ! n  n 
x

 x  1  l   l  
1 
1   1   
n  n   n  


x
n
n
n
Now using the classic limit
lim pBin  x l , n  
n 
l
 u
lim  1    eu
n 
 n
l
n
lx

lim 1   
e  l  pPoisson  x l 
x ! n  n 
x!
x
Graphical Illustration
Suppose a time interval is divided into n equal parts and
that one event may or may not occur in each subinterval.
n subintervals
time interval
- Event occurs
- Event does not occur
X = # of events is Bin(n,p)
As n→∞ , events can occur over the continuous time
interval.
X = # of events is Poisson(l)
Example
The number of Hurricanes over a period of a
year in the Caribbean is known to have a Poisson
distribution with l = 13.1
Determine the probability function of X.
Compute the probability that X is at most 8.
Compute the probability that X is at least 10.
Given that at least 10 hurricanes occur, what is
the probability that X is at most 15?
Solution
• X will have a Poisson distribution with
parameter l = 13.1, i.e.
p  x 
l
x
x!
e
l
x
13.1 13.1

e
x!
x  0,1, 2,3, 4,
x  0,1, 2,3, 4,
Table of p(x)
x
0
1
2
3
4
5
6
7
8
9
p (x )
0.000002
0.000027
0.000175
0.000766
0.002510
0.006575
0.014356
0.026866
0.043994
0.064036
x
10
11
12
13
14
15
16
17
18
19
p (x )
0.083887
0.099901
0.109059
0.109898
0.102833
0.089807
0.073530
0.056661
0.041237
0.028432
P at most 8  P  X  8
 p  0  p 1 
 p 8  .09527
P at least 10  P  X  10  1  P  X  9
 1   p  0   p 1 
 p  9    .8400
P at most 15 at least 10  P  X  15 X  10


P  X  15   X  10
P  X  10
p 10   p 11 
.8400

 p 15
P 10  X  15
P  X  10
 0.708
The Geometric distribution
Suppose a Bernoulli trial (S,F) is repeated until a
success occurs.
Let X = the trial on which the first success (S) occurs.
Find the probability distribution of X.
Note: the possible values of X are {1, 2, 3, 4, 5, … }
The sample space for the experiment (repeating a
Bernoulli trial until a success occurs is:
S = {S, FS, FFS, FFFS, FFFFS, … , FFF…FFFS, …}
(x – 1) F’s
p(x) =P[X = x] = P[{FFF…FFFS}] = (1 – p)x – 1p
Thus the probability function of X is:
P[X = x] = p(x) = p(1 – p)x – 1 = pqx – 1
A random variable X that has this distribution is said to
have the Geometric distribution.
Reason p(1) = p, p(2) = pq, p(3) = pq2 , p(4) = pq3 , …
forms a geometric series

 p  x   p 1 +p  2  +p 3 +
x 1
 p  pq  pq  pq 
2
3
p
p

 1
1- q p
The Negative Binomial distribution
Suppose a Bernoulli trial (S,F) is repeated until k
successes occur.
Let X = the trial on which the kth success (S) occurs.
Find the probability distribution of X.
Note: the possible values of X are
{k, k + 1, k + 2, k + 3, 4, 5, … }
The sample space for the experiment (repeating a
Bernoulli trial until k successes occurs) consists of
sequences of S’s and F’s having the following properties:
1. each sequence will contain k S’s
2. The last outcome in the sequence will be an S.
A sequence of length x
containing exactly k S’s
SFSFSFFFFS FFFSF … FFFFFFS
The # of S’s in
the first x – 1
trials is k – 1.
The last
outcome
is an S
 x  1 k x  k
p  x   P  X  x  
p q
 k  1
x  k , k  1, k  2,
The # of ways of
choosing from the
first x – 1 trials, the
positions for the
first k – 1 S’s.
The probability
of a sequence
containing k S’s
and x – k F’s.
The Hypergeometric distribution
Suppose we have a population containing N objects.
Suppose the elements of the population are partitioned
into two groups. Let a = the number of elements in group
A and let b = the number of elements in the other group
(group B). Note N = a + b.
Now suppose that n elements are selected from the
population at random. Let X denote the elements from
group A. (n – X will be the number of elements from
group B.)
Find the probability distribution of X.\
Population
GroupB (b elements)
Group A (a elements)
x
sample (n elements)
N-x
Thus the probability function of X is:
The number of ways
x elements can be
chosen Group A .
 a  b 
 

x
n

x

p  x   P  X  x    
N
 
n
The number of ways
n - x elements can be
chosen Group B .
The total number of
ways n elements can be
chosen from N = a + b
elements
A random variable X that has this distribution is said to have the
Hypergeometric distribution.
The possible values of X are integer values that range from
max(0,n – b) to min(n,a)
Discrete distributions
The Bernoulli distribution
1
0.8
q  1  p
p  x   P  X  x  
p

x0
x 1
0.6
0.4
0.2
0
0
1 Bernoulli trial = S
X 
0 Bernoulli trial = F
1
The Binomial distribution
 n  x n x
p  x   P  X  x    p q
x  0,1, 2,
 x
,n
0.3000
p(x)
0.2500
0.2000
0.1500
0.1000
0.0500
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20
x
X = the number of successes in n repetitions of
a Bernoulli trial
p = the probability of success
The Poisson distribution
lx
p  x 
x!
e l
x  0,1, 2,3, 4,
Events are occurring randomly and uniformly in time.
X = the number of events occuring in a fixed period of
time.
0.12
0.10
0.08
0.06
0.04
0.02
0
2
4
6
8
10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
The Geometric Distribution
The Negative Binomial Distribution
The Binomial distribution, the Geometric distribution and the
Negative Binomial distribution each arise when repeating
independently Bernoulli trials
The Binomial distribution
the Bernoulli trials are repeated independently a fixed number of
times n and X = the numbers of successes
The Negative Binomial distribution
the Bernoulli trials are repeated independently until a fixed
number, k, of successes has occurred and X = the trial on which
the kth success occurred.
The Geometric distribution
the Bernoulli trials are repeated independently the first success
occurs (,k = 1) and X = the trial on which the 1st success
occurred.
The Geometric distribution
Suppose a Bernoulli trial (S,F) is repeated until a
success occurs.
Let X = the trial on which the first success (S) occurs.
Find the probability distribution of X.
Note: the possible values of X are {1, 2, 3, 4, 5, … }
The sample space for the experiment (repeating a
Bernoulli trial until a success occurs is:
S = {S, FS, FFS, FFFS, FFFFS, … , FFF…FFFS, …}
(x – 1) F’s
p(x) =P[X = x] = P[{FFF…FFFS}] = (1 – p)x – 1p
Thus the probability function of X is:
P[X = x] = p(x) = p(1 – p)x – 1 = pqx – 1
A random variable X that has this distribution is said to
have the Geometric distribution.
Reason p(1) = p, p(2) = pq, p(3) = pq2 , p(4) = pq3 , …
forms a geometric series

 p  x   p 1 +p  2  +p 3 +
x 1
 p  pq  pq  pq 
2
3
p
p

 1
1- q p
Example
Suppose a die is rolled until a six occurs
Success = S = {six} , p = 1/6.
Failure = F = {no six} q = 1 – p = 5/6.
1. What is the probability that it took at most 5 rolls of a die to
roll a six?
2. What is the probability that it took at least 10 rolls of a die to
roll a six?
3. What is the probability that the “first six” occurred on an
even number toss?
4. What is the probability that the “first six” occurred on a toss
divisible by 3 given that the “first six” occurred on an even
number toss?
Solution
Let X denote the toss on which the first head occurs.
Then X has a geometric distribution with p = 1/6.. q = 1 – p = 5/6.
P  X  x   p  x   pq
1.
2.
3.
4.
x 1

1
6
 
5 x 1
6
x  1, 2,3,
P[X ≤ 5]?
P[X ≥ 10]?
P[X is divisible by 2]?
P[X is divisible by 3| X is divisible by 2]?
1.
P[X ≤ 5]?
P  X  5  p 1  p  2  p  3  p  4  p  5

1
6
  +    +    +    +   
5 0
6
1
6
5 1
6
1
6
5 2
6
1
6
5 1
5 2
5 3
5 4

   1+  6  +  6  +  6  +  6 


 
1-  56 5 
5 5
1
  1  6 
 6
5
 1  6 
n
1

r
Using
a  ar  ar 2   ar n1  a
1 r
1
2
a  ar  ar   a
Note also
1 r
1
6
5 5
6
5 3
6
1
6
5 4
6
2.
P[X ≥ 10]?
P  X  10  p 10  p 11  p 12 

1
6
  +   

1
6
 

Using
5 9
6
5 9
6
1
6
 
1
6
5 10
6
+
1
6
 
5 11
6
1+  5 1 +  5 2 + 
6
 6

5 9
6
 1  5 9
 5  6
1  6 
a  ar  ar 
2
1
a
1 r
+
3.
P[X is divisible by 2]?
P  X is divisible by 2  p  2  p  4  p  6 

1
6
  +    +   
5 1
6
1
6
  16  65  1+ 


5
36
5 3
6
 + 
5 2
6
5 5
6
1
6
5 4
6
+
 1 

+    16  65  
2

1   56  
 1 
5
5


 25  3625 11
1  36 
4.
P[X is divisible by 3| X is divisible by 2]?
P  X is divisible by 3 X is divisible by 2 


P  X is divisible by 3   X is divisible by 2
P  X is divisible by 2
P  X is divisible by 6
P  X is divisible by 2
P  X is divisible by 6  p  6  p 12  p 18 

1
6
  +   
5 5
6
1
6
5 11
6
+
1
6
 
5 17
6
+
5 6
5 12

    1+  6  +  6  + 


5
 1 
5
5
3125
5
1
 6 6 
  6  6  
6
43531
1-  56   6  5
1
6
5 5
6
Hence
P  X is divisible by 3 X is divisible by 2 

P  X is divisible by 6
P  X is divisible by 2
 3125 


  43531 
6875

 5  43531
 
 11 
The Negative Binomial distribution
Suppose a Bernoulli trial (S,F) is repeated until k
successes occur.
Let X = the trial on which the kth success (S) occurs.
Find the probability distribution of X.
Note: the possible values of X are
{k, k + 1, k + 2, k + 3, 4, 5, … }
The sample space for the experiment (repeating a
Bernoulli trial until k successes occurs) consists of
sequences of S’s and F’s having the following properties:
1. each sequence will contain k S’s
2. The last outcome in the sequence will be an S.
A sequence of length x
containing exactly k S’s
SFSFSFFFFS FFFSF … FFFFFFS
The # of S’s in
the first x – 1
trials is k – 1.
The last
outcome
is an S
 x  1 k x  k
p  x   P  X  x  
p q
 k  1
x  k , k  1, k  2,
The # of ways of
choosing from the
first x – 1 trials, the
positions for the
first k – 1 S’s.
The probability
of a sequence
containing k S’s
and x – k F’s.
Example
Suppose the chance of winning any prize in a lottery is
3%. Suppose that I play the lottery until I have won k =
5 times.
Let X denote the number of times that I play the lottery.
Find the probability function, p(x), of X
 x  1 k x  k
p  x   P  X  x  
p q
 k  1
 x  1
5
x 5

  0.03  0.91
 4 
x  k , k  1, k  2,
x  5, 6, 7,
Graph of p(x)
0.007
0.006
0.005
0.004
0.003
0.002
0.001
0
100
200
300
400
500
600
The Hypergeometric distribution
Suppose we have a population containing N objects.
Suppose the elements of the population are partitioned
into two groups. Let a = the number of elements in group
A and let b = the number of elements in the other group
(group B). Note N = a + b.
Now suppose that n elements are selected from the
population at random. Let X denote the elements from
group A. (n – X will be the number of elements from
group B.)
Find the probability distribution of X.\
Population
GroupB (b elements)
Group A (a elements)
x
sample (n elements)
n-x
Thus the probability function of X is:
The number of ways
x elements can be
chosen Group A .
 a  b 
 

x
n

x

p  x   P  X  x    
N
 
n
The number of ways
n - x elements can be
chosen Group B .
The total number of
ways n elements can be
chosen from N = a + b
elements
A random variable X that has this distribution is said to have the
Hypergeometric distribution.
The possible values of X are integer values that range from
max(0,n – b) to min(n,a)
Example: Estimating the size of a
wildlife population
• Suppose that N (unknown) is the size of a wildlife
population.
• To estimate N, T animals are caught, tagged and
replaced in the population. (T is known)
• A second sample of n animals are caught and the
number, t, of tagged animals is noted. (n is known
and t is the observation that will be used to estimate
N).
Note
• The observation, t, will have a hypergeometric
distribution
 T  N  T 
 

t
n

t
L N
f  t ; N    


N
 
n
To Estimate N we find the value Nˆ , that maximizes
 T  N  T 
 

t
n

t

L  N    
N
 
n
To determine when
 T  N  T 
 

t
n

t

L  N    
N
 
n
is maximized compute and determine when the ratio
L  N  1
LN 
is greater than 1 and less than 1.
Now
 T  N  1  T 

L  N  1  t 
nt 


LN 
 N  1


n


 T  N  T 
 

 t  n  t 
N
 
n
 N 1 T   N 

 
n

t
n




 N  T   N  1



 n  t  n 
N  1  T  ! N  T  n  t  ! N ! N  1  n !


 N  T ! N  1  T  n  t ! N  1! N  n !
N  1  T  N  1  n 


 N  1  T  n  t  N  1
Now
L  N  1
1
LN 
if
 N  1  T  N  1  n   1
 N  1  T  n  t  N  1
or
 N  1  T  N  1  n    N  1 T  n  t  N  1
 N  1  (n  T )  N  1  nT 
2
 N  1   t  n  T  N  1
2
or
and
nT  t  N  1
nT
N
1
t
hence
and
also
 x 
L  N  1
1
LN 
L  N  1
1
LN 
nT
N
1
t
nT
N
1
t
if
if
L  N  1
1
LN 
nT
N
1
t
if
greatest integer less than or equal to x
 nT 
 t 
nT
1
t
N
nT
t
Thus
 nT 
ˆ
N  
 t 
nT
 greatest integer less than or equal to
t
nT
nT
nT
ˆ
ˆ
If
is an integer then N 
 1 or N 
t
t
t
Example: Hyper-geometric distribution
Suppose that N = 10 automobiles have just come off the
production line. Also assume that a = 3 are defective
(have serious defects). Thus b = 7 are defect-free.
A sample of n = 4 are selected and tested to see if they
are defective. Let X = the number in the sample that are
defective. Find the probability function of X.
From the above discussion X will have a hypergeometric distribution i.e.
 a  b   3  7 
 
  

x
n

x
x
4

x
   
 x  0,1, 2,3
p  x   P  X  x    
N
10 
 
 
n
 
4
Table and Graph of p(x)
x
p (x )
0
0.1667
1
0.5000
2
0.3000
3
0.0333
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0
1
2
3
Sampling with and without replacement
Suppose we have a population containing N objects.
Suppose the elements of the population are partitioned
into two groups. Let a = the number of elements in group
A and let b = the number of elements in the other group
(group B). Note N = a + b.
Now suppose that n elements are selected from the
population at random. Let X denote the elements from
group A. (n – X will be the number of elements from
group B.)
Find the probability distribution of X.
1. If the sampling was done with replacement.
2. If the sampling was done without replacement
Solution:
1. If the sampling was done with replacement.
Then the distribution of X is the Binomial distn.
with
a
b
p
and q  1  p 
N
N
x
n x
n
  a   b 
i.e. pBinom  x   P  X  x       
 x N   N 
2. If the sampling was done without replacement.
Then the distribution of X is the hyper-geometric distn.
 a  b 
 

x  n  x 

i.e. pHyper  x   P  X  x  
N
 
n
Note:
 a  b   N 
  
pHyper  x   x 
n

x
 n
 
pBinom  x   n   a  x  b  n  x
    
 x N   N 
a!

x ! a  x !

a  a  1
 1
1  
a

x ! n  x ! ( N  n)!n ! N n
b!
N!
a xb n  x
 n  x ! b  n  x ! n !
 a  x  1 b  b  1  b  n  x  1
ax
bn x
Nn
N  N  1 ( N  n  1)
 x  1  1   n  x  1 
1 
1   1 

a
b
b


 
  1 as N , a, b  
1   n 1 

1   1 

N
N

 

Thus
 a  b   N 
pHyper  x    
  
 x  n  x   n 
 n a   b 
pBinom  x        
 x N   N 
x
n x
for large values of N, a and b
Thus for large values of N, a and b sampling with
replacement is equivalent to sampling without
replacement.
Summary
Discrete distributions
Discrete distributions
The Bernoulli distribution
1
0.8
q  1  p
p  x   P  X  x  
p

x0
x 1
0.6
0.4
0.2
0
0
1 Bernoulli trial = S
X 
0 Bernoulli trial = F
1
The Binomial distribution
 n  x n x
p  x   P  X  x    p q
x  0,1, 2,
 x
,n
0.3000
p(x)
0.2500
0.2000
0.1500
0.1000
0.0500
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20
x
X = the number of successes in n repetitions of
a Bernoulli trial
p = the probability of success
The Poisson distribution
lx
p  x 
x!
e l
x  0,1, 2,3, 4,
Events are occurring randomly and uniformly in time.
X = the number of events occuring in a fixed period of
time.
0.12
0.10
0.08
0.06
0.04
0.02
0
2
4
6
8
10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
The Geometric distribution
the Bernoulli trials are repeated independently the first success
occurs (,k = 1) and X = the trial on which the 1st success
occurred.
P[X = x] = p(x) = p(1 – p)x – 1 = pqx – 1
The Negative Binomial distribution
the Bernoulli trials are repeated independently until a fixed
number, k, of successes has occurred and X = the trial on which
the kth success occurred.
 x  1 k x  k
p  x   P  X  x  
p q
 k  1
x  k , k  1, k  2,
Geometric ≡ Negative Binomial with k = 1
The Hypergeometric distribution
Suppose we have a population containing N objects.
The population are partitioned into two groups.
• a = the number of elements in group A
• b = the number of elements in the other group (group B).
Note N = a + b.
• n elements are selected from the population at random.
• X = the elements from group A. (n – X will be the number of
elements from group B.)
 a  b 
 

x  n  x 

p  x   P  X  x 
N
 
n
Related documents