Download geometric distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Randomness wikipedia , lookup

Inductive probability wikipedia , lookup

Birthday problem wikipedia , lookup

Probability interpretations wikipedia , lookup

Conditioning (probability) wikipedia , lookup

Law of large numbers wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Transcript
OpenStax-CNX module: m13124
1
GEOMETRIC DISTRIBUTION
∗
Ewa Paszek
This work is produced by OpenStax-CNX and licensed under the
†
Creative Commons Attribution License 2.0
Abstract
This course is a short series of lectures on Introductory Statistics. Topics covered are listed in the
Table of Contents. The notes were prepared by Ewa Paszek and Marek Kimmel. The development of
this course has been supported by NSF 0203396 grant.
1 GEOMETRIC DISTRIBUTION
To obtain a binomial random variable, we observed a sequence of n Bernoulli trials and counted the number
of successes. Suppose now that we do not x the number of Bernoulli trials in advance but instead continue
to observe the sequence of Bernoulli trials until a certain number r, of successes occurs. The random
variable of interest is the number of trials needed to observe the rth success.
1.1
Let rst discuss the problem when r =1. That is, consider a sequence of Bernoulli trials with probability p
of success. This sequence is observed until the rst success occurs. Let X denot the trial number on which
the rst success occurs.
For example, if F and S represent failure and success, respectively, and the sequence starts with
F,F,F,S,. . ., then X =4. Moreover, because the trials are independent, the probability of such sequence
is
3
P (X = 4) = (q) (q) (q) (p) = q 3 p = (1 − p) p.
In general, the p.d.f. f (x) = P (X = x) , of X is given by f (x) = (1 − p)x−1 p,x = 1, 2, ..., because
there must be x -1 failures before the rst success that occurs on trail x. We say that X has a geometric
distribution.
note:
for a geometric series, the sum is given by
∞
X
k=0
when |r| < 1.
∗ Version
1.3: Oct 8, 2007 3:25 pm -0500
† http://creativecommons.org/licenses/by/2.0/
http://cnx.org/content/m13124/1.3/
ark =
∞
X
k=1
ark−1 =
a
,
1−r
OpenStax-CNX module: m13124
2
Thus,
∞
X
f (x) =
x=1
∞
X
k−1
(1 − p)
p=
x=1
p
= 1,
1 − (1 − p)
so that f (x) does satisfy the properties of a p.d.f..
From the sum of geometric series we also note that, when k is an integer,
∞
X
P (X > k) =
k
x−1
(1 − p)
p=
x=k+1
(1 − p) p
k
= (1 − p) = q k ,
1 − (1 − p)
and thus the value of the distribution function at a positive integer
P (X ≤ k) =
∞
X
x−1
(1 − p)
k
is
k
p = 1 − P (X > k) = 1 − (1 − p) = 1 − q k .
x=k+1
Example 1
Some biology students were checking the eye color for a large number of fruit ies. For the
individual y, suppose that the probability of white eyes is 14 and the probability of red eyes is 34
, and that we may treat these ies as independent Bernoulli trials. The probability that at least
four ies have to be checked for eye color to observe a white-eyed y is given by
P (X ≥ 4) = P (X > 3) = q 3 =
3
3
= 0.422.
4
The probability that at most four ies have to be checked for eye color to observe a white-eyed
y is given by
P (X ≤ 4) = 1 − q 4 = 1 −
4
3
= 0.684.
4
The probability that the rst y with white eyes is the fourth y that is checked is
P (X = 4) = q
4−1
3 3
1
p=
= 0.105.
4
4
It is also true that
"
3 # 3 4 # "
3
3
1
3
− 1−
=
.
P (X = 4) = P (X ≤ 4) − P (X ≤ 3) = 1 −
4
4
4
4
In general,
f (x) = P (X = x) =
http://cnx.org/content/m13124/1.3/
x−1 3
1
, x = 1, 2, 3, ...
4
4
OpenStax-CNX module: m13124
3
1.2
To nd a mean and variance for the geometric distribution, let use the following results about the sum and
the rst and second derivatives of a geometric series. For −1 < r < 1 , let
g (r) =
∞
X
a
.
1−r
ark =
k=0
Then
g ' (r) =
∞
X
akrk−1 =
k=1
and
g '' (r) =
∞
X
a
(1 − r)
ak (k − 1) rk−2 =
k=2
If
X
2,
2a
3.
(1 − r)
has a geometric distribution and 0 < p < 1 , then the mean of
E (X) =
∞
X
xq x−1 p =
x=1
p
2
(1 − q)
=
X
is given by
1
,
p
(1)
using the formula for g ' (x) with a = p and r = q .
note:
for example, that if
p
=1/4 is the probability of success, then
E (X) = 1/ (1/4) = 4
trials are needed on the average to observe a success.
1.3
To nd the variance of
, let rst nd the second factorial moment E [X (X − 1)]. We have
X
E [X (X − 1)] =
∞
X
x (x − 1) q x−1 p =
∞
X
pqx (x − 1) q x−2 =
x=1
x=1
Using formula for g '' (x) with a = pq and r = q . Thus the variance of
X
2pq
3
(1 − q)
=
2q
.
p2
is
2
2
V ar (X) = E X 2 − [E (X)] = {E [X (X − 1)] + E (X)} − [E (X)] =
=
2q
p2
+
1
p
−
The standard deviation of
1
p2
=
X
is
2q+p−1
p2
=
1−p
p2 .
σ=
p
(1 − p) /p2 .
Example 2
Continuing with example 1 (Example 1), with p =1/4, we obtain
1
1/4 = 4,
3/4
σ 2 = (1/4)
2 =
µ=
and
σ=
http://cnx.org/content/m13124/1.3/
√
12,
12 = 3.464.
OpenStax-CNX module: m13124
note:
Binomial Distribution1
note:
Poisson Distribution2
1 "BERNOULLI TRIALS and the BINOMIAL DISTRIBUTION", see section at
<http://cnx.org/content/m13123/latest/#sec_3>
2 "POISSON DISTRIBUTION": Section POISSON DISTRIBUTION <http://cnx.org/content/m13125/latest/#sec_1>
http://cnx.org/content/m13124/1.3/
4