Download Chapter 5

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
STAT 111
Chapter 5
Special Probability
Distributions
1

Often , the observations generated by different
statistical experiments have the same general
type of behavior. Consequently, discrete random
variables associated with these experiments can
be described by essentially the same probability
distribution and therefore can be represented by
a single formula. In fact, one needs only a
handful of important probability distributions to
describe many of the discrete random variables
encountered in practice
The Discrete Uniform Distribution

The simplest of all discrete probability
distributions is one in which the random
variable assumes each of its values with
an equal probability. Such a probability' is
called the discrete uniform distribution.
3
Definition
(l)If the random variable X assumes the
values x 1,x 2,...,xn(finite number of values) ,
( 2) with equal probabilities, then the discrete
unifom distribution is given by
f (x) = P(X = x) =
1
n
x = x1,x2,...,xn
4
Example
when a fair die is tossad , each element of
the sample space {1,,2,….,6} occurs with
1
probability
. therefore , we have a
6
uniform distribution with
1
f(x) =
6
x=1,2,………,6
5
Example
let X uniformly distributed on 0,1,...,9. Calculate:
1. P(X≤2) = P(X=2)+P(X=1)+(X=0) = 1 + 1 + 1 = 3
10 10 10
10
2.
1 + 1 = 2 = 1
P(1<X<4)
= P(X=2)+P(X=3)=
10 10 10 5
3. P(4<X<6
4.
or 8<X ≤12) =P(4<X<6)+P(8<X ≤12) = 1 + 1 = 1
10
10
5
P(5≤X ≤ 9) = 5 = 1
10 2
6
The Bernoulli Distribution
An experiment of a particularly simple type
is one in which there are only two possible
outcomes , such as head or tail, success
or failure, defective or not defective, It is
convenient to designate the two possible
outcomes of such an experiment as
0 and 1 .
7
The following definition can then be applied to any
this type.
A random variable X has a Bernoulli distribution with
parameter p (0≤ p ≤1) if X can take only the values
0 and 1 and the probabilities
P(X=1)= p and
P(X=0)=l-p
If we let q=1-p, then the p.m.f of X
can be written as follows;
{
f(x)=
pxq1-x x=0,1
0
otherwise
Note that that parameter of the Bernoulli distribution is p.
Binomial Distribution
A random variable X has a binomial distribution with
parameters n and p, if X has a discrete distribution
for which the p.m.f. is as follows;
n
pxqn-x
q = 1- p
x = o,1,...,n
x
f(x) =
o
otherwise
In this distribution n must be a positive integer and p must lie in the interval 0<p<l.
The binomial distribution is of fundamental importance in
probability and statistics because of the following result;
Suppose that the outcome of an experiment can be
either success or failure; that the experiment is
performed n times independently; and that the
probability of the success of any given trial is constant
and is equal to p.
If X denotes the number of successes among the n
trials, then X has a binomial distribution with
parameters n and p. We will often write X~ bin(n,p) to
indicate that X is a binomial random variable based on
n trials with success probability p. Note that the
binomial distribution reduces to the Bernoulli
distribution when n=l.
10
In summary, if
‫ من المرات المستقلة‬n ‫ تتكرر التجربة‬.1•
1-The experiment is performed n times independently
)‫نتائج التجربة نتيجتين فقط (النتيجة التى تهمنا ونبحث عن االحتماالت لها هى النجاح والفشل‬.2
2-The outcome of an experiment can be either success(S) of failure(F)
probability of a success = p,
Probability of failure = q=1-p,
If these conditions are satisfied then
f (x) =
n
x
px qn-x x= 0,1,…n
11
Example
An agricultural scientist plants 10 seeds of a certain variety
of hybrid corn(‫)ذرة مهجن‬. Past experience leads him to
believe that the probability of a given seed germinating
)‫ )انبات‬is 0.9.
1. What is the probability that exactly 6
seeds germinate?
X= number of seeds germ.
n=10,
S=seed germ.
F=seed do not germ.
P=0.9,
q= 1-0.9=0.1
f(6)=
10
6
(0.9) 6 (0.1) 4 = 0.0112
12
2. What is the probability that all the seeds germinate?
f (10) =
10
10
(0.9)l0 (0.1)0 = 0.349
3. What is the probability' that 4 seeds do not germinate?
X=seed do not germ.
S=seed do not germ.
10
f (4) = 4
n=l 0,
F= seed germ. p=0.1, q=0.9
(0.9)6 (0.1)4 =0.0112
13
Remark
Consider sampling with replacement from an um
containing M balls, K of which are defective. Let
X represents the number of defective balls in a
sample of size n. The experiment of taking a
sample of size n with replacement consists of n
repeated independent trials where
p=P(success)=(k/N); so X has the binomial
distribution
K
(S)
N-K
(F)
f (x) =
n
n
x
k )x (1 - k )n-x
N
N
x=0,1,2,…,n
(
14
The probability that a binomial random variable X is less
than or equal to some specified value x is given by the
cumulative distribution function
x
n
i=0
i
F(x)=P(X≤x)= Σ
pi qn-i
The binomial distribution has been tabulated (Table 1 in the Appendix)
for various values of n and p by using the cumulative probability for
samples of size n = 1,2,...20. and selected values of p from 0.1 to 0.9.
We can also determine individual probabilities using this table since the
binomial random variable is integer-valued. and thus the property
f(x)=P(X=x)=F(x)-F(x-l)
15
Example
The probability that a patient recovers from a rare blood disease is
0.4. If 15 people are known to have contracted this disease, what is
the probability that
1-At least 10 survive?
X=number of people that survive
P(X ≥ 10) = 1 - P(X < 10) = 1 – P(X ≤ 9) = 1 – F(9) = 1 - 0.9662 = 0.0338
2-From 3 to 8 survive?
P(3 ≤ X≤8) = F(8) - F(2) = 0.9050 -0.0271 = 0.8779
3-Exactly 5 survive?
P(X=5)= F(5)-F(4)=0.4032-0.2173=0.1862
or
15
P(X=5)=f(5) =
(0.4)5(0.6)10 = 0.1862
5
16
Example
cars coming to a dead-end intersection can turn either left or
right. at successive cars choose a turning direction
independently of one another and that p (left)=0.7.
1- Among the next 15 cars, what is the probability that at least 10 turn left?
X= number of cars which turn left,
n=15 , S= cars which turn left,
X " = cars which turn right,
p=0.7
q=0.3
P(X≥ 10) = 1 - P(X < 10) = 1 - P(X ≤ 9) = 1 - F(9) = 1 - 0.278 = 0.722
________________________________
2.among the next 15 cars, what is the probability' that at least 10
turn in the direction?
Let X " = number of cars turn right,
n=15 , S=cars which turn right, p=0.3,q=0.7
P( at least 10 turn in the same direction)=P(at least 10 turn left or at least 10 turn
right) = P(X≥10 or X"≥10)
=P(X≥10)+P(X"≥10) = 0.004+0.722
17
The binomial experiment becomes a
multinomial experiment if we let each
trial have more that two possible
outcomes. Hence the classification of a
manufactured product as being light,
heavy, or acceptable constitutes a
multinomial experiment
18
Definition
If a given trial can result in the k outcomes, E1,E2,…,Ek with
probabilities p 1,p2,...,pk, then the probability distribution of
the random variables X 1,X 2,...,Xk representing the number
of occurrences for E1,E2,..,Ek in n independent trials is
n
x
ƒ) x1 ,x2 ,…,xk)= x , x .., x
1
2
k
p1
1
x
p22
x
. . . pkk
With
k
∑ xi = n
i
and
k
∑ pi = 1
i=1
19
Example
A balanced die has two faces colored red, three
faces colored white, and one colored blue. If it
is thrown 9 times, what is the probability that
each color appears 3 times .
3
ƒ) 3 ,3 ,3 ) =
9
3,3,3
2
6
×
3
3
6
×
3
1
6
20
Exercise
A pair of dice is tossed 6 times , what is the
probability of obtaining a total of 7 or 11
twice , a matching pair once , and any other
combination 3 times‫؟‬
Answer
f ( 2 , 1, 3 ) = 0.1127
21
The Hypergeometric Distribution
The types of applications of the hypergeometric are
very similar to those of the binomial distribution.
The difference between the binomial distribution
and the hypergeometric distribution lies in the way
the sampling is done. In the case of the binomial,
independence among trials is required (sampling
with replacement). On the other hand, the
hypergeometric distribution does not require
independence and therefore, is based on the
sampling done without replacement.
22
Consider a set of N elements of which k are
successes and the other N-k are failures. We
are interested in the probability of getting x
successes in n trials, without replacement,
X=number of success
K
(S)
N-K
(F)
n
N
23
Definition
A random variable X has a hypergeometric
distribution, if and only if its probability
distribution is given by
f(x)=p(X=x)=
K
N -K
x
n–x
x= 0,1,2,…,n
N
n
0
otherwise
24
Note that
f(x) is equal to zero when x>k or when
(n-x)>N-k. Applications for the
hypergeometric distribution are found in
many areas where testing is done at the
expense of the item being tested. The
item is destroyed and hence cannot be
replaced in the sample. Thus sampling
without replacement is necessary.
25
Example
A jar contains 5 red and 10 green marbles.
Seven are selected without replacement. What is
the probability that exactly 3 red are selected?
That exactly 6 red are selected?
f(3) =
5
10
5
10
3
4
6
1
15
7
= 0.326
, f(6) =
15
= zero
7
26
Example
A committee of size 5 is to be selected at random from 3
chemists and 5 physicists. Find the probability distribution
for the number of chemists on the committee?
3
5
f(0) = 4
5
8
3
5
15
1
1
4
= — , f(1) =
=—,
56
56
8
5
30
f(2) = — ,
56
10
f(3) = —
56
5
The probability distribution for X
x
f(x)
0
1/56
1
15/56
2
30/56
3
10/56
27
When N is large and n is relatively small
compared to N, the probability for each drawing
will change only slightly and hence there is not
much difference between sampling with
replacement and sampling without replacement,
and the formula for the binomial distribution with
K
the parameters n and p = —
N
may be used to approximate hypergeometric
probabilities.
28
Example
Among the 120 applicants for a job only 80 are actually qualified. If
5 of these applicants are randomly selected for an interview, find the
probability that only 2 of the 5 will be qualified for the job by using
1. The hypergeometric distribution
X=2, n=5, N=120 ,k=80
f(2)=
80
2
40
3
120
5
= 0.164
2.The binomial distribution as an approximation.
2
f(2) =
5
2
2
—
3
3
1
—
3
= 0.165
As can be seen these results are close .
29
The hypergeometric distribution can be
extended to treat the case where the N
items can be partitioned into k cells
A 1,A 2,...Ak, with a1 elements in the
first cell, a2 elements the second cell,
..., ak elements in the kth cell.
30
Dethfinition(Multivariate Hypergeometric Distribution)
If N items can be partitioned into the k cells A 1,A2,...Ak,
with a 1, a 2 , ….., a k elements,
, then the probability distribution of the random variables
X1,X2,...Xk, representing the number of element selected
from A 1,A2,...Akin a random sample of size n , is
a1
x2
a2
ak
x2 … xk
f(x1 ,x2 ,…..,xk ) =
with
k
∑ xi = n
i=1
N
n
and
k
∑ ai= N
i=1
31
Example
A group of 10 individuals is being used in a biological case
study . The group contains 3 people with blood type O
4 with blood type A, and 3 with blood type B. what is the
probability that a random sample of 5 will contain 1 person
with blood type O , 2 people with blood type A and 2 people
with blood type B.
3
1
2
2
f(1,2,2) =
4
3
3
=
10
5
14
32
The Geometric Distribution
Suppose that independent trials, each having a probability p, 0<p<l,
being a success , are performed until the first success occurs. If we
let X equal the number of
trials required, then
sample space=(S,FS,FFS,...), X= number of trials, and
P(X=l)= P(S) = p
P(X=2)= P(FS) = P(F)P(S) =(l-p)p
P(X=3)= P(FFS) = (l-p)2p
In general,
P(X =k)= P(FF...FS) = (1 -p)k -1 p
The probability of this intersection of k independent events gives
the geometric probability distribution (note that the geometric
distribution is well named since the values of the geometric
probability mass function are the terms of geometric series).
33
Definition
Let 0<p<l. Then the real valued function f defined on R by
p(1-p)x-1
x=1,2,…
f(x)= P(X=x)=
0
otherwise
~ G(p)
Is called the geometric probability distribution . X
will mean that X has the Geometric distribution with
parameter p
34
A comparison of the binomial and geometric
distribution will show that, although both serve
as models for experiments involving
independent, repeated Bernoulli trials, in the
binomial case there is a fixed number of trials,
and the random variable represents the number
of successes which occur, In the case of the
geometric, there is no fixed number of trials.
The random variable represents the number of
trials until the first success
35
In summary
The binomial Distribution
1.Independent trials
2.number of trials is n
3.Two outcome (success or failure)
4.X= number of successes among the n trials
The Geometric Distribution
1.Independent trials
2.‫ال يوجد عدد محدود لتكرار التجربة‬
3.Two outcome(success or failure)
4.X= number of trials until the first success
‫عدد مرات حصول التجربة حتى الحصول على اول نجاح‬
36
Example
Find the probability that a person flipping a balanced coin
requires 4 tosses to get a head.
X ~ G (0.5)
f (4) = 0.5 (0.5)3
=
1
16
Example
In certain manufacturing process it is known that 1 in every
100 items is defective. What is the probability that the fifth
item inspected is the first defective item found
X ~ G (0.01)
f(5) = 0.01 ( 0.99) 4 = 0.0096
37
Theorem
Let X be G(p) and k a nonnegative integer. Then
P(X>k)=(l-p)k
(1)
From which,
P(X ≥ k) = P(X>k-1 ) = ( 1 –p) k-1
(2)
From (1)
F(k)= P(X ≤ k) = 1 -P(X>k) = 1 -(1 -p)k
From (2)
P(X<k)= l-P(X ≥ k) = l-(l-p)k-1
38
Example
An urn contains 10 white and 8 black balls. Balls are randomly
selected, one at a time, until a black one is obtained. if we assume that
each selected ball is replaced before the next one is drawn, what is the
probability that
1. Exactly 5 draws are needed.
X = number of draw's needed to select a black ball
P = 8
X~G( 8 )
18
4
18
P(X=5)= 10
8
18
18
2. At least 4 draws are needed
P (X ≥4) = (1 - 8 ) 3 = 10
18
18
3
39
The Negative Binomial Distribution
A random variable with a negative binomial distribution
originates from a context that is very similar to the one
that leads to the geometric distribution. A gain, we focus
on independent and identical trials, each of which results
on one of two outcomes, success or failure. The
probability of success is p and stays the same from trial to
trial. The geometric distribution handles the case where
we are interested in the number of the trial on which the
first success occurs. The negative binomial distribution
is used when we are interested in the number of trial
on which the rth success occurs (r=2,3,,..).
40
Definition
A random variable X is said to have a negative binomial
probability distribution if and only if
x- 1
r -1
pr (1 –p ) x- r
x=r, r+1,r+2,..
0≤ p ≤1
f(x)=p(X=x)=
0
otherwise
X~b*(r,p) will mean that X has the negative binomial distribution
with parameters r and p.
41
Example
The probability that a person living in a certain city owns a car is
0.3. Find the probability that the 10 th person randomly
interviewed in this city is the 5th one to own a car?
9
X~b * (5,0.3) f(10)= 4
(0.3) 5(0.7) 5 = 0.0515
Example
If the probability is 0.40 that a child exposed to a certain
disease will cach it, what is the peobability that the tenth
child exposed to the disease will be the third to cach it?
9
X~b * (3,0.4) f(10) = 2 (0.4)3 (0.6)7 = 0.0645
42
The Poisson Distribution
The Poisson distribution (named after Simeon Poisson, the
nineteenth century French probabilities who described it) is another
extremely useful discrete probability distribution in which the
random variable represents the number of occurrences of
independent events that take place over a given time interval or in a
specified region . The given time interval may be of any length, such
as a minute, a day, a week, a month, or even a year. Typical
examples are the number of arrivals to a service facility in a given
time, the number of telephone calls per hour received by an office,
the number of days school is closed due to snow during the winter.
The specified region could be a line segment, an area, a volume, or
perhaps a piece of material. In this case X might represent the
number of bacteria in a given culture, or the number of typing errors
per page,
43
Definition
Let X be a random variable with a discrete distribution, and
suppose that the value of X must be a nonnegative integer. It is said
that X has a Poisson distribution with mean λ(λ>0) if the p.m.f of
X is as follows:
f(x) = P(X = x) =
e –λ λx
x!
0
x=0,1,2,….
otherwise
The parameter of the Poisson distribution is λ, the average number
of occurrences of the random event per unit of time. X~P(λ) will
mean that X has the Poisson distribution with parameter λ.
44
Example
The average number of radioactive particles ( ‫)جزيئات اشعاع ذري‬
passing through acounter (‫ ) عاكس‬during 1 millisecond in a
laboratory experiment is 4 . What is the probability that 6
particles enter the counter in given millisecond?
X~P(4)
f(6) = e -4 4 6 = 0.1042
6!
45
In Table 2 of The Appendix, the cumulative distribution
function of the Poisson random variable is tabulated for
selected values of x and λ .
Several illustrations of the use of Table 2 now follow.
Let λ =2.5.
The probability that X is less than 3 is
P (X<3)= P(X≤ 2)= F(2)= 0.5438
The probility that X is no less than 4 is
P(X≥ 4) =1-P(X≤ 3) =1-F(3) =0.2424
The probability that X is exactly 2 is
P(X=2) = F(2)-F(1)= 0.2565
46
Example
average number of oil tanker ( ‫ ) ناقالت النفط‬arriving each day at a
certain port city is known to be 10. The facilities at the port can
handle at most 15 tankers per day , What is the probability that on a
given day tankers will have to be sent away?
X= number of tankers arriving each day
P(X >15) = 1- P(X≤15)
=1- F(15)
=1- 0.9513
=0.0487
47
Example
Suppose that the average number of telephone calls arriving
at the switchboard of a small corporation is 30 calls per hour.
1. What is the probability that 3 calls will arrive in a 1-minute period
30 calls ------ 60 min
? Calls -------- 1min
? = 30 = 0.5
f (3) = e -0.5 0.5 3
60
3!
2.What is the probability that no calls will arrive in a 3-minute period
0.5 calls ---- 1 min
? Calls ----- 3 min
?= 0.5 * 3 = 1.5
f(0)= e -1. 5 1.5 0 = 0.223
0!
3. What is the probability that more than five calls will arrive
in a 5 minutes interval
( Answer 0.042)
48
The Poisson Approximation to the Binomial
Distribution
The poisson random variable can be used as an
approximation to the binomial random variable with
parameters (n,p) . When n is large and p is small
enough so that np is moderate size, the value of the p.m.f
of the binomial distribution can be approximated, for
x=0,l,2,..., by the value of the p.m.f of the Poisson
distribution for which λ= np
49
Example
Suppose that in a large population the proportion of people
that have a certin disease is 0.01. What is the probability that
in a random group of 200 people at least four people will
have the disease?
X= number of people having the disease among the 200 people
X~bin (200, 0.01 ) can be approximated by poisson with λ= np = 2
P(X ≥ 4)= 1- P(X<4)= 1 - P(X ≤ 3 )=0.1428
50
51