Download Lecture 5, May 24

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Lecture 5. Binomial Distribution
“Bernoulli trials” – experiments satisfying 3 conditions:
1. Experiment has only 2 possible outcomes: Success (S) and Failure (F).
2. The probability of S is fixed (does not change) from trial to trial.
P(S)=p, 0<p<1, P(F)= 1- P(S)=1-p=q.
3. n independent trials of the experiment are performed.
Let X= # of S (successes) in n Bernoulli trials. X has Binomial distribution
with number of trials n and probability of success p.
X~Bin(n, p)
X is a discrete r.v. with 2 parameters: n and p.
X counts number of S (successes) in n Bernoulli trials (Binomial type
experiment).
EXAMPLES
1.
Toss a coin 10 times. Record the number of H. Is this
Binomial type experiment?
2.
Toss a coin until you get a T. Record the number of tosses. Is
this Binomial type experiment?
3.
Toss a die 20 times. Record the number of “5”. Is this
Binomial type experiment?
4.
Toss a die 20 times. Record the number of times an “even”
face comes up. Is this Binomial type experiment?
Probability distribution of Bin(n,p) r.v.
n = # of trials,. An outcome of the experiment with k successes, (0≤k≤n) and
n-k failures, for example:
SSLS
FFLF
{1
23
k times n - k times
has probability
ppLp(1− p)(1− p)L(1− p) = pk (1− p)n−k .
123144424443
k times
n-k times
P(k successes out of n trials) =# ways to place k S among n trials x pk(1-p)n-k
n
# ways to place k S among n trials =   =
k 
and
n!
,
k !(n − k )!
n ! = "n factorial" = 1 ⋅ 2L (n − 2) ⋅ (n − 1) ⋅ n.
Finally, P( k successes out of n trials )
n k
n−k
(1
−
)
p
p
= 
k 
Probability distribution of a Bin(n,p) r.v.
X~Bin(n, p), n = number of trials, p = probability of S, 0 < p < 1.
Values of X: 0, 1, …, n.
n k
n−k
P(k successes out of n trials) = P(X= k) =   p (1 − p )
k 
Example:
5
5!
1⋅ 2 ⋅ 3 ⋅ 4 ⋅ 5
=
= 10.
 =
 2  2!3! (1 ⋅ 2) ⋅ (1 ⋅ 2 ⋅ 3)
NOTE: Table A-1 in the Appendix lists Binomial probabilities for
n=2, 3, …, 15 and p=0.1, 0.2, 0.3, …, 0.99.
EXAMPLE
What is the probability that in a family of 5 children 2 are girls?
What is the probability of having all girls?
Solution.
Trial/experiment: parents have a child
Girl or Boy
S
F
Our family has 5 children i.e. n=5 trials, p=P(S)=P( girl )=0.5.
X= # girls among 5 children; X~Bin(5, 0.5).
P(X = 2) =
5 2
10 5
5− 2
0.5
(1
−
0.5)
=
= .
 
10
2
16
 2
Probability of having all girls?
P(X=5)=
5 5
1
1
5−5
  0.5 (1 − 0.5) = 5 = .
2
32
5
EXAMPLE
A commuter plane has 10 seats. The airline books 12 people on the flight.
Suppose the chance of a person who makes a reservation of actually
showing up is 0.8. Find the probability that someone is bumped and the
probability that at least one seat is empty.
Solution. Trial/experiment:
A person with reservation decides: Show up OR Do not show up for the flight
S
F
Total # of people with reservations =12. Total # of trials= 12. P(S) = 0.8
X= # people who show up for the flight; X~Bin(12, 0.8).
I used Table A-1 for binomial probabilities.
P( someone is bumped) = P( more than 10 people show up)=
P(11 or 12 people show up)=P(X=11 or X=12) = P(X=11) + P(X=12) =
= 0.206 + 0.069 = 0.275.
P(at least one seat is empty)= P( at most 9 people showed up)=
1- P(X=10or X=11 or X=12) = 1- ( 0.206+0.069+0.283) = 0.442.
Mean and variance of a binomial random variable
If X~Bin(n, p) random variable, then the mean of X,
µx= EX=np
And the standard deviation of X,
σ X = np(1 − p).
NOTES: 1. Variance of X is σ2=np(1-p).
2. The mean of a binomial r.v. (mean number of successes) is the
number of trials x the probability of success.
EXAMPLE
1. Fair coin was tossed 3 times. Let X =# of heads in the 3 tosses. What
is the mean and standard deviation of X?
Solution. X~Bin(3, 0.5). Mean of X µx= EX=np=3(0.5)=1.5.
Standard deviation of X is σ X = np (1 − p ) = 3(0.5)(1 − 0.5) =
3
= 0.866.
4
2. The overbooking airline example. What is the mean and standard
deviation of the number of passengers that show up for the flight?
Solution. X~Bin(12, 0.8). Mean of Xµx= EX=np=12(0.8)=9.6.
Standard deviation of X is
σX = np(1− p) = 12(0.8)(1−0.8) = 1.92 =1.38.
NORMAL DISTRIBUTION
Normal distribution- continuous
distribution.
Normal curve:
bell shaped,
unimodal- single peak at the
center, symmetric.
Completely described by its
center of symmetry - mean µ
and spread - standard
deviation σ.
Random variable with normal
distribution – normal random
variable with mean µ and st. dev.
σ: X~N(µ, σ)
Standard normal random variable:
mean 0 and st. dev. 1: Z~N(0, 1)
NORMAL DISTRIBUTION-CHANGING LOCATION AND
SCALE
CHANGING SCALE σ
0.0
0.0
0.1
0.1
0.2
0.2
0.3
0.3
0.4
0.4
CHANGING LOCATION µ
-4
-2
µ1
0
µ2
0=µ1 < µ2 =1
2
4
-4
“peaky”
density
Changes in mean/location cause
shifts in the density curve along
the x-axis.
-2
0
1=σ1 < σ2 = 2
2
4
“flatter”
density
Changes in spread/standard
deviation cause changes in
the shape of the density
curve.
Why Bother with Normal Distributions?
Normal distributions are great descriptions-modelsapproximations for many data sets such as weights,
heights, exam scores, experimental errors, etc.
Great descriptions of results-outcomes of many
chance driven experiments.
Statistical inference based on normal distribution
works well for many (approximately) symmetric
distributions.
HOWEVER, remember that not everything or
everybody is normal!
AREAS UNDER THE NORMAL CURVE
Normal probabilities = areas under the normal curve are tabulated
for the standard normal distribution (table A-2 in the Appendix).
In looking for probabilities keep in mind:
Symmetry of the normal curve and P(Z=a)=0 for any a.
FIND:
P(Z < 0.01) = 0.504
P(Z ≤ - 0.01 ) = 0.496
P(Z < 0) = 0.5
P( Z < 2.92)= 0.9982
P(Z>2.92)=1-0.9982=0.0018 or, by symmetry =P(Z< - 2.92)=0.0018
P(-1.32< Z <1.2)=0.8849 – 0.0934=0.7915
SUMMARY OF RULES we used above: P(Z>a)=1-P(Z< -a)
P(a < Z < b) = P(Z < b)- P( Z < a)
NORMAL PERCENTILES
Given that P(Z < z)=0.95 find p. Here p is called 95th percentile of Z.
Inside the table I looked for 0.95.
Found 0.9495 and 0.9505.
Used z-value corresponding to
the midpoint (0.95) between the
two available probabilities 1.645.
z=1.645
If an available probability is closer
to the one we need, use the
z-value corresponding to
that probability.
0.95
Z=?
Example
Scientific thermometers should give a reading of 0oC at the freezing point of water.
However, due to the usual random variability of the readings, the actual readings (in
oC) are normally distributed with mean 0 and standard deviation 1. If a thermometer
is randomly selected, find the probability that the reading is
A.
below 1.58o.
B.
above -1.23o
C.
between -2.00o and 1.50o
D.
Also, find the temp. corresponding to the 90th percentile of the temp readings.
E.
Find the temps separating the bottom 2.5% and the top 2.5% readings.
Solution. T=temp reading at freezing point of water on a randomly chosen thermometer.
T ~N(0, 1).
A.
P( T < 1.58)=0.9429
B.
P(T> -1.23)= 1- 0.1092 = 0.8907
C.
P(-2.00 < T < 1.50) = 0.9332 – 0.0.0228 = 0.9104
D.
P(T < 1.28) = 0.8997, closest to 0.9, so use z=1.28 as 90th percentile
E.
P(T < -1.96) = 0.025, by symmetry P(T > 1.96)=0.025. Thus, T=-1.96 separates the
bottom 2.5% and T=1.96 separates top 2.5% of temperatures.
GENERAL NORMAL DISTRIBUTION
X − µ ~N(0, 1) standard normal.
IF X~N(µ, σ) then Z =
σ
standardization
Example. Suppose that the weight of people in NV follows normal
distribution with mean 150 and standard deviation 20 lb. Find
the probability that a randomly selected Nevadan weighs
at most 160 lb; b) over 160 lb.
Solution. Let X= weight of a randomly selected Nevadan. X~N(150,
20).
a)
b)
P(X ≤ 160) = P  X − 150 ≤ 160 − 150  = P ( Z < 0.5) = 0.6515.


20
 20

P(X>160)= 1 - P(X ≤ 160) =1 – 0.6915 = 0.3085.
NORMAL PERCENTILES
EXAMPLE. Suppose scores X on a test follow a normal distribution with mean
430 and standard deviation 100. Find 90th percentile of the scores, that is for
score x such that P(X ≤ x)=0.9.
Solution. Since we start with a normal but NOT STANDARD normal distribution,
we have to standardize at some point:
0.9 = P(X ≤ x) =


x − 430
 X − 430 x − 430 
P
≤
= P( Z <
).

100
100 
100
4
24
3
1
4
24
3
1

Z

z
get equation:
0.90
x − 430
= 1.28
100
x - 430 =128
x = 558
z =1.28
90% of students scored 558 or less.
EXAMPLE
Height of women follows normal distribution with mean 64.5 and
standard deviation of 2.5 inches. Find
a) The probability that a woman is shorter than 70 in.
b) The probability that a woman is between 60 and 70 in tall.
c) What is the height 10% of women are shorter than, i.e. what is the
10th percentile of women heights?
SOLUTION. X= women height; X~N(64.5, 2.5).
a) P(X <70)=P(Z< (70-64.5)/2.5)=P(Z<2.2)=0.9861
b) P( 60 < X < 70) = P( (60-64.5)/2.5) < Z < (70-64.5)/2.5)=P(-1.8< Z <
2.2)= P( Z <-2.2) – P( Z < -1.8) = 0.9861 – 0.0359 = 0.9502.
c) 10th percentile of X =?
0.1=P( X< x) = P( Z< (x-65.5)/2.5), so -2.33=(x-65.5)/2.5; x=59.675.
10% of women are shorter than 59.675 in.