Download Probability Distributions and Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
8
Probability Distributions
and Statistics
• Distributions of Random Variables
• Expected Value
• Variance and Standard Deviation
• Binomial Distribution
• Normal Distribution
• Applications of the Normal Distribution
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Random Variable
A random variable is a rule that assigns a
number to each outcome of a chance experiment.
• Finite discrete – variable can assume only
finitely many values.
• Infinite discrete – variable can assume
infinitely many values that may be arranged in
a sequence.
• Continuous – variable can assume values
that make up an interval of real numbers.
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Probability Distribution for the
Random Variable X
A probability distribution for a random variable X:
x
–8
P(X = x) 0.13
–3
0.15
–1
0.17
0
0.20
1
0.15
4
6
0.11 0.09
Find
a. P ( X ≤ 0 )
0.65
b. P ( −3 ≤ X ≤ 1)
0.67
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. Students from a small college were asked how
many charge cards they carry. X is the random
variable representing the number of cards and the
results are below.
x
0
1
2
3
4
5
6
#people
12
42
57
24
9
4
2
P(x =X)
0.08
0.28
0.38
0.16
0.06
0.03
0.01
Probability
Distribution
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Histograms
A way to represent a probability distribution of
a random variable graphically.
Credit card results:
x
0
1
2
3
4
5
6
P (X = x )
P(x =X)
0.08
0.28
0.38
0.16
0.06
0.03
0.01
0.4
0.3
0.2
0.1
0
0
1
2
3
4
5
6
Number of Cards
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Mean
The average (mean) of the n numbers x1, x2 , ..., xn
is x where x = x1 + x2 + ... + xn
n
Median
The median is the middle value in a set of data that
is arranged in increasing or decreasing order. For
an even number of data points the median is the
average of the middle two.
Mode
The mode is the number that occurs most frequently
in a set of data.
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. The quiz scores for a particular student are
given below:
22, 25, 20, 18, 12, 20, 24, 20, 20, 25, 24, 25, 18
Find the mean, median and mode.
Mean:
sum of entries
273
=
= 21
number of data points
13
Median:
Middle number = 20
12, 18, 18, 20, 20, 20, 20, 22, 24, 24, 25, 25, 25
Mode (most frequent):
20 (occurs 4 times)
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Expected Value of a Random
Variable X
Let X denote a random variable that assumes the values
x1, x2, …,xn with associated probabilities p1, p2, …, pn,
respectively. Then the expected value of X, E(X), is
given by
E ( X ) = x1 p1 + x2 p2 + ... + xn pn
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. Use the data below to find out the expected
number of credit cards that a student will possess.
x = # credit cards
x
P(x =X)
0
1
2
3
4
5
6
0.08
0.28
0.38
0.16
0.06
0.03
0.01
E ( X ) = x1 p1 + x2 p2 + ... + xn pn
= 0(.08) + 1(.28) + 2(.38) + 3(.16)
+ 4(.06) + 5(.03) + 6(.01)
=1.97
About 2 credit cards
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. Jackson and Max are playing a dice game where
a single die is rolled. Jackson pays Max $2 for
rolling a 1, 2, 3, or 4 and Max pays Jackson $D for a
5 or 6. Determine the value of D if the game is to be
fair.
4 2
P (Jackson loses) = =
6 3
We want the expected value of the game to be
zero to be fair:
Jackson
loss
⎛ 2⎞
⎛1⎞
( −2 ) ⎜ ⎟ + ( D ) ⎜ ⎟ = 0
⎝ 3⎠
⎝ 3⎠
−4 + D = 0
D=4
D should
be $4
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Odds
If P(E) is the probability of an event E occurring,
then
1. The odds in favor of E occurring are given by
the ratio
P(E)
P(E)
E occurs
1− P ( E )
=
( )
P EC
E doesn’t occur
2. The odds against E occurring are given by the
ratio
C
P
E
E doesn’t occur
1− P ( E )
P(E)
=
( )
P(E)
E occurs
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. If the news has just announced that the probability
of rain is 0.65 (65%), find
a. the odds in favor of rain
P(E)
.65
.65 13
=
=
=
1 − P ( E ) 1 − .65
.35 7
b. The odds against rain
1− P ( E )
P(E)
.35 7
=
=
.65 13
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Probability of an Event (Given Odds)
If the odds in favor of an event E occurring are
a to b, then the probability of E occurring is
a
P(E) =
a+b
Ex. The odds that the horse Gluebound will win a
particular race are 2 to 16. Find the probability that
Gluebound wins the race.
2
2 1
a
=
= =
P ( win ) =
a + b 2 + 16 18 9
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Variance
Variance is a measure of the spread of the data. The
larger the variance, the larger the spread.
Suppose a random variable has the probability
distribution
x
P(X = x)
x1
p1
x2
p2
x3
p3
…
…
xn
pn
and expected value E ( X ) = µ
The variance of a random variable X is defined by:
Var( X ) = p1 ( x1 − µ ) + p2 ( x2 − µ ) + ... + pn ( xn − µ )
2
2
2
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Standard Deviation
Standard deviation is a measure of the spread of the
data using the same units as the data.
The standard deviation of a random variable X is
defined by:
σ = Var( X )
=
p1 ( x1 − µ ) + p2 ( x2 − µ ) + ... + pn ( xn − µ )
2
2
2
Where each xi denotes the value assumed by the
random variable X and pi is the probability associated
with xi.
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. The quiz scores for a particular student are
given below:
22, 25, 20, 18, 12, 20, 24, 20, 20, 25, 24, 25, 18
Find the variance and standard deviation.
Value
Frequency
Probability
12 18
1
2
.08 .15
20
4
.31
22
1
.08
24
2
.15
25
3
.23
The expected value µ ≈ 21
Var( X ) = p1 ( x1 − µ ) + p2 ( x2 − µ ) + ... + pn ( xn − µ )
2
2
2
σ = Var( X )
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Var( X ) = .08 (12 − 21) + .15 (18 − 21)
2
2
+ .31( 20 − 21) + .08 ( 22 − 21)
2
2
+ .15 ( 24 − 21) + .23 ( 25 − 21)
2
2
Var( X ) = 13.25
σ = Var( X ) = 13.25 ≈ 3.64
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Chebychev’s Inequality
Let X be a random variable,
P ( µ − kσ ≤ X ≤ µ + kσ ) ≥ 1 −
1
k
2
where µ = the expected value
σ = the standard deviation
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. A probability distribution has a mean of 40
and a standard deviation of 12. Use Chebychev’s
inequality to estimate the probability that an
outcome of the experiment lies between 22 and 58.
P (10 ≤ X ≤ 70 )
Notice that µ − kσ = 10
k = 2.5
µ + kσ = 70
1
1
21
P (10 ≤ X ≤ 70 ) ≥ 1 −
= 1−
=
2
2
25
k
(5 / 2)
So at least 84%
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Binomial (Bernoulli) Trials
A binomial experiment has the properties:
1. The number of trials in the experiment is fixed.
2. The only outcomes are “success” and “failure.”
3. The probability of success in each trial is the
same.
4. The trials are independent of each other.
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Probabilities in Bernoulli Trials
In a binomial experiment in which the probability
of success in any trial is p, the probability of
exactly x successes in n independent trials is given
by
C ( n, x ) p q
x n− x
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. A card is drawn from a standard 52-card
deck. If drawing a club is considered a success,
find the probability of
a. exactly one success in 4 draws (with replacement).
1
1 3
x n− x
where p = , q = 1 − =
C ( n, x ) p q
4
4 4
1
3
⎛1⎞ ⎛3⎞
C ( 4,1) ⎜ ⎟ ⎜ ⎟ ≈ 0.422
⎝4⎠ ⎝4⎠
b. no successes in 5 draws (with replacement).
0
5
⎛1⎞ ⎛3⎞
C ( 5, 0 ) ⎜ ⎟ ⎜ ⎟ ≈ 0.237
⎝4⎠ ⎝4⎠
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Mean, Variance, and Standard
Deviation of a Random Variable X
If X is a binomial random variable associated with
a binomial experiment consisting of n trials with
probability of success p and probability of failure
q, then the mean, variance, and standard deviation
of X are
µ = E ( X ) = np
Var ( X ) = npq
σ X = npq
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. 5 cards are drawn, with replacement, from a
standard 52-card deck. If drawing a club is
considered a success, find the mean, variance, and
standard deviation of X (where X is the number of
successes).
1
1 3
p = , q = 1− =
4
4 4
⎛1⎞
µ = np = 5 ⎜ ⎟ = 1.25
⎝4⎠
⎛ 1 ⎞⎛ 3 ⎞
Var ( X ) = npq = 5 ⎜ ⎟ ⎜ ⎟ = 0.9375
⎝ 4 ⎠⎝ 4 ⎠
σ X = npq = 0.9375 ≈ 0.968
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. If the probability of a student successfully
passing this course (C or better) is 0.82, find the
probability that given 8 students
a. all 8 pass.
b. none pass.
C ( 8,8 )( 0.82 ) ( 0.18 )
8
0
≈ 0.2044
C ( 8, 0 )( 0.82 ) ( 0.18 )
0
≈ 0.0000011
8
c. at least 6 pass. so 6, 7, and 8 successes
C ( 8, 6 )( 0.82 ) ( 0.18 ) + C ( 8, 7 )( 0.82 ) ( 0.18 )
6
2
+C ( 8,8 )( 0.82 ) ( 0.18 )
8
7
1
0
≈ 0.2758 + 0.3590 + 0.2044
= 0.8392
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Probability Density Function
A probability density function, f, defines a continuous
probability distribution and coincides with the interval of
values taken on by the random variable associated with an
experiment.
1. f (x) is nonnegative for all values of x.
2. The area of the region between the graph of f and the x –
axis is equal to 1.
y = f ( x)
Area = 1
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Probability Density Function
P(a < X < b) is given by the area of the shaded
region.
y = f ( x)
a
b
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Normal Distributions
Normal distributions are a special class of
continuous probability density functions. Many
phenomena have probability density functions that
are normal.
The graph of this distribution is called a normal
curve.
The probability density function associated with the
normal curve:
1
− (1/ 2)[( x − µ ) / σ ]2
f ( x) =
e
σ 2π
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Normal Curve Properties
1. The peak is at x = µ .
2. There is symmetry with respect to the line x = µ .
3. The curve lies above and approaches the x–axis.
4. The area under the curve is 1.
5. 68.27% of the area lies within 1 standard
deviation of the mean, 95.45% within 2, and
99.73% within 3 (see curve on next slide).
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Normal Curve
Percentage of area within given standard deviations.
99.73%
95.45%
68.27%
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Normal curves with the same standard deviation but different
means
Normal curves with the same mean but different standard
deviations.
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Standard Normal Distribution
Denoted by the variable Z, with
µ = 0 and σ = 1.
Ex. Let Z be the standard normal variable. Find
(from table)
a. P(Z < 0.85)
This is the area to the left of 0.85
0.8023
b. P(Z > 1.32)
Use the fact that this area is
equivalent to finding P(Z < –1.32)
0.0934
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
c. P(–2.1 < Z < 1.78)
Find the area to the left of 1.78 then subtract the
area to the left of –2.1.
P(Z < 1.78) – P(Z < –2.1)
0.9625 – 0.0179
= 0.9446
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. Let Z be the standard normal variable. Find z if
a. P(Z < z) = 0.9278.
Look at the table and find an entry
= 0.9278 then read back to find
z = 1.46.
b. P(–z < Z < z) = 0.8132
P(z < Z < –z ) = 2P(0 < Z < z)
= 2[P(Z < z ) – ½]
= 2P(Z < z) – 1 = 0.8132
P(Z < z) = 0.9066
z = 1.32
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Transforming Other Normal Distributions
into a Standard Normal Distribution
Given X, a normal random variable distribution
with mean = µ and standard deviation = σ ,
We can transform X to Z using:
Z=
X −µ
σ
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. Let X be a normal random variable with
µ = 80 and σ = 20. Find
a. P(X < 65)
b. P(X > 60)
a. P(X < 65)
Convert to standard normal
65 − 80 ⎞
⎛
P ( X < 65 ) = P ⎜ Z <
⎟
20
⎝
⎠
= P ( Z < −.75 )
= 0.2266
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
b. P(X > 60)
Convert to standard normal
60 − 80 ⎞
⎛
P ( X > 60 ) = P ⎜ Z >
⎟ = P ( Z > −1)
20 ⎠
⎝
= 1 − P ( Z < −1)
= 1 − 0.1587
= 0.8413
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. A particular rash has shown up at an elementary
school. It has been determined that the length of time
that the rash will last is normally distributed with
µ = 6 days and σ = 1.5 days.
a. Find the probability that for a student selected at
random, the rash will last for less than 3 days.
b. Find the probability that for a student selected at
random, the rash will last for between 3.75 and 9
days.
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
a. Find the probability that for a student selected at
random, the rash will last for less than 3 days.
3−6 ⎞
⎛
P ( X < 3) = P ⎜ Z <
⎟ = P ( Z < −2 )
1.5 ⎠
⎝
= 0.0228
b. Find the probability that for a student selected at
random, the rash will last for between 3.75 and 9
days.
9−6⎞
⎛ 3.75 − 6
<Z<
P ( 3.75 < X < 9 ) = P ⎜
⎟
1.5 ⎠
⎝ 1.5
= P ( −1.5 < Z < 2 ) = P ( Z < 2) − P ( Z < −1.5 )
= 0.9772 – 0.0668 = 0.9104
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Approximating Binomial Distributions
Suppose we are given a binomial distribution
associated with a binomial experiment involving n
trials, each with probability of success p and
failure q. If n is large and p is not close to 1 or 0,
the binomial distribution may be approximated by
a normal distribution with:
µ = np
σ = npq
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex 1. PAR Bearings manufactures ball bearings
packaged in lots of 100 each. The company’s qualitycontrol department has determined that 2% of the ball
bearings manufactured do not meet the specifications
imposed by a buyer. Find the average number of ball
bearings per package that fail to meet with the
specification imposed by the buyer.
The experiment under consideration is binomial. The
average number of ball bearings per package that fail
to meet with the specifications is therefore given by the
expected value of the associated binomial random
variable.
µ = E ( X ) = np = 100(.02) = 2
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
Ex 2. At a particular small college the pass rate of
Intermediate Algebra is 72%. If 500 students enroll
in a semester determine the probability that
a. at most 375 students pass.
µ = np = 500(.72) = 360
σ = npq = 500(.72)(.28) ≈ 10
P ( X ≤ 375 ) ≈ P (Y < 375.5 )
continuous variable Y
Convert
375.5 − 360 ⎞
⎛
= P⎜Z <
⎟ = P ( Z < 1.55) to Z
10
⎝
⎠
= 0.9394
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.
b. between 355 and 390, inclusive, of the students
pass.
P ( 355 ≤ X ≤ 390 ) ≈ P ( 354.5 < Y < 390.5 )
390.5 − 360 ⎞
⎛ 354.5 − 360
= P⎜
<Z<
⎟
10
10
⎝
⎠
= P ( −0.55 < Z < 3.05 )
= P ( Z < 3.05) − P ( Z < −0.55 )
= 0.9989 – 0.2912
= 0.7077 or 71%
Copyright © 2006 Brooks/Cole, a division of Thomson Learning, Inc.