Download Week 5 - Seminar

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
5 Normal Probability Distributions
Elementary Statistics
Larson
Farber
Properties of a Normal Distribution
x
• The mean, median, and mode are equal
• Bell shaped and is symmetric about the mean
• The total area that lies under the curve is one or 100%
Properties of a Normal Distribution
Inflection point
Inflection point
x
• As the curve extends farther and farther away from the
mean, it gets closer and closer to the x-axis but never
touches it.
• The points at which the curvature changes are called
inflection points. The graph curves downward between the
inflection points and curves upward past the inflection
points to the left and to the right.
Means and Standard Deviations
Curves with different means, same standard deviation
10 11
12 13 14
15 16 17 18 19
20
Curves with different means, different standard deviations
9 10 11 12 13 14 15 16 17 18 19 20 21 22
Empirical Rule
68%
About 68% of the area
lies within 1 standard
deviation of the mean
About 95% of the area
lies within 2 standard
deviations
About 99.7% of the area lies within
3 standard deviations of the mean
Determining Intervals
x
3.3 3.6 3.9 4.2
4.5 4.8 5.1
An instruction manual claims that the assembly time for a
product is normally distributed with a mean of 4.2 hours
and standard deviation 0.3 hour. Determine the
interval in which 95% of the assembly times fall.
95% of the data will fall within 2 standard deviations of the mean.
4.2 – 2 (0.3) = 3.6 and 4.2 + 2 (0.3) = 4.8.
95% of the assembly times will be between 3.6 and 4.8 hrs.
The Standard
Normal Distribution
The Standard Score
The standard score, or z-score, represents the number of
standard deviations a random variable x falls from the
mean.
The test scores for a civil service exam are normally
distributed with a mean of 152 and a standard deviation of
7. Find the standard z-score for a person with a score of:
(a) 161
(b) 148
(c) 152
(a)
(b)
(c)
The Standard Score
The standard score, or z-score, represents the number of
standard deviations a random variable x falls from the
mean.
The test scores for a civil service exam are normally
distributed with a mean of 152 and a standard deviation of
7. Find the standard z-score for a person with a score of:
(a) 161
(b) 148
(c) 152
(a)
(b)
(c)
The Standard Normal Distribution
The standard normal distribution has a mean of 0 and a
standard deviation of 1.
Using z-scores any normal distribution can be
transformed into the standard normal distribution.
–4 –3 –2 –1
0 1
2 3
4
z
Cumulative Areas
The
total
area
under
the curve
is one.
–3 –2 –1 0 1 2 3
z
The cumulative area for z = 0 is 0.5000, indicating
that the probability of getting a z value of 0 or less is .5
Cumulative Areas - Finding
Find the cumulative area for a z-score of –1.25.
0.1056
–3 –2 –1 0 1 2 3
•Table in Appendix
•Table on fold-out card
•Excel
•Internet calculator
z
Using the table …
Find the cumulative area that corresponds to a z-score of -0.24.
Solution:
Find -0.2 in the left hand column.
Move across the row to the column under 0.04
Using the table …
Find the cumulative area that corresponds to a z-score of -0.24.
Solution:
Find -0.2 in the left hand column.
Move across the row to the column under 0.04
The area to the left of z = -0.24 is 0.4052.
Using Excel
standard deviation
mean
x value of interest
Online: Rossman-Chance
http://www.rossmanchance.com/applets/NormalCalcs/NormalCalculations.html
Online: SurfStat
http://surfstat.anu.edu.au/surfstat-home/tables/normal.php
Cumulative Areas
Find the cumulative area for a z-score of –1.25.
0.1056
–3 –2 –1 0 1 2 3
z
Read down the z column on the left to z = –1.25 and across to
the column under .05. The value in the cell is 0.1056, the
cumulative area.
The probability that z is at most –1.25 is 0.1056.
“Less than”
To find the probability that z is less than a given value,
read the cumulative area in the table corresponding to
that z-score.
Find P(z < –1.45).
P (z < –1.45) = 0.0735
–3 –2 –1
0 1
2 3
z
Read down the z-column to –1.4 and across to .05. The
cumulative area is 0.0735.
“Greater than”
To find the probability that z is greater than a given
value, subtract the cumulative area in the table
from 1.
Find P(z > –1.24).
0.1075
0.8925
z
–3 –2 –1 0 1 2 3
The cumulative area (area to the left) is 0.1075. So the area
to the right is 1 – 0.1075 = 0.8925.
P(z > –1.24) = 0.8925
“Between”
To find the probability z is between two given values, find the
cumulative areas for each and subtract the smaller area from
the larger.
Find P(–1.25 < z < 1.17).
–3 –2 –1 0 1 2
1. P(z < 1.17) = 0.8790
3
z
2. P(z < –1.25) = 0.1056
3. P(–1.25 < z < 1.17) = 0.8790 – 0.1056 = 0.7734
Summary
To find the probability that z is less
than a given value, read the
corresponding cumulative area.
-3 -2 -1 0 1 2 3
z
To find the probability is greater
than a given value, subtract the
cumulative area in the table from 1.
-3 -2 -1 0 1 2 3
z
To find the probability z is
between two given values, find the
cumulative areas for each and
subtract the smaller area from the
larger.
-3 -2 -1 0 1 2 3
z
Normal Distributions
Finding Probabilities
Probability and Normal Distributions
• If a random variable x is normally distributed, you can
find the probability that x will fall in a given interval by
calculating the area under the normal curve for that
interval.
μ = 500
σ = 100
μ =500 600
x
Remember that the total area under the curve is 1.0
(equal to 100%).
Probability and Normal Distributions
• If a random variable x is normally distributed, you can
find the probability that x will fall in a given interval by
calculating the area under the normal curve for that
interval.
μ = 500
σ = 100
P(x < 600) = Area
μ =500 600
x
Remember that the total area under the curve is 1.0
(equal to 100%).
Probability and Normal
Distributions
Normal Distribution
μ = 500 σ = 100
P(x < 600)
x   600  500
z

1

100
x
μ =500 600
Probability and Normal
Distributions
Normal Distribution
Standard Normal Distribution
μ = 500 σ = 100
μ=0 σ=1
P(x < 600)
x   600  500
z

1

100
P(z < 1)
z
x
μ =500 600
μ=0 1
Same
Area
P(x < 500) = P(z < 1)
Example
A survey indicates that people use their
computers an average of 2.4 years before
upgrading to a new machine. The standard
deviation is 0.5 year. A computer owner is
selected at random. Find the probability that
he or she will use it for fewer than 2 years
before upgrading. Assume that the variable
x is normally distributed.
Solution
Normal Distribution
μ = 2.4 σ = 0.5
Standard Normal Distribution
μ=0 σ=1
x   2  2.4
z

 0.80

0.5
P(x < 2)
P(z < -0.80)
0.2119
z
x
2 2.4
-0.80 0
P(x < 2) = P(z < -0.80) = 0.2119
Example:
A survey indicates that for each trip to the
supermarket, a shopper spends an average
of 45 minutes with a standard deviation of
12 minutes in the store. The length of time
spent in the store is normally distributed and
is represented by the variable x. A shopper
enters the store. Find the probability that the
shopper will be in the store for between 24
and 54 minutes.
Solution: Finding Probabilities
for Normal Distributions
Normal Distribution
μ = 45 σ = 12
x-
Standard Normal Distribution
μ=0 σ=1
24 - 45
 -1.75

12
x -  54 - 45
z2 

 0.75

12
z1 
P(24 < x < 54)

P(-1.75 < z < 0.75)
0.7734
0.0401
x
24
45 54
z
-1.75
0 0.75
P(24 < x < 54) = P(-1.75 < z < 0.75)
= 0.7734 – 0.0401 = 0.7333
Example:
Find the probability that the shopper will be
in the store more than 39 minutes. (Recall μ
= 45 minutes and σ = 12 minutes)
Solution: Finding Probabilities
for Normal Distributions
Normal Distribution
μ = 45 σ = 12
z
P(x > 39)
Standard Normal Distribution
μ=0 σ=1
x-


39 - 45
 -0.50
12
P(z > -0.50)
0.3085
z
x
39 45
-0.50 0
P(x > 39) = P(z > -0.50) = 1– 0.3085 = 0.6915
Section 5.4
Normal Distributions
Finding Values
From Areas to z-Scores
Find the z-score corresponding to a cumulative area of 0.9803.
z = 2.06 corresponds
roughly to the
98th percentile.
0.9803
–4 –3 –2 –1 0
1
2
3
4
z
Locate 0.9803 in the area portion of the table. Read the
values at the beginning of the corresponding row and at
the top of the column. The z-score is 2.06.
Finding z-Scores from Areas
Find the z-score corresponding to the 90th percentile.
.90
0
z
The closest table area is .8997. The row heading is
1.2 and column heading is .08. This corresponds to
z = 1.28.
A z-score of 1.28 corresponds to the 90th percentile.
Finding z-Scores from Areas
Find the z-score with an area of .60 falling to its right.
.40
.60
z
0
z
With .60 to the right, cumulative area is
.40. The closest area is .4013. The row
heading is 0.2 and column heading is .05.
The z-score is 0.25.
A z-score of 0.25 has an area of .60 to its right. It
also corresponds to the 40th percentile
Finding z-Scores from Areas
Find the z-score such that 45% of the area under the
curve falls between –z and z.
.275
.275
.45
–z 0
z
The area remaining in the tails is .55. Half this area is
in each tail, so since .55/2 = .275 is the cumulative area
for the negative z value and .275 + .45 = .725 is the
cumulative area for the positive z. The closest table
area is .2743 and the z-score is 0.60. The positive z
score is 0.60.
From z-Scores to Raw Scores
To find the data value, x when given a standard score, z:
The test scores for a civil service exam are normally
distributed with a mean of 152 and a standard deviation of 7.
Find the test score for a person with a standard z-score of:
(a) 2.33
(b) –1.75
(c) 0
From z-Scores to Raw Scores
To find the data value, x when given a standard score, z:
The test scores for a civil service exam are normally
distributed with a mean of 152 and a standard deviation of 7.
Find the test score for a person with a standard z-score of:
(a) 2.33
(b) –1.75
(c) 0
(a) x = 152 + (2.33)(7) = 168.31
(b) x = 152 + (–1.75)(7) = 139.75
(c) x = 152 + (0)(7) = 152
Finding Percentiles or Cut-off Values
Monthly utility bills in a certain city are normally distributed
with a mean of $100 and a standard deviation of $12. What is
the smallest utility bill that can be in the top 10% of the bills?
$115.36 is the smallest
value for the top 10%.
90%
10%
z
Find the cumulative area in the table that is closest to
0.9000 (the 90th percentile.) The area 0.8997 corresponds
to a z-score of 1.28.
To find the corresponding x-value, use
x = 100 + 1.28(12) = 115.36.
Sampling Distributions &
The Central Limit Theorem
Sampling Distributions
A sampling distribution is the probability distribution of a
sample statistic that is formed when samples of size n are
repeatedly taken from a population. If the sample statistic is
the sample mean, then the distribution is the sampling
distribution of sample means.
Sample
Sample
Sample
Sample
Sample
Sample
The sampling distribution consists of the values of the sample
means,
Sampling Distribution of x-bar
Population with μ, σ
Sample 3
Sample 1
x1
x3
Sample 2
x2
Sample 5
Sample 4
x5
x4
The sampling distribution consists of the values
of the sample means,
x1 , x2 , x3 , x4 , x5 ,...
The Central Limit Theorem
If a sample n >= 30 is taken from a population with
any type distribution that has a mean =
and standard deviation =
x
the sample means will have a normal distribution
and standard deviation
The Central Limit Theorem
If a sample of any size is taken from a population with a
normal distribution with mean =
and standard
deviation =
x
the distribution of means of sample size n, will be normal
with a mean
standard deviation
Application
The mean height of American men (ages 20-29) is
inches. Random samples of 60 such men are selected. Find the mean and
standard deviation (standard error) of the sampling distribution.
69.2
Distribution of means of sample size 60,
will be normal.
mean
Standard deviation
Interpreting the Central Limit Theorem
The mean height of American men (ages 20-29) is =
69.2”. If a random sample of 60 men in this age group
is selected, what is the probability the mean height for
the sample is greater than 70”? Assume the standard
deviation is 2.9”.
Since n > 30 the sampling distribution of will be normal
mean
standard deviation
Find the z-score for a sample mean of 70:
Interpreting the Central Limit Theorem
z
2.14
There is a 0.0162 probability that a sample of 60
men will have a mean height greater than 70”.
Application Central Limit Theorem
During a certain week the mean price of gasoline in California was
$1.164 per gallon. What is the probability that the mean price for
the sample of 38 gas stations in California is between $1.169 and
$1.179? Assume the standard deviation = $0.049.
Since n > 30 the sampling distribution of
will be normal
mean
standard deviation
Calculate the standard z-score for sample values of $1.169 and
$1.179.
Application Central Limit Theorem
P( 0.63 < z < 1.90)
= 0.9713 – 0.7357
= 0.2356
z
.63
1.90
The probability is 0.2356 that the mean for the
sample is between $1.169 and $1.179.