Download 6.2: The normal distribution

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
6.2: The normal distribution
Normal Distributions
q 
Characterized by symmetric, bell-shaped (mound-shaped) curve.
q 
q 
Heights, weights, standardized test scores
A particular normal distribution is determined by
q 
q 
µ
The mean
The standard deviation
Normal Distribution and Deviation
from the Mean
Example: Adult Heights
q 
q 
95% of female adult heights are between 58 and 72
inches
95% of male adult heights are between 62 and 78
inches
Z-scores (revisited)
q 
The multiples 1, 2, and 3 or the number of standard
deviations from the mean are denoted by z.
q 
For a particular observation, x, its z-score is computed by
z=
q 
x
µ
For each fixed number z, the probability within z standard
deviations of the mean is the area under the normal curve
between µ z and µ + z
Finding Probabilities for the Normal
Distribution
q 
q 
What if we want the probability within 1.43 standard
deviations of the mean?
For normal distributions there is a table we can use (Table A
in back of the book). It tabulates the normal cumulative
probability falling below the point µ + z
Finding Probabilities for the Normal
Distribution: P(-1.43< z < 1.43)
To find P(-1.43 < z < 1.43) we do it in 3
steps:
1)  We first find P(z < 1.43), using Table A
or calculator
2)  We then also know P(z > 1.43), which by
symmetry means we know P(z < -1.43).
3)  P(1.43 < z < 1.43)
= P(z < 1.43) - P(z < -1.43)
To use Table A:
— 
Find the corresponding z-score.
— 
Look up the closest standardized score (z)
in the table.
◦  First column gives z to the first decimal place.
◦  First row gives the second decimal place of z.
— 
The corresponding probability found in the
body of the table gives the probability of
falling below the z-score.
Part of Table A
The Probability Less Than 1.43
Standard Deviations
P(height < 70)=P(z<1.43) = 0.9236
Example 1: Mensa
Mensa is a society of high-IQ people with IQ test
scores at the 98th percentile or higher. The StanfordBinet IQ test scores that are used for admission are
approximately normally distributed with a mean of
100 and a standard deviation of 16.
q  How many standard deviations above the mean
is the 98th percentile?
q  What is the IQ score for that percentile?
Example 1: Mensa Solution
Example 1: Mensa Solution
q  98th percentile
corresponds to a zscore of 2.05
q  So a person needs an
IQ score of at least
100 + 2.05(16) = 133.
Example 2: SAT and ACT scores
q 
SAT and ACT exams are the two primary college entrance exams.
Both have a mathematics component.
q  The scores for the SAT range from 200 to 800 and are normally
distributed with a mean of 500 and a standard deviation of
100.
q  The scores for the ACT range from 1 to 36 and are normally
distributed with a mean of 21 and a standard deviation of 4.7.
u  Which is better, a 650 on the SAT or a 30 on the
ACT?
We will answer by looking at percentiles. Begin by
finding z-scores!
Example 2: SAT and ACT Solution
q 
650 on the SAT (mean is 500, std. dev. is 100)
Example 2: SAT and ACT Solution
q 
650 on the SAT (mean is 500, std. dev. is 100)
q  z-score is (650-500)/100 = 1.50. From Table A, this is in the
93rd percentile. In other words, 7% of people scored above
650.
Example 2: SAT and ACT Solution
q 
650 on the SAT (mean is 500, std. dev. is 100)
q  z-score is (650-500)/100 = 1.50. From Table A, this is in the
93rd percentile. In other words, 7% of people scored above
650.
q 
30 on the ACT (mean is 21, std. dev. is 4.7)
q  z-score is (30-21)/4.7 = 1.91. From Table A, this is in the 97th
percentile. In other words, 3% of people scored above 30.
Example 2: SAT and ACT Solution
q 
650 on the SAT (mean is 500, std. dev. is 100)
q  z-score is (650-500)/100 = 1.50. From Table A, this is in the
93rd percentile. In other words, 7% of people scored above
650.
q 
30 on the ACT (mean is 21, std. dev. is 4.7)
q  z-score is (30-21)/4.7 = 1.91. From Table A, this is in the 97th
percentile. In other words, 3% of people scored above 30.
q 
Thus, a 30 on the ACT is better than a 650 on the SAT
Finding Probabilities on TI-83/84
q 
Normalcdf(low,high,mean,std. dev.)
q  For calculating P(a < X < b) when X has a normal distribution of
mean mu and standard deviation sigma.
q  On Calculator: “2nd” “DISTR” “2” “a,b,mu,sigma) ENTER”
Normalcdf
q 
Invnorm(% to left,mean,std. dev.)
q  For finding value a so that P(X ≦ a) = p, when X has normal
distribution of mean mu and standard deviation sigma.
q  On Calculator: “2nd” “DISTR” “3” “p,mu,sigma) ENTER”
Invnorm
Finding Probabilities on TI-83/84
q 
Normalcdf(low,high,mean,std. dev.)
q  What percent of women are between 65 and 70 inches?
q  On Calculator: “2nd” “DISTR” “2” “65,70,65,3.5) ENTER”
q 
Invnorm(% to left,mean,std. dev.)
q  How tall does a woman need to be to be in the top 10%?
q  On Calculator: “2nd” “DISTR” “3” “.9,65,3.5) ENTER”
Finding Probabilities on TI-83/84
q 
Normalcdf(low,high,mean,std. dev.)
q  What percent of women are between 65 and 70 inches?
q  Answer: 42.34%
q 
Invnorm(% to left,mean,std. dev.)
q  How tall does a woman need to be to be in the top 10%?
q  Answer: 69.5 inches, or 5’9.5”
Building an Interval that Contains a
Certain Percentage of the Data
q 
q 
q 
q 
Suppose we have a normal distribution.
We want the interval that contains 95% of the data (in
terms of z values, i.e., between –z* and z*). The Emperical
Rule told us “about 2 standard deviations” but we want to
be more precise.
This means that 5% of the data must not be between, and
of this amount 2.5% will be to the left of –z*. Since 2.5% is
0.0250, we look in Table A for a z-score with an entry of
0.0250. This gives us -1.96.
We conclude that 95% of the data lies between -1.96 and
1.96.
Building an Interval that Contains a
Certain Percentage of the Data (cont.)
q 
For normal distributions, 95% of the data has z-score
between -1.96 and 1.96.
q 
Recall that female adult heights are normally distributed
with a mean of 65 inches and a standard deviation of
3.5 inches.
q 
We can convert the z-scores into heights.
We conclude that 95% of adult women have a height
between 58.14 inches and 71.86 inches.
q 
“Unusual” Observations
Adult male heights are normally distributed with a mean of 70 inches
and a standard deviation of 4 inches.
q  Consider these two
q  Sam is 79 inches tall (z-score is 2.25; corresponds to 0.9878
in Table A)
q  Joe is 61 inches tall (z-score is -2.25; corresponds to 0.0122
in Table A)
q  For a given person, we can think of “unusual” in two ways
q  Sam is unusually tall, he is in the rarest 1.22% of tall people.
q  Joe is unusually short, he is in the rarest 1.22% of short
people.
q  Both have unusual height, they are in the rarest 2.44%
P-Values
q 
q 
q 
The P-value is a measure of just how unusual the data is, in terms
of what percentage of the data is even more unusual than the
given data.
Recall that
q  Sam is unusually tall, he is in the rarest 1.22% of tall people.
q  Joe is unusually short, he is in the rarest 1.22% of short
people.
q  Both have unusual height, they are in the rarest 2.44%
This can be restated as
q  Sam’s one-tail (right-tail) P-value is 0 .0122
q  Joe’s one-tail (left-tail) P-value is 0.0122
q  Both have a two-tail P-value of 0.0244
Graphical Depiction of P-Values
Other Types of Distributions
We will also work with other distributions. Some will not be
symmetric. For a distribution like this, we are only interested in
one-tail (right-tail) P-values.
q  Commuting time of 45 minutes has a P-value of 0.15.
Finding Probabilities on TI-83/84
q 
Normalcdf(low,high)
q  What is the percentage of data between 1 and 1.75 standard
deviations? Normalcdf(1,1.75)
q 
Normalcdf(low,high,mean,std. dev.)
q  What is the percentage of women between 62 and 70 inches?
Normalcdf(62,70,65,3.5)
q 
Invnorm(% to left)
q  What is the z-score for data in the top 10%? Invnorm(0.9)
q 
Invnorm(% to left,mean,std. dev.)
q  How tall does a woman need to be to be in the top 10%?
Invnorm(0.9,65,3.5)