Download Chapter 2: Normal Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Chapter 2
The Normal Distribution
Density Curves
A density curve is a model we will use
for probability problems.
Two characteristics
– All area is positive (above the x-axis)
– Area = 1
It is because Area = 1 that we can use
a density curve to model probability,
where P(Sample Space) = 1
Density Curves
Think of a density curve as a relative
frequency histogram with tiny class
width—so tiny that the width of each
class approaches zero.
When you connect the tops of all the
bars, a continuous line forms.
The Normal Distribution
The most famous density curve models
the Normal Distribution. The normal
distribution is a distribution of many
commonly-found phenomena: test
scores, physical properties of various
creatures (humans included),
measurements in scientific experiments,
and many more.
The Normal Distribution will be your
constant friend until the end of the course!
The Normal Distribution
Identify a normal distribution using two
things
– Shape: Symmetric, Unimodal, Bell-shaped
– The Empirical Rule:
About
of the
About
mean
About
68% of all data fall within 1 standard deviation
mean
95% are within 2 standard deviations of the
99.7% are within 3 sd’s of the mean.
There are many, many normal
distributions out there, but they all follow
the empirical rule.
From your textbook, Yates, Moore, Starnes, p. 87
The Normal Distribution
The points of inflection (where the curve
reverses) correspond to the points one standard
deviation away from the mean on either side.
A Normal Distribution is completely described by
mean and standard deviation. For instance, the
heights of men are N(69”, 2.5”). This means
“normally distributed with a mean of 69 inches
and standard deviation of 2.5 inches.”
Those three facts: Normal, Mean=, SD = tell you
everything you need to know to solve problems
using the normal distribution.
Solving Problems with the Normal
Distribution
#7, page 89
ALWAYS draw the curve.
Standardizing—Speaking a
common language
How would you compare a score of
600 on an SAT with a score of 27 on
an ACT?
The secret: Express both scores as
“the number of standard deviations
above or below the mean.”
Standardized Scores
AKA: “Z-scores”
x


z

Standardized Scores
AKA: “Z-scores”
In other words—What is the distance from the
score to the mean, compared to the size of the
standard deviation?
This gives an answer to the question, “How many
standard deviations from the middle is this score?”
Scores expressed as “standard deviations from the
mean” are called Standardized Scores, or Z-Scores
Z-scores can be positive or negative!
Standardizing—Speaking a
common language
How would you compare a score of 650 on
an SAT with a score of 25 on an ACT?
The mean SAT score is 500, while the
mean ACT score is 18. Standard deviation
for SAT is 100. Standard deviation for
ACT is 6
Express both as Z-scores, then compare:
Standardizing—Speaking a
common language
How would you
compare a score of
650 on an SAT
with a score of 25
on an ACT?
SAT guy wins.
z  650500 150
1.5
100
SAT
100
z  2518  76 1.16
ACT
6
The STANDARD Normal
Distribution
If you were to create a distribution of
z-scores, it would look like this:
The STANDARD Normal
Distribution
This Distribution is N(0,1): It, too, is a Normal
Distribution, whose Mean is 0 and whose SD is 1
Or if you prefer: μ=0, and σ = 1
Normal Distribution Calculations
The Empirical Rule: All fine and good until
someone’s Z-score isn’t an integer!
Easy: What percentile is a score of 600
for an SAT if μ=500, and σ = 100? It’s
the 84th percentile, because a score 1 SD
above the mean is better than 68% +
16%, or 84%.
Hard: What percentile is a score of 550?
– Uhhh… z = .5
Now what?
First, The Three P’s
Percent (or percentile)
Proportion
Probability
– The problems may differ, but the
procedures are the same.
– It all comes down to figuring out how
much area is involved.
Table A: Gives you Area to the left of your z-score.
In this case, the area to the left of .5
Using Table A
Along the edge, “assemble” your zscore. Whole units and the tenths
place along the side, intersected with
the hundredths place along the top.
Where the two intersect, that’s area
to the left
Example (using a positive zscore)
Say a person scores a 695 on a math
SAT. (~N(500, 100). What is that
person’s percentile?
Calculate z-score: Z = (695500)/100 = 1.95
Normal Calculation Toolbox
(Use it without fail—get all the points )
1. Draw and label the curve, shading
the area of interest. (Don’t forget
“N(500, 100),” e.g.)
2. Create a probability statement,
leaving blanks for the unknowns
3. Show all calculations (z-score
calculations, etc.) and complete the
probability statement.
4. Answer the question in context
using words.
SAT Example
1.
2.
~N(500, 100)
3.
P(x <= 690) = ________
z
695  500 195

 1.95
100
100
From Table A, Area to left
of z = 1.95 is .9744
500
100
4.
695
A score of 695 on the SAT is approximately the 97.4th
percentile.
Assessing Normality
“Are these data normally distributed?”
1. Is the shape of the histogram unimodal, bell-shaped,
approximately symmetric?
2. Are the mean & median close? (This suggests symmetry)
3. Does the Empirical Rule apply? In other words, are about
68% of the numbers within 1 SD of the mean? Are about
95% within 2? Are just about all of them within 3 SD’s of the
mean?
4. Is an NPP of the data approximately straight? If yes, data’s
approximately normal.
Be somewhat forgiving in your judgment. On the test, as long as
you check all of these things (climb the mountain!) your judgment
is valid. The phrase “approximately normal” is open to
interpretation, and data is never perfect, so feel free to have an
opinion. 
NPP—Normal Probability Plot
Read the book (end of 2.2)
To create an NPP with the calculator
– Put data into a list
– Go to Stat Plot. Choose the 6th option
(after the boxplots)
– Let the data axis remain “x”
– Do the usual thing: Zoom…Zoomstat.
– If what you see is pretty much straight,
the distribution is approximately
normal.