Download The Normal Distribution, z-scores

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
The Normal Curve
June 20, 2006
Bryan T. Karazsia, M.A.
Overview
„
„
„
„
„
„
Hand-in Homework
Why are distributions so important
(particularly the normal distribution)?
What is the normal distribution?
Z-scores
Using z-scores
Setting Probable Limits
Distributions
„
What’s so important about distributions?
„
Link with probabilities: between shape of
distribution for given behavior and probability
of that behavior occurring
Can translate area under curve to a probability
„ Doing so helps us to describe behavior better…
„
Ex. – What’s the probability of _____?
•We can find area under curve for any portion of the
distribution and convert it to probability
Distributions
„
What’s so important about distributions?
„
Areas under a curve are additive.
So, we can explore grouping/addition of 2+ areas
„ Can combine areas into one, single larger area
„
•We can combine any areas under the curve with addition
•Above, we can combine the area designated by the
color BLUE
•Adds together to be 68.2%
Normal Distribution
„
Characteristics of a normal distribution?
• Symmetric
• Uni-modal
• Limits of infinity
• Mathematically
exact: Can find
percentage between
any 2 points on a
curve (it’s actually
already been done for
us: Table E.10)
Normal Distribution
„
Characteristics of a normal distribution?
„
50-34-14 Rule
Normal Distribution
„
Why so important?
„
We usually assume that our DVs are normally
distributed
Most phenomena are normally distributed (or close
to it) in large populations
„ Assumption of Normality is required for most
statistical tests
„
„
Relatively easy to work with
„
Can calculate probabilities easily (in Tables!)
„
Note: can be done with non-normal distributions, but that
would require extensive mathematics (/Calculus/)
Normal Distribution
„
When we can calculate probabilities for
any point on the curve, our results will
depend on the curve’s mean and SD
„
„
Why? Æ because area under curve will
change
So, anytime these attributes change, we
need to calculate new probabilities…
Standard Normal Distribution
„
To make life easier, we base everything off of
one single normal distribution, with
predetermined attributes:
„
„
µ=0
σ=1
Standard Normal Distribution
„
„
Real world: very rarely (if ever) have a
real-world distribution that matches this
standard normal distribution exactly
BUT, we can mathematically manipulate,
or transform, our data…
Enter Math---but remember why!
Key = Concepts…the math will fall into place
What are we doing again???
Z-scores
„
How do we transform data?
„
Convert our sample distribution such that…
Mean = 0
„ SD = 1
„
„
„
„
Subtract mean of sample from EACH data pt.
Divide each data pt. by the SD of the sample
When we do this, we are converting
raw scores to standardized scores
(a.k.a., z-scores)
Z-scores
„
How do we transform data?
„
Formula: z-score for any particular observation (score, x)
z=
(x − µ)
σ
x = indiv. score
µ = population mean
σ = population SD
z = standardized score
Z-scores
µ x = 3.20
σ x = 1.79
Part
1
2
3
4
5
Σ
X
2
1
3
5
5
16
Y
1
4
8
9
7
29
z1 =
(X – µ)
-1.2
-2.2
-0.2
1.8
1.8
z
-0.67
-1.23
-0.11
1.01
1.01
(x − µ)
σ
(2 − 3.20)
z1 =
1.79
(−1.2)
z1 =
1.79
z1 = −.67
Z-scores
World Cup Data (hypothetical)
ORIGINAL DATA
Z-SCORES
Z-scores
„
Why are the distributions the same???
„
Z-score transformation is a Linear Transformation
Add, subtract, multiply, and/or divide by a constant
„ Relationship among values does NOT change (shape of
distribution remains unaffected)
„
Common examples:
„
Distance from KSU to Panama City, FL:
„
985 miles or 62,409,600 inches
Z-scores
„
Common Misconception:
„
Converting raw scores to z-scores will normalize
the data (will give us a normal curve)
FALSE
„
Shape of distribution will remain unaffected b/c
relationships among variables does not change
Quick Review
„
„
„
What have we accomplished by converting
raw-scores to z-scores???
What good is it? (GRE example)
If you still don’t think they are very
helpful, just hang-on…
Normal Distributions (cont.)
„
Link to probabilities:
„
„
„
(Example)
Suppose we want to know the probability that
any random person will score 1 standard
deviation above the mean (score 600) on the
GRE Verbal
Stated differently, what is the area under the
normal curve that is 1 SD above the mean?
Notes:
„
„
Standard normal curve: total area = 1.0
(100% of observation)
We are assuming GRE scores are normally
distributed
Normal Distributions (cont.)
„
Link to probabilities:
„
(Example)
We already know that z-scores represent SDs
from the mean, so we will want to find area
above z = 1.
Normal Distributions (cont.)
„
Link to probabilities:
„
(Example)
Here is where we utilize Table E.10
Note: only positive half of normal distribution in Table
(z > 0)
„
We want z = 1
Normal Distributions (cont.)
„
„
Table E.10 (Normal Distribution; z)
We see…
„
„
„
“mean to z”
= .3413
“Larger portion” = .8413
“Smaller Portion” = .1587
Normal Distributions (cont.)
„
„
Table E.10 (Normal Distribution; z)
“mean to z” = .3413
Normal Distributions (cont.)
„
„
Table E.10 (Normal Distribution; z)
“Larger portion” = .8413
Normal Distributions (cont.)
„
„
Table E.10 (Normal Distribution; z)
“Smaller portion” = .1587
Normal Distributions (cont.)
„
What if we wanted to know if the random
person will score > 1 SD in either direction?
Normal Distributions (cont.)
„
Can now also calculate area (probability)
between z = -1 & z = +1
Normal Distributions (cont.)
„
Let’s go one step further…
„
„
What is probability a person will fall between 1.5 (650)
& 2.5 (750) SDs from the mean on the GRE Verbal
When doing such calculations, it is Usually
wise to draw the diagram (will eliminate many
errors)
Normal Distributions (cont.)
„
Let’s go one step further…
„
What is probability a person will fall between 1.5 (650)
& 2.5 (750) SDs from the mean on the GRE Verbal
Normal Distributions (cont.)
„
Let’s go one step further…
„
„
What is probability a person will fall between 1.5 (650) &
2.5 (750) SDs from the mean on the GRE Verbal
From Table E.10…
Z = 1.5 Æ “mean to z” = .4332
Z = 2.5 Æ “mean to z” = .4938
We want the difference between these…
.4938 - .4332 = .0606 (.06)
Probability of falling between z = 1.5 & z = 2.5 is .06
„
Normal Distributions (cont.)
„
Setting Probable Limits on Observations:
“If I draw an adult at random, 95% of the time
his/her score will lie between ____ & ____”
„
To answer this question, we need to find
corresponding z-scores…
„
Remember: Table is only ½ of the distribution…
Normal Distributions (cont.)
„
Setting Probable Limits on Observations:
So, 95% of time, any random individual’s score will
fall between –1.96 & + 1.96 SDs from the mean
????????????????????????????????
Normal Distributions (cont.)
„
Setting Probable Limits on Observations:
„
If we want to hear some music, we need to
convert these z-scores back to raw scores…
x = µ ± ( z *σ )
x = 500 ± (1.96 *100)
x = 500 ± 196 = 696 & 304
Normal Distributions (cont.)
„
Is anything we did today useful???
„
We see them all the time…
„
IQ, GREs, SATs, etc…
They all use standard scores…and perform another
linear transformation to derive the scores we hear so
often (IQ = 102, SAT = 1600, GRE Verbal = 520)
„ All easy too…just convert raw scores to z-scores, then
you can convert to whatever system you like…
„
Review
„
„
What is Variability?
Measure of Variability
„
„
„
„
„
Range
Variance
Standard Deviation
Definitional vs. Computational formulae
Practice before next class!!!