Download The Standard Deviation as a Ruler + The Normal Model

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
(Chapter 6)
The Standard Deviation
as a Ruler
+
The Normal Model
Reading Quiz (10 points):
When we rescale data, how are measures of
center and spread affected?
 Why do we use z-scores?
 What does a z-score measure?
 To use a Normal model what shape must
our data be?
 If a distribution is roughly Normal, a Normal
probability plot shows what kind of line?

What do we use standard
deviations for?
To compare different values (like cm and
seconds in a heptathlon)
 Compares an individual value to the
group
 How far is a value from the mean?

Standardizing Results

Z-Scores (Standardized Values):
xx
z
s

No units because we’re measuring
distance from the mean in standard
deviations
What would these z-scores mean?
2
 -1.6
0
 -3


Which of these values is the most
statistically surprising?
Your Statistics teacher has announced that
the lower of your two tests will be dropped.
You got a 90 on test 1 and an 80 on test 2.
You’re all set to drop the 80 until she
announces that she grades “on a curve.” She
standardized the scores in order to decide
which is the lower one. If the mean on the
first test is 88 with a standard deviation of 4
and the mean on the second was a 75 with
a standard deviation of 5.
a.
b.
Which one will be dropped
Does this seem fair?
Shifting Data: Remember?

Adding (or subtracting) a constant to
each value, all measures of position
(center, percentiles, min, max) will
increase (or decrease) by the same
constant, but does not change any measures
of spread
Rescaling Data

When we multiply (or divide) by a
constant, our measures of position get
multiplied (or divided) by the same
constant, as do our measures of spread
Z-Scores

What are really doing in terms of shifting
and rescaling?
xx
z
s


What will the new value of the original mean
be?
What happens to the standard deviation
when we divide by s?
Standardizing:

Does not change the shape of the
distribution of a variable

The center (mean) becomes:_____

The spread (standard deviation)
becomes:______
How do we know if a z-score is
interesting?
3 (+ or -) or more is rare
 6,7 call for attention

Homework:
page
123
◦1 – 11 (odd)
Normal Models

Appropriate for unimodal, roughly
symmetric distributions
N ( , )

Why do we have new notation for mean,
standard deviation?
◦ These are the parameters for our model rather
than numerical summaries of the data
If we standardize with a Normal
model…

Standard Normal model/standard Normal
distribution
Normality Assumption
When we apply the Normal model, we
assume a distribution is normal
 There is no way to check
 And most likely, it’s not true


Nearly Normal Condition: the shape of
the data’s distribution is unimodal an
dsymmetric
68-95-99.7 Rule
Suppose it takes you 20 minutes, on average, to
drive to school, with a standard deviation of 2
minutes. Suppose a Normal model is
appropriate for the distributions of driving
times.
a. How often will you arrive at school in less
than 22 minutes?
b. How often will it take you more than 24
minutes?
c. Do you think the distribution of your driving
times is unimodal and symmetric?
d. What does this say about the accuracy of
your predictions? Explain.
Normal Models:

Make a picture!
Homework:
 Page
124
◦ 15-27 (odd)
The SAT has 3 parts: Writing, Math, and Critical
Reading (verbal). Each part has a distribution
that is roughly unimodal and symmetric and
designed to have an overall mean of about 500
and a standard deviation of 100 for all test
takers. In any one year the mean and standard
deviation may differ from the target by a small
amount, but they’re a good overall
approximation.
a. Suppose you score 600 on one part; where do
you stand among all students?
b. What if you scored 200? 800?
What about this data?
The 2007 freshman class at Uconn had an
average score of 1192
 The 2007 freshman class at Umass had an
average math score of 559 and an average
verbal score of 561
 The 2008 class at URI has an average SAT
score of 1659.
 At NYU, to take Calculus you must score
at least a 750 on math

What if we’re not exactly 1,2,3 etc.
standard deviations away? How can
we find our percentile?
Find the z-score
 Use Table Z to find the percentage of
individuals in a standard Normal
distribution falling below that score
 These are called Normal Percentiles

With technology…

Go to the distribution menu
◦ normalpdf—used for graphing
◦ normalcdf—finds the area between two zscore cut points
What proportion of SAT scores fall
between 450 and 600?
What is the z-score cut point for
the 25th percentile?
Make a picture
 Look in the table
 With your calculator—invnorm


What z-score cuts off the highest 10% of
the data?
Suppose a college only admits those
with verbal SAT scores in the top 10
percent. What do you need?
Normal Probability Plot
If the distribution of data is roughly
Normal, this plot is roughly a diagonal
straight line
 Use this data:

◦ 22,17,18,29,22,23,24,23,17,21
◦ Statplot—the last one!
What can go wrong!
Only use a Normal model when the
distribution is symmetric and unimodal!
 Don’t use the mean and standard deviation
when outliers are present!
 Don’t round too soon—be as precise as
possible
 Don’t round any results in the middle of a
calculation
 Don’t worry about minor differences in
results (just like with quartiles and median!)

Homework:
Page 126
◦ 29 – 37 (odd)
◦ 41,43,45,47