Download Normal Models - math-b

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Z-Scores, Shifting and Scaling
 Z score calculated as
follows:
𝑧=
 (𝑠 =
𝑦−𝑦
𝑠
(𝑦−𝑦)2
𝑛−1
)
 Shifting:
 Adding/Subtracting
 Changes Measures of
Center
 DOES NOT change
measures of spread
 Scaling
 Multiplying/Dividing
 Changes Measures of
Center
 Changes measures of
Spread
Normal Models
 “bell-shaped curves” are
called Normal Models
 Appropriate for
distributions whose
shapes are unimodal and
roughly symmetric
Normal Models
 Each is a model
 For symmetric, unimodal
distributions a normal model
provide a measure of how
extreme a z-score is
 There is a normal model for
every possible combination
of mean and standard
deviation
Notation
 𝑁 𝜇, 𝜎
 This represents a normal
model with a mean of 𝜇 and
a standard deviation of 𝜎.
 Why the greek?
 This mean and standard
deviation are not numerical
summaries of data.
 They are part of a model.
 They don’t come from data.
 They are numbers that we
choose to specify our model
 They are called parameters.
Notation Continued
 𝑁 𝜇, 𝜎
 We don’t want to confuse
our parameters with
summaries of the data
such as 𝑦 𝑎𝑛𝑑 𝑠
 Summaries of the data are
called statistics
Z-Scores and Normal Models
 If we model data with a
Normal Model and
standardize them using
the corresponding
𝜇 𝑎𝑛𝑑 𝜎 we still call the
standardized value a zscore and we write:
𝑦−𝜇
𝑧=
𝜎
Z-Scores and Normal Models
 It is usually easier to
standardize data first
(using its mean and
standard deviation)
 Then we need only model
N(0,1)
 N(0,1) is called the
standard normal model or
standard normal distribution
Normality Assumption
 In using the Normal Model
to model our data, we
must have a unimodal and
symmetric distribution
 The Normality Assumption
is that the data is
unimodal and symmetric
 But it probably isn’t
exactly that…
Nearly Normal Condition
 The shape of the data’s
distribution is unimodal
and symmetric.
 Check this by making a
histogram
 All models make
assumptions – always
point out the assumption
you make for your model.
 Must also check the
conditions in the data to
make sure that those
assumptions are
reasonable.
Normal Models
 Normal models tell us
how extreme a value is by
telling us how likely it is to
find one that far from the
mean.
68-95-99.7 Rule
 In a Normal Model
about 68% of values
fall within 1 SD of the
mean
 About 95% of values
fall within 2 SD of the
mean
 About 99.7% of
values fall within 3 SD
of the mean
Sample Problems
 Jean-Baptiste Grange of
France skied the slalom in
88.46sec, approximately 1
SD faster than the mean.
If a Normal Mode is useful
in describing these slalom
times, about how many of
the 35 skiers finishing the
event would you expect
skied the slalom faster
than Jean-Baptiste?
 We expect 68% of skiers
to be within 1 SD of the
mean. Of the remaining
32%, we expect half on
the high end and half on
the low end.
 16% of 35 is 5.6, so
conservatively, we’d
expect about 5 skiers to
do better than JeanBaptiste
The Dutch
 The Dutch are among the
tallest people in the
world: The average Dutch
man is 185cm tall, just
over six feet. The average
Dutch woman is just over
5’ 7’’ tall.
 If the Normal Model is
appropriate and the SD for
men is about 8cm, what
percentage of Dutch men
will be over 2 meters (6’
6’’) tall?
The Dutch





Mean = 184 cm
SD = 8 cm
2 meters = 200cm
200cm = 2 SD above mean
We expect 5% of men to
be more than two
standard deviations below
or above the mean
 2.5% are likely to be above
2 meters
Driving
 It takes you 20 minutes, on
average, to drive to school with
a standard deviation of 2
minutes
 Suppose a Normal Model is
appropriate for the distribution
of driving times
 A) How often will you arrive at
school in less than 22 minutes?
 Answer:
68% of the time we’ll be within 1
SD, or two minutes, of the
average 20 minutes.
So 32% of the time we’ll arrive in
less than 18 minutes or in more
than 22 minutes.
Half of those times (16%) will be
greater than 22 minutes, so 84%
will be less than 22 minutes
Driving
 It takes you 20 minutes,
on average, to drive to
school with a standard
deviation of 2 minutes
 B) How often will it take
you more than 24
minutes?
 Answer: 24 minutes is 2
 Suppose a Normal Model
is appropriate for the
distribution of driving
times
SD above the mean. By
the 95% rule, we know
2.5% of the times will be
more than 24 minutes
Driving
 It takes you 20 minutes, on
 C) Do you think the
average, to drive to school
with a standard deviation of
2 minutes
distribution of your driving
times is unimodal and
symmetric?
 Suppose a Normal Model is
 Answer: “Good” traffic will
appropriate for the
distribution of driving times
speed up your time by a bit
but traffic incidents may
occasionally increase the
time it takes so times may
be skewed to the right and
there may be outliers.
Driving
 It takes you 20 minutes,
 D) What does the shape of
on average, to drive to
school with a standard
deviation of 2 minutes
the distribution then say
about the accuracy of
your predictions?
 Suppose a Normal Model
 Answer: If this is the case
is appropriate for the
distribution of driving
times
the Normal Model is not
appropriate and the
percentages we predict
would not be accurate.
pg 129, # 1, 2, 3, 5, 7, 9, 24
(Handed in Tomorrow for Real)
Working With Normal Models
1. Make a Picture
2. Make a Picture
3. Make a Picture
How to Draw a Normal Curve:
- Bell shaped, symmetric about
mean: start at the middle and
sketch the left and right
- Only need to draw out to 3SD
- The place where the bell shape
changes from curving downward
to curving back up – the
inflection point – is located
exactly one standard deviation
from the mean