Download Non-normal Distributions In a

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Welcome to .
Week 08 Tues .
MAT135 Statistics
Non-normal Distributions
Last class we studied a lot
about the normal distribution
Some distributions are not
normal …
Non-normal Distributions
Skewness – the data are
“bunched” to one side vs a
normal curve
Non-normal Distributions
Scores that are "bunched" at
the right or high end of the
scale are said to have a
“negative skew”
Non-normal Distributions
In a “positive skew”, scores are
bunched near the left or low
end of a scale
Non-normal Distributions
Note: this is exactly the
opposite of how most people use
the terms!
NON-NORMAL DISTRIBUTIONS
PROJECT QUESTIONS 1,2
Which is positively skewed?
Which is negatively skewed?
Non-normal Distributions
Kurtosis - how tall or flat your
curve is compared to a normal
curve
Non-normal Distributions
Curves taller than a normal
curve are called “Leptokurtic”
Curves that are flatter than a
normal curve are
called
”Platykurtic”
Non-normal Distributions
Platykurtic
W.S. Gosset 1908
Leptokurtic
NON-NORMAL DISTRIBUTIONS
PROJECT QUESTIONS 3,4
Which is platykurtic?
Which is leptokurtic?
Questions?
Normal Distributions
We use normal distributions a
lot in statistics because lots of
things have graphs this shape!
-heights
-weights
-IQ test scores
-bull’s eyes
Normal Distributions
Also, even data which are not
normally distributed
have averages which DO have
normal distributions
Normal Distributions
If you take a gazillion samples
and find the means for each of
the gazillion samples
You would have a new
population:
the gazillion means
Normal Distributions
Normal Distributions
If you plotted the frequency of
the gazillion mean values, it is
called a
SAMPLING DISTRIBUTION
Sampling Distributions
The shape of the plot of the
gazillion sample means would
have a normal-ish distribution
NO MATTER WHAT THE
ORIGINAL DATA LOOKED
LIKE
Sampling Distributions
But … the shape of the
distribution of your gazillion
means changes with the size “n”
of the samples you took
Sampling Distributions
Graphs of a gazillion means for
different n values
Sampling Distributions
As “n” increases, the
distributions of the means
become closer and closer
to normal
Sampling Distributions
This also
works for
discrete data
Sampling Distributions
as “n” increases,
variability (spread) also decreases
Sampling Distributions
We usually say the sample mean
will be normally distributed if n
is ≥ 20 (or 30…)
(the “good-enuff” value)
Sampling Distributions
I call it:
20or30
Sampling Distributions
The statistical principle that
allows us to conclude that
sample means have a normal
distribution if the sample size
is 20or30 or more is called the
Central Limit Theorem
Sampling Distributions
If you can assume the
distribution of the sample
means is normal, you can use
the normal distribution
probabilities for making
probability statements about µ
Sampling Distributions
Sample means from platykurtic,
leptokurtic, and bimodal
distributions become “normal
enough” when your sample size
n is 20or30 or more
Sampling Distributions
Means from samples of skewed
populations do not become
“normal enough” very easily
You sometimes need a
mega-huge
sample size to “normalize” a
badly skewed distribution
Sampling Distributions
A wild outlier might indicate a
badly skewed distribution
SAMPLING DISTRIBUTIONS
PROJECT QUESTION 5
From which of these would you
expect the distribution of the
sample means to be normal?
Original population normal
Samples taken of size 10
Sample taken of size 50
Highly skewed population
Questions?
Graphs of 𝒙
Graph of 𝒙
values
Graphs of 𝒙
Averages (measures of central
tendency) show where the data
tend to pile up
Graph of 𝒙
values
Graphs of 𝒙
The place where 𝒙 tends to pile
up is at μ
Graph of 𝒙
values
Graphs of 𝒙
So, the most likely value for 𝒙
is μ
Graph of 𝒙
values
Graphs of 𝒙
As you move away from μ on
the graph, 𝒙 is less likely to
have these values
Graph of 𝒙
values
Graphs of 𝒙
GRAPHS OF 𝒙
PROJECT QUESTION 6
Population
mean
μ
Sample
mean
𝒙
What is the best estimate
we have for the unknown
population mean µ ?
GRAPHS OF 𝒙
PROJECT QUESTION 6
𝒙
is the best estimate
we have for the unknown
population mean µ
Graphs of 𝒙
The mean of all of the
gazillion 𝒙 values
will be µ
GRAPHS OF 𝒙
PROJECT QUESTION 7
Graph of likely values for µ:
?
GRAPHS OF 𝒙
PROJECT QUESTION 7
Graph of likely values for µ:
𝒙
Questions?
Estimation
We will use the sample mean 𝒙
to estimate the unknown
population mean µ
Estimation
Using the sample mean 𝒙 to
estimate the unknown population
mean µ is called “making
inferences”
Estimation
The sample standard deviation
“s” is the best estimate we have
for the unknown population
standard deviation “σ”
Estimation
Using s to estimate σ
is also
an inference
Estimation
You would think, since we use 𝒙
to estimate µ and s to estimate
σ that the graph of 𝒙 would be:
𝒙-3s 𝒙-2s 𝒙-s
𝒙
𝒙+s 𝒙+2s 𝒙+3s
Estimation
It’s not…
Estimation
Remember, as “n” increases,
the variability decreases:
Estimation
While s is a good estimate for
the original population standard
deviation σ, s IS NOT the
measure of variability in the
new population of 𝒙s
Estimation
It needs to be decreased to take
sample size into account!
Estimation
We use:
s/ n
for the measure of variability
in the new population of 𝒙s
Estimation
The standard deviation
of the 𝒙s: s/ n
is called the “standard error”
abbreviated “se”
Estimation
BTW: you now know ALL of the
items on the Descriptive
Statistics list in Excel !
Estimation
So our curve is:
𝒙-3se 𝒙-2se 𝒙-se 𝒙
𝒙+se 𝒙+2se 𝒙+3se
ESTIMATION
PROJECT QUESTION 8
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
Can we assume the population of
𝒙s forms a normal distribution?
ESTIMATION
PROJECT QUESTION 8
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
Because the sample size 49 is
above the usual “good-enuff”
value of 20or30, unless the
original distribution is very
skewed, it will be normal
ESTIMATION
PROJECT QUESTION 9
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is our best estimate of
the original population mean?
ESTIMATION
PROJECT QUESTION 9
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is our best estimate of
the original population mean?
150
ESTIMATION
PROJECT QUESTION 10
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is our best estimate of
the original population standard
deviation?
ESTIMATION
PROJECT QUESTION 10
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is our best estimate of
the original population standard
deviation?
56
ESTIMATION
PROJECT QUESTION 11
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is our estimate of the
standard error?
ESTIMATION
PROJECT QUESTION 11
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is our estimate of the
standard error?
56/ 49 = 56/7 = 8
ESTIMATION
PROJECT QUESTION 12
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What will be the normal curve
for the 𝒙s ?
ESTIMATION
PROJECT QUESTION 12
Our curve is:
ESTIMATION
PROJECT QUESTION 12
Our curve is:
126 134 142 150 158 166 174
ESTIMATION
PROJECT QUESTION 13
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is the probability that the
true population mean is greater
between 142 and 158?
ESTIMATION
PROJECT QUESTION 13
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is the probability that the
true population mean is between
142 and 158?
68%
ESTIMATION
PROJECT QUESTION 14
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is the probability that the
true population mean is 150?
ESTIMATION
PROJECT QUESTION 14
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is the probability that the
true population mean is 150?
0%
ESTIMATION
PROJECT QUESTION 15
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is the range of values that
would ensure with 95%
probability that we include the
mean?
ESTIMATION
PROJECT QUESTION 15
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is the range of values that
would ensure with 95%
probability that we include the
mean?
126-174
ESTIMATION
PROJECT QUESTION 16
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is the probability the true
p lies between 130 and 145?
ESTIMATION
PROJECT QUESTION 16
Suppose we have a population of
𝒙s from samples of size 49
The mean of the 𝒙s is 150
The standard deviation is 56
What is the probability the true
p lies between 130 and 145?
26%
Questions?
You survived!
Turn in your homework!
Don’t forget
your homework
due next class!
See you Thursday!
www.playbuzz.com