Download Statistical Foundations: The Normal Distribution The Central Limit

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Statistical Foundations:
The Normal Distribution
The Central Limit Theorem
Other Fun Things...
Lecture 7
September 14, 2006
Psychology 790
Lecture #7 - 9/14/2006
Slide 1 of 32
Today’s Lecture
●
Overview
➤ Today’s Lecture
Homework questions?
✦
Any questions on homework #2?
Normal Distribution
●
The Normal Distribution (Chapter 6.1-6.6).
Assessing
Normality
●
The Central Limit Theorem (Chapter 6.7).
Central Limit
Theorem
●
Information about Maximum Likelihood Estimators.
Maximum
Likelihood
●
Biasedness and Unbiasedness.
Bias
Wrapping Up
Lecture #7 - 9/14/2006
Slide 2 of 32
Univariate Normal Distribution
●
The univariate normal distribution function is:
Overview
Normal Distribution
➤ Normal
Distribution
➤ Normal
Distribution Notes
➤ Cumulative
Distribution
➤ Finding
Probabilities
➤ More Notes
2
e
f (x; µ, σ ) = √
2
2πσ
●
Assessing
Normality
Bias
The mean sets the center of the distribution on the x axis.
The variance is σ 2 .
✦
Central Limit
Theorem
−(x−µ)2
2σ 2
The mean is µ.
✦
●
Maximum
Likelihood
1
The standard deviation is σ.
The variance sets the spread/dispersion/width of the
distribution.
Standard notation for normal distributions is N (µ, σ 2 ).
✦
●
✦
Hence, the letter N from Wonder Showzen.
Wrapping Up
Lecture #7 - 9/14/2006
Slide 3 of 32
Univariate Normal Distribution
N (0, 1)
Overview
Univariate Normal Distribution
0.2
f(x)
0.3
0.4
Normal Distribution
➤ Normal
Distribution
➤ Normal
Distribution Notes
➤ Cumulative
Distribution
➤ Finding
Probabilities
➤ More Notes
Central Limit
Theorem
0.1
Assessing
Normality
0.0
Maximum
Likelihood
Bias
Wrapping Up
Lecture #7 - 9/14/2006
−6
−4
−2
0
2
4
6
x
Slide 4 of 32
Univariate Normal Distribution
N (0, 2)
Overview
Univariate Normal Distribution
0.2
f(x)
0.3
0.4
Normal Distribution
➤ Normal
Distribution
➤ Normal
Distribution Notes
➤ Cumulative
Distribution
➤ Finding
Probabilities
➤ More Notes
Central Limit
Theorem
0.1
Assessing
Normality
0.0
Maximum
Likelihood
Bias
Wrapping Up
Lecture #7 - 9/14/2006
−6
−4
−2
0
2
4
6
x
Slide 5 of 32
Univariate Normal Distribution
N (3, 1)
Overview
Univariate Normal Distribution
0.2
f(x)
0.3
0.4
Normal Distribution
➤ Normal
Distribution
➤ Normal
Distribution Notes
➤ Cumulative
Distribution
➤ Finding
Probabilities
➤ More Notes
Central Limit
Theorem
0.1
Assessing
Normality
0.0
Maximum
Likelihood
Bias
Wrapping Up
Lecture #7 - 9/14/2006
−6
−4
−2
0
2
4
6
x
Slide 6 of 32
Normal Distribution Notes
●
The area under the curve for the normal distribution is equal
to one (recall our probability lecture about P (S)).
●
Furthermore, we know that:
Overview
Normal Distribution
➤ Normal
Distribution
➤ Normal
Distribution Notes
➤ Cumulative
Distribution
➤ Finding
Probabilities
➤ More Notes
P (µ − σ ≤ X ≤ µ + σ) = 0.683
P (µ − 2σ ≤ X ≤ µ + 2σ) = 0.954
●
Assessing
Normality
Central Limit
Theorem
Maximum
Likelihood
●
Bias
Also note the term in the exponent:
−(x − µ)
2σ 2
2
This is the square of the distance from x to µ in standard
deviation units.
Wrapping Up
Lecture #7 - 9/14/2006
Slide 7 of 32
Cumulative Normal Distribution
●
The cumulative normal distribution (denoted F (x)) gives the
probability a observation falls at or below x.
Overview
Normal Distribution
➤ Normal
Distribution
➤ Normal
Distribution Notes
➤ Cumulative
Distribution
➤ Finding
Probabilities
➤ More Notes
●
Assessing
Normality
Central Limit
Theorem
Wrapping Up
Lecture #7 - 9/14/2006
In probability notation: F (x) = p(a ≤ x)
✦
This probability is formed by taking the area under the
curve to the left of the point (found by integration).
For instance, the probability of finding a point less than or
equal to the mean would be given by:
Z µ
−(x−µ)2
1
√
F (µ) = p(a ≤ µ) =
e 2σ2 dx = 0.5
2πσ 2
∞
R
●
You can replace with what you find from a table in the book
(or the =normdist function in Excel).
●
Another name for the cumulative distribution is the
cumulative density function (abbreviated CDF).
Maximum
Likelihood
Bias
✦
Slide 8 of 32
Normal Distribution Functions
N (0, 1)
Overview
Probability Density Function
F(x)
0.2
f(x)
0.6
0.3
0.8
1.0
0.4
Normal Distribution
➤ Normal
Distribution
➤ Normal
Distribution Notes
➤ Cumulative
Distribution
➤ Finding
Probabilities
➤ More Notes
0.2
0.1
0.4
Assessing
Normality
Central Limit
Theorem
Cumulative Density Function
0.0
0.0
Maximum
Likelihood
Bias
Wrapping Up
Lecture #7 - 9/14/2006
−4
−2
0
x
2
4
−4
−2
0
2
4
x
Slide 9 of 32
Normal Distribution Functions
N (0, 2)
Overview
0.4
Assessing
Normality
0.10
f(x)
F(x)
0.6
0.15
0.8
0.20
Normal Distribution
➤ Normal
Distribution
➤ Normal
Distribution Notes
➤ Cumulative
Distribution
➤ Finding
Probabilities
➤ More Notes
0.0
0.05
0.2
Central Limit
Theorem
Maximum
Likelihood
Cumulative Density Function
1.0
Probability Density Function
Bias
Wrapping Up
Lecture #7 - 9/14/2006
−4
−2
0
x
2
4
−4
−2
0
2
4
x
Slide 10 of 32
Normal Distribution Functions
N (3, 1)
Overview
Probability Density Function
F(x)
0.2
f(x)
0.6
0.3
0.8
1.0
0.4
Normal Distribution
➤ Normal
Distribution
➤ Normal
Distribution Notes
➤ Cumulative
Distribution
➤ Finding
Probabilities
➤ More Notes
0.2
0.1
0.4
Assessing
Normality
Central Limit
Theorem
Cumulative Density Function
Bias
Wrapping Up
Lecture #7 - 9/14/2006
0.0
0.0
Maximum
Likelihood
−4
−2
0
2
x
4
6
−4
−2
0
2
4
6
x
Slide 11 of 32
Finding Probabilities
●
We can use the Normal CDF to assess the probability an
observation falls within a given range.
●
For instance, imagine we had a distribution of variables that
we knew was N(0,1) - standard normal.
●
What is p(−0.75 ≤ 1.0)?
●
How would you figure that out?
Overview
Normal Distribution
➤ Normal
Distribution
➤ Normal
Distribution Notes
➤ Cumulative
Distribution
➤ Finding
Probabilities
➤ More Notes
Assessing
Normality
Central Limit
Theorem
Maximum
Likelihood
Bias
Wrapping Up
Lecture #7 - 9/14/2006
Slide 12 of 32
More Normal Notes
●
The normal distribution is frequently used in mathematical
statistics because it is very flexible.
●
We will assume normality of many things in this course.
●
For instance, we will come to say the amount of error in
prediction we have in a regression will be normally
distributed.
Overview
Normal Distribution
➤ Normal
Distribution
➤ Normal
Distribution Notes
➤ Cumulative
Distribution
➤ Finding
Probabilities
➤ More Notes
Assessing
Normality
✦
●
This leads to hypothesis tests
Then there is the Central Limit Theorem...
Central Limit
Theorem
Maximum
Likelihood
Bias
Wrapping Up
Lecture #7 - 9/14/2006
Slide 13 of 32
Assessing Normality
●
We will find that there are two ways to assess
normality/MVN.
Overview
Normal Distribution
Assessing
Normality
➤ Uni Norm
➤ Make a Q-Q plot
➤ Example Q-Q plot
➤ Other Tests
Central Limit
Theorem
1. By comparing the distribution of your observations (or
some transformation of your observations) to some known
distribution. (These are commonly called Q-Q plots)
2. By computing some set of statistics and obtaining a
p-value (i.e., compute a statistic with a known distribution
and determine how extreme the statistic is compared to a
null hypothesis).
Maximum
Likelihood
Bias
Wrapping Up
Lecture #7 - 9/14/2006
Slide 14 of 32
Assessing Univariate Normality
Overview
Normal Distribution
Assessing
Normality
➤ Uni Norm
➤ Make a Q-Q plot
➤ Example Q-Q plot
➤ Other Tests
Central Limit
Theorem
There are situations where we would like to assess whether a
variable is normally distributed using a Q-Q plot.
● A Q-Q plot is a plot that matches the Quantiles of the
observed data with the Quantiles of a specific distribution.
●
✦
●
Maximum
Likelihood
Lecture #7 - 9/14/2006
●
For example the .5 quantile of a N(0,1) is 0.
In our case the Quantiles of a specific distribution will be a
normal, N (0, 1).
✦
Bias
Wrapping Up
A Quantile (commonly called a percentile) is that value such
that a specific proportion p of the population will score at or
below.
It could be a N (x̄, s2x ), if preferred.
There should be a linear relationship between the quantiles
of the observed data with their theoretical quantiles
(assuming the distribution) if they follow the same
distribution.
Slide 15 of 32
Constructing a Q-Q plot
Lets assume that we have n observations x1 , x2 , . . . , xn . To
construct a Q-Q plot we:
Overview
Normal Distribution
Assessing
Normality
➤ Uni Norm
➤ Make a Q-Q plot
➤ Example Q-Q plot
➤ Other Tests
1. Order the observations from smallest to largest (i.e.,
x(1) ≤ y(2) ≤ . . . ≤ x(n) ).
2. Next we define the ith point, x(i) , as the (i − .5)/n quantile.
●
Central Limit
Theorem
Maximum
Likelihood
Bias
We could use i/n but can cause problems.
3. Based on a N (0, 1) distribution we compute the quantile
values q1 , q2 , . . . , qn (this is typically done using a table or
computer).
Wrapping Up
4. Finally plot (x(i) , qi ), and if they follow the same distribution
(Normal) they should form a line.
Lecture #7 - 9/14/2006
Slide 16 of 32
Example Q-Q plot
Lets assume that we have 5 observations: 3, 6, 4, 5, 2:
Overview
First we order them
Normal Distribution
Assessing
Normality
➤ Uni Norm
➤ Make a Q-Q plot
➤ Example Q-Q plot
➤ Other Tests
Central Limit
Theorem
y(i)
2
3
4
5
6
(i − .5)/n
qi
Maximum
Likelihood
Bias
Wrapping Up
Lecture #7 - 9/14/2006
Slide 17 of 32
Example Q-Q plot
Lets assume that we have 5 observations: 3, 6, 4, 5, 2:
Overview
Next compute quantiles
Normal Distribution
Assessing
Normality
➤ Uni Norm
➤ Make a Q-Q plot
➤ Example Q-Q plot
➤ Other Tests
Central Limit
Theorem
Maximum
Likelihood
y(i)
2
3
4
5
6
(i − .5)/n
(1 − .5)/5 = .1
(2 − .5)/5 = .3
(3 − .5)/5 = .5
(4 − .5)/5 = .7
(5 − .5)/5 = .9
qi
Bias
Wrapping Up
Lecture #7 - 9/14/2006
Slide 18 of 32
Example Q-Q plot
Lets assume that we have 5 observations: 3, 6, 4, 5, 2:
Overview
Normal Distribution
Assessing
Normality
➤ Uni Norm
➤ Make a Q-Q plot
➤ Example Q-Q plot
➤ Other Tests
Central Limit
Theorem
Maximum
Likelihood
Finally compute quantiles values assuming N (0, 1) (i.e., this is
a z-score)
y(i)
(i − .5)/n
2
(1 − .5)/5 = .1
3
(2 − .5)/5 = .3
4
(3 − .5)/5 = .5
5
(4 − .5)/5 = .7
6
(5 − .5)/5 = .9
and plot
qi
-1.28
-0.52
0.00
0.52
1.28
Bias
Wrapping Up
Lecture #7 - 9/14/2006
Slide 19 of 32
Example Q-Q plot
Notice how it follows nearly a straight line
Overview
Figure 1: Q-Q plot
Normal Distribution
Assessing
Normality
➤ Uni Norm
➤ Make a Q-Q plot
➤ Example Q-Q plot
➤ Other Tests
7
6
5
4
Central Limit
Theorem
3
Maximum
Likelihood
Y
Bias
2
1
-1.5
Wrapping Up
Lecture #7 - 9/14/2006
-1.0
-.5
0.0
.5
1.0
1.5
Q
Slide 20 of 32
Other Tests
Other tests can be gathered in SAS (note the null hypothesis is
always that the data is normally distributed).
Overview
Normal Distribution
We will discuss this in later classes (especially in regression
diagnostics).
Assessing
Normality
➤ Uni Norm
➤ Make a Q-Q plot
➤ Example Q-Q plot
➤ Other Tests
proc univariate data=mydata normal plot;
var x1-x5;
run;
Central Limit
Theorem
Maximum
Likelihood
Bias
Wrapping Up
Lecture #7 - 9/14/2006
Slide 21 of 32
Central Limit Theorem
From Hays:
If a population has a finite variance σ 2 and a finite mean
µ, the distribution of sample means of N independent
observations approaches the form of a normal
distribution with variance σ 2 /N and mean µ as the
sample size N increases. When N is very large, the
sampling distribution of x̄ is approximately N (µ, σ 2 /N ).
(p. 251)
Overview
Normal Distribution
Assessing
Normality
Central Limit
Theorem
➤ CLT
Maximum
Likelihood
●
So...The distribution of x̄ converges to normal with mean
equal to µ and variance σ 2 /N .
Bias
✦
This is true no matter how X is distributed.
Wrapping Up
●
●
Lecture #7 - 9/14/2006
If X is normal, (N − 1)s2 /σ 2 has a Chi-Square distribution
with N − 1 degrees of freedom
Note: We will end up using these pieces of information for
hypothesis testing such as t-test and ANOVA.
Slide 22 of 32
Maximum Likelihood
●
Now that we know a thing or two about the PDF of the
normal distribution, it makes sense to talk about maximum
likelihood.
●
As we said last time, a MLE is an estimator which has a
value that maximizes something called a likelihood function.
●
In the statistics you will encounter in your career, MLEs will
frequently be used, so I present this to give you background.
Overview
Normal Distribution
Assessing
Normality
Central Limit
Theorem
Maximum
Likelihood
➤ ML
➤ Likelihood
Functions
➤ MLE Properties
Bias
✦
●
In this class, you will never have to find what an MLE is,
only to know what types of properties MLEs have.
We said last time that x̄ and S 2 were MLEs - just know that.
Wrapping Up
Lecture #7 - 9/14/2006
Slide 23 of 32
Likelihood Functions
●
Let’s start our discussion by talking about a likelihood
function.
●
A likelihood function is the statistical model for the data
formed by the distribution the data follows.
●
Let’s imagine we have a sample of independent observations
x1 , x2 , . . . , xN we know come from N(0,1).
Overview
Normal Distribution
Assessing
Normality
Central Limit
Theorem
Maximum
Likelihood
➤ ML
➤ Likelihood
Functions
➤ MLE Properties
Bias
Wrapping Up
Lecture #7 - 9/14/2006
✦
●
Statisticians would say that our sample is iid or
Independent and Identically Distributed.
Because of independence, the joint distribution of the data is
formed by taking the product of the individual distribution
functions of the data (like independence in our probability
chapter):
QN
L(x1 , x2 , . . . , xN ) = f (x1 ) × f (x2 ) × . . . × f (xN ) = i=1 f (xi )
Slide 24 of 32
Likelihood Functions
L(x1 , x2 , . . . , xN ) = f (x1 ) × f (x2 ) × . . . × f (xN ) =
Overview
Normal Distribution
Assessing
Normality
L(x1 , x2 , . . . , xN ) =
i=1
Central Limit
Theorem
Maximum
Likelihood
➤ ML
➤ Likelihood
Functions
➤ MLE Properties
N
Y
●
√
1
2πσ 2
e
N
Y
f (xi )
i=1
−(xi −µ)2
2σ 2
The likelihood function above is the function of the data that
needs to be maximized with respect to µ or σ 2 .
✦
By calculus, we know that:
■
x̄ happens to be where L is maximized for µ.
■
S 2 happens to be where L is maximized for σ 2 .
Bias
Wrapping Up
●
Lecture #7 - 9/14/2006
Let’s have an example...we have five observations: 2,3,4,5,6,
which are iid N (µ, 2.5).
Slide 25 of 32
Likelihood Functions
Our likelihood function:
N
Y
−(xi −µ)2
1
√
L(µ) =
e 2×2.5
2π2.5
i=1
Overview
Normal Distribution
1.40e−05
Assessing
Normality
1.20e−05
L(mu)
Maximum
Likelihood
➤ ML
➤ Likelihood
Functions
➤ MLE Properties
1.30e−05
Central Limit
Theorem
Wrapping Up
1.10e−05
Bias
3.6
3.8
4.0
4.2
4.4
mu
Lecture #7 - 9/14/2006
Slide 26 of 32
MLE Properties
●
Overview
MLEs are used frequently because of their properties:
✦
Functional invariance: any function of the MLE results in
the MLE for a function.
✦
Asymptotic behavior: when N is very large, the variance
of the MLE hits an important lower limit (implications for
consistency and relative efficiency).
Normal Distribution
Assessing
Normality
Central Limit
Theorem
Maximum
Likelihood
➤ ML
➤ Likelihood
Functions
➤ MLE Properties
Bias
●
What does all of this mean to you?
✦
MLEs are your friends.
✦
You now know the basics, so do not shy from talking MLEs
(or reading them).
Wrapping Up
Lecture #7 - 9/14/2006
Slide 27 of 32
Bias Recap
●
Overview
Last time we said that x̄ is unbiased for µ.
✦
This means that E(x̄) = µ.
Normal Distribution
●
Assessing
Normality
We also said that if we use:
s2 =
Central Limit
Theorem
For variance, then:
Maximum
Likelihood
●
Lecture #7 - 9/14/2006
2
(x
−
x̄)
i
i=1
N −1
E(s2 ) = σ 2
Bias
➤ Mean
➤ Variance
Wrapping Up
PN
For your information, these slides show you how...
Slide 28 of 32
Mean
E (x̄) = E
Overview
N
X
i=1
xi
N
!
=E
x1 + x2 + . . . + xN
N
Normal Distribution
Assessing
Normality
Central Limit
Theorem
E(x1 ) + E(x2 ) + . . . + E(xN )
µ + µ + ...+ µ
Nµ
=
=
=
N
N
N
=µ
Maximum
Likelihood
Bias
➤ Mean
➤ Variance
Wrapping Up
Lecture #7 - 9/14/2006
Slide 29 of 32
Variance
From http://en.wikipedia.org/wiki/Variance
Overview
Normal Distribution
Assessing
Normality
Central Limit
Theorem
Maximum
Likelihood
Bias
➤ Mean
➤ Variance
Wrapping Up
Lecture #7 - 9/14/2006
Slide 30 of 32
Final Thought
●
Overview
Normal Distribution
✦
Assessing
Normality
Central Limit
Theorem
Wrapping Up
➤ Final Thought
➤ Next Class
Lecture #7 - 9/14/2006
Part of that frequency is
due to the CLT.
●
MLEs are something to
understand but their
determination is not the
focus of this course.
●
Biased and unbiased parameters are not talked about much
after this lecture.
Maximum
Likelihood
Bias
The normal distribution is
something we will come to
use quite often.
Slide 31 of 32
Next Time
Overview
Normal Distribution
●
Hypothesis testing, Part I (Chapter 7.1 to 7.10)
●
We will recap this week’s Wonder Showzen: MTV2 Friday at
8:30pm.
Assessing
Normality
Central Limit
Theorem
Maximum
Likelihood
Bias
Wrapping Up
➤ Final Thought
➤ Next Class
Lecture #7 - 9/14/2006
Slide 32 of 32