Download Lecture 4

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Econ 140
More on Univariate Populations
Lecture 4
Lecture 4
1
Today’s Plan
•
•
•
•
•
Econ 140
Examining known distributions:
Normal distribution & Standard normal curve
Student’s t distribution
F distribution & c2 distribution
Note: should have a handout for today’s lecture with all
tables and a cartoon
Lecture 4
2
Standard Normal Curve
Econ 140
• We need to calculate something other than our PDF, using
the sample mean, the sample variance, and an assumption
about the shape of the distribution function
• Examine the assumption later
• The standard normal curve (also known as the Z table) will
approximate the probability distribution of almost any
continuous variable as the number of observations
approaches infinity
Lecture 4
3
Standard Normal Curve (2)
Econ 140
• The standard deviation (measures the distance from the
mean) is the square root of the variance:
2
 
68%
area under curve
95%
99.7%
3
Lecture 4
2

y

2
3
4
Standard Normal Curve (3)
Econ 140
• Properties of the standard normal curve
– The curve is centered around  y
– The curve reaches its highest value at  y and tails off
symmetrically at both ends
– The distribution is fully described by the expected
value and the variance
• You can convert any distribution for which you have
estimates of  y and  2 to a standard normal distribution
Lecture 4
5
Standard Normal Curve (4)
Econ 140
• A distribution only needs to be approximately normal for
us to convert it to the standardized normal.
• The mass of the distribution must fall in the center, but the
shape of the tails can be different
or
2
1
y
Lecture 4
6
Standard Normal Curve (5)
Econ 140
• If we want to know the probability that someone earns at
most $C, we are asking: PY  C   ?
(Y   )  C   
We can
(Y   ) C   



rearrange
P ( Z  C*)  ?
terms to get:
where Z 
(Y   )

• Properties for the standard normal variate Z:
– It is normally distributed with a mean of zero and a
variance of 1, written in shorthand as Z~N(0,1)
Lecture 4
7
Standard Normal Curve (5)
Econ 140
• If we have some variable Y we can assume that Y will be
normally distributed, written in shorthand as Y~N(µ,2)
• We can use Z to convert Y to a normal distribution
• Look at the Z standardized normal distribution handout
– You can calculate the area under the Z curve from the
mean of zero to the value of interest
– For example: read down the left hand column to 1.6 and
along the top row to .4 you’ll find that the area under
the curve between Z=0 and Z=1.64 is 0.4495
Lecture 4
8
Standard Normal Curve (6)
Econ 140
• Going back to our earlier question: What is the probability
that someone earns between $300 and $400 [P(300Y
400)]?
  316.6
Z1
Z2
 2  25608
  25608  160
P(300Y 400)
300  316.6
 0.104
160
300 316.6
400  316.6
Z 400 
 0.52
160
P (0.104  Z  0)  0.0418
P (0  Z  0.52)  0.1985
P (0.104  Z  0.52)  0.0418  0.1985  .2403
Z300 
Lecture 4
400
9
Standard Normal Curve (7)
Econ 140
• We know from using our PDF that the chance of someone
earning between $300 and $400 is around 23%, so 0.24 is
a good approximation
• Now we can ask: What is the probability that someone
earns between $253 and $316?
Z1
Z2
P(253Y 316)
253  316.6
 0.3975
160
316  316.6
Z2 
 0.0038
160
P (0.3975  Z  0)  0.1554
Z1 
P (0.0038  Z  0)  0.0020
P (0.3975  Z  .0038)  .1554  .002
Lecture 4
253
316.6
316
 .1574  15.3%
10
Standard Normal Curve (8)
Econ 140
• There are instructions for how you can do this using Excel:
L4_1.xls. Note how to use STANDARDIZE and
NORMDIST and what they represent
• Our spreadsheet example has 3 examples of different
earnings intervals, using the same distribution that we used
today
• Testing the Normality assumption. We know the
approximate shape of the Earnings (L3_79.xls)
distribution. Slightly skewed. Is normality a good
assumption? Use in Excel (L4_2.xls) of NORMSINV
Lecture 4
11
Student’s T-Distribution
Econ 140
• Starting next week, we’ll be looking more closely at
sample statistics
• In sample statistics, we have a sample that is small relative
to the population size
• We do not know the true population mean and variance
– So, we take samples and from those samples we will
estimate a mean Y and variance SY2
Lecture 4
12
T-Distribution Properties
Econ 140
• Fatter tails than the Z distribution
• Variance is n/(n-2) where n is the number of observations
• When n approaches a large number (usually over 30), the t
approximates the normal curve
• The t-distribution is also centered on a mean of zero
• The t lets us approximate probabilities for small samples
Lecture 4
13
F and c2 Distributions
Econ 140
• Chi-squared distribution:square of a standard normal (Z)
distribution is distributed c2 with one degree of freedom
(df).
• Chi-squared is skewed. As df increases, the c2
approximates a normal.
• F-distribution: deals with sample data. F stands for Fisher,
R.A. who derived the distribution. F tests if variances are
equal.
• F is skewed and positive. As sample sizes grow infinitely
large the F approximates a normal. F has two parameters:
degrees of freedom in the numerator and denominator.
Lecture 4
14
What we’ve done
Econ 140
• The probability of earning particular amounts
– Relationship between a sample and population
– Using standard normal tables
• Introduction to the t-distribution
• Introduction to the F and c2 distributions
• In the next lectures we’ll move on to bivariate populations,
which will be important for computing conditional
probability examples such as P(Y|X)
Lecture 4
15