Download Econometrics unit 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Econometrics ECM712s
P. Nangula
Tel: 2072492
[email protected]
What is Econometrics?
Literally interpreted econometrics means “economics
measurements”. Econometrics may be defined as the
social science in which the tools of economic theory,
mathematics, and statistical inference are applied to the
analysis of economic variable. Econometrics can also be
defined as “statistical observation of theoretically formed
concepts or alternatively as mathematical economics
working with measured data”
We know very well that economic theory attempts to
defined the relationship among different economic
variables.
WHY A SEPARATE DISCIPLINE?
Economic theory makes statements or hypotheses that
are mostly qualitative in nature. For example,
microeconomic theory states that, other things remaining
the same, a reduction in the price of a commodity is
expected to increase the quantity demanded of that
commodity.
It is the job of the econometrician to provide such
numerical estimates.
The main concern of mathematical economics is to
express economic theory in mathematical form
(equations) without regard to measurability or empirical
verification of the theory.
WHY A SEPARATE DISCIPLINE?
Economic statistics is mainly concerned with
collecting, processing, and presenting economic
data in the form of charts and tables. It does not
go any further. The one who does that is the
econometrician.
METHODOLOGY OF ECONOMETRICS
How do econometricians proceed in their analysis of an
economic problem?
Following are the main steps in methodology of
econometrics
1. Statement of theory or hypothesis.
2. Specification of the mathematical model of the theory
3. Specification of the statistical, or econometric, model
4. Obtaining the data
5. Estimation of the parameters of the econometric model
6. Hypothesis testing
7. Forecasting or prediction
8. Using the model for control or policy purposes.
METHODOLOGY OF ECONOMETRICS
How do econometricians proceed in their analysis of an
economic problem?
1. Statement of theory or hypothesis.
Men or women increase their consumption as their income
increases, but not as much as the increase in their income.
2. Specification of the mathematical model of the theory
𝑌 = 𝛽1 + 𝛽2 𝑋
0 < 𝛽2 > 1
where Y = consumption expenditure and X = income, and
where β1 and β2, known as the parameters of the model, are,
respectively, the intercept and slope coefficients.
This is a single equation model
Y is dependent variable and X is independent or explanatory
METHODOLOGY OF ECONOMETRICS
2. Specification of the mathematical model of
the theory
𝑌 = 𝛽1 + 𝛽2 𝑋
Consumption
𝛽2 =MPC
1
Income
METHODOLOGY OF ECONOMETRICS
3. Specification of the statistical, or econometric, model
𝒀 = 𝜷𝟏 + 𝜷𝟐 𝑿, The purely mathematical model of the
consumption function given above is of limited interest to
the econometrician, for it assumes that there is an exact
or deterministic relationship between consumption and
income.
𝒀 = 𝜷𝟏 + 𝜷𝟐 𝑿 + 𝒖, where u, known as the disturbance, or
error, term, is a random (stochastic) variable that has welldefined probabilistic properties. The disturbance term u may
well represent all those factors that affect consumption but
are not taken into account explicitly.
METHODOLOGY OF ECONOMETRICS
4. Obtaining the data
Year
Y
X
1982
3081.5
4620.3
1983
3240.6
4803.7
1984
3407.6
5140.1
1985
3566.5
5323.5
1986
3708.7
5487.7
1987
3822.3
5649.5
1988
3972.7
5865.2
1989
4064.6
6062.0
1990
4132.2
6136.3
1991
4105.8
6079.4
1992
4219.8
6244.4
1993
4343.6
6389.6
1994
4486.0
6610.7
1995
4595.3
6742.1
Source: Economic Report of the President, 1998, Table B–2, p. 282
METHODOLOGY OF ECONOMETRICS
5. Estimation of the Econometric Model
Now that we have the data, our next task is to
estimate the parameters of the consumption
function. 𝑌 = −184.08 + 0.7064𝑋
The hat on the Y indicates that it is an estimate.
The estimated consumption function.
Interpretation?
6. Hypothesis Testing
Is 0.70 statistically less than 1?
METHODOLOGY OF ECONOMETRICS
7. Forecasting or Prediction
• If the chosen model does not refute the
hypothesis or theory under consideration, we
may use it to predict the future value(s) of the
dependent, or forecast, variable Y on the basis
of known or expected future value(s) of the
explanatory, or predictor, variable X.
Review of probability distributions




The normal distribution
The t-distribution
The Chi-square distribution
The F-distribution
Regression Analysis
The term regression was introduced by Francis
Galton
Regression analysis is concerned with the
study of the dependence of one variable, the
dependent variable, on one or more other
variables, the explanatory variables, with a view
to estimating and/or predicting the (population)
mean or average value of the former in terms of
the known or fixed (in repeated sampling)
values of the latter.
Regression Analysis
Regression Analysis is a technique of studying
the dependence of one variable (called
dependant variable), on one or more variables
(called explanatory variable), with a view to
estimate or predict the average value of the
dependent variables in terms of the known or
fixed values of the independent variables.
Regression Analysis
THE REGRESSION TECHNIQUE IS PRIMARILY USED
TO :
• Estimate the relationship that exists, on the
average, between the dependent variable and
the explanatory variable
• Determine the effect of each of the explanatory
variables on the dependent variable, controlling
the effects of all other explanatory variables
• Predict the value of dependent variable for a
given value of the explanatory variable
STATISTICAL VERSUS DETERMINISTIC
RELATIONSHIPS
In regression analysis we are concerned with what is
known as the statistical, not deterministic, dependence
among variables, such as those of classical physics.
In statistical relationships among variables we essentially
deal with random or stochastic variables, that is, variables
that have probability distributions.
In deterministic dependency, on the other hand, we also
deal with variables, but these variables are not random
or stochastic.
REGRESSION VERSUS CAUSATION
Although regression analysis deals with the
dependence of one variable on other variables,
it does not necessarily imply causation.
Causation indicates that one event is the result
of the occurrence of the other event; i.e. there is
a causal relationship between the two events.
A statistical relationship in itself cannot
logically imply causation.
REGRESSION VERSUS CORRELATION
Closely related to but conceptually very much different from
regression analysis is correlation analysis.
The primary objective of correlation is to measure the
strength or degree of linear association between two
variables. The correlation coefficient measures this strength of
(linear) association. For example, we may be interested in
finding the correlation (coefficient) between smoking and lung
cancer, between scores on statistics and mathematics
examinations, between high school grades and college grades,
and so on. In regression analysis, as already noted, we are not
primarily interested in such a measure. Instead, we try to
estimate or predict the average value of one variable on the
basis of the fixed values of other variables.
REGRESSION VERSUS CORRELATION
Regression and correlation have some fundamental
differences that are worth mentioning.
• In regression analysis there is an asymmetry in
the way the dependent and explanatory variables
are treated.
• The dependent variable is assumed to be
statistical, random, or stochastic, that is, to have
a probability distribution.
• The explanatory variables, on the other hand, are
assumed to have fixed values
REGRESSION VERSUS CORRELATION
Regression and correlation have some fundamental
differences that are worth mentioning.
In correlation analysis, on the other hand, we treat any (two)
variables symmetrically; there is no distinction between the
dependent and explanatory variables. After all, the correlation
between scores on mathematics and statistics examinations is
the same as that between scores on statistics and
mathematics examinations.
Moreover, both variables are assumed to be random. As we
shall see, most of the correlation theory is based on the
assumption of randomness of variables, whereas most of the
regression theory to be expounded in this book is conditional
upon the assumption that the dependent variable is stochastic
but the explanatory variables are fixed or nonstochastic.
CAUSATION VERSUS CORRELATION
Unfortunately, intuition can lead us astray when it comes
to distinguishing between the two. For example, eating
breakfast has long been correlated with success in school
for elementary school children. It would be easy to
conclude that eating breakfast causes students to be
better learners. Is this a causal relationship—does
breakfast by itself create better students? Or is it only a
correlation: perhaps not having breakfast correlates
highly with other challenges in kids’ lives that make them
poorer students, such as less educated parents, worse
socio-economic status, less focus on school at home, and
lower expectations.
Stochastic Specification of PRF
Normal distribution
• What is a normal distribution curve?
• All normal distributions are symmetric and
have bell-shaped density curves with a single
peak. To speak specifically of any normal
distribution, two quantities have to be
specified: the mean , where the peak of the
density occurs, and the standard deviation ,
which indicates the spread or girth of the bell
curve.
Normal distribution
• What is the mean of the standard normal
distribution?
• This is the "bell-shaped" curve of the Standard
Normal Distribution. It is a Normal
Distribution with mean 0 and standard
deviation 1. It shows you the percent of
population: between 0 and Z (option "0 to Z")
Normal distribution
• What is the first quartile of the standard
normal distribution?
• Quartiles. It is readily calculated that for the
standard normal distribution the first quartile
is -.67 (using .2514 for .25) and the third
quartile is .67. This means that for normally
distributed data, one-half of the data is within
2/3 of a standard deviation unit of the mean.
T distribution
• What is a t distribution?
• In probability and statistics, Student's tdistribution (or simply the t-distribution) is
any member of a family of continuous
probability distributions that arises when
estimating the mean of a normally distributed
population in situations where the sample size
is small and population standard deviation is
unknown.
T distribution
• Why is it called Student's t distribution?
• The t distributions were discovered by William
S. Gosset in 1908. Gosset was a statistician
employed by the Guinness brewing company
which had stipulated that he not publish
under his own name. He therefore wrote
under the pen name ``Student.''
T distribution
• What is at stat?
• The test statistic in the t-test is known as the
t-statistic. The t-test looks at the t-statistic, tdistribution and degrees of freedom to
determine a p value (probability) that can be
used to determine whether the population
means differ. The t-test is one of a number of
hypothesis tests.
Chi square distribution
• What is the shape of the chi square
distribution?
• The mean of a Chi Square distribution is its
degrees of freedom. Chi Square distributions
are positively skewed, with the degree of skew
decreasing with increasing degrees of
freedom. As the degrees of freedom
increases, the Chi Square distribution
approaches a normal distribution.
Chi square distribution
• What does DF mean in Chi Square?
• Since two scores are sampled, the answer can
be found using the Chi Square distribution
with two degrees of freedom. A Chi Square
calculator can be used to find that the
probability of a Chi Square (with 2 df) being
six or higher is 0.050. The mean of a Chi
Square distribution is its degrees of freedom.
Chi square distribution
• What is a chi square test in statistics?
• A chi square (X2) statistic is used to investigate
whether distributions of categorical variables
differ from one another. Basically categorical
variable yield data in the categories and
numerical variables yield data in numerical
form.
F distribution
• What is the mean of the F distribution?
• The F distribution is a right-skewed
distribution used most commonly in Analysis
of Variance. When referencing the F
distribution, the numerator degrees of
freedom are always given first, as switching
the order of degrees of freedom changes the
distribution (e.g., F(10,12) does not equal
F(12,10) ).
F distribution
• What is the F ratio?
• The F ratio is the ratio of the variance
between groups to the variance within groups
i.e. the ratio of the explained variance to the
unexplained variance. The F ratio is used to
test whether or not two variances are equal.
The F ratio is calculated as follows:
• What is the f value in an ANOVA?
• The F ratio is the ratio of two mean square
values. If the null hypothesis is true, you
expect F to have a value close to 1.0 most of
the time. A large F ratio means that the
variation among group means is more than
you'd expect to see by chance.