Download Econometrics unit 1

Econometrics ECM712s P. Nangula Tel: 2072492 [email protected] What is Econometrics? Literally interpreted econometrics means “economics measurements”. Econometrics may be defined as the social science in which the tools of economic theory, mathematics, and statistical inference are applied to the analysis of economic variable. Econometrics can also be defined as “statistical observation of theoretically formed concepts or alternatively as mathematical economics working with measured data” We know very well that economic theory attempts to defined the relationship among different economic variables. WHY A SEPARATE DISCIPLINE? Economic theory makes statements or hypotheses that are mostly qualitative in nature. For example, microeconomic theory states that, other things remaining the same, a reduction in the price of a commodity is expected to increase the quantity demanded of that commodity. It is the job of the econometrician to provide such numerical estimates. The main concern of mathematical economics is to express economic theory in mathematical form (equations) without regard to measurability or empirical verification of the theory. WHY A SEPARATE DISCIPLINE? Economic statistics is mainly concerned with collecting, processing, and presenting economic data in the form of charts and tables. It does not go any further. The one who does that is the econometrician. METHODOLOGY OF ECONOMETRICS How do econometricians proceed in their analysis of an economic problem? Following are the main steps in methodology of econometrics 1. Statement of theory or hypothesis. 2. Specification of the mathematical model of the theory 3. Specification of the statistical, or econometric, model 4. Obtaining the data 5. Estimation of the parameters of the econometric model 6. Hypothesis testing 7. Forecasting or prediction 8. Using the model for control or policy purposes. METHODOLOGY OF ECONOMETRICS How do econometricians proceed in their analysis of an economic problem? 1. Statement of theory or hypothesis. Men or women increase their consumption as their income increases, but not as much as the increase in their income. 2. Specification of the mathematical model of the theory 𝑌 = 𝛽1 + 𝛽2 𝑋 0 < 𝛽2 > 1 where Y = consumption expenditure and X = income, and where β1 and β2, known as the parameters of the model, are, respectively, the intercept and slope coefficients. This is a single equation model Y is dependent variable and X is independent or explanatory METHODOLOGY OF ECONOMETRICS 2. Specification of the mathematical model of the theory 𝑌 = 𝛽1 + 𝛽2 𝑋 Consumption 𝛽2 =MPC 1 Income METHODOLOGY OF ECONOMETRICS 3. Specification of the statistical, or econometric, model 𝒀 = 𝜷𝟏 + 𝜷𝟐 𝑿, The purely mathematical model of the consumption function given above is of limited interest to the econometrician, for it assumes that there is an exact or deterministic relationship between consumption and income. 𝒀 = 𝜷𝟏 + 𝜷𝟐 𝑿 + 𝒖, where u, known as the disturbance, or error, term, is a random (stochastic) variable that has welldefined probabilistic properties. The disturbance term u may well represent all those factors that affect consumption but are not taken into account explicitly. METHODOLOGY OF ECONOMETRICS 4. Obtaining the data Year Y X 1982 3081.5 4620.3 1983 3240.6 4803.7 1984 3407.6 5140.1 1985 3566.5 5323.5 1986 3708.7 5487.7 1987 3822.3 5649.5 1988 3972.7 5865.2 1989 4064.6 6062.0 1990 4132.2 6136.3 1991 4105.8 6079.4 1992 4219.8 6244.4 1993 4343.6 6389.6 1994 4486.0 6610.7 1995 4595.3 6742.1 Source: Economic Report of the President, 1998, Table B–2, p. 282 METHODOLOGY OF ECONOMETRICS 5. Estimation of the Econometric Model Now that we have the data, our next task is to estimate the parameters of the consumption function. 𝑌 = −184.08 + 0.7064𝑋 The hat on the Y indicates that it is an estimate. The estimated consumption function. Interpretation? 6. Hypothesis Testing Is 0.70 statistically less than 1? METHODOLOGY OF ECONOMETRICS 7. Forecasting or Prediction • If the chosen model does not refute the hypothesis or theory under consideration, we may use it to predict the future value(s) of the dependent, or forecast, variable Y on the basis of known or expected future value(s) of the explanatory, or predictor, variable X. Review of probability distributions     The normal distribution The t-distribution The Chi-square distribution The F-distribution Regression Analysis The term regression was introduced by Francis Galton Regression analysis is concerned with the study of the dependence of one variable, the dependent variable, on one or more other variables, the explanatory variables, with a view to estimating and/or predicting the (population) mean or average value of the former in terms of the known or fixed (in repeated sampling) values of the latter. Regression Analysis Regression Analysis is a technique of studying the dependence of one variable (called dependant variable), on one or more variables (called explanatory variable), with a view to estimate or predict the average value of the dependent variables in terms of the known or fixed values of the independent variables. Regression Analysis THE REGRESSION TECHNIQUE IS PRIMARILY USED TO : • Estimate the relationship that exists, on the average, between the dependent variable and the explanatory variable • Determine the effect of each of the explanatory variables on the dependent variable, controlling the effects of all other explanatory variables • Predict the value of dependent variable for a given value of the explanatory variable STATISTICAL VERSUS DETERMINISTIC RELATIONSHIPS In regression analysis we are concerned with what is known as the statistical, not deterministic, dependence among variables, such as those of classical physics. In statistical relationships among variables we essentially deal with random or stochastic variables, that is, variables that have probability distributions. In deterministic dependency, on the other hand, we also deal with variables, but these variables are not random or stochastic. REGRESSION VERSUS CAUSATION Although regression analysis deals with the dependence of one variable on other variables, it does not necessarily imply causation. Causation indicates that one event is the result of the occurrence of the other event; i.e. there is a causal relationship between the two events. A statistical relationship in itself cannot logically imply causation. REGRESSION VERSUS CORRELATION Closely related to but conceptually very much different from regression analysis is correlation analysis. The primary objective of correlation is to measure the strength or degree of linear association between two variables. The correlation coefficient measures this strength of (linear) association. For example, we may be interested in finding the correlation (coefficient) between smoking and lung cancer, between scores on statistics and mathematics examinations, between high school grades and college grades, and so on. In regression analysis, as already noted, we are not primarily interested in such a measure. Instead, we try to estimate or predict the average value of one variable on the basis of the fixed values of other variables. REGRESSION VERSUS CORRELATION Regression and correlation have some fundamental differences that are worth mentioning. • In regression analysis there is an asymmetry in the way the dependent and explanatory variables are treated. • The dependent variable is assumed to be statistical, random, or stochastic, that is, to have a probability distribution. • The explanatory variables, on the other hand, are assumed to have fixed values REGRESSION VERSUS CORRELATION Regression and correlation have some fundamental differences that are worth mentioning. In correlation analysis, on the other hand, we treat any (two) variables symmetrically; there is no distinction between the dependent and explanatory variables. After all, the correlation between scores on mathematics and statistics examinations is the same as that between scores on statistics and mathematics examinations. Moreover, both variables are assumed to be random. As we shall see, most of the correlation theory is based on the assumption of randomness of variables, whereas most of the regression theory to be expounded in this book is conditional upon the assumption that the dependent variable is stochastic but the explanatory variables are fixed or nonstochastic. CAUSATION VERSUS CORRELATION Unfortunately, intuition can lead us astray when it comes to distinguishing between the two. For example, eating breakfast has long been correlated with success in school for elementary school children. It would be easy to conclude that eating breakfast causes students to be better learners. Is this a causal relationship—does breakfast by itself create better students? Or is it only a correlation: perhaps not having breakfast correlates highly with other challenges in kids’ lives that make them poorer students, such as less educated parents, worse socio-economic status, less focus on school at home, and lower expectations. Stochastic Specification of PRF Normal distribution • What is a normal distribution curve? • All normal distributions are symmetric and have bell-shaped density curves with a single peak. To speak specifically of any normal distribution, two quantities have to be specified: the mean , where the peak of the density occurs, and the standard deviation , which indicates the spread or girth of the bell curve. Normal distribution • What is the mean of the standard normal distribution? • This is the "bell-shaped" curve of the Standard Normal Distribution. It is a Normal Distribution with mean 0 and standard deviation 1. It shows you the percent of population: between 0 and Z (option "0 to Z") Normal distribution • What is the first quartile of the standard normal distribution? • Quartiles. It is readily calculated that for the standard normal distribution the first quartile is -.67 (using .2514 for .25) and the third quartile is .67. This means that for normally distributed data, one-half of the data is within 2/3 of a standard deviation unit of the mean. T distribution • What is a t distribution? • In probability and statistics, Student's tdistribution (or simply the t-distribution) is any member of a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown. T distribution • Why is it called Student's t distribution? • The t distributions were discovered by William S. Gosset in 1908. Gosset was a statistician employed by the Guinness brewing company which had stipulated that he not publish under his own name. He therefore wrote under the pen name ``Student.'' T distribution • What is at stat? • The test statistic in the t-test is known as the t-statistic. The t-test looks at the t-statistic, tdistribution and degrees of freedom to determine a p value (probability) that can be used to determine whether the population means differ. The t-test is one of a number of hypothesis tests. Chi square distribution • What is the shape of the chi square distribution? • The mean of a Chi Square distribution is its degrees of freedom. Chi Square distributions are positively skewed, with the degree of skew decreasing with increasing degrees of freedom. As the degrees of freedom increases, the Chi Square distribution approaches a normal distribution. Chi square distribution • What does DF mean in Chi Square? • Since two scores are sampled, the answer can be found using the Chi Square distribution with two degrees of freedom. A Chi Square calculator can be used to find that the probability of a Chi Square (with 2 df) being six or higher is 0.050. The mean of a Chi Square distribution is its degrees of freedom. Chi square distribution • What is a chi square test in statistics? • A chi square (X2) statistic is used to investigate whether distributions of categorical variables differ from one another. Basically categorical variable yield data in the categories and numerical variables yield data in numerical form. F distribution • What is the mean of the F distribution? • The F distribution is a right-skewed distribution used most commonly in Analysis of Variance. When referencing the F distribution, the numerator degrees of freedom are always given first, as switching the order of degrees of freedom changes the distribution (e.g., F(10,12) does not equal F(12,10) ). F distribution • What is the F ratio? • The F ratio is the ratio of the variance between groups to the variance within groups i.e. the ratio of the explained variance to the unexplained variance. The F ratio is used to test whether or not two variances are equal. The F ratio is calculated as follows: • What is the f value in an ANOVA? • The F ratio is the ratio of two mean square values. If the null hypothesis is true, you expect F to have a value close to 1.0 most of the time. A large F ratio means that the variation among group means is more than you'd expect to see by chance.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Econometrics unit 1