Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
EC339: Applied Econometrics Introduction 1 What is Econometrics? Scope of application is large Literal definition: measurement in economics Working definition: application of statistical methods to problems that are of concern to economists Econometrics has wide applications—beyond the scope of economics 2 What is Econometrics? Econometrics is primarily interested in Quantifying economic relationships Testing competing hypothesis Forecasting 3 Quantifying Economic Relationships Outcomes of many policies tied to the magnitude of the slope of supply and demand curves Often need to know elasticities before we can begin practical analysis For example, if the minimum wage is raised, unemployment may drop as more workers enter the labor force However, this depends on the slopes of the labor supply and labor demand curves Econometric analysis attempts to determine this answer Allows us to quantify causal relationships when the luxury of a formal experiment is not available 4 Testing Competing Hypothesis Econometrics helps fill the gap between the theoretical world and the real world For instance, will a tax cut impact consumer spending? Keynesian models relate consumer spending to annual disposable income, suggesting that a cut in taxes will change consumer spending Other theories relate consumer spending to lifetime income, suggesting a tax cut (especially a “one-shot deal”) will have little impact on consumer spending 5 Forecasting Econometrics attempts to provide the information needed to forecast future values Such as inflation, unemployment, stock market levels, etc. 6 The Use of Models Economists use models to describe real-world processes Models are simplified depictions of reality Usually an equation or set of equations Economic theories are usually deterministic while the world is characterized by randomness Empirical models include a random component known as the error term, or i Typically assume that the mean of the error term is zero 7 Types of Data Data provide the raw material needed to Data can be described as a set of observations such as income, age, grade Quantify economic relationships Test competing theories Construct forecasts Each occurrence is called an observation Data are in different formats Cross-sectional Time series Panel data 8 Cross-Sectional Data Provide information on a variety of entities at the same point in time 9 Time Series Data Provides information for the same entity at different points in time 10 Panel (or Longitudinal) Data Represents a combination of cross-sectional and time series data Provides information on a variety of entities at different periods in time 11 Conducting an Empirical Project How to Write an Empirical Paper Select a topic Textbooks, JSTOR, News sources (for ideas), “pop-econ” Learn what others have learned about this topic Spend time researching what others have done Conduct extensive literature review 12 Conducting an Empirical Project Theoretical Foundation Have an empirical strategy Existing literature may help Would apply the methods you learn in this book Gather data and apply appropriate econometric techniques Interpret your results Write it up… Build like a court case or newspaper article 13 Where to obtain data How to use DataFerrett CPS.doc Files for course will be stored on datastor \\datastor\courses\economic\ec339 You can download all files from book http://caleb.wabash.edu/econometrics/index.htm 14 Web Links Resources for Economists on the Internet are available at www.rfe.org www.freelunch.com www.bea.gov, www.census.gov, www.bls.gov 15 Math Review There is much more to it… but these are the basics you must know 16 y f ( x) a bx Math Review Differentiation expresses the rate at which a quantity, y, changes with respect to the change in another quantity, x, on which it has a functional relationship. Using the symbol Δ to refer to change in a quantity. y slope b x f ( x) 3 2 x y ( y1 y0 ) (7 3) 4 2b x ( x1 x0 ) (2 0) 2 f ( x) y Linear Relationship (i.e., a straight line) has a specific equation. As x changes, how does y change? Directly related (x increases, y increases) Inversely related (x increases, y decreases) x x=0, y=3 or (0,3). 17 x=2, y=3+2(2) or (2,7) y f ( x) a bx Math Review y slope b x Derivatives are essentially the same thing. f ( x) 3 2 x Instead of looking at the difference in y as y ( y1 y0 ) (3.0002 3) .0002 x goes from 0 to 2, if you look at very 2b small intervals, say changing x from 0 to x ( x1 x0 ) (.0001 0) .0001 f ( x) 0.0001, the slope does not change for a y straight line The basic rule for derivatives is that the distance between the initial x and new x approches zero (in what is called the limit) x=0, y=3 or (0,3). x=.0001, y=3+2(.0001) or (x,y)=(.0001,3.0002) x 18 y f ( x) a bxc Math Review Derivatives have a slightly different notation than delta-y/delta-x, namely dy/dx or f’(x). Constants, such as the y-intercept do not change as x changes, and thus are dropped when taking derivatives. dy f '( x) c(b) xc 1 f ( x) 3 2 x f '( x) (1)2 x11 2( x0 ) 2 f ( x) y Derivatives represent the general formula to find the slope of a function when evaluated at a particular point. For straight lines, this value is fixed. x=0, y=3 or (0,3). x=.0001, y=3+2(.0001) or (x,y)=(.0001,3.0002) x 19 F ( x) ydx f ( x)dx a bx c dx b c 1 x c 1 2 2 F ( x) (3 2 x)dx 3 x x C 11 Math Review F ( x) (a bx c )dx ax Integration (or reverse differentiation) is just the opposite of a derivative, you have to F ( x) 3x x 2 C remember to add back in C (for constant) 10 since you may not know the “primitive” F ( x) (3 2 x)dx [3x x 2 C ]10 0 equation. 0 F ( x) [3(10) (10)2 ] [3(0) (0)2 ] 130 There are indefinite integrals (over no y 23 specified region) and definite integrals (where the region of integration is specified). 3 Also, the result of integration should be the function you would HAVE TO TAKE the derivative of to get the initial function. x 10 Area=[3*(10-0)]+[1/2*(10-0)*(3+2(10))]=130 20 Basic Definitions Random variable A function or rule that assigns a real number to each basic outcome in the sample space The domain of random variable X is the sample space The range of X is the real number line Value changes from trial to trial Uncertainty prevails in advance of the trail as to the outcome 21 Case Study Weight Data Introductory Statistics class Spring, 1997 Virginia Commonwealth University 22 Weight Data 192 152 135 110 128 180 260 170 165 150 110 120 185 165 212 119 165 210 186 100 195 170 120 185 175 203 185 123 139 106 180 130 155 220 140 157 150 172 175 133 170 130 101 180 187 148 106 180 127 124 215 125 194 23 Weight Data: Frequency Table Weight Group 100 - <120 120 - <140 140 - <160 160 - <180 180 - <200 200 - <220 220 - <240 240 - <260 260 - <280 Count 7 12 7 8 12 4 1 0 1 sqrt(53) = 7.2, or 8 intervals; range (260100=160) / 8 = 20 = class width 24 Weight Data: Histogram 14 Number of students 12 10 8 6 Frequency 4 2 0 100 120 140 160 180 200 Weight 220 240 260 280 * Left endpoint is included in the group, right endpoint is not. 25 Numerical Summaries Center of the data mean median Variation range quartiles (interquartile range) variance standard deviation 26 Mean or Average Traditional measure of center Sum the values and divide by the number of values n 1 1 x x1 x2 x3 xn xi n n i 1 27 Median (M) A resistant measure of the data’s center At least half of the ordered values are less than or equal to the median value At least half of the ordered values are greater than or equal to the median value If n is odd, the median is the middle ordered value If n is even, the median is the average of the two middle ordered values 28 Median (M) Location of the median: L(M) = (n+1)/2 , where n = sample size. Example: If 25 data values are recorded, the Median would be the (25+1)/2 = 13th ordered value. 29 Median Example 1 data: 2 4 6 Median (M) = 4 Example 2 data: 2 4 6 8 Median = 5 (ave. of 4 and 6) Example 3 data: 6 2 4 Median 2 (order the values: 2 4 6 , so Median = 4) 30 Comparing the Mean & Median The mean and median of data from a symmetric distribution should be close together. The actual (true) mean and median of a symmetric distribution are exactly the same. In a skewed distribution, the mean is farther out in the long tail than is the median [the mean is ‘pulled’ in the direction of the possible outlier(s)]. 31 Quartiles Three numbers which divide the ordered data into four equal sized groups. Q1 has 25% of the data below it. Q2 has 50% of the data below it. (Median) Q3 has 75% of the data below it. 32 L(M)=(53+1)/2=27 L(Q1)=(26+1)/2=13.5 100 101 106 106 110 110 119 120 120 123 124 125 127 128 130 130 133 135 139 140 Weight Data: Sorted 148 150 150 152 155 157 165 165 165 170 170 170 172 175 175 180 180 180 180 185 185 185 186 187 192 194 195 203 210 212 215 220 260 33 Variance and Standard Deviation Recall that variability exists when some values are different from (above or below) the mean. Each data value has an associated deviation from the mean: xi x 34 Deviations what is a typical deviation from the mean? (standard deviation) small values of this typical deviation indicate small variability in the data large values of this typical deviation indicate large variability in the data 35 Variance Find the mean Find the deviation of each value from the mean Square the deviations Sum the squared deviations Divide the sum by n-1 (gives typical squared deviation from mean) 36 Variance Formula n Remember that you must find the deviations of EACH x, square the deviations, THEN add them up! 1 2 s ( xi x ) (n 1) i 1 2 n 1 x xi n i 1 37 Standard Deviation Formula typical deviation from the mean n 1 2 s ( x x ) i (n 1) i 1 [ standard deviation = square root of the variance ] 38 Variance and Standard Deviation Example from Text Metabolic rates of 7 men (cal./24hr.) : 1792 1666 1362 1614 1460 1867 1439 x 1792 1666 1362 1614 1460 1867 1439 7 11,200 7 1600 39 Variance and Standard Deviation Example Observations xi Deviations Squared deviations 2 xi x xi x (192)2 = 36,864 1792 17921600 = 192 1666 1666 1600 = 66 1362 1362 1600 = -238 1614 1614 1600 = 14 1460 1460 1600 = -140 (-140)2 = 19,600 1867 1867 1600 = 267 (267)2 = 71,289 1439 1439 1600 = -161 (-161)2 = 25,921 sum = 0 (66)2 = 4,356 (-238)2 = 56,644 (14)2 = 196 sum = 214,870 Notice the deviations add to zero, so each deviation must be squared 40 Variance versus Standard Deviation 1 1 s (214,870) (214,870) 35,811.67 Value Observation 7 1 6 2 s s 35,811.67 189.24 2 Note: Standard deviation is in the same units as the original data (cal/24 hours) while variance is in those units squared (cal/24 hours)2. Thus variance is not easily comparable to the original data. 1 2 3 4 5 6 7 =sum(B1:B7) =stdevp(B1:B7) =stdev(B1:B7) =variance(B1:B7) 1,792 1,666 1,362 1,614 1,460 1,867 1,439 11,200 175 189 35,812 41 Density Curves Example: here is a histogram of vocabulary scores of 947 seventh graders. The smooth curve drawn over the histogram is a mathematical model for the distribution. This is typically written as f(x), also known as the PROBABILITY DISTRIBUTION FUNCTION (PDF) 42 Density Curves Example: the areas of the shaded bars in this histogram represent the proportion of scores in the observed data that are less than or equal to 6.0. This proportion is equal to 0.303. The area underneath the curve, is called the CUMULATIVE DENSITY FUNCTION (CDF): denoted F(x) 43 Density Curves Example: now the area under the smooth curve to the left of 6.0 is shaded. If the scale is adjusted so the total area under the curve is exactly 1, then this curve is called a density curve. The proportion of the area to the left of 6.0 is now equal to 0.293. .55 F ( x) 1 xx ( 1 e 2 2 x x )2 .293 44 45 46 Density Curves Always on or above the horizontal axis Have area exactly 1 underneath curve Area under the curve and above any range of values is the proportion of all observations that fall in that range 47 Density Curves The median of a density curve is the equalareas point, the point that divides the area under the curve in half The mean of a density curve is the balance point, at which the curve would balance if made of solid material 48 Density Curves The mean and standard deviation computed from actual observations (data) are denoted by and s, respectively.x The mean and standard deviation of the actual distribution represented by the density curve are denoted by µ (“mu”) and (“sigma”), respectively. 49 Question Data sets consisting of physical measurements (heights, weights, lengths of bones, and so on) for adults of the same species and sex tend to follow a similar pattern. The pattern is that most individuals are clumped around the average, with numbers decreasing the farther values are from the average in either direction. Describe what shape a histogram (or density curve) of such measurements would have. 50 Bell-Shaped Curve: The Normal Distribution standard deviation mean 51 52 The Normal Distribution Knowing the mean (µ) and standard deviation () allows us to make various conclusions about Normal distributions. Notation: N(µ,). 53 54 55 56 68-95-99.7 Rule for Any Normal Curve 68% of the observations fall within (meaning above and below) one standard deviation of the mean 95% of the observations fall within two standard deviations (actually 1.96) of the mean 99.7% of the observations fall within three standard deviations of the mean 57 68-95-99.7 Rule for Approximates for any Normal Curve 68% - 95% µ + -2 µ +2 99.7% -3 µ +3 58 68-95-99.7 Rule for Any Normal Curve 59 60