Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Econ 240A Power 17 1 Outline • Review • Projects 2 Review: Big Picture 1 • #1 Descriptive Statistics – Numerical central tendency: mean, median, mode dispersion: std. dev., IQR, max-min skewness kurtosis – Graphical • • • • Bar plots Histograms Scatter plots: y vs. x Plots of a series against time (traces) Question: Is (are) the variable (s) normal? 3 Review: Big Picture 2 • # 2 Exploratory Data Analysis – Graphical • Stem and leaf diagrams • Box plots • 3-D plots 4 Review: Big Picture 3 • #3 Inferential statistics – Random variables – Probability – Distributions • Discrete: Equi-probable (uniform), binomial, Poisson – Probability density, Cumulative Distribution Function • Continuous: normal, uniform, exponential – Density, CDF • Standardized Normal, z~N(0,1) – Density and CDF are tabulated • Bivariate normal – Joint density, marginal distributions, conditional distributions – Pearson correlation coefficient, iso-probability contours – Applications: sample proportions from polls pˆ successes / n x / n, where : x ~ B( p, n) 5 Review: Big Picture 4 • Inferential Statistics, Cont. – The distribution of the sample mean is different than the distribution of the random variable • Central limit theorem z [ x Ex ] / x [ x ] / / n – Confidence intervals for the unknown population mean p[ x 1.96 / n x 1.96 / n ] 0.95 6 Review: Big Picture 5 • Inferential Statistics – If population variance is unknown, use sample standard deviation s, and Student’s t-distribution p[ x t0.025s / n x t0.025s / n ] 0.95 – Hypothesis tests H 0 : 0, H A : 0, t [ x Ex ] /( s / n ) – Decision theory: minimize the expected costs of errors • Type I error, Type II error – Non-parametric statistics • techniques of inference if variable is not normally distributed 7 Review: Big Picture 6 • Regression, Bivariate and Multivariate – Time series • • • • • • Linear trend: y(t) = a + b*t +e(t) Exponential trend: ln y(t) = a +b*t +e(t) Quadratic trend: y(t) = a + b*t +c*t2 + e(t) Elasticity estimation: lny(t) = a + b*lnx(t) +e(t) Returns Generating Process: ri(t) = c + b*rM(t) + e(t) Problem: autocorrelation – – – – Diagnostic: Durbin-Watson statistic Diagnostic: inertial pattern in plot(trace) of residual Fix-up: Cochran-Orcutt Fix-up: First difference equation 8 Review: Big Picture 7 • Regression, Bivariate and Multivariate – Cross-section • • • • • • • • Linear: y(i) = a + b*x(i) + e(i), i=1,n ; b=dy/dx Elasticity or log-log: lny(i) = a + b*lnx(i) + e(i); b=(dy/dx)/(y/x) Linear probability model: y=1 for yes, y=0 for no; y =a + b*x +e Probit or Logit probability model Problem: heteroskedasticity Diagnostic: pattern of residual(or residual squared) with y and/or x Diagnostic: White heteroskedasticity test Fix-up: transform equation, for example, divide by x – Table of ANOVA • Source of variation: explained, unexplained, total • Sum of squares, degrees of freedom, mean square, F test 9 Review: Big Picture 8 • Questions: quantitative dependent, qualitative explanatory variables – Null: No difference in means between two or more populations (groups), One Factor • Graph • Table of ANOVA • Regression Using Dummies – Null: No difference in means between two or more populations (groups), Two Factors • Graph • Table of ANOVA • Comparing Regressions Using Dummies 10 Review: Big Picture 9 • Cross-classification: nominal categories, e.g. male or female, ordinal categories e.g. better or worse, or quantitative intervals e.g. 13-19, 20-29 – Two Factors mxn; (m-1)x(n-1) degrees of freedom – Null: independence between factors; expected number in cell (i,j) = p(i)*p(j)*n – Pearson Chi- square statistic = sum over all i, j of [observed(i, j) – expected(i, j)]2 /expected(i, j) 11 Summary • Is there any relationship between 2 or more variables – quantitative y and x: graphs and regression – Qualitative binary y and quantitative x: probability model, linear or non-linear – Quantitative y and qualitative x: graphs and Tables of ANOVA, and regressions with indicator variables – Qualitative y and x: Contingency Tables 12 13