Download May 5

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Forecasting wikipedia , lookup

Data assimilation wikipedia , lookup

Choice modelling wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Statistics
Diagnostics
May 2, 2008
1. Plot the standardized residuals: e∗i = ei /sei .
2. Leverage: points far away from x pull the regression line toward them. A way to measure
leverage is by the hat values
1 (xi − x)2
.
hi = +
n
Sxx
3. Influence: the amount that a point affects the regression line.
(a) One way to measure influence is to refit the line with the point removed. (R function
dfbeta().)
(b) A summary statistic is Cook’s distance:
Di =
e∗i hi
.
2 1 − hi
4. Multiple regression: the standard linear model.
Y = β0 + β1 x1 + · · · + βk xk + 5. Assumptions: has mean 0, variance σ 2 and is normally distributed.
6. Data: (x1,1 , x1,2 , . . . , x1,k , y1 ), . . . , (xn,1 , xn,2 , . . . , xn,k , yn )
7. Do all the same stuff! Example.
Homework - due Tuesday, May 6, 2008
1. Read Section 8.5.
Useful R
>
>
>
>
>
>
>
>
l=lm(ohms~juice,data=fo)
plot(residuals(l)~fitted(l))
library(MASS)
l=lm(brain~body,data=mammals)
plot(rstandard(l)~fitted(l))
dfbeta(l)
library(faraway)
l=lm(brozek~weight+abdom,data=fat)