Download Simple Linear Regression

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Interaction (statistics) wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Instrumental variables estimation wikipedia , lookup

Choice modelling wikipedia , lookup

Time series wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Simple Linear Regression - Introduction
• Simple linear regression studies the relationship between a
quantitative response variable Y, and a single explanatory variable
X.
• We expect that different values of x will produce different mean
response.
• In linear regression the explanatory variable x is quantitative and can
have many different values. We can think of these values as defining
different subpopulations, one for each possible value of x. Each
subpopulation consists of all individuals in the population having
the same value of x. For example, giving five different amounts x of
calcium to different groups of subjects, we could view these values
as defining five different subpopulations.
STA286 week 12
1
Simple Linear Regression – Model
• The statistical model for simple linear regression assumes that for
each value of x the observed values of the response variable y are
Normally distributed with a mean that depends on x. We use μy to
represent these means.
• In general, the means μy can change as x changes according to any
sort of pattern.
• In simple linear regression we assume the means all lie on a line
when plotted against x.
STA286 week 12
2
Simple Linear Regression – Model assumptions
• The mean of the response variable y changes as x changes. The
means all lie on a straight line. The equation of the line is,
 y   0  1 x
with intercept β0 and slope β1. This is the population regression
line.
• Individual responses of y with the same x vary according to a
Normal distribution. These Normal distributions all have the same
standard deviation.
STA286 week 12
3
Data for Simple Linear Regression
• The data for linear regression are observed values of y and x.
• The model treats each x to be a fixed known quantity.
• The response y to a given x is a random variable.
• The linear regression model describes the mean and standard
deviation of this random variable y. These unknown parameters
must be estimated from the data.
STA286 week 12
4
Simple Linear Regression Model
• Given n observations of the explanatory variable x and the response
variable y, x1 , y1 , x2 , y 2 , , xn , y n  the statistical model for
simple linear regression states that the observed response yi when
the explanatory variable takes the value xi is
y i   0   1 xi   i
• Here β0 + β1xi is the mean response when x = xi. The deviations  i
are assumed to be independent and Normally distributed with mean
0 and standard deviation σ.
• The parameters of the model are β0, β1 and σ.
STA286 week 12
5
Use of the Linear Regression Model for Inference
• Once we have estimates of β0 and β1, the linear relationship
determines the estimates of μy for all values of x.
• Linear regression allows us to do inference not only for
subpopulations for which we have data but also for those
corresponding to x’s not presented in the data.
• We will learn how to make inference about
 the slop β1 and the intercept β0 of the population regression line
 the mean response μy for a given x and
 an individual future response y for a given value of x.
STA286 week 12
6
Least-Squares Method
• We use the method of least-squares to fit a line to summarize a
relationship between the observed values of an explanatory variable
and a response variable.
• The least-squares method estimates β0 and β1 by minimizing
the sums of squares of the errors…
• We use the least-squares line as a basis for inference about a
population from which our observations are sampled.
• We can do this only when the statistical model described above
holds.
STA286 week 12
7
Estimation of the Regression Parameters
• The least-squares estimate of the slope β1 is
b1 
s xy
s xx
where sxy is ….
and sxx is ...
• The least-squares estimate of the intercept β0 is
b0  y  b1 x
• The least-square line is then given by
yˆ  b0  b1 x
STA286 week 12
8