Download Chapter 1 Simple Linear Regression

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Time series wikipedia , lookup

Instrumental variables estimation wikipedia , lookup

Choice modelling wikipedia , lookup

Regression toward the mean wikipedia , lookup

Regression analysis wikipedia , lookup

Coefficient of determination wikipedia , lookup

Linear regression wikipedia , lookup

Chapter 1 Simple Linear Regression
Y: force, X: mass, then the famous Newton’s first law is
Y= 9.8 X
Y: consumption of chocolate in pounds, X: population, then the relationship between
the consumption of chocolate and the population of a country might be
Y= X 2 + 3
Ideally, we all hope the usual equation for describing the relationship between two
variables X and Y in science are fixed. A commonly used equation is Y=f(X), where f
is certain function. Thus, given the value of X, the exact value of Y can be obtained
via this equation. However, in the real world, based on the data obtained, it is almost
impossible to obtain the fixed equation as given in the previous examples. There are
so many sources of unpredicted errors in the data collecting process. In addition, the
fixed or exact equations might not be accurate in describing the natural phenomenon.
Therefore, it should be sensible to take the random error into account. This motivates
us to consider the statistical model Y=f(X)+ε, where ε is some random variable and
is referred to as the random error and random variation. The function f(X) which
can describe the relationship between variables Y and X might be very complicated.
In this course, we consider the simplest equation, the linear equation. The simplest
linear regression model is
Y   0  1 X   ,
where  0 and  1 are unknown parameters. We will refer the above model as the
simple linear regression model. The word “regression” was associated with Sir
Francis Galton (1822-1911) when he studied the relationship between the height of
the parents and the height of the children. He found that very tall parents might tend
to have shorter children. It seems to be “reverse”. Therefore, he used the word
“reversion” for describing the relationship. We will discuss simple linear regression in
next section.
Y is often called response, or dependent variable, or outputs.
X is often called predictor, or independent variable, or inputs, or regressor.
Question: Why did Sir Francis Galton use the word “reversion”? Please explain based
on the linear equation he found for the heights of the parents and the children.