Download 3.5 Shape-Changing Transformations

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Choice modelling wikipedia , lookup

Forecasting wikipedia , lookup

Data assimilation wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
3.5 Shape-Changing Transformations
As you saw in the last section, not all relationships between two quantitative variables are
________. When ____________ data shows a nonlinear relationship a new technique can
be used to find an appropriate model. This technique requires you to _____________ (reexpress) the data to make it linear. This allows you to use linear regression techniques with
nonlinear data.
We will look at two different ways of transforming data.
1. Logarithmic Transformations of Exponential Functions
x
An exponential relationship has an equation of the form y  ab as the underlying model.
The data can be “linearized” (straightened) by taking the logarithm (base 10 or base e*) of
each y-value. The result will be a linear equation of the form
log yˆ  a  bx
The logarithmic transformation works well for variables like population growth (or the
growth of many other phenomena).
* The values of a and b in the regression will change depending on whether you use log 10 or ln, but the ŷ and r2
values will be identical.
Example:
The number of a certain type of bacteria present (in thousands) after a certain number of
hours is given in the following chart.
Hours
1.0 1.5 2.0 2.5 3.0 3.5 4.0
4.5
5.0
Number 1.8 2.4 3.1 4.3 5.8 8.0 10.6 14.0 18.0
What would be the predicted quantity of bacteria after 3.75 hours?
Solution:
A scatterplot of the data and a residual plot shows that a line is not a good model for this
data.
1
3.5 Shape-Changing Transformations
Now take the log (base e) of the y-values (Number):
Hours
1.0
Number
1.8
ln(Number)
1.5
2.4
2.0
3.1
2.5
4.3
3.0
5.8
3.5
8.0
4.0
10.6
4.5
14.0
5.0
18.0
Sketch the scatterplot of Hours vs. ln(Number) and the residual plot.
What do you notice about these plots compared to the plots before the data was
transformed?
Find the regression equation for the transformed data:
Use your equation to predict how many bacteria will be present after 3.75 hours. You will
need to “back-transform” this answer to the original units by taking the exponential of the
answer.
2
3.5 Shape-Changing Transformations
2. Log-Log Transformations of Power Functions
A power relationship has an equation of the form y  axb as the underlying model. The data
can be “linearized” by taking the logarithm (base 10 or base e) of both the values of x and
the values of y. The result will be a linear equation of the form:
log yˆ  a  b log x
Example: page 196 P34
Scientists want to find a good model of the relationship between tidal velocity (the speed at
which water depth increases) and the depth of the water. The table below shows some data
for certain locations in an estuary.
Depth (m)
0.02
0.02
0.2
0.2
0.4
0.4
0.6
0.6
0.8
0.8
1.0
1.0
Velocity (m/s)
0.6
0.55
0.88
0.85
1.1
1.05
1.0
1.2
1.15
0.95
1.2
1.05
Fit an appropriate model that would allow the prediction of velocity from depth.
Solution:
A scatterplot of the data and a residual plot shows that a line is not a good model for this
data.
3
3.5 Shape-Changing Transformations
Now take the log (base e) of the x-values (Depth) and the y-values (Velocity):
Depth (m)
0.02
0.02
0.2
0.2
0.4
0.4
0.6
0.6
0.8
0.8
1.0
1.0
Velocity (m/s)
0.6
0.55
0.88
0.85
1.1
1.05
1.0
1.2
1.15
0.95
1.2
1.05
ln(Depth)
ln(Velocity)
Sketch the scatterplot of ln(Depth) vs. ln(Velocity) and the residual plot.
What do you notice about these plots compared to the plots before the data was
transformed?
Find the regression equation for the transformed data:
3. Power Transformations
4
3.5 Shape-Changing Transformations
Transforming a data set to achieve linearity is a multi-step, trial-and-error process.
1.
2.
3.
4.
5.
6.
7.
8.
9.
Choose a transformation method (see below table).
Transform the explanatory variable, response variable, or both.
Plot the explanatory variable against the response variable, using the transformed
data.
If the scatterplot is linear, proceed to the next step.
If the plot is not linear, return to Step 1 and try a different approach. Choose a
different transformation method and/or transform a different variable.
Conduct a regression analysis, using the transformed variables.
Create a residual plot, based on regression results.
If the residual plot suggests a linear pattern, the transformation was successful.
Congratulations!
If the plot is not random, return to Step 1 and try a different approach.
The best transformation method (exponential model or power model) will depend on the
nature of the original data. Sometimes the only way to determine which method is best is to
try each and compare the results (i.e. residual plots, correlation coefficients).
Method
Transformation(s)
Regression equation
Standard linear regression
None
ŷ = a + bx
Exponential model
Response variable = log(y)
log( ŷ ) = a + bx
Power model
Response variable = log(y)
Explanatory variable = log(x)
log( ŷ )= a + b log(x)
5
3.5 Shape-Changing Transformations