Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
3.5 Shape-Changing Transformations As you saw in the last section, not all relationships between two quantitative variables are ________. When ____________ data shows a nonlinear relationship a new technique can be used to find an appropriate model. This technique requires you to _____________ (reexpress) the data to make it linear. This allows you to use linear regression techniques with nonlinear data. We will look at two different ways of transforming data. 1. Logarithmic Transformations of Exponential Functions x An exponential relationship has an equation of the form y ab as the underlying model. The data can be “linearized” (straightened) by taking the logarithm (base 10 or base e*) of each y-value. The result will be a linear equation of the form log yˆ a bx The logarithmic transformation works well for variables like population growth (or the growth of many other phenomena). * The values of a and b in the regression will change depending on whether you use log 10 or ln, but the ŷ and r2 values will be identical. Example: The number of a certain type of bacteria present (in thousands) after a certain number of hours is given in the following chart. Hours 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Number 1.8 2.4 3.1 4.3 5.8 8.0 10.6 14.0 18.0 What would be the predicted quantity of bacteria after 3.75 hours? Solution: A scatterplot of the data and a residual plot shows that a line is not a good model for this data. 1 3.5 Shape-Changing Transformations Now take the log (base e) of the y-values (Number): Hours 1.0 Number 1.8 ln(Number) 1.5 2.4 2.0 3.1 2.5 4.3 3.0 5.8 3.5 8.0 4.0 10.6 4.5 14.0 5.0 18.0 Sketch the scatterplot of Hours vs. ln(Number) and the residual plot. What do you notice about these plots compared to the plots before the data was transformed? Find the regression equation for the transformed data: Use your equation to predict how many bacteria will be present after 3.75 hours. You will need to “back-transform” this answer to the original units by taking the exponential of the answer. 2 3.5 Shape-Changing Transformations 2. Log-Log Transformations of Power Functions A power relationship has an equation of the form y axb as the underlying model. The data can be “linearized” by taking the logarithm (base 10 or base e) of both the values of x and the values of y. The result will be a linear equation of the form: log yˆ a b log x Example: page 196 P34 Scientists want to find a good model of the relationship between tidal velocity (the speed at which water depth increases) and the depth of the water. The table below shows some data for certain locations in an estuary. Depth (m) 0.02 0.02 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1.0 1.0 Velocity (m/s) 0.6 0.55 0.88 0.85 1.1 1.05 1.0 1.2 1.15 0.95 1.2 1.05 Fit an appropriate model that would allow the prediction of velocity from depth. Solution: A scatterplot of the data and a residual plot shows that a line is not a good model for this data. 3 3.5 Shape-Changing Transformations Now take the log (base e) of the x-values (Depth) and the y-values (Velocity): Depth (m) 0.02 0.02 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1.0 1.0 Velocity (m/s) 0.6 0.55 0.88 0.85 1.1 1.05 1.0 1.2 1.15 0.95 1.2 1.05 ln(Depth) ln(Velocity) Sketch the scatterplot of ln(Depth) vs. ln(Velocity) and the residual plot. What do you notice about these plots compared to the plots before the data was transformed? Find the regression equation for the transformed data: 3. Power Transformations 4 3.5 Shape-Changing Transformations Transforming a data set to achieve linearity is a multi-step, trial-and-error process. 1. 2. 3. 4. 5. 6. 7. 8. 9. Choose a transformation method (see below table). Transform the explanatory variable, response variable, or both. Plot the explanatory variable against the response variable, using the transformed data. If the scatterplot is linear, proceed to the next step. If the plot is not linear, return to Step 1 and try a different approach. Choose a different transformation method and/or transform a different variable. Conduct a regression analysis, using the transformed variables. Create a residual plot, based on regression results. If the residual plot suggests a linear pattern, the transformation was successful. Congratulations! If the plot is not random, return to Step 1 and try a different approach. The best transformation method (exponential model or power model) will depend on the nature of the original data. Sometimes the only way to determine which method is best is to try each and compare the results (i.e. residual plots, correlation coefficients). Method Transformation(s) Regression equation Standard linear regression None ŷ = a + bx Exponential model Response variable = log(y) log( ŷ ) = a + bx Power model Response variable = log(y) Explanatory variable = log(x) log( ŷ )= a + b log(x) 5 3.5 Shape-Changing Transformations