Download 1.3 Approximate Linear Models

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression analysis wikipedia , lookup

Choice modelling wikipedia , lookup

Forecasting wikipedia , lookup

Data assimilation wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
1.3 Approximate Linear Models
Question 1: Which linear model is best?
Question 2: How do you use linear regression functions?
Question 3: How good is the linear model?
In section 1.2, we found the demand function D(Q) for a dairy. The scatter plot and
corresponding linear function are pictured in Figure 1.
Figure 1 – The demand function D(Q) = -0.05Q + 7.75 for the data in Example 7 of section 1.2.
In this scatter plot, the function passes through each point perfectly. In other words, the
prices at each quantity in the data match the prices from the function at the
corresponding prices. This occurred because the slopes between adjacent ordered
pairs were exactly the same. In this section we’ll find linear models for data that do not
lie along a straight line.
1
Question 1: Which linear model is best?
In real life, data are not so perfect. A more common situation might be the data in Table
1.
Table 1
Weekly Demand for Milk
(thousands of gallons)
85
95
105
115
125
Average Price Per Gallon
(dollars)
3.89
2.78
3.01
3.23
1.69
If we were to calculate the slope between adjacent points, we would find that the slope
is not constant. This is easy to see if we graph the ordered pairs from the table in a
scatter plot.
Figure 2 – A scatter plot of the data in Table 1. The slope between adjacent ordered pairs is not constant.
Although we cannot find a linear function that passes though all of the points, we can
find a linear function that passes close to the points. This function is said to model the
2
data. A mathematical model is a representation of a real world system. In this case, the
model is a representation of the relationship between the average price of a gallon of
milk and the quantity of milk sold per week.
Figure 3 – Two different linear functions that model the relationship between the average price of a gallon
of milk and the number of gallons sold at that price.
The linear functions in Figure 3 both pass close to the points, but which function passes
“closest” to the points?
Closeness is measured vertically from the data point to the line. For instance, let’s draw
a vertical dashed line from each point to the linear function in Figure 3a.
3
Figure 4 – The dashed red lines represent the vertical distances between the data values and the linear
function.
The vertical distance between any data value and the line can be computed by
subtracting the corresponding prices. To find the largest vertical distance at Q  115 , we
need to find the average price from the data and from the linear model:
Price at Q  115 from data: P  3.23
Price at Q  115 from linear function:
Pˆ  0.05 115   8  2.25
The symbol P̂ indicates that the price has been estimated from the model whereas P
indicates a data price. The difference between the data price and the linear model price
is called the error or residual. For this price, the error is
E  P  Pˆ  3.23  2.25  0.98 Since the error is positive, we know that the data price is higher than the model’s price.
If the error is negative, the model’s price is higher than the data price. This is the case
at Q  95 :
4
Price at Q  95 from data:
P  2.78
Price at Q  95 from linear function:
Pˆ  0.05  95   8  3.25
E  P  Pˆ  2.78  3.25  0.47 We can carry this process at each of the quantities in Table 1 and label it on a scatter
plot.
Figure 5 – A scatter plot of the data and the linear model. The dashed red line and red numbers indicate
the error between the data and the model.
If we want a linear function to go as close to the ordered pairs as possible, we need to
collectively make these errors as small as possible. This is done by summing the errors
at each quantity. If the sum of the errors is zero, the linear function might go through
each point.
5
Figure 6 – In a, each data point coincides with the linear model so the sum of the errors is zero. In b, the
sum of the errors is also zero, but the data points do not lie on the graph of the linear model.
But we can’t simply sum the errors as is since some of the errors are positive and some
are negative. By adding the errors together we might have some cancellation and be
deceived into thinking that a linear model coincides with the data when it does not. This
situation is illustrated in Figure 6. In graph a of Figure 6, each ordered pair of the data
lies on the linear model so there is no error between the model and the data. The sum
of the errors is zero. In this case, the linear model fits the data perfectly.
But simply adding the errors is inadequate. In graph b of Figure 6, the sum of the errors
is also zero. However the model does not fit the data perfectly. Some ordered pairs are
above the linear model and others are below the linear model. For each positive error,
there is a negative error that cancels it. Even though the sum is zero, the linear model is
not a very good fit for the data.
A better criterion for determining the fit is needed. Instead of simply summing the errors,
sum the square of the errors. This eliminates the potential cancellation of the positive
and negative errors.
6
Table 2
Q
P
P̂
P  Pˆ
 P  Pˆ 
85
3.89
3.7500
0.1400
0.0196
95
2.78
3.2500
-0.4700
0.2209
105
3.01
2.7500
0.2600
0.0676
115
3.23
2.2500
0.9800
0.9604
125
1.69
1.7500
-0.0600
0.0036
2
Sum = 1.2721
In Table 2, the first three columns describe the quantity Q, the price P and the estimate
of the price P̂ based on the linear model. The fourth column describes the error in the
estimate and the last column corresponds to the squared error. For this model, the sum
of the squared errors is 1.2721.
A model that has a lower sum of squared error would be considered a better model.
Let’s look at the model in Figure 3b, P  0.0395Q  7.0675 .
Table 3
Q
P
P̂
P  Pˆ
 P  Pˆ 
85
3.89
3.7100
0.1800
0.0324
95
2.78
3.3150
-0.5350
0.2862
105
3.01
2.9200
0.0900
0.0081
115
3.23
2.5250
0.7050
0.4970
125
1.69
2.1300
-0.4400
0.1936
2
Sum = 1.0174
7
This model, rounded to four decimal places, is better since the sum of the squared
errors is lower.
The model P  0.0395Q  7.0675 in Table 3 is the very best model for the data. This
means that it has the lowest sum of squared errors. If we were to vary the slope and
intercept in the model, no other combination would lead to a sum lower than 1.0174.
The process of calculating the linear model with the smallest sum of squared errors is
called linear regression. The model obtained through linear regression goes by several
different names. Best linear model, least squares linear model and best line are all
terms that can be used when referring to the model. The model is typically obtained
using technology like a graphing calculator or Excel. The formulas required for
calculating the slope and intercept of the best linear model without technology uses
calculus and are beyond the scope of this text.
8
Question 2: How do you use linear regression functions?
Once you have found a linear model using linear regression, you can use it to analyze
problems. These problems may involve questions about the slope or the variables.
Example 1
Use a Linear Model
The least squares linear model for the data in Table 1 is
P  0.0395Q  7.0675 .
a. Use this model to determine the average price when 100 thousand
gallons of milk are sold.
Solution This equation relates Q, the quantity of milk sold in thousands
of gallons, to the average price P. Since we are given a value of
Q  100, substitute this into the model:
P  0.0395 100   7.0675  3.12
At a price of $3.12, 100 thousand gallons of milk are demanded per
week.
b. Use this model to determine the amount of milk sold per week
when the average price per gallon is $2.90.
Solution In this part, a price of P  2.90 is specified. We can substitute
this value into the model and solve for Q to get the quantity.
2.90  0.0395Q  7.0675
4.1675  0.0395Q
4.1675
Q
0.0395
Subtract 7.0675 from both sides
Divide both sides by -0.0395
This fraction is approximately 105.506 thousand gallons of milk or about
105,506 gallons of milk per week.
9
Example 2
Interpret the Slope of the Model
The least squares linear model for the data in Table 1 is
P  0.0395Q  7.0675 . Use this model to determine the rate at which the
prices are changing with respect to quantity.
Solution The rate at which the variables change with respect to each
other is the slope of the model.
Figure 7 – A graph of the model with the slope labeled as a ratio.
The slope, -0.0395, is the coefficient of the independent variable Q.
However, giving the number as the rate is not complete. A rate should
include units. This allows you to understand what the rate means.
In Figure 7, the slope of the model is labeled so that the ratio of the
vertical change to the horizontal change is
0.0395 
0.395
10
This tells us that for every 10,000 gallon increase in demand, the price
per gallon drops .395 dollars or 39.5 cents.
10
We could also write the ratio as
0.0395 
0.395
.
10
In this case, we would interpret this as a decrease of 10,000 gallons in
demand leads to an increase in price of .395 dollars or 39.5 cents.
In general, rates are interpreted in 1 unit increments of the independent
variable. Writing the ratio as
0.0395 
0.0395
1
means that for every 1 thousand gallon increase in demand, the price
drops by 0.0395 dollars or 3.95 cents. We can state this easily by
saying that the rate is -0.0395 dollars per thousand gallons. The units
include the term “per” meaning that the units are a ratio of dollars to
thousands of gallons.
11
Question 3: How good is the linear model?
The linear model is not complete without an indication of how good the fit is to the data.
We can examine the scatter plot with the model and data and get a qualitative idea of
the fit, but this can be deceiving.
Figure 8 – Two scatter plots of the data in Table 1. The model of each scatter plot is P = -0.0395Q + 7.0675,
but the horizontal and vertical scales are different.
Which linear model in Figure 8 appears to be a better fit? On the surface, you would
probably say that the model on the right is a better fit. But in fact, both scatter plots
depicts the exact same model with different scales. The vertical scale for the scatter plot
on the right is larger and makes any gaps between the model and the data seem small.
This makes the points appear to be closer to the line. In fact, the vertical distance
between each data point and the line are exactly the same and there is no difference in
the fit.
To remedy this and other difficulties in determining goodness of fit, two indicators are
used. The correlation coefficient and coefficient of determination are commonly used to
compare the fit of regression models.
12
The correlation coefficient r is a number from -1 to 1 that indicates
how well the linear model fits a set of data. If r is closer to 1, the
relationship between the data is more linear. If r is closer to 0, the
data are not linearly related.
A positive correlation coefficient indicates that the data is positively correlated. For
linear models with a positive correlation coefficient, the slope of the model will be
positive. A negative correlation coefficient indicates that the data is negatively
correlated. For linear models with a negative correlation coefficient, the slope of the
model will be negative.
Figure 9 – Three different sets of data and the corresponding linear models. The worst fit is in graph a
with an absolute value of the correlation coefficient closest to 0. The best fit is graph c with an absolute
value of the correlation coefficient closest to 1. Each model is decreasing so the correlation coefficient is
negative.
As r gets closer to 1, the data points get closer to the linear model.
Another measure of fit is the coefficient of determination, r 2 . For a linear model, the
coefficient of determination is the square of the correlation coefficient r . Since r is a
number from -1 to 1, r 2 is a number from 0 to 1. The closer the coefficient of
13
determination is to 1, the more linearly related the data are. If the coefficient of
determination is close to 0, the data are not linearly related.
Figure 10 – A graphing calculator or Excel can be used to calculate the correlation coefficient or
coefficient of determination. A graphing calculator (left) returns both values. Excel (right) can return the
coefficient of determination (it is written as R2 instead of r2.
Another measure of how close the data lie with respect to the linear model is the
percent error. The percent error at any value of the independent variable is found by
dividing the error, P  Pˆ , by the data value P . We can find the percent error by
calculating
Percent Error 
P  Pˆ
P
at each data value where P is the price data and P̂ is the model’s estimate of the price.
At Q  115 , the percent error is
Percent Error 
3.23  2.5250
 0.218 3.23
or approximately 21.8%. This means that as a proportion of the price, the price at
Q  115 is 21.8% above the model’s estimate of the price.
14
Example 3
Find the Largest Percent Error
For the model P  0.0395Q  7.0675 and the data in Table 3,
Q
P
P̂
P  Pˆ
85
3.89
3.7100
0.1800
95
2.78
3.3150
-0.5350
105
3.01
2.9200
0.0900
115
3.23
2.5250
0.7050
125
1.69
2.1300
-0.4400
Find the quantity that yields the largest percent error.
Solution Add a column to the table and calculate the percent error at
each quantity.
Q
P
P̂
P  Pˆ
P  Pˆ
P
85
3.89
3.7100
0.1800
0.1800
 0.046
3.89
95
2.78
3.3150
-0.5350
0.5350
 0.192
2.78
105
3.01
2.9200
0.0900
0.0900
 0.030
3.01
115
3.23
2.5250
0.7050
0.7050
 0.218
3.23
125
1.69
2.1300
-0.4400
0.4400
 0.260
1.69
The largest percent error is 26% and occurs at Q  125 . This price is
26% below the linear model.
15