Download Regression Basics For Business Analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Instrumental variables estimation wikipedia , lookup

Data assimilation wikipedia , lookup

Choice modelling wikipedia , lookup

Forecasting wikipedia , lookup

Linear regression wikipedia , lookup

Regression analysis wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Regression Basics For Business Analysis
If you've ever wondered how two or more things relate to
each other, or if you've ever had your boss ask you to create a
forecast or analyze relationships between variables, then
learning regression would be worth your time.
In this article, you'll learn the basics of simple linear regression
- a tool commonly used in forecasting and financial analysis.
We will begin by learning the core principles of regression, first
learning about covariance and correlation, and then move on
to building and interpreting a regression output.
A lot of software such as Microsoft Excel can do all the
regression calculations and outputs for you, but it is still
important to learn the underlying mechanics.
Variables
At the center of regression is the relationship between two variables,
called the dependent and independent variables. For instance, suppose
you want to forecast sales for your company and you've concluded that
your company's sales go up and down depending on changes in GDP.
Covariance
The formula to calculate the relationship between two variables is
called covariance. This calculation shows you the direction of the
relationship as well as its relative strength.
Correlation Coefficient
We need to standardize the covariance in order to allow us to better
interpret and use it in forecasting, and the result is the correlation
calculation.
The correlation calculation simply takes the covariance and divides it by the
product of the standard deviation of the two variables. This will bound the
correlation between a value of -1 and +1.
Regression Equation
Now that we know how the relative relationship between the two variables
is calculated, we can develop a regression equation to forecast or predict
the variable we desire. Below is the formula for a simple linear regression.
The "y" is the value we are trying to forecast, the "b" is the slope of the
regression, the "x" is the value of our independent value, and the "a"
represents the y-intercept. The regression equation simply describes the
relationship between the dependent variable (y) and the independent
variable (x).
The intercept, or "a", is the value of y (dependent variable) if the value of x (independent
variable) is zero. So if there was no change in GDP, your company would still make some
sales - this value, when the change in GDP is zero, is the intercept.
Linear regression attempts to estimate a line that best fits the data, and the
equation of that line results in the regression equation.
Excel
Now that you understand some of the background that goes into regression
analysis, let's do a simple example using Excel's regression tools. We'll build
on the previous example of trying to forecast next year's sales based on
changes in GDP. The next table lists some artificial data points, but these
numbers can be easily accessible in real life.
Year
Sales
GDP
2005
100
1.00%
2006
250
1.90%
2007
275
2.40%
2008
200
2.60%
2009
300
2.90%
Just eyeballing the table, you can see that there is going to be a positive
correlation between sales and GDP. Both tend to go up together. Using
Excel, all you have to do is click the Tools drop-down menu, select Data
Analysis, and from there choose Regression.
The popup box is easy to fill in from there; your Input Y Range is your "Sales"
column and your Input X Range is the change in GDP column; choose the
output range for where you want the data to show up on your spreadsheet
and press OK. You should see something similar to what is given in the table
below
Regression Statistics
Coefficients
Multiple R
0.8292243
Intercept
34.58409
R Square
0.687613
GDP
88.15552
Adjusted
R Square
0.583484
-
-
Standard Error
51.021807
-
-
Observations
5
-
-
Interpretation
The major outputs you need to be concerned about for simple linear
regression are the R-squared, the intercept and the GDP coefficient.
The R-squared number in this example is 68.7% - this shows how well our
model predicts or forecasts the future sales.
Next we have an intercept of 34.58, which tells us that if the change in GDP
was forecasted to be zero, our sales would be about 35 units.
And lastly, the GDP correlation coefficient of 88.15 tells us that if GDP
increases by 1%, sales will likely go up by about 88 units.
So how would you use this simple model in your business? Well if your research leads you
to believe that the next GDP change will be a certain percentage, you can plug that
percentage into the model and generate a sales forecast.
This can help you develop a more objective plan and budget for the upcoming year.
EXERCISE
In this case, you would plot last year's data for monthly sales and advertising
expenditures as shown on the scatter plot below. (Data for independent and
dependent variables must be from the same period of time.)
Did you get this?
Scatter plots are effective in visually identifying relationships between
variables. These relationships can be expressed mathematically in
terms of a correlation coefficient, which is commonly referred to as a
correlation.
Regression Line – May you try?
• The figure below is the same as the scatter plot above, with the addition of
a regression line fitted to the historical data.
The regression line is the
line with the smallest
possible set of distances
between itself and each
data point.
As you can see, the
regression line touches
some data points, but
not others.
The distances of the data
points
from
the
regression
line
are
called error terms.
Regression analysis – Excel Formula
You use the LINEST function to perform a regression analysis.
And you perform a regression analysis when you need to
know, for example, how an athlete's performance is affected
by age, height, and weight. You can then use the results to
predict the performance of a new, untested athlete. In other
words, you're estimating likelihood.
As an example, say you have sales data from January to June,
and you want to predict sales for September. You'd use the
LINEST function like this:
Remember to enter LINEST as an array formula
(press Ctrl+Shift+Enter instead of just Enter).
Give it a try
• The sample data shown here uses the LINEST function to estimate future sales.
• This Excel Online workbook shows the LINEST function being used with SUM in
an array formula.
• Copy all the cells in the table below and paste them into cell A1 in a new
worksheet in Excel. Then, select cell B9 and press Ctrl+Shift+Enter to enter it as
an array formula. The result in B9 should be 11,000.
Month
1
Sales
$3,100
2
3
4
$4,500
$4,400
$5,400
5
6
$7,500
$8,100
Formula
=SUM(LINEST(B1:B6, A1:A6)*{9,1})