Download Simple Regression

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Choice modelling wikipedia , lookup

Time series wikipedia , lookup

Regression toward the mean wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Regression Analysis
Gordon Stringer
Gordon Stringer, UCCS
1
Regression Analysis
 Regression Analysis: the study of the
relationship between variables
 Regression Analysis: one of the most
commonly used tools for business analysis
 Easy to use and applies to many situations
Gordon Stringer, UCCS
2
Regression Analysis
 Simple Regression: single explanatory
variable
 Multiple Regression: includes any number
of explanatory variables.
Gordon Stringer, UCCS
3
Regression Analysis
 Dependant variable: the single variable
being explained/ predicted by the regression
model (response variable)
 Independent variable: The explanatory
variable(s) used to predict the dependant
variable. (predictor variable)
Gordon Stringer, UCCS
4
Regression Analysis
 Linear Regression: straight-line relationship
Form: y=mx+b
 Non-linear: implies curved relationships, for
example logarithmic relationships
Gordon Stringer, UCCS
5
Data Types
 Cross Sectional: data gathered from the
same time period
 Time Series: Involves data observed over
equally spaced points in time.
Gordon Stringer, UCCS
6
Graphing Relationships
 Highlight your data, use chart wizard,
choose XY (Scatter) to make a scatter plot
Gordon Stringer, UCCS
7
Scatter Plot and Trend line
 Click on a data point and add a trend line
Gordon Stringer, UCCS
8
Scatter Plot and Trend line
 Now you can see if there is a relationship
between the variables. TREND uses the
least squares method.
Gordon Stringer, UCCS
9
Correlation
 CORREL will calculate the correlation
between the variables
 =CORREL(array x, array y)
or…
 Tools>Data Analysis>Correlation
Gordon Stringer, UCCS
10
Correlation
 Correlation describes the strength of a linear
relationship
 It is described as between –1 and +1
 -1 strongest negative
 +1 strongest positive
 0= no apparent relationship exists
Gordon Stringer, UCCS
11
Simple Regression Model
 Best fit using least squares method
 Can use to explain or forecast
Gordon Stringer, UCCS
12
Simple Regression Model
 y = a + bx + e
(Note: y = mx + b)
 Coefficients: a and b
 Variable a is the y intercept
 Variable b is the slope of the line
Gordon Stringer, UCCS
13
Simple Regression Model
 Precision: accepted measure of accuracy is
mean squared error
 Average squared difference of actual and
forecast
Gordon Stringer, UCCS
14
Simple Regression Model
 Average squared difference of actual and
forecast
 Squaring makes difference positive, and
severity of large errors is emphasized
Gordon Stringer, UCCS
15
Simple Regression Model
 Error (residual) is difference of actual data
point and the forecasted value of dependant
variable y given the explanatory variable x.
Error
Gordon Stringer, UCCS
16
Simple Regression Model
 Run the regression tool.
 Tools>Data Analysis>Regression
Gordon Stringer, UCCS
17
Simple Regression Model
 Enter the variable data
Gordon Stringer, UCCS
18
Simple Regression Model
 Enter the variable data
 y is dependent, x is independent
Gordon Stringer, UCCS
19
Simple Regression Model
 Check labels, if including column labels
 Check Residuals, Confidence levels to
displayed them in the output
Gordon Stringer, UCCS
20
Simple Regression Model
 The SUMMARY OUTPUT is displayed
below
Gordon Stringer, UCCS
21
Simple Regression Model
 Multiple R is the correlation coefficient
 =CORREL
Gordon Stringer, UCCS
22
Simple Regression Model
 R Square: Coefficient of Determination
 =RSQ
 Goodness of fit, or percentage of variation
explained by the model
Gordon Stringer, UCCS
23
Simple Regression Model
 Adjusted R Square =
1- (Standard Error of Estimate)2 /(Standard Dev Y)2
Adjusts “R Square” downward to account for the number of
independent variables used in the model.
Gordon Stringer, UCCS
24
Simple Regression Model
 Standard Error of the Estimate
 Defines the uncertainty in estimating y with
the regression model
 =STEYX
Gordon Stringer, UCCS
25
Simple Regression Model
 Coefficients:
– Slope
– Standard Error
– t-Stat, P-value
Gordon Stringer, UCCS
26
Simple Regression Model
 Coefficients:
– Slope = 63.11
– Standard Error = 15.94
– t-Stat = 63.11/15.94 = 3.96; P-value = .0005
Gordon Stringer, UCCS
27
Simple Regression Model
 y = mx + b
 Y= a + bX + e
 Ŷ = 56,104 + 63.11(Sq ft) + e
 If X = 2,500 Square feet, then
 $213,879 = 56,104 + 63.11(2,500)
Gordon Stringer, UCCS
28
Simple Regression Model
 Linearity
 Independence
 Homoscedasity
 Normality
Gordon Stringer, UCCS
29
Simple Regression Model
 Linearity
Square Feet Line Fit Plot
350,000
300,000
Cost
250,000
200,000
150,000
100,000
50,000
0
1,500
2,000
2,500
3,000
3,500
4,000
Square Feet
Cost
Predicted Cost
Gordon Stringer, UCCS
30
Simple Regression Model
 Linearity
Square Feet Residual Plot
Residuals
100000
50000
0
1,500
-50000
2,000
2,500
3,000
3,500
4,000
-100000
Square Feet
Gordon Stringer, UCCS
31
Simple Regression Model
 Independence:
– Errors must not correlate
– Trials must be independent
Gordon Stringer, UCCS
32
Simple Regression Model
 Homoscedasticity:
– Constant variance
– Scatter of errors does not change from trial to
trial
– Leads to misspecification of the uncertainty in
the model, specifically with a forecast
– Possible to underestimate the uncertainty
– Try square root, logarithm, or reciprocal of y
Gordon Stringer, UCCS
33
Simple Regression Model
 Normality:
• Errors should be normally distributed
• Plot histogram of residuals
Gordon Stringer, UCCS
34
Multiple Regression Model
 Y = α + β1X1 + … + βkXk + ε
 Bendrix Case
Gordon Stringer, UCCS
35
Regression Modeling Philosophy
 Nature of the relationships
 Model Building Procedure
– Determine dependent variable (y)
– Determine potential independent variable (x)
– Collect relevant data
– Hypothesize the model form
– Fitting the model
– Diagnostic check: test for significance
Gordon Stringer, UCCS
36