Download Logistic Regression

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Logistic Regression
Saed Sayad
www.ismartsoft.com
1
Definition
Logistic Regression is a type of regression
model where the dependent variable
(target) has just two values, such as:
0, 1
Y, N
F, T
www.ismartsoft.com
2
Sample Dataset
Months n Business
189
170
166
423
145
60
97
354
99
80
25
118
74
...
Balance
$429,916
$240,319
$231,327
$196,105
$193,907
$190,944
$184,333
$152,126
$151,061
$135,885
$119,751
$116,578
$123,864
...
www.ismartsoft.com
Default
0
1
0
0
1
0
0
0
1
0
1
1
0
...
3
Linear Regression (Continuous Dependent Variable)
$500,000
$450,000
$400,000
$350,000
Balance
$300,000
Y= 47.92X + 13916
$250,000
$200,000
$150,000
$100,000
$50,000
$0
0
100
200
300
400
500
600
Months in Business
www.ismartsoft.com
4
Linear Regression (Binary Dependent Variable)
1
Default
Y= -0.000X + 0.373
0
0
100
200
300
400
500
600
Months in Business
www.ismartsoft.com
5
Linear Regression Model – Binary Target
Yi  o  1 X i   i
• If the actual Y is a binary variable then the predicted
Y can be less than zero or greater than 1
• If the actual Y is a binary variable then error is not
normally distributed.
www.ismartsoft.com
6
Linear Regression Model
Y
1
0
X
www.ismartsoft.com
7
Frequency Table
Months in Business
Count
<50
50-100
100-150
150-200
200-250
250-300
>300
4
12
4
4
4
1
4
www.ismartsoft.com
Default
Count
0
1
1
2
3
1
4
Default
Frequency
0
0.083
0.25
0.5
0.75
1
1
8
Frequency Plot
1
0.8
0.6
Default Probability
0.4
0.2
0
1
2
3
4
5
6
7
Months in Business - Bins
www.ismartsoft.com
9
Logistic Function
1
f ( z) 
1  ez
www.ismartsoft.com
10
Logistic Regression
p
1
1 e
 (  0  1 X )
 The logistic distribution constrains the estimated
probabilities to lie between 0 and 1.
 Maximum Likelihood Estimation is a statistical
method for estimating the coefficients of a model.
www.ismartsoft.com
11
Logistic Regression Model
Linear Model
Y
1
Logistic
Model
0
X
www.ismartsoft.com
12
Maximum Likelihood Estimation (MLE)
• MLE maximizes the log likelihood (LL) which reflects
how likely it is that the dependent variable will be
predicted from the independent variables.
• MLE is an iterative algorithm which starts with initial
arbitrary numbers of what the coefficients should be.
• After this initial function is estimated, the process is
repeated until LL does not change significantly.
Copyright iSmartsoft Inc. 2008
www.ismartsoft.com
13
Log Likelihood (LL)
• Likelihood is the probability that the
dependent variable may be predicted from
the independent variables.
• LL is calculated through iteration, using
maximum likelihood estimation (MLE).
• Log likelihood is the basis for tests of a logistic
model.
www.ismartsoft.com
14
Log Likelihood Test (-2LL)
• The log likelihood test is a test of the
significance of the difference between the
likelihood ratio for the baseline model minus
the likelihood ratio for a reduced model.
• This difference is called "model chi-square“.
• Also called Likelihood Ratio test.
www.ismartsoft.com
15
Wald Test
• A Wald test is used to test the statistical significance
of each coefficient () in the model.
• A Wald test calculates a Z statistic, which is:
Z
̂
SE
• This Z value is then squared, yielding a Wald statistic
with a chi-square distribution.
www.ismartsoft.com
16
Summary
• Logistic Regression is a classification method.
• It returns the probability that the binary dependent variable
may be predicted from the independent variables.
• Maximum Likelihood Estimation is a statistical method for
estimating the coefficients of the model.
• The Likelihood Ratio test is used to test the statistical
significance between the full model and the simpler model.
• The Wald test is used to test the statistical significance of each
coefficient in the model.
www.ismartsoft.com
17
Questions?
www.ismartsoft.com
18
Related documents