Download OLS Regression with a Dichotomous Dependent Variable

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Topic 12. Binary Logistic Regression
Binary logistic regression is a type of regression analysis that is used to estimate the relationship between a
dichotomous dependent variable and dichotomous-, interval-, and ratio-level independent variables.
Many different variables of interest are dichotomous – e.g., whether or not someone voted in the last election,
whether or not someone is a smoker, whether or not one has a child, whether or not one is unemployed, etc.
These types of variables are often referred to as discrete or qualitative. Many discrete or qualitative variables
can be thought of as events.
Dichotomous or dummy variables are usually coded 1, indicating “success” or “yes,” and 0, indicating “failure”
or “no.” The mean of a dichotomous variable coded 1 and 0 is equal to the proportion of cases coded as 1,
which can also be interpreted as a probability.
1111110000
mean = 6 / 10 = .6 = the probability that any 1 case out of 10 has a score of 1
For quite a while, researchers used OLS regression to analyze dichotomous outcomes. This was based on the
idea that predicted values (ŷ) – based on the regression results – generally range from 0 to 1 and are equivalent
to predicted probabilities, predicted proportions, and predicted percents of “success” given values on the
independent variables.
In other words, if we regressed a dummy variable, voted or not, on education and got the estimate b = .25, then
we could say that a one-unit increase in education increases the probability of voting by .25. Equivalently, a
one-unit increase in education increases the proportion voting by .25. Finally, a one-unit increase in education
increases the percent voting by 25 percent.
Due to a number of conceptual and statistical problems, however, people no longer use OLS regression to
analyze dichotomous dependent variables. There are a number of alternative approaches to modeling
dichotomous outcomes including logistic regression, probit analysis, and discriminant function analysis.
Logistic regression is by far the most common, so that will be our main focus. Additionally, we will focus on
binary logistic regression as opposed to multinomial logistic regression – used for nominal variables with more
than 2 categories.
OLS Regression with a Dichotomous Dependent Variable
What is wrong with using OLS regression with dichotomous dependent variables? There are a number of
problems.
1. One of the regression assumptions that we discussed is that the dependent variable is quantitative (at least at
the interval level), continuous (can take on any numerical value), and unbounded.
A person’s score on the dependent variable is assumed to be a function of their score on each independent
variable. Therefore, the dependent variable must be free to take on any value that is predicted by the
combination of independent variables. If the dependent variable does not meet these requirements (e.g., it is
dichotomous), then predicted scores on the dependent variable may lie outside possible limits.
When you use OLS regression with a dichotomous dependent variable, predicted probabilities (based on the
estimated OLS regression equation) are not bounded by the values of 0 and 1.
Page 1 of 12
Why is this a problem? In the real world, probabilities (and percents for that matter) can never be less than 0
and can never be greater than 1. For example, the chance of rain can’t be less than 0% and can’t be more than
100% - 0 % means that it’s impossible and 100% means it’s guaranteed. You can’t get any more or less likely
than that!
With dummy dependent variables and OLS regression, it is not uncommon for predicted probabilities to be less
than 0 and greater than 1. The likelihood of this increases as the difference between the number of successes
and failures increases. In other words, if the split is 90% have a score of 1 and 10% have a score of 0, then you
will probably experience impossible predicted probabilities.
2. Another OLS multiple regression assumption is that the relationship between Y and X is linear and additive
in the population. Our estimates cannot be very good if we assume that the true relationship is linear and
additive and we specify a linear and additive relationship when, in fact, in the population, the relationship is
non-linear and/or non-additive
In many cases, it is not unreasonable to assume that the relationship between two variables is non-linear. The
example that Pampel uses in the book is that of income and home ownership. A $10,000 increase in income
probably increases the probability of owning a home more for someone with an initial income of $40,000 than
someone with an initial income of $0.
Also an additional $10,000 probably does not have much of an influence on the likelihood that a very rich
person owns a home – e.g., earning $1,001,000 versus earning $1,000,000.
So a more appropriate functional form (rather than a line) might be an s-shaped curve:
1
0.9
0.8
Probability
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
X
This type of a curve suggests that one-unit changes in the independent variable have different effects on the
dependent variable at different levels of the independent variable. It takes a much larger increase in X to have
the same effect on Y at extreme ends of the curve. One way to think about this is to consider the fact that the
slope of this curve changes at different values of Y. In essence, there are a number of different perpendicular
lines that one can draw.
Page 2 of 12
3. Another regression assumption is that the error term is normally distributed. Remember that the error term
summarizes all of the causes of the dependent variable not included in the model as well as errors in the
functional form of the equation, measurement error, and the randomness in human behavior.
The assumption of normality allows you to do hypothesis testing. If the error term is not normally distributed,
then we cannot use z (t) to find the probability under the curve.
The error term is not normally distributed when you use OLS regression with a dichotomous dependent variable
because, for any value of X, there are only two possible values that the residuals can take. A residual is defined
as the observed value on the dependent variable minus the predicted value given X.
Here’s an example:
Consider 10 people with a value of 2 on the independent variable and an estimated regression equation of:
ŷi = .03 + .48 * xi
The residual is equal to yi – ŷi, where ŷi = a + (b * xi).
So after substituting, the residual is equal to yi – (a + (b * xi)).
For the value X = 2, there are only 2 possible values for the residual because there are only two possible
observed values for Y (1 and 0).
1 – (.03 + .48 * 2) = .01
0 – (.03 + .48 * 2) = -.99
Thus, for any value of x, there are only two possible residuals so the distribution is not normal.
4. Another assumption is that of homoskedasticity – that the variance of the error term is constant across all
values of the independent variables. Homoskedasticity means that the predicted values of the dependent
variable are as good (or as bad) at all levels of the independent variable.
This is violated because the residuals vary with the value of x. The linear OLS regression consistently
underestimates the slope at moderate levels of x and consistently overestimates the slope at extreme levels of x.
Heteroskedasticity leads to biased estimates of the standard errors, which we use in our t tests. Poor estimates
increase the chance of drawing incorrect conclusions in hypothesis testing.
The Logit Transformation
So what can we do? As I mentioned earlier, many topics of interest are dichotomous. The “miracle” of logistic
regression is that, through the logit transformation, it linearizes the non-linear relationship between X and the
probability of Y. It does this through the use of odds and logarithms. So, the logit is a nonlinear function that
represents the s-shaped curve. Let’s look more closely at how this works.
Probabilities express the likelihood of an event as a proportion of both occurrences and non-occurrences. In
other words, probabilities are defined as the number of occurrences divided by the number of occurrences plus
the number of non-occurrences. So, if you have a sample of 4,000 people and 3000 are married, the probability
of being married is .75 (there is a 75% change of being married): 3000 / (3000 + 1000) = .75.
Probabilities cannot be less than 0 and cannot be greater than 1. In other words, they are bounded by 0 and 1.
Page 3 of 12
Odds, by contrast, are defined as the likelihood of occurrence divided by the likelihood of non-occurrence.
Thus, the odds of being married for our example is: 3000 / 1000 = 3.
What difference does dividing only by the number of non-occurrences make? It removes the upper limit of 1.
But wait…that’s not all…odds are also non-linear. Consider the examples in the Pampel text (p. 11).
P
1-P
Odds
.1
.9
.111
.2
.8
.250
.3
.7
.429
.4
.6
.667
.6
.4
1.500
.5
.5
1
.7
.3
2.333
.8
.2
4
.9
.1
9
The same change (a .1 increase in P) leads to increasingly large increases in the odds. Notice that the odds is
still bounded at the lower end. It is impossible for the odds to fall below 0.
So, transforming the probabilities into odds has removed the upper limit. We are half way there.
The next step is to take the natural logarithm of the odds. Here is the example from Pampel:
P
1-P
Odds
Ln odds
.1
.9
.111
-2.198
.2
.8
.250
-1.386
.3
.7
.429
-.846
.4
.6
.667
-.405
.5
.5
1
0
.6
.4
1.500
.405
.7
.3
2.333
.847
.8
.2
4
1.386
.9
.1
9
2.197
Taking the natural log of the odds eliminates the floor of zero. Taking the natural log of a number above 0 and
below 1 yields a negative number. Odds cannot be less than zero, but all odds less than 1 yield natural logs that
are negative…the floor is gone.
Taking the natural log of the number 1 yields 0.
Finally, taking the natural logarithm of a number greater than 1 yields a positive number. However, notice that
the distribution of logged odds is symmetrical around 0. The same size increase or decrease in the probabilities
has the same absolute value in logged odds.
It is important to point out that the difference between the logged odds is not constant. For example, 2.1971.386=.811 while 1.386-.847=.539. What does this mean? The logit transformation is stretching the
distribution at extreme ends so the same one-unit change in X (the independent variable) leads to increasingly
smaller gains in Y. In essence, it yields the s-shaped curve.
The linear relationship between X and the log odds is given by the following formula – cumulative logistic
distribution function (you will see this in some statistical output):
 P
Logged odds: Ln i
 1  Pi

  b0  b1 x1  b2 x2

Notice that there is no error term in the model. The error term is not necessary. “The random component of the
model is inherent in the modeling itself – the logit equation, for example, provides the expression for the
probability that an event will occur. For each observation the occurrence or non-occurrence of that event comes
about through a chance mechanism determined by this probability, rather than by a draw from a bowl of error
terms” (Kennedy 1998, p. 234).
Page 4 of 12
The right hand side of the equation looks like a normal linear regression equation, but the left hand side is the
log odds rather than a probability. This equation can be re-written as follows:
Odds: Pi / 1  Pi  e bo * e b1x1i * e b2 x2i
Probabilities: Pi  E (Yi  1 X 1i , X 2i ) 
1
1 e
 ( b0  b1 x1i  b2 x2 i )
Estimation
Fantastic…logistic regression allows us to estimate the relationship between dichotomous dependent variables
and dichotomous, interval, and ratio independent variables. But…how does it work? Where do the estimates
for  come from?
Logistic regression uses maximum likelihood estimation to generate estimates of . Specifically, ML uses the
observed data and probability theory to find the most likely or the most probable population value given the
sample data (observations).
In logistic regression, the formula that is used to determine the population value most likely to yield the sample
data is given by the likelihood function:

LF   Pi i * (1  Pi )1 yi
y

The likelihood function is an expression for the likelihood of observing the pattern of occurrences (y=1) and
non-occurrences (y=0) of an event in a given sample. In other words, it tells us the probability of getting our
sample data from a population with probabilities equal to Pi.
The yis above can be either 1 or 0, depending on the score of person i on the dependent variable. The Pis refer
to predicted probabilities on the dependent variable (there is one for every person in the sample) given the
person’s scores on the independent variable(s). You plug predicted probabilities into this equation just like we
plugged possible population proportions into the binomial probability distribution. The predicted probabilities
are from a logistic regression model.
yi
1
1
0
0
LF
Pi
0.1
0.2
0.8
0.9
P
i
yi

* (1  Pi )1 yi
0.100000
0.200000
0.200000
0.100000
0.000400
yi
1
1
0
0
Pi
0.8
0.9
0.1
0.2
P
i
yi

* (1  Pi )1 yi
0.800000
0.900000
0.900000
0.800000
0.518400
How does the software package do a logistic regression to get the predicted probabilities? It does this by using
OLS regression to get a first set of estimates. These estimates yield predicted probabilities for each case, which
are plugged into the likelihood function and generate a score for those estimates. The score is equal to the
likelihood of obtaining the observed sample data (the combination of 1s and 0s) from a population with that
particular set of slope estimates.
Next, based on sophisticated computer algorithms, it chooses a new estimate for , which is used to generate a
new set of predicted probabilities for each case. These are plugged into the likelihood function and generate a
new score (a new probability) for this new estimate.
Page 5 of 12
This process is repeated over and over until the likelihood function is maximized – until increases in the number
become extremely small with successive attempts. Believe me, you don’t want to do this by hand.
In the book, Pampel talks about the log likelihood function. The log likelihood function is another version of
the likelihood function – it is simply the log of the likelihood function. Why does the computer program use
this instead of the likelihood function? Because it’s easier to do addition than multiplication. By taking the log
of the likelihood function, the different components of the equation can be added together instead of multiplied
together. One rule of logarithms states that log (a*b) = log a + log b.
Summary
So to summarize, when we are interested in examining the relationship between a dichotomous dependent
variable and dichotomous, interval, and ratio independent variables, we can use logistic regression. Logistic
regression uses maximum likelihood to generate the estimates of the slopes.
Logistic regression yields unbiased and efficient estimates of  and OLS regression does not.
Logistic regression is now available in most statistical packages. It is fairly simple to run in SPSS.
Getting Results from SPSS and Interpretation
For our example, we will examine people’s attitudes toward women in politics. Our dependent variable is
dichotomous: whether or not the respondent agrees that men are better suited emotionally for politics (1 = yes).
Independent variables include: sex (male), race (white), age, marital status (married), the number of children
(childs), and education (educ).
We will run two logistic regressions. In the first, we will control for all variable except for education. In the
second, we will control for all variables.
Here is the process for using the menu system: analyze, regression, binary logistic. That’s all there is to it.
Once you have entered the dependent and independent variables, click on paste and run the command from the
syntax window. Here is the command syntax for SPSS for the first model:
LOGISTIC REGRESSION VAR=menpol2
/METHOD=ENTER male white age married childs
/CRITERIA PIN(.05) POUT(.10) ITERATE(20) CUT(.5) .
Here is the output:
Logistic Regression
Case Processing Summary
Unweighted Cases
Selected Cases
Unselected Cas es
Total
a
Included in Analysis
Mis sing Cases
Total
N
1436
94
1530
0
1530
Percent
93.9
6.1
100.0
.0
100.0
a. If weight is in effect, s ee class ification table for the total
number of cases.
Page 6 of 12
De pendent Varia ble Encoding
Original Value Int ernal Value
.00 no
0
1.00 y es
1
SPSS uses a listwise deletion of missing cases procedure for logitistic regression – cases with missing data on
any variable in the model are dropped from the analysis. In this first model, we lose 94 cases (or 6.1% of our
sample).
Block 0: Beginning Block
Classification Tablea,b
Predicted
Observed
Step 0
no
729
707
no
yes
yes
0
0
Overall Percentage
Percentage
Correct
100.0
.0
50.8
a. Constant is included in the model.
b. The cut value is .500
Notice that the n = 1436 (729+707). 50.8% is the percent that SPSS would guess correctly if it guessed “no” for
every case (729/1436=.508). Without any additional information (e.g., independent variables) this would be the
best guess because no is the modal category.
Va riables in the Equation
St ep 0
Constant
B
-.031
S. E.
.053
W ald
.337
df
1
Sig.
.562
1
1
1
1
1
5
Sig.
.764
.766
.000
.017
.000
.000
Ex p(B)
.970
Va riables not in the Equation
St ep
0
Variables
MALE
W HITE
AGE
MARRIED
CHILDS
Overall Statist ics
Sc ore
.090
.089
59.009
5.646
14.435
67.258
df
Nothing terribly important here. This first model or baseline model is just used to calculate the –2 log
likelihood value for the fully unconditional or null model (the model with no variables).


LF   Pi i * (1  Pi )1 yi
I believe that the “scores” are chi-square statistics that test the null hypothesis that the variable is not associated
with the dependent variable.
y
Page 7 of 12
Block 1: Method = Enter
Omnibus Tests of Model Coefficients
Step 1
Step
Block
Model
Chi-square
68.454
68.454
68.454
df
5
5
5
Sig.
.000
.000
.000
Model Summary
Step
1
-2 Log
likelihood
1921.927
Cox & Snell
R Square
.047
Nagelkerke
R Square
.062
SPSS does not actually give you the baseline –2 log likelihood value. You have to calculate it based on the chisquare and –2 log likelihood values listed above. 1921.927 is the –2 log likelihood for the model with
independent variables. 68.454 is the improvement in this model over the baseline –2 log likelihood value. So
the baseline value is 1990.381 (1921.927+68.454=1990.381). “0” = perfect fit, closer to 0 is better fit.
These numbers do not have any useful meaning in and of themselves. The larger the improvement in the
model, the better able are the independent variables to predict the outcome. More correctly, the smaller the –2
log likelihood, the more probable the population estimates given the sample data (smaller because it is
multiplied by negative 2).
The difference between these two –2 log likelihood values can be thought of as a test of the significance of the
whole model. If the chi-square value is significant, then it means that at least one of the independent variables
has a significant effect on the dependent variable. It is, therefore, roughly equivalent to the F test in OLS
multiple regression.
Another way to evaluate the fit of the model is to examine the pseudo r-squared values reported in the output.
As Pampel mentions in the Sage text, the various pseudo r-squareds are not equivalent to the r-squared test in
OLS multiple regression.
In OLS multiple regression, the r-square indicates the amount of variation in the dependent variable that is
explained by the independent variables. In other words, the degree to which we are able to accurately predict
the observed values on the dependent variable over guessing the mean.
The total variation in the dependent variable is defined as the sum of squared deviations between the mean of
the dependent variable and each case’s observed score. The explained variation is defined as the sum of
squared deviations between the predicted score on the dependent variable and the mean. Thus, the r-squared is
defined as the explained variation divided by the total variation. This type of summary measure of the goodness
of fit is commonly referred to as a proportional reduction of error measure.
The pseudo r-squareds that are available in logistic regression use the same logic. The simplest pseudo rsquared is simply the improvement in the –2 log likelihood divided by the –2 log likelihood of the baseline
model. This indicates the proportional improvement in the model, much like the r-squared in linear regression.
A number of different alternative pseudo r-squareds are available. Two included in the SPSS output are the
Cox and Snell pseudo r-squared and the Nagelkerke pseudo r-squared.
Page 8 of 12
Others discussed in the Pampel book include: Aldrich and Nelson, Hagle and Mitchell, and McKelvey and
Zavoina. All of these pseudo r-squareds indicate the goodness of fit of the model – that is, the proportional
reduction in the chi square value from the baseline to the fitted model.
What is the difference between all of the different pseudo r-squareds? The –2 log likelihood value is partially
determined by sample size and the number of variables. Also, some pseudo r-squareds are not bounded by 0
and 1. Different versions of the pseudo r-squared correct for one or both of these problems. I would suggest
presenting several in your output. In terms of those available from SPSS, the Nagelkerke version is based on
the Cox and Snell pseudo r squared, but has a value of 1 for the upper limit.
Classification Tablea
Predicted
Observed
Step 1
no
441
296
no
yes
yes
288
411
Overall Percentage
Percentage
Correct
60.5
58.1
59.3
a. The cut value is .500
Any case with a predicted probability greater than .5 (the cut value) is assigned as yes and any case with a
predicted probability less than .5 is assigned as no. Here you have the percent correctly classified. Earlier, we
learned that if SPSS guessed the modal category (no) for each case, it would have been correct 50.8% of the
time. So, by taking into account the independent variables, it is now able to be correct 59.3% of the time – an
improvement of about 9%. Obviously, the bigger the difference, the better the model.
Variables in the Equation
Step
a
1
MALE
WHITE
AGE
MARRIED
CHILDS
Constant
B
.065
-.169
.024
.259
.034
-1.220
S.E.
.110
.167
.003
.117
.031
.221
Wald
.353
1.019
48.850
4.911
1.247
30.597
df
1
1
1
1
1
1
Sig.
.552
.313
.000
.027
.264
.000
Exp(B)
1.067
.845
1.024
1.296
1.035
.295
a. Variable(s) entered on s tep 1: MALE, WHITE, AGE, MARRIED, CHILDS.
Here are the coefficients. The B column presents the logged odds. These do not have an intuitive
interpretation. For example, the logged odds of agreeing that men are better emotionally suited for politics is
.065 higher for men than women. Each additional child increases the logged odds of agreeing that men are
better emotionally suited for politics by .034.
The SE column provides the standard error for the estimated slopes. Remember, this is an estimate of how
much variability there is across all possible samples in the estimated slopes. Dividing B by the SE yields a z
score, which you can use to test if the slope is significantly different that zero. If you square the z value –
multiply it times itself, you get the Wald statistic that is presented in column four. The Wald statistic has a chi
square distribution. So you can also use the Wald statistic and the degrees of freedom to test if the slope is
different than zero. The actual probability that the slope is equal to zero is given in the next column.
Page 9 of 12
Finally, the Exp(B) column presents the odds of agreeing. This is a simple transformation of the B. To
calculate the exponent of B, raise e (the base of the natural logarithm, about equal to 2.718) to a power equal to
B. For example, e.065=1.067.
The odds has a more intuitive interpretation. The odds will be greater than 1 if the B is positive. The odds will
be less than 1 if the B is negative. If the odds is equal to 1, there is no relationship. Here is the interpretation:
men are 6.7% more likely to agree that men are better emotionally suited for politics than women. Each
additional child increases the odds of agreeing by 3.5%. Odds are different than probabilities!
Let’s run a second model controlling for education. Here is the output.
Case Processing Summary
Unweighted Cases
Selected Cases
a
Included in Analysis
Mis sing Cases
Total
Unselected Cas es
Total
N
1428
102
1530
0
1530
Percent
93.3
6.7
100.0
.0
100.0
a. If weight is in effect, s ee class ification table for the total
number of cases.
Notice that we have fewer cases. We lost a few cases when we entered the education variable (n=1,428).
Block 0: Beginning Block
De pendent Varia ble Encodi ng
Original Value Int ernal Value
.00 no
0
1.00 y es
1
Classification Tablea,b
Predicted
Observed
Step 0
no
724
704
no
yes
yes
0
0
Overall Percentage
Percentage
Correct
100.0
.0
50.7
a. Constant is included in the model.
b. The cut value is .500
Va riables in the Equation
St ep 0
Constant
B
-.028
S. E.
.053
W ald
.280
df
1
Page 10 of 12
Sig.
.597
Ex p(B)
.972
Variables not in the Equation
Step
0
Variables
Score
.065
.039
60.445
5.108
14.686
54.437
94.604
MALE
WHITE
AGE
MARRIED
CHILDS
EDUC
Overall Statistics
df
1
1
1
1
1
1
6
Sig.
.798
.844
.000
.024
.000
.000
.000
Block 1: Method = Enter
Omnibus Tests of Model Coefficients
Step 1
Step
Block
Model
Chi-square
97.391
97.391
97.391
df
6
6
6
Sig.
.000
.000
.000
Model Summary
Step
1
-2 Log
likelihood
1881.958
Cox & Snell
R Square
.066
Nagelkerke
R Square
.088
Classification Tablea
Predicted
Observed
Step 1
no
yes
no
479
311
yes
245
393
Overall Percentage
Percentage
Correct
66.2
55.8
61.1
a. The cut value is .500
What I want to point out is that you are able to compare the fit of two models with a different set of independent
variables. The logic is the same as before when we compared the null model to the first fitted model. Now we
can compare the first fitted model (without education) to the second fitted model (with education). The
improvement in the chi square value is: 1921.927 - 1881.958 = 39.969. With 1 degree of freedom, the new chi
square value of 39.969 is significant. So we can conclude that adding the education variable significantly
improves the fit of the model.
Page 11 of 12
Variables in the Equation
Step
a
1
MALE
WHITE
AGE
MARRIED
CHILDS
EDUC
Constant
B
.085
-.051
.020
.298
.008
-.099
.054
S.E.
.111
.172
.004
.119
.032
.019
.332
Wald
.586
.088
30.337
6.222
.061
27.306
.026
df
1
1
1
1
1
1
1
Sig.
.444
.766
.000
.013
.805
.000
.872
Exp(B)
1.089
.950
1.020
1.347
1.008
.906
1.055
a. Variable(s) entered on s tep 1: MALE, WHITE, AGE, MARRIED, CHILDS, EDUC.
Bells and Whistles
You have probably noticed that I have not said much about standardized or semi-standardized coefficients.
Pampel discusses both in reference to logistic regression. I’ll show my bias…in general, I am not a big fan of
standardized coefficients. I think it is much more meaningful to describe the relationships between variables in
their natural metric.
That being said, only semi-standardized coefficients are available in logistic regression. Semi-standardized
coefficients are obtained by standardizing the independent variables (converting them to z sores) and using the
new standardized variables in the logistic regression. Alternatively, you can multiple the logged odds of each
independent variable (from your logistic regression) by each variable’s standard deviation.
It is impossible to have a truly standardized coefficient in logistic regression because both the independent and
dependent variables must be standardized to have a standardized coefficient that can be compared across
different regression models and you cannot convert a dichotomous variable to a standardized variable.
Standardizing a dummy variable only changes the values from 0 and 1 to two other values.
If you get semi-standardized coefficients, they allow you to compare the strength of different independent
variables when they are in the same model. Dichotomous independent variables can be standardized to
compare their relative strength to that of interval and ratio level variables even though a one standard deviation
unit increase in a dummy variable doesn’t make any sense.
Pampel says that semi-standardized coefficients are available in SPSS…if they are, I can’t find them anywhere.
Page 12 of 12