Download Multiple Correlation/ Regression as a Simplification of the GLM

A Brief Introduction to Multiple Correlation/Regression as a Simplification of the Multivariate General Linear Model In its most general form, the GLM (General Linear Model) relates a set of p predictor variables (X1 through Xp) to a set of q criterion variables (Y1 through Yq). We shall now briefly survey three special cases of the GLM, univariate means, bivariate correlation/regression and multiple correlation/regression. The Univariate Mean: A One Parameter (a) Model If there is only one Y and no X, then the GLM simplifies to the computation of a mean. We apply the least squares criterion to reduce the squared deviations between Y and predicted Y to the smallest value possible for a linear model. The prediction equation is Yˆ  Y . Error in prediction is estimated by s  (Y  Y )2 . n 1 Bivariate Regression: A Two Parameter (a and b) Model If there is only one X and only one Y, then the GLM simplifies to the simple bivariate linear correlation/regression with which you are familiar. We apply the least squares criterion to reduce the squared deviations between Y and predicted Y to the smallest value possible for a linear model. 2 Y  Ŷ is minimal. The GLM is reduced to That is, we find a and b such that for Yˆ  a  bX , the   Y  a  bX  e  Yˆ  e , where e is the "error" term, the deviation of Y from predicted Y. The coefficient "a" is the Y-intercept, the value of Y when X = 0 (the intercept was the mean of Y in the one parameter model above), and "b" is the slope, the average amount of change in Y per unit (Y  Yˆ )2 change in X. Error in prediction is estimated by sest _ Y  . n 1 Although the model is linear, that is, specifies a straight line relationship between X and Y, it may be modified to test nonlinear models. For example, if you think that the function relating Y to X is quadratic, you employ the model Y  a  b1 X  b2 X 2  e . It is often more convenient to work with variables that have all been standardized to some common mean and some common SD (standard deviation) such as 0, 1 (Z-scores). If scores are so standardized, the intercept, "a," drops out (becomes zero) and the standardized slope, the number of standard deviations that predicted Y changes for each change of one SD in X, is commonly referred to as . In a bivariate regression,  is the Pearson r. If r = 1, then each change in X of one SD is associated with a one SD change in predicted Y. The variables X and Y may be both continuous (Pearson r), one continuous and one dichotomous (point biserial r), or both dichotomous (). Multiple Correlation/Regression In multiple correlation/regression, one has two or more predictor variables but only one criterion variable. The basic model is Yˆ  a  b1 X 1  b2 X 2    bp X p or, employing standardized scores, Zˆ Y   1Z1   2 Z 2     p Z p . Again, we wish to find regression coefficients that produce a  Copyright 2012, Karl L. Wuensch - All rights reserved. MV\MultReg\IntroMR.docx 2 predicted Y that is minimally deviant from observed Y, by the least squares criterion. We are creating a linear combination of the X variables, a  b1 X 1  b2 X 2    b p X p , that is maximally correlated with Y. That is, we are creating a superordinate predictor variable that is a linear combination of the individual predictor variables, with the weighting coefficients (b1  bp) chosen such that the Pearson r between the criterion variable and the linear combination is maximal. The value of this r between Y and the best linear combination of X’s is called R, the multiple correlation coefficient. Note that the GLM is not only linear, but additive. That is, we assume that the weighted effect of X1 combines additively with the weighted effect of X2 to determine their joint effect, a  b1 X 1  b2 X 2 , on predicted Y. As a simple example of multiple regression, consider using high school GPA and SAT scores to predict college GPA. R would give us an indication of the strength of the association between college GPA and the best linear combination of high school GPA and SAT scores. We could additionally look at the  weights (also called standardized partial regression coefficients) to determine the relative contribution of each predictor variable towards predicting Y. These coefficients are called partial coefficients to emphasize that they reflect the contribution of a single X in predicting Y in the context of the other predictor variables in the model. That is, how much does predicted Y change per unit change in Xi when we partial out (remove, hold constant) the effects of all the other predictor variables. The weight applied to Xi can change dramatically if we change the context (add one or more additional X or delete one or more of the X variables currently in the model). An X which is highly correlated with Y could have a low weight simply because it is redundant with another X in the model. Rather than throwing in all of the independent variables at once (a simultaneous multiple regression) we may enter them sequentially . With an a priori sequential analysis (also called a hierarchical analysis), we would enter the predictors variables in some a priori order. For example, for predicting college GPA, we might first enter high school GPA, a predictor we consider “highpriority” because it is cheap (all applicants can provide it at low cost). We would compute r2 and interpret it as the proportion of variance in Y that is “explained” by high school GPA. Our next step might be to add SAT-V and SAT-Q to the model and compute the multiple regression for Yˆ  a  b1 X1  b2 X 2  b3 X 3 . We entered SAT scores with a lower priority because they are more expensive to obtain - not all high school students have them and they cost money to obtain. We enter them together because you get both for one price. This is called setwise entry. We now compare the R2 (squared multiple correlation coefficient) with the r2 previously obtained to see how much additional variance in Y is explained by adding X2 and X3 to the X1 already in the model. If the increase in R2 seems large enough to justify the additional expense involved in obtaining the X2 and X3 information, we retain X2 and X3 in the model. We might then add a yet lower priority predictor, such as X4, the result of an on-campus interview (costly), and see how much further the R2 is increased, etc. In other cases we might first enter nuisance variables (covariates) for which we wish to achieve “statistical control” and then enter our predictor variable(s) of primary interest later. For example, we might be interested in the association between the amount of paternal care a youngster has received (Xp) and how healthy e is (Y). Some of the correlation between Xp and Y might be due to the fact that youngsters from “good” families get lots of maternal (Xm) care and lots of paternal care, but it is the maternal care that causes the youngsters' good health. That is, Xp is correlated with Y mostly because it is correlated with Xm which is in turn causing Y. If we want to find the effect of Xp on Y we could first enter Xm and compute r2 and then enter Xp and see how much R2 increases. By first entering the covariate, we have statistically removed (part of) its effect on Y and obtained a clearer picture of the effect of Xp on Y (after removing the confounded nuisance variable’s effect). This is, however, very risky business, because this adjustment may actually remove part of (or all) of 3 the actual causal effect of Xp on Y. For example, it may be that good fathers give their youngsters lots of care, causing them to be healthy, and that mothers simply passively respond, spending more time with (paternally caused) healthy youngsters than with unhealthy youngsters. By first removing the noncausal “effect” of Xm on Y we, with our maternal bias, would have eliminated part of the truly causal effect of Xp on Y. Clearly our a priori biases can affect the results of such a squential analyses. Stepwise multiple regression analysis employs one of several available statistical algorithms to order the entry (and/or deletion) of predictors from the model being constructed. I opine that stepwise analysis is one of the most misunderstood and abused statistical procedures employed by psychologists. Many psychologists mistakenly believe that such an analysis will tell you which predictors are importantly related to Y and which are not. That is a very dangerous delusion. Imagine that among your predictors are two, let us just call them A and B, each of which is well correlated with the criterion variable, Y. If A and B are redundant (explain essentially the same portion of the variance in Y), then one, but not both, of A and B will be retained in the final model constructed by the stepwise technique. Whether it is A or B that is retained will be due to sampling error. In some samples A will, by chance, be just a little better correlated with Y than is B, while in other samples B will be, by chance, just a little better correlated with Y than is A. With your sample, whether it is A or B that is retained in the model does not tell you which of A and B is more importantly related to Y. I strongly recommend against persons using stepwise techniques until they have received advanced instruction in their use and interpretation. See this warning. Assumptions There are no assumptions involved in computing point estimates of the value of R, a, bi, or sest_Y, but as soon as you use t or F to put a confidence interval on your estimate of one of these or test a hypothesis about one of these there are assumptions. Exactly what the assumptions are depends on whether you have adopted a correlation model or a regression model, which depends on whether you treat the X variable(s) as fixed (regression) or random (correlation). Review this distinction between regression and correlation in the document Bivariate Linear Correlation and then work through my lesson on Producing and Interpreting Residuals Plots in SAS. Return to Wuensch’s Stats Lessons Page Copyright 2012, Karl L. Wuensch - All rights reserved.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Multiple Correlation/ Regression as a Simplification of the GLM