Download PPT Lecture Notes

Regression Outline of Today’s Discussion 1. Coefficient of Determination 2. Regression Analysis: Introduction 3. Regression Analysis: SPSS 4. Regression Analysis: Excel 5. Independent Predictors Part 1 Coefficient of Determination Coefficient of Determination In correlational research Researchers often use the “r-squared” statistic, also called the “coefficient of determination”, to describe the proportion of Y variability explained by X. Coefficient of Determination What range of values is possible for the coefficient of determination (the r-squared statistic)? Coefficient of Determination Example: What is the evidence that IQ is heritable? Coefficient of Determination R-value for the IQ of identical twins reared apart = 0.6. What is the value of r-squared in this case? Coefficient of Determination So what proportion of the IQ is unexplained (unaccounted for) by genetics? Coefficient of Determination Different sciences are characterized by the r-squared values that are deemed impressive. (Chemists might r-squared to be > 0.99). Coefficient of Determination As we have already seen r-squared is the same as “eta-squared”. Part 2 Regression Analysis Introduction Regression Analysis Introduction • Correlation is the process of finding a relationship between variables. • Regression is the process of finding the best-fitting trend (line) that describes the relationship between variables. • So, correlation and regression are very similar! Regression Analysis Introduction • The ‘r’ statistic can be tested for statistical significance! • Potential Pop Quiz Question: What two factors determine the critical value (i.e., the number to beat) when we engage in hypothesis testing? Regression Analysis Introduction DF for Correlation & Regression Here n stands for the number of pairs of scores. Why would this be n-2, rather than the usual n-1? Regression Analysis Introduction • In general, the formula for the degrees of freedom is the number of observations minus the number of parameters estimated. • For correlation, we have one estimate for the mean of X, and another estimate for the mean of Y. • For regression, we have one estimate for the slope, and another estimate for y intercept. Regression Analysis Introduction Slope can also be though of as “rise over run”. Regression Analysis Introduction The “rise” on the ordinate = Y2 - Y1. The “run” on the abscissa = X2 - X1. Regression Analysis Introduction “Rise over run” in pictures. Regression Analysis Introduction Here, the regression is “linear”… Regression Analysis Introduction Here, the regression is non-linear! What would the equation look like for this trend? Regression Analysis Introduction • Let’s now return to linear regression, and learn how to manually compute the slope and y-intercept. • To compute the slope, we need two quantities that we have already learned. These are SPxy (sums of products) and SSx (sums of squares for X)… Regression Analysis Introduction SPxy slope  SSx SPxy  (X  X )(Y  Y ) Regression Analysis Introduction Once we have the slope, it’s easy to get the y-intercept! intercept = Y - (s lope * X) Part 3 Regression Analysis: SPSS Regression Analysis: SPSS • Later we’ll go to SPSS and get some practice with regression. • The steps in SPSS will be Analyze ---> Regression --> Linear. • We will place the criterion (i.e., the Y-axis variable) in the “Dependent” box, and the predictor (i.e., the X-axis variable) in the “Independent(s)” box. • Click the “Statistics” box, and check “estimates”, “model fit”, and “descriptives”. Regression Analysis: SPSS The “Coefficients” Section In SPSS Output The “Coefficients” Section in the SPSS output contains all the info needed for the regression equation, the r statistic, and the evaluation of Ho (retain or reject). Regression Analysis: SPSS The “Coefficients” Section In SPSS Output The constant is the “b” in, Y = mX + b. Here, b = -9923.665 Regression Analysis: SPSS The “Coefficients” Section In SPSS Output The slope is the “m” in, Y = mX + b. Here, m = 1807.836 Regression Analysis: SPSS The “Coefficients” Section In SPSS Output So, our regression equation is, Y = mX + b. or Y = 1807.836X - 9923.665. Regression Analysis: SPSS The “Coefficients” Section In SPSS Output The r statistic is the standardized coefficient, Beta. r = .705 Regression Analysis: SPSS The “Coefficients” Section In SPSS Output Lastly, we look at the ‘sig’ value for the predictor, (which is “EDU” in this case) to determine whether predictor (x-axis variable) is significantly correlated with the criterion (y-axis variable). Evaluate Ho: …do we retain or reject? Part 4 Regression Analysis: Excel Regression Analysis: Excel • Correlation and regression are very similar. • If we have a significant correlation, the best-fitting regression line is said to have a slope significantly different from zero. • Sometimes it is stated that “the slope departs significantly from zero”. Regression Analysis: Excel • Note: A slope can be very modestly different from zero, and still be “statistically significant” if all data points fall very close to the line. • In correlation and regression, statistical significance is determined by the strength of the correlation between two variables (the r-value), and NOT by the slope of the regression line. • The significance of the r-value, as always, depends on the alpha level, and the df (which is n-2). Take a peak at the r-value table. Regression Analysis: Excel Regression Analysis: Excel • Remember: The regression line (equation) can help us predict one score, given another score, but only if there is a significant r-value. • The terminology w/b… “the regression line explains (or accounts for)” 42% of the variability in the scores (if rsquared = .42). • To “explain” or “account for” does NOT mean “to cause”. Correlation does not imply causation! Regression Analysis Continued • A synonym for regression is prediction! Recall that prediction is one of the four goals of the scientific method. What were the others? • A significant correlation implies a significant capacity for prediction, i.e., a prediction that is reliably better than chance! Regression Analysis Continued • The equation for a straight line, again, is: y = mx + B or Criterion = ( slope * Predictor) + Intercept • How many “parameters” in a linear equation? • How about a quadratic equation? Part 5 Independent Predictors Independent Predictors • So far, we’ve attempted to use regression for prediction. • Specifically, we’ve tried to predict one variable Y (called the criterion), using one other variable (called the predictor). • Multiple Regression - the process by which one variable Y (called the criterion) is predicted on the basis of more than one variable (say, X1, X2, X3…). Independent Predictors Here’s the simple case of one predictor variable. The overlap (in gray) indicates the predictive strength. Independent Predictors If the overlap in the Venn diagram were to grow, the r-value would grow, too! Independent Prediction Variable X1 Criterion (Y) Here’s the same thing again… but we’ll call the the predictor variable X1. Independent Prediction Variable X2 Variable X1 Criterion (Y) By adding another predictor variable X2, we could sharpen our predictions. Why? Independent Prediction Variable X2 Variable X1 Criterion (Y) Unfortunately, X1 and X2 provide some redundant information about Y, so the predictive increase is small. Independent Prediction Variable X2 Variable X1 Criterion (Y) Unfortunately, X1 and X2 provide some redundant information about Y, so the predictive increase is small. Independent Prediction Variable X2 Variable X1 Variable X3 Criterion (Y) By contrast, variable X3 has no overlap with either X1 or X2, so it would add the most new information. Independent Prediction Variable X2 Variable X1 Variable X3 Criterion (Y) In short, since all three predictors provide some unique information, predictions w/b best when using all three. Independent Prediction Variable X2 Variable X1 Variable X3 Criterion (Y) If you wanted to be more parsimonious and use only two of the three, which two would you pick, and why? Independent Predictors • That was a conceptual introduction to Multiple Regression (predicting Y scores from more than one variable). • We will not learn about the computations for multiple regression in this course (but you will if you take the PSYCH 370 course). • For our purposes, simply know that predictions improve to the extent that the various predictors are independent of each other.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download PPT Lecture Notes