Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Unit 4: Inferences about a Single Quantitative Predictor Unit Organization First consider details of simplest model (one parameter estimate; mean-only model; no X’s) Next examine simple regression (two parameter estimates, one X for one quantitative predictor variable) These provide critical foundation for all linear models Subsequent units will generalize to one dichotomous predictor variable (Unit 5; Markus), multiple predictor variables (Units 6-7) and beyond…. 2 Linear Models as Models Linear models (including regression) are ‘models’ DATA = MODEL + ERROR Three general uses for models: Describe and summarize DATA (Ys) in a simpler form using MODEL Predict DATA (Ys) from MODEL • Will want to know precision of prediction. How big is error? Better prediction with less error. Understand (test inferences about) complex relationships between individual regressors (Xs) in MODEL and the DATA (Ys). How precise are estimates of relationship? MODELS are simplifications of reality. As such, there is ERROR. They also make assumptions that must be evaluated 3 Fear Potentiated Startle (FPS) We are interested in producing anxiety in the laboratory To do this, we develop a procedure where we expose people to periods of unpredictable electric shock administration alternating with periods of safety. We measure their startle response in the shock and safe periods. We use the difference between their startle during shock – safe to determine if they are anxious. This is called Fear potentiated startle (FPS). Our procedure works if FPS > 0. We need a model of FPS scores to 4 determine if FPS > 0. Fear Potentiated Startle: One parameter model A very simple model for the population of FPS scores would predict the same value for everyone in the population. ^ Yi = 0 We would like this value to be the “best” prediction. In the context of DATA = MODEL + ERROR, how can we quantify “best”? We want to predict some characteristic about the population of FPS scores that minimizes the ERROR from our model. ERROR = DATA – MODEL ^ i = Yi – Yi; There is an error (i) for each population score. How can we quantify total model error? 5 Total Error Sum of errors across all scores in the population isn’t ideal b/c positive and negative errors will tend to cancel each other ^ (Yi – Yi) Sum of absolute value of errors could work. If we selected 0 to minimize the sum of the absolute value of errors, 0 would equal the median of the population. ^ ( |Yi – Yi| ) Sum of squared errors (SSE) could work. If we selected 0 to minimize the sum of squared errors, 0 would equal the mean of the population. ^ (Yi – Yi)2 6 One parameter model for FPS For the moment, lets assume we prefer to minimize SSE (more on that in a moment). You should predict the population mean FPS for everyone. ^ Yi = 0 where 0 = What is the problem with this model and how can we fix this problem? We don’t know the population mean for FPS scores (). We can collect a sample from the population and use the sample mean (X) as an estimate of the population mean (). X is an unbiased estimate for 7 Model Parameter Estimation Population model ^ Yi = 0 where 0 = Yi = 0 + i Estimate population parameters from sample ^ Yi = b0 where b0 = X Yi = b0 + ei 8 Least Squares Criterion In ordinary least squares (OLS) regression and other least squares linear models, the model parameter estimates (e.g., b0) are calculated such that they minimize the sum of squared errors (SSE) in the sample in which you estimate the model. ^ SSE = (Yi – Yi)2 SSE = ei2 9 Properties of Parameter Estimates There are 3 properties that make a parameter estimate attractive. Unbiased: Mean of the sampling distribution for the parameter is equal to the value for that parameter in the population. Efficient: The sample estimates are close to the population parameter. In other words, the narrower the sampling distribution for any specific sample size N, the more efficient the estimator. Efficient means small SE for parameter estimate Consistent: As the sample size increases, the sampling distribution becomes narrower (more efficient). Consistent means as N increases, SE for parameter estimate decreases 10 Least Squares Criterion If the i are normally distributed, both the median and the mean are unbiased and consistent estimators. The variance of the sampling distribution for the mean is: 2 N The variance of the sampling distribution for the median is: 2 2N Therefore the mean is the more efficient parameter estimate. For this reason, we tend to prefer to estimate our models by minimizing the sum of squared errors. 11 Fear-potentiated startle during Threat of Shock > setwd("C:/Users/LocalUser/Desktop/GLM") > d = lm.readDat('4_SingleQuantitative_FPS.dat') > str(d) 'data.frame': 96 obs. of 2 variables: $ BAC: num 0 0 0 0 0 0 0 0 0 0 ... $ FPS: num -98.098 -22.529 0.463 1.194 2.728 ... > head(d) BAC FPS 0125 0 -98.0977778 0013 0 -22.5285000 0113 0 0.4632944 0116 0 1.1943667 0111 0 2.7280444 0014 0 6.7237833 > some(d) BAC FPS 0111 0.0000 2.728044 1121 0.0235 43.901667 1126 0.0395 14.181344 1113 0.0495 53.176722 1124 0.0580 11.859050 1112 0.0605 45.181778 2112 0.0730 162.736611 2016 0.0750 30.453111 2023 0.0925 19.598722 3112 0.1085 14.603611 12 Descriptives and Univariate Plots > lm.describeData(d) var n mean sd median min max skew kurtosis BAC 1 96 0.06 0.04 0.06 0.0 0.14 -0.09 -1.09 FPS 2 96 32.19 37.54 19.46 -98.1 162.74 0.62 1.93 > windows() #on MAC, use quartz() > par('cex' = 1.5, 'lwd'=2) > hist(d$FPS) 13 FPS Experiment: The Inference Details Goal: Determine if our shock threat procedure is effective at potentiating startle (increasing startle during threat relative to safe) Create a simple model of FPS scores in the population FPS = 0 Collect sample of N=96 to estimate 0 Calculate sample parameter estimate (b0) that minimizes SSE in sample Use b0 to test hypotheses H0: 0 = 0 Ha: 0 <> 0 14 Estimating a one parameter model in R m = lm(FPS ~ 1, data = d) > summary(m) Call: lm(formula = FPS ~ 1, data = d) Residuals: Min 1Q -130.29 -25.40 Median -12.73 3Q 18.27 Max 130.55 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 32.191 3.832 8.402 4.26e-13 *** --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 37.54 on 95 degrees of freedom 15 Errors/Residuals summary(m) Call: lm(formula = FPS ~ 1, data = d) Residuals: Min 1Q -130.29 -25.40 Median -12.73 3Q 18.27 Max 130.55 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 32.191 3.832 8.402 4.26e-13 *** --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 37.54 on 95 degrees of freedom This is simple descriptive information about the errors (ei) ^ ei = (Yi – Yi) 16 Errors/Residuals ^) ei = (Yi – Y i R can report errors for each individual in the sample: > residuals(m) 0125 0013 -130.2886183 -54.7193405 0011 0021 -12.6999127 -2.7175738 0023 0026 19.6643817 35.0963817 1121 1014 11.7108262 34.7095484 1115 1021 -24.1598966 61.4058262 2014 1114 -35.5108960 -29.1695572 2122 1112 27.5951151 12.9909373 2113 2116 5.2138262 -27.4784549 2125 2111 -23.4260627 9.3974373 3023 2114 -14.5139683 -13.0036627 3022 3021 6.1488262 -18.1054016 3112 3115 -17.5872294 -26.9565572 0113 -31.7275460 0121 1.5669373 0114 36.5672151 1122 -25.2434516 1116 -19.2076849 1022 67.7537262 2011 -12.7704849 2112 130.5457706 2126 4.5087151 2021 0.3123817 3014 -17.1700072 3111 -26.9358794 0116 -30.9964738 0016 8.3829373 0126 53.1913817 1025 14.2658262 1013 -14.1219516 1123 -18.4250627 2013 -58.2597294 3024 1.6943262 2026 37.4066428 3026 -25.3833322 3125 -31.8114616 3124 -31.4756127 0111 -29.4627960 0123 9.3662151 0015 57.4678817 1023 104.6339928 1024 36.5526595 1026 -16.7506349 2022 -35.5405016 2025 -20.6439405 2121 17.0578817 2023 -12.5921183 3011 77.8639373 3126 -15.9328183 You can get the SSE easily: > sum(residuals(m)^2) [1] 133888.3 0014 -25.4670572 0122 11.7176040 0024 58.6873817 1126 -18.0094960 1111 -29.6592294 1016 47.3338484 1015 17.9774928 2123 -28.0089794 2012 3.9210484 3116 -32.1845627 3122 -34.0540127 3113 -25.7722738 0124 -25.3710072 0012 16.2161040 0112 72.7258928 1125 -8.4337683 1113 20.9858817 3025 -21.4997294 2124 -28.5735072 2016 -1.7377294 2115 -9.8150183 3121 -31.0086183 3114 4.8073817 3012 -21.4575572 0022 -16.8541238 0115 19.1260706 0025 78.7543262 1011 29.8681317 1012 59.8164373 1124 -20.3317905 2024 25.0772151 2015 -32.0183927 3015 27.4325484 3013 -18.5716738 3016 -26.4386960 3123 -17.4630572 17 Standard Error of Estimate summary(m) Call: lm(formula = FPS ~ 1, data = d) Residuals: Min 1Q -130.29 -25.40 Median -12.73 3Q 18.27 Max 130.55 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 32.191 3.832 8.402 4.26e-13 *** --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 37.54 on 95 degrees of freedom This is the standard error of estimate. It an estimate of the standard deviation of i ^ (Yi – Yi)2 SSE N-P N-P NOTE: for mean-only model, this is sY 18 Coefficients (Parameter Estimates) summary(m) Call: lm(formula = FPS ~ 1, data = d) Residuals: Min 1Q -130.29 -25.40 Median -12.73 3Q 18.27 Max 130.55 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 32.191 3.832 8.402 4.26e-13 *** --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 37.54 on 95 degrees of freedom This is b0, the unbiased sample estimate of 0, and its standard error. It is also called the intercept in regression ^ =b ^ = 32.2 (more on this later). Y Y i 0 i > coef(m) (Intercept) 32.19084 19 Predicted Values ^ Yi = 32.19 You can get the predicted value for each individual in the sample using this model: > fitted.values(m) 0125 0013 0113 32.19084 32.19084 32.19084 0123 0122 0012 32.19084 32.19084 32.19084 1121 1014 1122 32.19084 32.19084 32.19084 1024 1111 1113 32.19084 32.19084 32.19084 2122 1112 2011 32.19084 32.19084 32.19084 2025 2123 2016 32.19084 32.19084 32.19084 3023 2114 2021 32.19084 32.19084 32.19084 3011 3122 3114 32.19084 32.19084 32.19084 0116 32.19084 0115 32.19084 1025 32.19084 1012 32.19084 2013 32.19084 2015 32.19084 3026 32.19084 3016 32.19084 0111 32.19084 0023 32.19084 1023 32.19084 2014 32.19084 2022 32.19084 2125 32.19084 2023 32.19084 3112 32.19084 0014 32.19084 0026 32.19084 1126 32.19084 1114 32.19084 1015 32.19084 2111 32.19084 3116 32.19084 3115 32.19084 0124 32.19084 0114 32.19084 1125 32.19084 1022 32.19084 2124 32.19084 2126 32.19084 3121 32.19084 3111 32.19084 0022 32.19084 0126 32.19084 1011 32.19084 1123 32.19084 2024 32.19084 2026 32.19084 3013 32.19084 3124 32.19084 0011 32.19084 0015 32.19084 1115 32.19084 1026 32.19084 2113 32.19084 2121 32.19084 3022 32.19084 3126 32.19084 0021 32.19084 0024 32.19084 1021 32.19084 1016 32.19084 2116 32.19084 2012 32.19084 3021 32.19084 3113 32.19084 0121 32.19084 0112 32.19084 1116 32.19084 3025 32.19084 2112 32.19084 2115 32.19084 3014 32.19084 3012 32.19084 0016 32.19084 0025 32.19084 1013 32.19084 1124 32.19084 3024 32.19084 3015 32.19084 3125 32.19084 3123 32.19084 20 Testing Inferences about 0 summary(m) Call: lm(formula = FPS ~ 1, data = d) Residuals: Min 1Q -130.29 -25.40 Median -12.73 3Q 18.27 Max 130.55 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 32.191 3.832 8.402 4.26e-13 *** --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 37.54 on 95 degrees of freedom This is the t-statistic to test the H0 that 0 = 0. The probability (p-value) of obtaining a sample b0 = 32.2 if H0 is true (0 = 0) < .0001. Describe the logic of how this was determined? 21 Sampling Distribution: Testing Inferences about 0 H0: 0 = 0; Ha: 0 <> 0 If H0 is true, the sampling distribution for 0 will have a mean of 0. We can estimate standard deviation of the sampling distribution with SE for b0. t (df=N-P) = b0 – 0 SEb0 = 32.2 – 0 = 8.40 3.8 b0 is approximately 8 standard deviations above the expected mean of the distribution if H0 is true pt(8.40,95,lower.tail=FALSE) * 2 [1] 4.293253e-13 The probability of obtaining a sample b0 = 32.2 if H0 is true is very low (< .05). Therefore we reject H0 And conclude that 0 <> 0 and b0 is our best (unbiased) estimate of it. 22 Statistical Inference and Model Comparisons Statistical inference about parameters is fundamentally about model comparisons You are implicitly (t-test of parameter estimate) or explicitly (F-test of model comparison) comparing two different models of your data We follow Judd et al and call these two models the compact model and the augmented model. The compact model will represent reality as the null hypothesis predicts. The augmented model will represent reality as the alternative hypothesis predicts. The compact model is simpler (fewer parameters) than the augmented model. It is also nested in the augmented model (i.e. a subset of parameters) 23 Model Comparisons: Testing inferences about 0 ^ = FPS i 0 H0: 0 = 0 Ha: 0 <> 0 ^ Compact model: FPSi = 0; ^ = ( b ) Augmented model: FPS i 0 0 We estimate 0 parameters (P=0) in this compact model We estimate 1 parameter (P=1) in this augmented model Choosing between these two models is equivalent to testing if 0 = 0 as you did with the t-test 24 Model Comparisons: Testing inferences about 0 ^ =0 Compact model: FPS i ^ = ( b ) Augmented model: FPS i 0 0 We can compare (and choose between) these two models by comparing their total error (SSE) in our sample ^ )2 SSE = (Yi – Y i ^ SSE(C) = (Yi – Yi)2 = (Yi – 0)2 > sum((d$FPS - 0)^2) [1] 233368.3 SSE(A) = (Yi – Yi)2 = (Yi – 32.19)2 > sum((d$FPS – coef(m)[1])^2 #(sum(residuals(m)^2) [1] 133888.3 25 Model Comparisons: Testing inferences about 0 Compact model: SSE = 233,368.3 P=0 ^ FPSi = 0; ^ Augmented model: FPSi = 0 ( b0) SSE = 133,888.3 P=1 F (PA – PC, N – PA) = F (1– 0, 96 – 1) = (SSE(C) -SSE(A)) / (PA-PC) SSE(A) / (N-PA) (233368.3-133888.3) / (1 - 0) 133888.3 / (96 - 1) F(1,95) = 70.59, p < .0001 > pf(70.58573,1,95, lower.tail=FALSE) [1] 4.261256e-13 26 Effect Sizes Your parameter estimates are descriptive. The describe effects in the original units of the (IVs) and DV. Report them in your paper There are many other effect size estimates available. You will learn two that prefer. Partial eta2 (p2): Judd et al call this PRE (proportional reduction in error) Eta2 (2): This is also commonly referred to as R2 in regression. 27 Sampling Distribution vs. Model Comparison The two approaches to testing H0 about parameters (0, j) are statistically equivalent They are complementary approaches with respect to conceptual understanding of GLMs Sampling distribution Focus on population parameters and their estimates Tight connection to sampling and probability distributions Understanding of SE (sampling error/power; confidence intervals; graphic displays) Model comparison Focus on models themselves increase Highlights model fit (SSE) and model parsimony (P) Clearer link to PRE (p2) Test comparisons that differ by > 1 parameter (discouraged) 28 Partial Eta2 or PRE Compact model: SSE = 233,368.3 P=0 ^ FPSi = 0; ^ Augmented model: FPSi = 0 ( b0) SSE = 133,888.3 P=1 How much was the error reduced in the augmented model relative to the compact model? SSE(C) – SSE(A) SSE (C) = 233,368.3 - 133,888.3 233,368.3 = .426 Our more complex model that includes 0 reduces prediction error (SSE) by approximately 43%. Not bad! 29 Confidence Interval for b0 A confidence interval (CI) is an interval for a parameter estimate in which you can be fairly confident that you will capture the true population parameter (in this case, 0). Most commonly reported is the 95% CI. Across repeated samples, 95% of the calculated CIs will include the population parameter*. > confint(m) 2.5 % 97.5 % (Intercept) 24.58426 39.79742 Given what you now know about confidence intervals and sampling distributions, what should the formula be? CI (b0) = b0 + t (; N-P) * SEb0 For the 95% confidence interval this is approximately + 2 SEs around our unbiased estimate of 0 30 Confidence Interval for b0 How can we tell if a parameter is “significant” from the confidence interval? If a parameter <> 0 at = .05, then the 95% confidence interval for its parameter estimate should not include 0. This is also true for testing whether the parameter is equal to any other True for any other non-zero point estimate for the parameter non-zero value 31 The one parameter (mean-only) model: Special Case What special case (specific analytic test) is statistically equivalent to the test of the null hypothesis: 0 = 0 in the one parameter model? The one sample t-test testing if a population mean = 0. > t.test(d$FPS) One Sample t-test data: d$FPS t = 8.4015, df = 95, p-value = 4.261e-13 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: 24.58426 39.79742 sample estimates: mean of x 32.19084 32 Testing 0 = non-zero values How could you test an H0 regarding 0 = some value other than 0 (e.g., 10)? HINT: There are at least three methods. ^ = ) to Option 1: Compare SSE for the augmented model (Y i 0 ^ SSE from a different compact model for this new H0 (Yi = 10) Option 2: Recalculate t-statistic using this new H0. t = b0 – 10 SEb0 Option 3: Does the confidence interval for the parameter estimate contain this other value? No p-value provided. > confint(m) 2.5 % 97.5 % (Intercept) 24.58426 39.79742 33 Intermission….. One parameter (0) “mean-only” model Description: b0 describes mean of Y Prediction: b0 is predicted value that minimizes sample SSE Inference: Use b0 to test if 0 = 0 (default) or any other value. One sample t-test. Two parameter (0, 1) model Description: b1 describes how Y changes as function of X1. b0 describes expected value of Y at specific value (0) for X1. Prediction: b0 and b1 yield predicted values that vary by X1 and minimize SSE in sample. Inference: Test if 1 = 0. Pearson’s r; independent sample t-test. Test if 0 = 0. Analogous to one-sample t-test controlling for X1, if X1 is mean-centered. Very flexible! 34 Two Parameter (One Predictor) models We started with a very simple model of FPS: FPS = 0 What if some participants were drunk and we knew their blood alcohol concentrations (BAC). Would it help? What would the model look like? What question (s) does this model allow us to test? Think about it 35 The Two Parameter Model DATA = MODEL + ERROR Yi = 0 + 1 * X1 + i ^ Yi = 0 + 1 * X1 i = Yi - ^ Yi ^ FPSi = 0 + 1 * BAC1 36 The Two Parameter Model ^ Yi = 0 + 1 * X1 As before, the population parameters in the model (0 , 1) are estimated by b0 & b1 calculated from sample data based on the least squares criterion such that they minimize SSE in the sample data. Sample model ^ b +b *X Y i= 0 1 1 To derive these parameter estimates you must solve series of simultaneous equations using linear algebra and matrices (see supplemental reading). Or use R! 37 Least Squares Criterion ^ ei = Yi – Y i SSE = ei2 38 Interpretation of b0 in Two Parameter Model ^ Yi = b0 + b1 * X1 b0 is predicted value for Y when X1 = 0. Graphically, this is the Y intercept for the regression line (Value of Y where regression line crosses Y-axis at X1 = 0*). Approximately what is b0 in this example? 42.5 39 Interpretation of b0 in Two Parameter Model IMPORTANT: Notice that b0 is very different (42.5) in the two parameter model than in previous one parameter model (32.2) WHY? In the one parameter model b0 was our sample estimate of the mean FPS score in everyone. b0 in the two parameter model is our sample estimate of the mean FPS score for people with BAC = 0, not everyone. 40 Interpretation of b1 in Two Parameter Model ^ b +b *X Y i= 0 1 1 b1 is the predicted change in Y for every one unit change in X1. Graphically it is represented by the slope of the line regression line. If you understand the units of your predictor and DV, this is an attractive description of their relationship. ^ = 42.5 + -184.1 * BAC FPS i i For every 1% increase in BAC, FPS decreases by 184.1 microvolts. For every .01% increase in BAC, FPS decreases by 1.841 microvolts. 41 Testing Inferences about 1 Does alcohol affect people’s anxiety? ^ = + * BAC FPS i 0 1 i What are your null and alternative hypotheses about model parameter to evaluate this question? H0: 1 = 0 Ha: 1 <> 0 If 1 = 0, this means that FPS does not change with changes in BAC. In other words, there is no effect of BAC on FPS. If 1 < 0, this means that FPS decreases with increasing BAC (People are less anxious when drunk) If 1 > 0, this means FPS increases with increasing BAC (i.e., people are more anxious when drunk). 42 Estimating a Two Parameter Model in R > m2 = lm(FPS ~ BAC, data = d) > summary(m2) Call: lm(formula = FPS ~ BAC, data = d) Residuals: Min 1Q -140.555 -21.565 Median -8.289 3Q 15.638 Max 133.718 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 42.457 6.548 6.484 4.11e-09 *** BAC -184.092 95.894 -1.920 0.0579 . --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 37.02 on 94 degrees of freedom Multiple R-squared: 0.03773, Adjusted R-squared: 0.02749 43 F-statistic: 3.685 on 1 and 94 DF, p-value: 0.05792 Testing Inferences about 1 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 42.457 6.548 6.484 4.11e-09 *** BAC -184.092 95.894 -1.920 0.0579 . Does BAC affect FPS? Explain this conclusion in terms of the parameter estimate, b1 and its standard error Under the H0: 1 = 0, the sampling distribution for 1 will have a mean of 0 with an estimated standard deviation of 95.894. t (96-1) = -184.092 – 0 = -1.92 95.894 Our this value of the parameter estimate, b1, is 1.92 standard deviations below the expected mean of the sampling distribution for H0. > pt(-1.92, 94, lower.tail = TRUE)*2 [1] 0.05788984 A b1 of this size is not unlikely under the null, therefore you fail to 44 reject the null and conclude that BAC has no effect on FPS Testing Inferences about 1 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 42.457 6.548 6.484 4.11e-09 *** BAC -184.092 95.894 -1.920 0.0579 . One tailed p-value > pt(-1.92, 94, lower.tail = TRUE) [1] 0.02894492 Two-tailed p-value > pt(-1.92, 94, lower.tail = TRUE)*2 [1] 0.05788984 H0: 1 = 0 Ha: 1 <> 0 45 Model Comparison: Testing Inferences about 1 H0: 1 = 0 Ha: 1 <> 0 What two models are you comparing when you test hypotheses about 1? Describe the logic. ^ Compact Model: FPS = + 0 * BAC i 0 i PC = 1 SSE(C) = 133888.3 ^ Augmented Model: FPSi = 0 + 1 * BACi PA = 2 SSE(A) = 128837.1 F(PA-PC, N-PA) = SSE(C) – SSE(A) / (PA-PC) SSE(A) / (N-PA) F(1,94) = 3.685383, p = 0.05792374 46 Sum of Squared Errors If there is a perfect relationship between X1 and Y in your sample, what will the SSE be in the two parameter model and why? SSE(A) = 0 . All data points will fall perfectly on the regression line. All errors will be 0 If there is no relationship at all between X1 and Y in your sample (b1 = 0), what will the SSE be in the two parameter model and why? SSE(A) = SSE of the mean-only model. X1 provides no additional information about the DV. Your best prediction will still be the mean of the DV. 47 Testing Inferences about 0 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 42.457 6.548 6.484 4.11e-09 *** BAC -184.092 95.894 -1.920 0.0579 . What is the interpretation of b0 in this two parameter model? It is the predicted FPS for a person with BAC = 0 (sober). The test of this parameter estimate could inform us if the shock procedure worked among our sober participants. This is probably a more appropriate manipulation check than testing if it worked in everyone including drunk people given that alcohol could have reduced FPS. What two models are being compared? ^ Compact Model: FPSi = 0 + 1 * BACi ^ = + * BAC Augmented Model: FPS i 0 1 i 48 Confidence Interval for bj or b0 You can provide confidence intervals for each parameter estimate in your model. > confint(m2) 2.5 % 97.5 % (Intercept) 29.45597 55.457721 BAC -374.49261 6.308724 The underlying logic from your understanding of sampling distributions remains the same CI (b) = b + t (;N-P) * SEb where P = total # of parameters How can we tell if a parameter is “significant” from the confidence interval? If a parameter is <> 0, at = .05, then the 95% confidence interval should not include 0. True for any other non-zero value for b as well. 49 Partial Eta2 or PRE for 1 How can you calculate the effect size estimate partial eta2 (PRE) for 1 Compare the SSE across the two relevant models ^ = + 0 * BAC Compact Model: FPS i 0 i SSE(C) = 133888.3 ^ FPSi = 0 + 1 * BACi Augmented Model: SSE(A) = 128837.1 SSE(C) – SSE(A) SSE (C) = 133888.3 - 128837.1 133888.3 = . 0.038 Our augmented model that includes a non-zero effect for BAC reduces prediction error (SSE) by only 3.8% over the compact model that fixes this parameter at 0. 50 Partial Eta2 or PRE for 0 How can you calculate the effect size estimate partial eta2 (PRE) for 0 Compare the SSE across the two relevant models ^ = 0 + * BAC Compact Model: FPS i 1 i SSE(C) = 186462.4 ^ FPSi = 0 + 1 * BACi Augmented Model: SSE(A) = 128837.1 SSE(C) – SSE(A) SSE (C) = 186462.4 - 128837.1 186462.4 = 0.309 Our augmented model that allows FPS to be non-zero for people with BAC=0 (sober people) reduces reduces prediction error (SSE) by 30.9% from the model that fixes FPS 51 at 0 when BAC =0! Coefficient of Determination (R2) Coefficient of Determination (R2): Proportion of explained variance (i.e., proportion of variance in Y accounted for by all Xs in model). DATA = MODEL + ERROR For individuals: Yi = Yi + ei With respect to variances: sYi2 = s^Yi2 + sei2 R2 = s^Yi2 sYi2 > var(fitted.values(m2))/ var(d$FPS) [1] 0.03772707 52 Coefficient of Determination (R2) > m2 = lm(FPS ~ BAC, data = d) > summary(m2) Call: lm(formula = FPS ~ BAC, data = d) Residuals: Min 1Q -140.555 -21.565 Median -8.289 3Q 15.638 Max 133.718 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 42.457 6.548 6.484 4.11e-09 *** BAC -184.092 95.894 -1.920 0.0579 . --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 37.02 on 94 degrees of freedom Multiple R-squared: 0.03773, Adjusted R-squared: 0.02749 53 F-statistic: 3.685 on 1 and 94 DF, p-value: 0.05792 R2 and the Mean-Only Model Why did the mean-only model not have an R2? It explained no variance in Yi because it predicted the same value (mean) for every person. The variance of the predicted values is 0 in the mean-only model In fact, the SSE for the mean-only model is the numerator of the formula for the variance for Yi ^ SSE = (Yi – Y)2 S2 = (Yi – Y)2 N-1 54 R2 and the Mean-Only Model The mean-only model is used in an alternative conceptualization of R2 for any augmented model R2 = SSE(Mean-only) - SSE(A) SSE(Mean-only) ^ = Mean-Only Model: FPS i 0 SSE(Mean-only) = 133888.3 Augmented Model: SSE(A) = 128837.1 ^ FPSi = 0 + 1 * BACi R2 = 133888.3 - 128837.1 = 0.03773 133888.3 In this augmented model, R2 is fully accounted for by BAC. In more complex models, R2 will be the aggregate of multiple 55 predictors. Test of 1 in Two Parameter Model: Special Case When both the predictor variable and the dependent variable are quantitative, the test of 1 = 0 is statistically equivalent to the what other common statistical test The test of the Pearson’s correlation coefficient, r > corr.test(d$BAC, d$FPS) Correlation matrix [1] -0.194 Sample Size [,1] [1,] 96 Probability values adjusted for multiple tests. [,1] [1,] 0.058 Furthermore r2 = R2 for this model only. -.1942 = .038 56 Visualizing the Model e = effect('BAC', m2) plot(e) 57 Displaying Model Results Error bars/bands You are predicting the mean Y for any X. There is a sampling distribution around this mean. The true population mean Y for any X is uncertain. You can display this uncertainty by displaying information about the sampling distribution at any/every X. This is equivalent to error bars in ANOVA. 58 ^ Error Band for Yi plot(d$BAC,d$FPS, xlim = c(0,.15), xlab = 'Blood Alcohol Concentration', ylab = 'Fear-potentiated startle') dNew = data.frame(BAC= seq(0,.14,.0001)) pY = lm.pointEstimates(m2,dNew) lines(dNew$BAC,pY[,1], col='red', lwd=2) lines(dNew$BAC,pY[,2], col='gray', lwd=.5) lines(dNew$BAC,pY[,3], col='gray', lwd=.5) effect() dsiplays 95% CI for However, I prefer + 1 SE . 59 ^ Error Band for Yi Why are the error bands not linear? Model predictions are better (less error) near the center of your data (Xi). Regression line will always go through mean of X and Y. Small changes in b1 across samples will ^ produce bigger variation in Yi at the edge of the model (far from the mean X). ^ FPSi = 42.5 + -184.1 * BACi confint(m2) 2.5 % 97.5 % (Intercept) 29.45597 55.457721 BAC -374.49261 6.308724 60 Error Band for Compare to the SE for b0 b0 is simply the predicted value for Y when X = 0. We can use additive transformations of X to make tests of the predicted value at X = 0. Most common in repeated measures designs but used elsewhere as well. 61 Publication Quality Figure 62