Download Normal - Madison Public Schools

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Confidence interval wikipedia , lookup

Choice modelling wikipedia , lookup

Time series wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
AP Statistics
Notes 12.1
The population (true) regression line is µy = α + βx where µy is the mean response for any fixed value of
x, α is the intercept, and β is the slope. (α and β in this setting are in no way related to Type I and Type II
probabilities!)
As we’ve been doing in previous chapters, we will construct a confidence interval to estimate the slope β
of the population (true) regression line. This is what you need to know:
t Confidence Interval for the Slope of an LSRL:
b ± t*SEb
Standard error of the slope:
SE𝑏 = 𝑠
𝑠
𝑥 √𝑛−1
Conditions* (LINER):
Linear Examine the scatterplot to check that the overall pattern is roughly linear. Look for
curved patterns in the residual plot. Check to see that the residuals center on the
“residual = 0” line at each x-value in the residual plot.
Independent Look at how the data were produced. Random sampling and random assignment
help ensure the independence of individual observations. If sampling is done without
replacement, remember to check 10n < population.
Normal Make a stemplot, histogram, or Normal probability plot of the residuals and check for
clear skewness or other major departures from Normality.
Equal variance Look at the scatter of the residuals above and below the “residual = 0” line in
the residual plot. The amount of scatter should be roughly the same from the smallest to
the largest x-value.
Random See if the data were produced by random sampling or a randomized experiment.
Degrees of freedom:
n–2
* Because checking the conditions is so time-consuming, the statement “all conditions for inference have
been met” is usually given on the exam.
Examples:
Many people believe that students learn better if they sit closer to the front of the classroom. To
investigate, an AP Statistics teacher randomly assigned students to seat locations in his classroom for a
particular chapter and recorded the test score for each student at the end of the chapter. The explanatory
variable in this experiment is which row the students were assigned (row 1 is closest to the front and row
7 is the farthest away). Because there are no serious violations of the conditions, we should be safe
performing inference about the regression model in this setting.
Here is computer output for the least-squares regression analysis on the seating chart data.
Regression Analysis: Score versus Row
Predictor
Constant
Row
S = 10.0673
Coef
85.706
-1.1171
SE Coef
4.239
0.9472
R-Sq = 4.7%
T
20.22
-1.18
P
0.000
0.248
R-Sq(adj) = 1.3%
1. State the equation of the least-squares regression line. Define any variables you use.
2. Interpret the slope, y intercept (if possible), and standard deviation of the residuals.
3. Identify the standard error of the slope SEb from the computer output. Interpret this value in
context.
4. Calculate the 95% confidence interval for the true slope. Show your work.
5. Based on your interval, is there convincing evidence that seat location affects scores?