Download Module 14 Review Questions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Choice modelling wikipedia , lookup

Time series wikipedia , lookup

Regression analysis wikipedia , lookup

Coefficient of determination wikipedia , lookup

Linear regression wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Transcript
Chapter 14 Review Questions
1.
Which of the following is NOT one of the basic assumptions that must be satisfied in order to
perform inference for regression of y on x?
(a) For each value of x, the corresponding population of y-values is normally distributed.
(b) The standard deviation  of the population of y-values corresponding to a particular value of
x is always the same regardless of the specific value of x.
(c) The sample size (the number of paired observations (x, y) in the sample data) exceeds 30.
(d) There exists a straight line y =  +  x such that for each value of x, the mean µy of the
corresponding population of y-values lies on that straight line.
2.
If the assumptions for regression inference are met, then a normal probability plot of the residuals
should be
(a) Bell shaped
(b) A group of randomly scattered points
(c) Roughly linear
(d) Clearly curved
3.
If a test of hypotheses rejects H0:  = 0 in favor of the alternative hypothesis Ha:  > 0, where  is
the population regression slope, then the least-squares regression line
(a) Slopes downward and to the right when plotted on the scatterplot of paired observations (x, y)
(b) Is useful for predicting y given x (within the limits of x-values covered by the data)
(c) Can be extrapolated beyond the limits of the x-values covered by the data to predict y at any
possible x
(d) Is not useful for predicting y given x
4.
Inference for regression on the population regression slope  is based on which of the following
distributions?
(a) The t distribution with n – 1 degrees of freedom
(b) The standard normal distribution
(c) The chi-square distribution with n – 1 degrees of freedom
(d) The t distribution with n – 2 degrees of freedom
5.
Suppose that inference for regression is conducted on the following small data set:
x
y
12
2
14
3
16
5
18
6
The number of degrees of freedom for our test statistic is
(a) 4
(b) 3
(c) 2
(d) Inference cannot be conducted on this data set because it is too small.
(e) The answer cannot be determined from the information given.
6.
In inference for regression, the statistic s represents
(a) the estimate of the standard deviation σ in the regression model.
(b) the standard deviation of the x-values in the paired observations (x, y).
(c) the estimate of the y-intercept.
(d) the standard deviation of the y-values in the paired observations (x, y).
The following information is used for questions 7-10.
The effects of a toxic pollutant upon fish were examined by placing fish in a two-liter solution of water
with various concentrations of the pollutant. The time (in minutes) until the fish showed distress was
recorded at which time the fish were removed from the container. A total of 18 different experiments
were performed. Note that the pollutant is measured on a logarithmic scale where a change of one unit
represents an increase of 10 fold in the pollution concentration. A preliminary plot of the data showed
that the relationship of time vs. log(pollution) was approximately linear. The output appears below:
SOURCE
DF
SUM OF SQUARES
MEAN SQUARE
F VALUE
PR > F
MODEL
ERROR
CORR. TOTAL
1
16
17
2.21459712
6.45556062
8.67015774
2.21459712
0.40347254
5.49
0.0324
PARAMETER
INTERCEPT
LOGPOLLUT
ESTIMATE
7.5641
-1.0269
T FOR H0:
PARAMETER=0
3.82
-2.34
PR > |T|
0.0015
0.0324
STD ERROR OF
ESTIMATE
1.978
0.438
7. The fitted regression line is:
(a) ŷ = –1.03 + 7.56 x
(b) ŷ = 7.56 – 1.03 x
(c) ŷ = 3.28 – 2.34 x
(d) ŷ = 7.56 – 10.27 x
(e) ŷ = –1.03 + 75.64 x
8. A 95% confidence interval for the slope is:
(a) 7.56 ± 1.96 (1.978)
(b) –1.03 ± 1.96 (0.438)
(c) 7.56 ± 2.110 (1.978)
(d) –1.03 ± 2.110 (.438)
(e) –1.03 ± 2.120 (.438)
9. An appropriate null and alternate hypothesis to test the slope, the test statistic, and the p-value are:
(a) H0:  = 0, Ha:  ≠ 0, t = –2.34, and p-value = .0324
(b) H0:  = 0, Ha:  ≠ 0, t = 3.82, and p-value = .0007
(c) H0:  = 0, Ha:  < 0, t = –2.34, and p-value = .0324
(d) H0:  = 0, Ha:  ≠ 0, t = 3.82, and p-value = .0015
(e) H0:  = 0, Ha:  < 0, t = –2.34, and p-value = .0162
10. Which of the following is a reasonable conclusion?
(a) There is a positive linear relationship between log(pollutant) concentration and time to distress.
(b) There is a negative linear relationship between log(pollutant) concentration and time to distress.
(c) There is no identifiable relationship between log(pollutant) concentration and time to distress.
(d) The time to distress is due to sampling variability and independent of pollutant.
(e) The sample size is too small to allow reasonable statistical inference.
Answers: 1. C, 2. C, 3. B, 4. D, 5. C, 6. A, 7. B, 8. E, 9. E, 10. B