Download Name: Math 17 Section 02/ Enst 24 – Introduction to Statistics Third

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Transcript
Name:
Math 17 Section 02/ Enst 24 – Introduction to Statistics
Third Midterm Exam
PRACTICE 1
Instructions:
1. Show all work. You may receive partial credit for partially completed problems.
2. You may use calculators and a one-sided sheet of reference notes, as well as the provided tables
(t,chi-square). You may not use any other references or any texts.
3. You may not discuss the exam with anyone but me.
4. Suggestion: Read all questions before beginning and complete the ones you know best first.
Point values per problem are displayed below if that helps you allocate your time among
problems.
5. Use 4 decimal places for calculations involving proportions.
6. You MAY NOT use a calculator to do more than the standard arithmetic functions, exponents,
and square roots. I.E. You may not use t-test functions, regression functions, and the like.
7. Good luck!
Problem
1
2
3
4
Total
Points Earned
Possible Points
50
1. A student who attends college in Atlanta,
Georgia, and flies home for the holidays decides
to investigate an airline's claim that "as distance
to the destination increases, our fare increases".
The student collects distance and fare data from
one-way flights from Atlanta to many other cities,
and then analyzes the data generating a graph
and output shown. Use the student's work to
answer the questions below.
a. In order to investigate the airline's claim, which
variable should be the response variable?
Partial Rcmdr Output:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 177.21452
19.99315
8.864 1.43e-07
distance
0.07862
0.02037
3.859 0.00139
Residual standard error: 41.82 on 16 degrees of freedom
Multiple R-squared: 0.482,
Adjusted R-squared: 0.4496
b. What is the numerical value of the correlation between distance and fare? Interpret this value.
c. What is the equation of the least squares line generated by the student?
d. If reasonable, predict the fare for a flight from Atlanta to a city that is 1000 miles away. If not
reasonable, explain why not in one sentence.
e. Does the regression line fit well? Explain briefly.
f. Is there evidence to support the airline's claim that "as distance to the destination increases, our fare
increases"? Perform an appropriate test at a .01 significance level, reporting your hypotheses, test
statistic, p-value, and conclusion in context. (Assumptions will be checked below).
Null:
Alternative:
Test statistic:
p-value:
Conclusion:
g. The student generates basic diagnostic plots to help with checking the regression assumptions. For
each graph, state what assumption(s) it can be used to check, then comment on whether that
assumption checks out.
Used to check:
Comment:
Used to check:
Comment:
2. A study was conducted to assess the effectiveness of a new antibiotic treatment for strep throat in
children. Children 6 to 14 who met the entry criteria were randomized to one of three treatment
groups. Group 1 was given standard treatment 1, group 2 was given standard treatment 2, and the last
group (group 3) was given the new antibiotic treatment. (The 2 standard treatments are different.) The
response measured on each child was the number of days to cure the strep infection. Use the partial
ANOVA table provided to answer the questions below.
Source
SS
Df
MS
F
Treatment
38.364
Residuals
141.273
30
4.709
-
-
Total
179.636
32
-
-
-
19.182
p-value
.027
a. This output would be generated in order to test what set of hypotheses?
Null:
Alternative:
Assume for the parts below that the ANOVA conditions are satisfied.
b. Provide the missing value of the treatment df and the F test statistic.
c. What is your best estimate of the common population variance assumed by the ANOVA? (Provide a
numerical value.)
d. Using a .05 significance level, what is your conclusion (in context) for this ANOVA?
e. The following pairwise confidence intervals were generated using Tukey's multiple comparisons
methods. If appropriate, use the intervals to summarize the differences. If not appropriate, explain why
not.
2-1
3-1
3-2
Estimate
1.45
-1.18
-2.64
Lwr
-.83
-3.46
-4.92
Upper
3.74
1.10
-.36
3. A random sample of 337 college students was asked whether or not they were registered to vote. We
wonder if there is an association between a student's sex and whether the student is registered to vote.
The data collected is provided in the table below. Use the table to address the following questions.
Men
Women
Total
Registered
104
147
251
Not Registered
33
53
86
Total
137
200
337
a. If a randomly selected student was chosen from this sample, what is the probability a male student
was selected?
b. If a randomly selected female student was chosen from this sample, what is the probability the
student is not registered to vote?
c. What test should you perform to determine if there is an association between sex and whether or not
a student is registered to vote? (Be specific.)
d. Determine the expected counts for your chosen test and write them in the table in parentheses after
the observed counts.
e. State and check the conditions necessary for your chosen test.
f. The chi-square test statistic is .249. What distribution does the test statistic have assuming the null
hypothesis is true?
g. What can you say about the p-value for your test?
h. State the conclusion for your test (in context).
4. A student wants to investigate the "famous" Fisher
iris data set and determine whether or not petal
lengths of three different iris species differ on
average. The data set contains 50 observations from
each of 3 species of iris. A boxplot of the data is
shown at right.
a. If the student performed an ANOVA, would it be
balanced or unbalanced?
b. Should the student perform an ANOVA? Explain
why or why not.
General true/false or fill-in the blank questions.
c. F distributions are skewed right.
True
False
d. Correlation implies causation.
True
False
e. Assume that you will perform better on an exam if you get more sleep. Then, the random variables –
exam score and sleep time – are independent.
True
False
f. One of the ANOVA assumptions is that the population of all the responses is normally distributed.
True
g. Correlation detects all forms of association between 2 quantitative variables.
False
True
False
h. If the distribution-related assumptions for ANOVA are not met, you can use a Kruskal-Wallis test,
which is an example of a __________________________ test.