Download Homework 7

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Regression toward the mean wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
PubH 6414 Fall2011 Homework 7 (20 points)
We encourage you to work together in computing and discussing the problems.
However, each student is expected to independently write up the submitted
assignment using her or his own computing and giving explanations in her or his
own words. Identical or nearly identical homework submissions will not receive
credit.


Turn in this completed Word document in class by the homework due date.
You may use R commander to do the calculations needed for each question. Paste in ONLY the
parts of the output needed to answer the question. (You may use another statistical software
package to do the calculations, if you prefer, but the instructor and TAs cannot provide
assistance with other packages.)
Data needed for this homework assignment are on the website link:
http://www.biostat.umn.edu/~susant/FALL11PH6414HMK.html
Problem 1: Multiple Choice Questions. (2 points)
1. A study based on a sample of size 25 reported a mean of 76 with a margin of error of 12 for 95%
confidence. If you wanted 99% confidence, how would your margin of error change? (Underline or
highlight one.)
a.
b.
c.
d.
It would increase.
It would decrease.
It would stay the same.
Cannot be determined without more information.
2. A random sample of n= 40 from a certain population was used to estimate the population mean
weight. The 95% confidence interval limits for the population mean weight based on this sample were
(134, 160). Which of the following statements are correct (true) interpretations of this confidence
interval? (Underline or highlight all that apply.)
a. The probability that the true population mean is between 134 and 160 is 0.95.
b. Out of all possible 95% confidence intervals constructed for this population mean from
random samples of n= 40, 95% will include the true mean.
c. We are 95% confident that the true confidence interval limits of the population mean are 134
and 160.
d. We are 95% confident that the interval (134, 160) contains the true population mean.
3. 95% confidence intervals for the mean birth weight are constructed separately for a sample of boy
babies and a sample of girl babies with the following results:
95% CI for mean weight (in lbs.) of boys: (6.82, 10.84)
95% CI for mean weight (in lbs.) of girls: (5.71, 11.39)
(This is made-up data.) Which mean birth weight estimate is more precise? (Underline or highlight
one.)
a. Unable to determine since the mean and standard deviation of the birth weights for boys and
for girls are unknown.
b. Unable to determine since the sample size for each group is unknown.
c. The mean birth weight estimate for girls.
d. The mean birth weight estimate for boys.
4. If the rejection region has been chosen so that alpha = 0.05 and the test statistic is in the rejection
region then what is the p-value for the test? (Underline or highlight one.)
a. >0.05
b. <0.05
c. =0.05
d. >0.10
e. <0.10
f. =0.10
5. If we reject the null hypothesis at the 5% level, then the probability that the null hypothesis is true is
5%.
a. True
b. False
Problem 2. Quetelet Index. (5 points)
Data on the Quetelet index (or Body Mass Index, BMI) for 315 people in a study on dietary retinol and
carotene levels are provided in the ‘PlasmaRetinol.txt’ file in the Homework 7 assignment.
[Ref: Taken from http://lib.stat.cmu.edu/datasets/Plasma_Retinol (accessed 03-Mar-2009).
See also: Nierenberg DW, Stukel TA, Baron JA, Dain BJ, Greenberg ER. Determinants of plasma
levels of beta-carotene and retinol. American Journal of Epidemiology 1989;130:511-521.]
Description: This datafile contains 315 observations on 14 variables.
The data are coded as follows:
AGE: Age (years)
SEX: Sex (1=Male, 2=Female).
SMOKSTAT: Smoking status (1=Never, 2=Former, 3=Current
Smoker)
QUETELET: Quetelet (weight/(height^2))
VITUSE: Vitamin Use (1=Yes, fairly often, 2=Yes, not often,
3=No)
CALORIES: Number of calories consumed per day.
FAT: Grams of fat consumed per day.
FIBER: Grams of fiber consumed per day.
ALCOHOL: Number of alcoholic drinks consumed per week.
CHOLESTEROL: Cholesterol consumed (mg per day).
BETADIET: Dietary beta-carotene consumed (mcg per day).
RETDIET: Dietary retinol consumed (mcg per day)
BETAPLASMA: Plasma beta-carotene (ng/ml)
RETPLASMA: Plasma Retinol (ng/ml)
A person with a BMI value above 25 kg/m2 is considered overweight. Carry out a hypothesis test to
test whether mean body mass index in the population sampled for this study is greater than 25. Use
alpha = 0.05 for this test.
.
Your answer should include the following:
1.
2.
3.
4.
5.
Your hypotheses (null and alternative) and whether the alternative is one- or two-sided.
The name of the appropriate test statistic.
The critical value(s) for the test.
The calculated test statistic and the p-value for the test. (Please show all your work.)
Your conclusions. (Do you reject your null hypothesis? Why or why not? If you reject it and
conclude a difference exists, in which direction does the difference lie?)
Problem 3. Quetelet Index, again. (4 points)
Part A. Data on the Quetelet index (or Body Mass Index, BMI) for 315 people in a study on dietary
retinol and carotene levels are provided in the ‘PlasmaRetinol.txt’ file in the Homework 7 assignment.
A person with a BMI value above 25 kg/m2 is considered overweight. Construct a 95% confidence
interval for mean body mass index for the population sampled for this study.
Your answer should include the following:
1.
2.
3.
4.
The calculated SEM. (Please show your work.)
The appropriate t-coefficient.
The upper and lower confidence limits. (Please show your work.)
The interpretation of the confidence interval. (Remember that in statistics, ‘interpreting’ a result
simply means restating it in words.)
Part B. Does your calculated confidence interval include 25? What inference can you make from this
observation?
Part C. Can the results from this confidence interval be compared to those from the hypothesis test in
the previous problem? Why or why not?
Problem 4. Time to Infection. (5 points)
Data on time to infection (months) for 119 kidney dialysis patients are provided in the
‘TimeToInfection.txt’ file in the Homework 7 assignment. Before conducting the study, the principal
investigator hypothesized based on her clinical experience that the mean time to infection in this
patient population would be 10 months. Do the study results support her hypothesis? Carry out a
hypothesis test to test whether mean time to infection in this patient population is different than 10
months. Use alpha = 0.05 for this test.
Your answer should include the following:
1.
2.
3.
4.
5.
Your hypotheses (null and alternative) and whether the alternative is one- or two-sided.
The name of the appropriate test statistic.
The critical value(s) for the test.
The calculated test statistic and the p-value for the test. (Please show all your work.)
Your conclusions. (Do you reject your null hypothesis? Why or why not? If you reject it and
conclude a difference exists, in which direction does the difference lie?)
Problem 5. Time to Infection, again. (4 points)
Part A. Data on time to infection (months) for 119 kidney dialysis patients are provided in the
‘TimeToInfection.txt file in the Homework 7 assignment. Before conducting the study, the principal
investigator hypothesized based on her clinical experience that the mean time to infection in this
patient population would be 10 months. Do the study results support her hypothesis? Construct a 95%
confidence interval for the mean time to infection in this patient population.
Your answer should include the following:
1.
2.
3.
4.
The calculated SEM. (Please show your work.)
The appropriate t-coefficient.
The upper and lower confidence limits. (Please show your work.)
The interpretation of the confidence interval.
Part B. Does your calculated confidence interval include 10 months? What inference can you make
from this observation?
Part C. Can the results from this confidence interval be compared to those from the hypothesis test in
the previous problem? Why or why not?