Download chapter 3: sample problems for homework, class

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
CHAPTER 3: SAMPLE PROBLEMS FOR HOMEWORK, CLASS OR EXAMS
These problems are designed to be done without access to a computer, but they may require
a calculator.
1. A. A university administrator states ‘there is no evidence, at  = 1%, that aliens have
taken over the university’. You are interested in this conclusion, but prefer to use  = 5%.
(1) You would also say there is no significant evidence of an alien takeover.
(2) You would say there is significant evidence of an alien takeover.
(3) You do not know whether there is significant evidence or not, until you know the p value.
B. You are testing the null hypothesis ‘mean reading scores are not changing in my school
district’ using a significance level of 5%. Which statement below corresponds to a Type II
error?
(1) In truth, mean reading scores have not changed. Sample data yields a p value of 0.03.
(2) In truth, mean reading scores have not changed. Sample data yields a p value of 0.30.
(3) In truth, mean reading scores have changed. Sample data yields a p value of 0.03.
(4) In truth, mean reading scores have changed. Sample data yields a p value of 0.30.
C. For a hypothesis test at a fixed significance level of 5%, power will increase if
(1)  increases
(2) sample size decreases
(3) the standard error increases
(4) the discrepancy between the true parameter value and the hypothesized value increases
D. The margin of error will generally decrease if
(1) the confidence coefficient decreases
(2)  increases
(3) sample size decreases
(4) the standard error increases
2. You are reading a research article that states ‘there is significant evidence that the
distribution of debt ratios is changing, z = 2.14’.
a) Calculate a p value for this test statistic, assuming a two-tailed alternative.
b) The authors were using α = 5%. Would you agree with their conclusion, if you use α =
1%?
3. From past experience, the cognition scores on a certain test are known to have a mean of
60 and  = 12. You have tested a random sample of 40 subjects who have a history of
depression and found a sample mean of 58.
a) Is there significant evidence that the mean score in this population differs from 60? Use
=5% and assume that the standard deviation is still 12. In your conclusion, identify the
relevant population.
b) Is it necessary to assume the data come from a normal distribution? Why or why not?
4. In a survey of household saving rates, a sample of 42 households randomly selected from
County X showed a mean saving rate of 1.45 (as a percentage of gross income). Assuming
that the population standard deviation is still at its historical value of 0.79, give a 90%
confidence interval for the mean saving rate in this population.
5. A technician is trying to use data to show that the mean pH readings from an instrument
are biased. The technician takes 8 independent readings of pH from a neutral test solution
with known pH of 7.0.
Ho:  = 7 and H1:   7.
Not knowing any formal statistics, the technician intuitively understands that a sensible rule
is ‘decide the instrument is biased if the sample mean is less than 6.9 or greater than 7.1’.
Suppose the standard deviation of individual readings from the instrument is known from
technical specifications to be   0.15 .
Calculate  if the instrument really is biased with a true mean in this situation of 6.85.
6. You are studying sizes for single-family houses in Gainesville, FL. You believe that the
standard deviation in sizes will be about 400 square feet. How large a sample size will you
need, if you want to estimate the mean size in this population with a confidence interval of
90% and a margin of error of 50 square feet?
7. From past experience, the cognition scores on a certain test are known to have a mean of
60 and  = 12. You wish to show that mean scores among people with a history of
depression will differ from 60. Your plan is to randomly select a sample of 40 such people,
and claim evidence for your hypothesis if the sample mean exceeds 62. What significance
level are you using?
Problems 8 through 10 use distributions other than the normal in applying concepts of
inference.
8. In the past, the success rate (proportion of students earning a C or better) in an elementary
statistics course has been 0.65. The instructor increases the quantity of homework in the
hopes of increasing the success rate. In the semester after the change, 22 of the 30 students
initially enrolled in the course succeed in earning a C or better.
a) State the null and alternative hypotheses in terms of the probability of succeeding after the
change in homework (p).
b) Calculate the p value for the observed data, using the binomial distribution.
c) Using a significance level of 5%, what conclusion would you reach?
9. A nutritionist tests 200 foods to see if they affect the risk of heart disease. In each
hypothesis test of the null hypothesis “Food X has no effect on heart disease”, he uses =5%.
Though the nutritionist doesn’t know it, none of the foods has any relation to heart disease so
that the null hypothesis is true in every case. Assume all the foods (and tests) are
independent.
a) What is the expected number of Type I errors, in which the nutritionist claims that there is
evidence the food has an effect on heart disease (even though it truly does not)?
b) What is the probability that there is at least 1 Type I error in the list of 200 hypothesis
tests?
10. In the past, a company’s number of computer network outages in a month has followed a
Poisson distribution with a mean of 1.5. The company has undertaken steps to reduce the
number of network outages. Their assessment plan states ‘we will have evidence our steps
have been effective if there are no network outages in either of the two months following
implementation’.
a) If the plan has not been effective, so that the mean is still 1.5, what is the probability that a
single month will have 0 network outages?
b) What significance level is the company using, assuming months are independent?
SOLUTIONS
1. a) #3
b) #4
c) #4
d) #1
2. a) p value = P(Z < -2.14 or Z > 2.14) = 2*0.0162 = 0.0324
b) No, using α = 1%, there is no significant evidence that the distribution of debt ratios is
changing.
3. a)  = mean cognition scores among people with a history of depression
Ho: µ = 60
H1: µ ≠ 60
Reject Ho if |z| > 1.96 or p value less than 0.05.
Z = -1.054 and p value = 0.2918
There is no significant evidence that the mean cognition score among people with a history
of depression differs from 60.
b) No, it is not necessary. Since the sample size is large (at least 30), the Central Limit
Theorem implies that the sample mean will be approximately normally distributed.
4. With confidence level 90%, the mean savings rate for households in County X is between
1.2495 and 1.6505.
5. The standard error is 0.15 / 8  0.053 . Assuming µ = 6.85, then
 = P( 6.9 ≤ x ≤ 7.1) = P(0.943 ≤ Z ≤ 4.717) = 0.1728
6. n  
2
1.645 * 400 

50


 173.2
The sample size should be at least 173.
7. The standard error is 12 / 40  1.897 . Assuming µ = 60, then
 = P( x ≥ 62) = P( Z ≥ 1.054) = 0.1459
8. a) Ho: p ≤ 0.65 versus H1: p > 0.65
b) p value = P(X ≥ 22) assuming X has binomial distribution with n = 30 and p = 0.65.
p value = 0.2247
c) There is no significant evidence that increasing the quantity of homework increased the
success rate.
9. The number of Type I errors among the 200 hypothesis tests will follow a binomial
distribution with n = 200 and p = 0.05
a) µ = np = 10
b) 1 – (0.95)^200 = 0.99996
10. a) Using the Poisson distribution, the probability of no outage in a single month is
e1.5  0.2231
b) The significance level equals the probability of neither month having a network outage,
assuming µ has not changed. If months are independent, this is 0.22312  0.0498 .