Download File

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Regression toward the mean wikipedia , lookup

Statistical inference wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Statistics Chapter 19: Review B – KEY
1. Here is a histogram of waiting times of the 1,159 calls to a customer service center last month.
The mean wait was 47.7 minutes with a standard deviation of 33.14 minutes. Sketch the
sampling distribution of means of random samples of 75 calls taken from this population and
describe its shape, center, and spread in context.
The shape will be a t-model with 74 degrees of
freedom (unimodal and symmetric) with a mean of
47.7 minutes and standard deviation of
33.14
= 3.8267 minutes.
75
2. A government report on housing costs says that single-family home prices nationwide are
skewed to the right, with a mean of $235,700.
a. We collect price data from a random sample of 50 homes in Orange County, California.
Why is it okay to use these data for inference even though the population is skewed?
The Central Limit Theorem guarantees that as long as the sample size is large enough, like n = 50, that
the distribution of sample means will be a t-model. This allows us to use the inference procedures.
b. The standard deviation of the 50 homes in our sample was $25,500. Specify the sampling
model (shape, center, spread) for the mean price of such samples.
Because the sample is fairly large, the shape will be a t-model with 49 degrees of freedom (unimodal and
s
25,500
symmetric). The center will be at $235,700 and the standard deviation will be
$3606 .
n
50
c. Find a 90% confidence interval for the mean price in Orange County. Does this interval
provide evidence that single-family home prices are unusually high in this county? Explain
briefly.
The conditions have been met, so we can create a one-sample t-interval, with 90% confidence.
25500
*
*
y r t59
˜ SE ( y ) 200 r t49
˜
(2, 2)
50
This interval does provide evidence that single-family home prices are unusually high in this county
because the nationwide mean of $235,700 is EHORZ the interval.
Copyright © 2012 Pearson Education, Inc. Publishing as Addison-Wesley.
19-30
3. List and explain the assumptions and conditions you must check before using a one-sample ttest.
Independence Condition: We must check to make certain individual cases within a data set do not have an
effect on each other.
Randomization Condition: The data needs to be collected randomly.
10% Condition: We need to make certain that the sample size is less than 10% of the population.
Nearly Normal Condition: The mechanics involved require the sample distribution to be somewhat Normal.
4. A hypothesis test of whether the mean number of hours adults spend on their cell phones is
more than 30 minutes per day produces a P-value of 0.112. Explain what this means in context.
If the mean number of hours adults spend on their phones is 30 minutes per day, we would expect about
11.2% of samples of the same size to have a sample mean as high or higher than the one we saw in our
sample.
Copyright © 2012 Pearson Education, Inc. Publishing as Addison-Wesley.
19-31
5. Professional home stagers claim to increase the amount of money a homeowner can make
selling their house by making the house look more attractive to prospective buyers. The highest
offers in thousands of dollars on 12 houses are shown before and after a professional home
stager worked on them.
a. Write appopriate hypotheses in words and symbols.
The null hypothesis is that the mean difference is zero, or that there is no difference between before and
after. The alternative hypothesis is that the mean difference is positive, or that the home stager increases
the value. In symbols: H 0 : µ d = 0 H A : µd > 0 .
b. Do these data satisfy the assumptions for inference? Explain.
Paired Data Assumption: The data are paired by home.
Histogram of After-Before
Randomization Condition: We are not given that the data was
not obtained randomly.
10% Condition: 12 houses are certainly less than 10% of all
houses.
4
Frequency
Independence Condition: These price differences may be
assumed to be independent of each other as long as all the
houses are not in the same neighborhood.
3
2
1
0
-22.0
-13.4
-4.8
3.8
After-Before
12.4
21.0
Nearly Normal Condition: The distribution does not appear
to be approximately Normal.
c. Find the mean and standard deviation of these differences.
y = 2.1667 , s = 15.3317
d. Find the t-value and P-value for the hypothesis test.
t
P
2.1667 0
15.3317
12
P(t11 ! 0.490)
0.490
0.3170
e. Explain what the P-value means in this context and state an appropriate conclusion.
If the mean difference in highest offer before and after staging is zero, we would expect 31.7% of
samples of 12 homes to have a mean difference as great or greater than $2,167. Since the P-value is so
high, we fail to reject the null hypothesis. There is no evidence that average highest offers are higher
after home staging. We should be cautious with this conclusion, however, since the Nearly Normal
condition was not met.
f. Find and interpret a 95% confidence interval for the change in house offers.
The conditions have been met, so we can create a one-sample t-interval, with 95% confidence.
15.3317
d ± t11* ⋅ SE (d ) = 2.1667 ± t11* ⋅
= (−7.575,11.908) .
12
I am 95% confident that the mean highest offer is between $7575 lower and $11,908 higher than the
asking price.
Copyright © 2012 Pearson Education, Inc. Publishing as Addison-Wesley.
19-32