* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download File
Degrees of freedom (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
History of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
Taylor's law wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Regression toward the mean wikipedia , lookup
Statistical inference wikipedia , lookup
Misuse of statistics wikipedia , lookup
Statistics Chapter 19: Review B – KEY 1. Here is a histogram of waiting times of the 1,159 calls to a customer service center last month. The mean wait was 47.7 minutes with a standard deviation of 33.14 minutes. Sketch the sampling distribution of means of random samples of 75 calls taken from this population and describe its shape, center, and spread in context. The shape will be a t-model with 74 degrees of freedom (unimodal and symmetric) with a mean of 47.7 minutes and standard deviation of 33.14 = 3.8267 minutes. 75 2. A government report on housing costs says that single-family home prices nationwide are skewed to the right, with a mean of $235,700. a. We collect price data from a random sample of 50 homes in Orange County, California. Why is it okay to use these data for inference even though the population is skewed? The Central Limit Theorem guarantees that as long as the sample size is large enough, like n = 50, that the distribution of sample means will be a t-model. This allows us to use the inference procedures. b. The standard deviation of the 50 homes in our sample was $25,500. Specify the sampling model (shape, center, spread) for the mean price of such samples. Because the sample is fairly large, the shape will be a t-model with 49 degrees of freedom (unimodal and s 25,500 symmetric). The center will be at $235,700 and the standard deviation will be $3606 . n 50 c. Find a 90% confidence interval for the mean price in Orange County. Does this interval provide evidence that single-family home prices are unusually high in this county? Explain briefly. The conditions have been met, so we can create a one-sample t-interval, with 90% confidence. 25500 * * y r t59 SE ( y ) 200 r t49 (2, 2) 50 This interval does provide evidence that single-family home prices are unusually high in this county because the nationwide mean of $235,700 is EHORZ the interval. Copyright © 2012 Pearson Education, Inc. Publishing as Addison-Wesley. 19-30 3. List and explain the assumptions and conditions you must check before using a one-sample ttest. Independence Condition: We must check to make certain individual cases within a data set do not have an effect on each other. Randomization Condition: The data needs to be collected randomly. 10% Condition: We need to make certain that the sample size is less than 10% of the population. Nearly Normal Condition: The mechanics involved require the sample distribution to be somewhat Normal. 4. A hypothesis test of whether the mean number of hours adults spend on their cell phones is more than 30 minutes per day produces a P-value of 0.112. Explain what this means in context. If the mean number of hours adults spend on their phones is 30 minutes per day, we would expect about 11.2% of samples of the same size to have a sample mean as high or higher than the one we saw in our sample. Copyright © 2012 Pearson Education, Inc. Publishing as Addison-Wesley. 19-31 5. Professional home stagers claim to increase the amount of money a homeowner can make selling their house by making the house look more attractive to prospective buyers. The highest offers in thousands of dollars on 12 houses are shown before and after a professional home stager worked on them. a. Write appopriate hypotheses in words and symbols. The null hypothesis is that the mean difference is zero, or that there is no difference between before and after. The alternative hypothesis is that the mean difference is positive, or that the home stager increases the value. In symbols: H 0 : µ d = 0 H A : µd > 0 . b. Do these data satisfy the assumptions for inference? Explain. Paired Data Assumption: The data are paired by home. Histogram of After-Before Randomization Condition: We are not given that the data was not obtained randomly. 10% Condition: 12 houses are certainly less than 10% of all houses. 4 Frequency Independence Condition: These price differences may be assumed to be independent of each other as long as all the houses are not in the same neighborhood. 3 2 1 0 -22.0 -13.4 -4.8 3.8 After-Before 12.4 21.0 Nearly Normal Condition: The distribution does not appear to be approximately Normal. c. Find the mean and standard deviation of these differences. y = 2.1667 , s = 15.3317 d. Find the t-value and P-value for the hypothesis test. t P 2.1667 0 15.3317 12 P(t11 ! 0.490) 0.490 0.3170 e. Explain what the P-value means in this context and state an appropriate conclusion. If the mean difference in highest offer before and after staging is zero, we would expect 31.7% of samples of 12 homes to have a mean difference as great or greater than $2,167. Since the P-value is so high, we fail to reject the null hypothesis. There is no evidence that average highest offers are higher after home staging. We should be cautious with this conclusion, however, since the Nearly Normal condition was not met. f. Find and interpret a 95% confidence interval for the change in house offers. The conditions have been met, so we can create a one-sample t-interval, with 95% confidence. 15.3317 d ± t11* ⋅ SE (d ) = 2.1667 ± t11* ⋅ = (−7.575,11.908) . 12 I am 95% confident that the mean highest offer is between $7575 lower and $11,908 higher than the asking price. Copyright © 2012 Pearson Education, Inc. Publishing as Addison-Wesley. 19-32