Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sufficient statistic wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
History of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
Taylor's law wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Statistical inference wikipedia , lookup
Misuse of statistics wikipedia , lookup
Resampling (statistics) wikipedia , lookup
STATISTICS 200 Lecture #20 Thursday, October 27, 2016 Textbook: Sections 9.6, 11.1, 11.2 Objectives: • Apply sampling distribution for one sample mean to confidence intervals and hypothesis tests. • Identify situations in which t-multipliers and t-tests should be used instead of z-multipliers and z-tests. We have begun a strong focus on Inference Means Proportions One population proportion Two population proportions This week One population mean Difference between Means Mean difference Categorical data (2 categories) Quantitative data parameter: parameter: statistic: statistic: Clicker Question: Consider the following three survey questions: 1. Do you plan to vote in the upcoming presidential election? 2. How old are you? 3. Which candidate do you most support? How many of these questions will produce Quantitative data? A. 0 B. 1 C. 2 D. 3 Example 1: The population is normally distributed. 52 68 84 100 116 132 148 X sampling distribution 52 68 84 100 116 132 148 IQ 52 68 84 100 116 132 X sampling distribution 148 Clicker Question Which statement(s) are false, when comparing the original distribution to the two sampling distributions a. All three distributions have the same value for the mean b. As the sample size increases, the standard deviation for the sampling distribution decreases c. The original distribution will always have a smaller standard deviation than what is found with either of the two sampling distributions d. The sampling distributions suggest possible values for the population mean. Example 1: The population is normally distributed. 52 68 84 100 116 132 148 X sampling distribution 52 68 84 100 116 132 148 IQ 52 68 84 100 116 132 X sampling distribution 148 Example 1: The population is normally distributed. 52 68 84 100 IQ 116 132 148 Sampling Distribution: Standard Deviation & Standard Error Statistic Standard Deviation p̂ x p(1 p) s.d.(p̂) n σ s.d.(x) n We generally do not know p. Thus, we don’t know s.d.(p-hat). Similarly: We generally do not know σ. Thus, we don’t know s.d.(x-bar). Sampling Distribution: Standard Deviation & Standard Error Statistic Standard Deviation Standard Error: Estimated St Dev if p is unknown, use: p̂ x p(1 p) s.d.(p̂) n σ s.d.(x) n p̂(1 p̂) s.e.(p̂) n if σ is unknown, use: s s.e.(x) n Here, s is the sample standard deviation. Example 1: The population is normally distributed. Substitute s for σ: 52 68 84 100 IQ 116 132 148 Standard normal (dotted red) vs. t (solid black) Degrees of freedom for t distribution: 1, 5, and 20 (as d.f. increases, the t looks more like the standard normal.) -3 -2 -1 0 1 2 3 Standard normal (dotted red) vs. t (solid black) Degrees of freedom for t distribution: 1, 5, and 20 We worry a lot about teaching t vs. z, but the difference is tiny for degrees of freedom usually seen in practice. -3 -2 -1 0 1 2 3 Confidence Interval Formula Generic Formula: Specific for Population Mean: µ sample estimate ± (margin of error) sample estimate ± (multiplier × standard error) x t * s n Here, t* depends on confidence level and df = (n – 1). Multipliers: from the t table (not a complete list, obviously) Conf. level: 0.90 1 df 6.31 0.95 12.71 0.98 31.82 0.99 63.66 2 df 3 df 9 df 20 df 2.92 2.35 1.83 1.72 4.30 3.18 2.26 2.09 6.96 4.54 2.82 2.53 9.92 5.84 3.25 2.85 30 df Infinite df 1.70 1.645 2.04 1.96 2.46 2.326 2.75 2.576 Example 2: We ask each of 31 students “how many regular ‘text’ friends do you have?” Survey results: n = 31 X-bar = 6 friends Clicker Question: What kind of variable is this? A.Categorical B.Quantitative s = 2.0 friends Calculate a 95% Confidence Interval: How can we estimate the population mean number of regular “text” friends for all STAT 200 students using these data? Confidence Interval Formula Generic Formula: sample estimate ± (margin of error) sample estimate ± (multiplier × standard error) Survey results: n = 31 X-bar = 6 friends Thus, the 95% CI is s = 2.0 friends Confidence Interval Interpretation We are 95% confident that the… Calculated Interval: 6.0 ± 0.7 friends (5.3 to 6.7 friends) a. b. c. d. e. sample mean sample proportion population mean population proportion range of values for the …number of regular “text” friends for STAT 200 students is between 5.3 and 6.7 friends. Confidence Interval Conclusion 95% C.I.: 5.3 to 6.7 friends In the population, we may conclude, with 95% confidence, that on average, STAT 200 students have A. B. C. D. more than 6 friends. more than 4 friends. fewer than 5 friends. fewer than 6 friends. 15000 Example 3: The population is NOT normally distributed. Histogram of 100,000 samples (n=25) 0 5000 1/6 1 2 3 4 5 6 X 10000 Histogram of 100,000 samples (n=100) 4 5 6 6000 3 2000 2 0 1 1 2 3 4 X 5 6 No Are all sampling distributions normal? _____ When do we have to be cautious? small sample sizes 1. with _____ 2. where the original population is not normal ______ in shape One-Sample t procedure is valid if one of the conditions for normality is met: Sample data suggest a normal shape or We have a large sample size (n ≥ 30 __) Sampling distribution will look normal in shape If you understand today’s lecture… 9.61, 9.62, 9.64, 9.65, 11.25, 11.30, 11.32, 11.33 Objectives: • Apply sampling distribution for one sample mean to confidence intervals and hypothesis tests. • Identify situations in which t-multipliers and t-tests should be used instead of z-multipliers and z-tests.