Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia, lookup

Student's t-test wikipedia, lookup

Taylor's law wikipedia, lookup

Bootstrapping (statistics) wikipedia, lookup

Confidence interval wikipedia, lookup

Resampling (statistics) wikipedia, lookup

Misuse of statistics wikipedia, lookup

Transcript

Chapter 23: Inferences About Means Central Limit Theorem: (pg. 521, Ch. 18) No matter what population the random sample comes from, the shape of the sampling distribution is approximately Normal as long as the sample size is large enough. The larger the sample used, the more closely the Normal approximates the sampling distribution. When creating a sampling distribution, we need: 1. a random sample of quantitative data 2. the true population standard deviation, If we don’t have (which is almost always!), we have to estimate it, using the standard error: s SE ( y ) n HOWEVER, there is additional variance between samples that must be accounted for… Gosset’s t: (pg. 522) accounts for additional variance based on sample size use the t model when you only know s, the sample standard deviation (not , the population standard deviation)…this is almost always true! t distributions are always bell shaped, but change with sample size t model is often denoted: t df - degrees of freedom: df = n-1 TI Tips – pg. 524-525 Confidence Intervals: Assumptions and Conditions o Independence Assumption Randomization Condition 10% Condition o Normal Population Assumption – NEW! Nearly Normal Condition: the data come from a distribution that is unimodal and symmetric *make a histogram or Normal probability plot to check* for small sample sizes (n<15): the data should follow the Normal model pretty closely for moderate sample sizes (15<n<40): the t methods will work well as long as the data are unimodal and reasonably symmetric large sample size (n>40): t methods are safe to use even if the data are skewed *make a histogram anyway to check for outliers *if the data has multiple modes, it may need to be separated into different categories One-Sample t-interval When the conditions are met, we are ready to find the confidence interval for the population mean, . The confidence interval is: y t *n1 SE ( y ) Where the standard error of the mean, SE ( y ) s . n The critical value t *n1 depends on the particular confidence level, C, that you specify and on the number of degrees of freedom, n-1, which we get from the sample size. **Just Checking: pg. 526-527 **Step-by-Step: pg. 527-529 **TI Tips: pg. 529-530 One-sample t-test for the mean The conditions for the one-sample t-test for the mean are the same as for the onesample t-interval. We test the hypothesis Ho : o using the statistic tn 1 The standard error of y is SE ( y ) y o SE ( y ) s n When the conditions are met and the null hypothesis is true, this statistic follows a Student’s t-model with n-1 degrees of freedom. We use that model to obtain a P-value. Steps for a One-sample t-test: 1) Hypotheses: This time use , not y . (Remember for proportions we used p , not p̂ . 2) Model: Check the conditions. 3) Mechanics: With conditions complete, calculate t, draw the curve (indicating degrees of freedom), shade the region representing the Pvalue, and find P (using technology or a t table for critical values). 4) Conclusion: Link p-value to the decision in context. Be sure the conclusion talks about the mean of a population. *If you have an outlier, perform the analysis twice. **Step-by-Step: pg. 531-533 **TI Tips: pg. 533 “Significance and Importance”: When performing a hypothesis test, use the CI to determine possible values and see if the conclusion is important (“statistically significant” does not necessarily mean important) **Just Checking: pg. 534 Intervals and Tests: A level C confidence interval contains all of the plausible null hypothesis values that would not be rejected by a two-sided hypothesis test at alpha level 1 – C Sample Size calculations: ME t * n 1 - HW: s n s is usually unknown n is unknown (it’s what we’re looking for) s can be estimated (if we have no idea, we can take a small “pilot study”) use z* instead of t* to get an estimate your sample size calculations (margin of error) won’t be exact #2, 4, 5, 6, 10, 11, 12, 17 #19, 21, 26, 29, 33, 34