Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Inference Procedure Summary – AP Statistics Procedure Formula Confidence Interval for mean µ when given σ One sample z-interval for the mean Hypothesis Test for mean µ when given σ (Ho: µ = µ o) x ± z* Conditions One Sample Mean and Proportion σ n SRS from population of interest, normality of x ( CLT if n>30 or norm pop via plots), “essentially” ind. observations (samp < 10% of pop); Know value of population stand. deviation σ One sample z-test for mean z= x − µo σ x ± tdf * SAME AS ABOVE CI n One sample t-interval for the mean CI for mean µ when σ is unknown Calculator Options s n (with df = n – 1) *Can also find p-value using 2nd-Distr normalcdf(lower, upper, mean, sd) SRS from population of interest, normality of x (CLT if n>30 or norm pop via plots, or robustness of t-procedures (see below)), “essentially” independent observations (samp <10% of pop) The t procedures are applicable if a) n > 40, even if skewness and outliers exist; b) 15 < n < 40, with normal probability plot showing little skewness and no extreme outliers; and c) n < 15 with npp showing no outliers and no skewness Inference Procedure Summary – AP Statistics One sample t-test for the mean Test for mean µ when σ is unknown (Ho: µ = µ o) x − µo s n t= SAME AS ABOVE CI (with df = n – 1) One sample z-interval for a proportion CI for proportion p pˆ ± z * pˆ (1 − pˆ ) n One sample z-test for a proportion Test for proportion p (Ho: p = po) z= pˆ − p o po (1 − po ) n *Can also find p-value using 2nd-Distr tcdf(lower, upper, df) SRS from population of interest, normality of p̂ (successes npˆ ≥ 10 and failures n(1 − pˆ ) ≥ 10 ), “essentially” ind. observations (samp <10% of pop) SRS from population of interest, normality of p̂ (successes np0 ≥ 10 and failures n(1 − p0 ) ≥ 10 ), “essentially” ind. observations (samp <10% of pop) *Can also find p-value using 2nd-Distr normalcdf(lower, upper, mean, sd) Inference Procedure Summary – AP Statistics Two Sample Means and Proportions Both samples are independent SRSs from distinct populations, normality* of x1 − x2 , (requires x1 and x2 to Two-sample t-interval for difference in two means CI for mean µ 1-µ 2 when σ is unknown s12 s22 ( x1 − x2 ) ± tdf * + n1 n2 with conservative df = n – 1 of smaller sample be approx. norm., (use CLT or plots of x1 and x2), or see note below), obs. within each sample are independent (each samp <10% of its pop) * The t procedures are applicable if a) n1 + n2 > 40, even if skewness and outliers exist; b) 15 < n1 + n2 < 40, with normal probability plots showing little skewness and no extreme outliers; and c) n1 + n2 < 15 with npp showing no outliers and no skewness Two-sample t-test for difference in two means Test for mean µ 1-µ 2 when σ is unknown (Ho: µ 1 = µ 2) t= usually = 0 ( x1 − x2 ) − ( µ1 − µ2 ) s12 s22 + n1 n2 with conservative df = n – 1 of smaller sample SAME AS ABOVE CI *Can also find p-value using 2nd-Distr tcdf(lower, upper, df) where df is either conservative estimate or value using long formula that calculator does automatically! Inference Procedure Summary – AP Statistics Two-sample z-interval for difference in two proportions CI for proportion p1 – p2 ( pˆ 1 − pˆ 2 ) ± z * n1 p̂1 and n2 p̂ 2 and failures ˆp1 (1 − pˆ 1 ) pˆ 2 (1 − pˆ 2 ) n (1 − pˆ ) and n (1 − pˆ ) are all 1 1 2 2 + n1 n2 at least 5), obs. within each sample are independent (each samp <10% of its pop) Two-sample z-test for difference in two proportions Test for proportion p1 – p2 z= Both samples are independent SRSs from distinct populations, normality of pˆ1 − pˆ 2 (successes ( pˆ 1 − pˆ 2 ) 1 1 pˆ (1 − pˆ ) + n1 n2 where pˆ = X1 + X 2 n1 + n2 SAME AS ABOVE CI, except when assessing normality, we use the pooled proportion p̂ (see at left) to check that the counts of success n1 pˆ and n2 pˆ and failures n1 (1 − pˆ ) and n2 (1 − pˆ ) are all at least 5 *Can also find p-value using 2nd-Distr normalcdf(lower, upper, mean, sd) where mean and sd are values from numerator and denominator of the formula for the test statistic Inference Procedure Summary – AP Statistics Categorical Distributions χ2 = ∑ Chi Square Test (O − E ) 2 E G. of Fit – 1 sample, 1 variable; df = categories – 1 Independence – 1 sample, 2 variables; df = ( r − 1)( c − 1) Homogeneity – 2 or more samples, 2 variables; df = ( r − 1)( c − 1) 1. SRS from population of interest (independent SRSs for homogeneity) 2. All expected counts are at least 1 3. No more than 20% of expected counts are less than 5 *Can also find p-value using 2nd-Distr x2cdf(lower, upper, df) Slope t-interval for regression slope CI for β b ± tdf *SEb where df = n – 2: s SEb = ∑ ( x − x )2 and s = 1 ( y − yˆ ) 2 ∑ n−2 1. Independent observations 2. Linear relationship between x and y 3. For any fixed x, y varies according to a normal distribution 4. Standard deviation of y is same for all x values t-test for regression slope Test for β t= b with df = n – 2 SE b SAME AS ABOVE CI *You will typically be given computer output for inference for regression Inference Procedure Summary – AP Statistics Variable Legend – here are a few of the most commonly used variables Variable µ σ x s Meaning population mean “mu” population standard deviation “sigma” sample mean “x-bar” sample standard deviation Variable CLT SRS npp p z test statistic using normal distribution p̂ z* t critical value representing confidence level C test statistic using t distribution Meaning Central Limit Theorem Simple Random Sample Normal Probability Plot (last option on stat plot) population proportion sample proportion p-hat or “pooled” proportion p-hat for two sample procedures t* critical value representing confidence level C n sample size Matched Pairs – same as one sample procedures but one set of data is created from the difference of two matched lists (i.e. pre and post test scores of left and right hand measurements). The differences are what need to meet the inference conditions, and notation reflects this, i.e., we use µdiff , xdiff , σ diff , sdiff , etc. . Conditions – show that they are met (i.e. substitute values in and show sketch of npp) ... don’t just list them