Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 4: Sampling and Statistical Inference Part 2: Estimation Types of Estimates Point estimate – a single number used to estimate a population parameter Interval estimate – a range of values between which a population parameter is believed to be Common Point Estimates Theoretical Issues Unbiased estimator – one for which the expected value equals the population parameter it is intended to estimate The sample variance is an unbiased estimator for the population variance 2 n s2 x x i 1 i n 1 2 n 2 x i 1 i N Confidence Intervals Confidence interval (CI) – an interval estimated that specifies the likelihood that the interval contains the true population parameter Level of confidence (1 – a) – the probability that the CI contains the true population parameter, usually expressed as a percentage (90%, 95%, 99% are most common). Confidence Intervals for the Mean - Rationale Confidence Interval for the Mean – Known A 100(1 – a)% CI is: x za/2(/n) za/2 may be found from Table A.1 or using the Excel function NORMSINV(1-a/2) Confidence Interval for the Mean, Unknown A 100(1 – a)% CI is: x ta/2,n-1(s/n) ta/2,n-1 is the value from a t-distribution with n-1 degrees of freedom, from Table A.3 or the Excel function TINV(a, n-1) Relationship Between Normal Distribution and t-distribution The t-distribution yields larger confidence intervals for smaller sample sizes. PHStat Tool: Confidence Intervals for the Mean PHStat menu > Confidence Intervals > Estimate for the mean, sigma known…, or Estimate for the mean, sigma unknown… PHStat Tool: Confidence Intervals for the Mean - Dialog Enter the confidence level Choose specification of sample statistics Check Finite Population Correction box if appropriate PHStat Tool: Confidence Intervals for the Mean - Results Confidence Intervals for Proportions Sample proportion: p = x/n x = number in sample having desired characteristic n = sample size The sampling distribution of p has mean p and variance p(1 – p)/n When np and n(1 – p) are at least 5, the sampling distribution of p approach a normal distribution Confidence Intervals for Proportions A 100(1 – a)% CI is: p za/2 p(1 - p) n PHStat tool is available under Confidence Intervals option Confidence Intervals and Sample Size CI for the mean, known Sample size needed for half-width of at most E is n (za/2)2(2)/E2 CI for a proportion Sample size needed for half-width of at most E is ( za / 2 ) 2 p (1 p ) n E2 Use p as an estimate of p or 0.5 for the most conservative estimate PHStat Tool: Sample Size Determination PHStat menu > Sample Size > Determination for the Mean or Determination for the Proportion Enter s, E, and confidence level Check Finite Population Correction box if appropriate Confidence Intervals for Population Total A 100(1 – a)% CI is: N x tn-1,a/2 N s n PHStat tool is available under Confidence Intervals option N n N 1 Confidence Intervals for Differences Between Means Population 1 Population 2 Mean 1 2 Standard deviation 1 2 x1 x2 n1 n2 Point estimate Sample size Point estimate for the difference in means, 1 – 2, is given by x1 - x2 Independent Samples With Unequal Variances A 100(1 – a)% CI s s n n 2 1 2 1 df* = 2 2 s12 s 22 is:x1 -x2 (ta/2, df*) n1 n 2 2 ( s12 / n1 ) 2 ( s 22 / n2 ) 2 n 1 n 1 1 2 Fractional values rounded down Independent Samples With Equal Variances A 100(1 – a)% CI is:x1 -x2 (ta/2, n1 + n2 – 2)s p sp 1 1 n1 n2 (n1 1) s12 (n2 1) s 22 n1 n2 2 where sp is a common “pooled” standard deviation. Must assume the variances of the two populations are equal. Paired Samples A 100(1 – a)% CI is:D (tn-1,a/2) sD/n Di = difference for each pair of observations D = average of differences n sD (D i 1 i D) n 1 PHStat tool available in the Confidence Intervals menu Differences Between Proportions A 100(1 – a)% CI is: p1 p2 za / 2 p1 (1 p1 ) p2 (1 p2 ) n1 n2 Applies when nipi and ni(1 – pi) are greater than 5 Sampling Distribution of s The sample standard deviation, s, is a point estimate for the population standard deviation, The sampling distribution of s has a chisquare (c2) distribution with n-1 df See Table A.4 CHIDIST(x, deg_freedom) returns probability to the right of x CHIINV(probability, deg_freedom) returns the value of x for a specified right-tail probability Confidence Intervals for the Variance (n 1) s 2 (n 1) s 2 , 2 A 100(1 – a)% CI is: 2 c n 1,a / 2 c n 1,1a / 2 PHStat Tool: Confidence Intervals for Variance - Dialog PHStat menu > Confidence Intervals > Estimate for the Population Variance Enter sample size, standard deviation, and confidence level PHStat Tool: Confidence Intervals for Variance - Results Time Series Data Confidence intervals only make sense for stationary time series data Probability Intervals A 100(1 – a)% probability interval for a random variable X is an interval [A,B] such that P(A X B) = 1 – a Do not confuse a confidence interval with a probability interval; confidence intervals are probability intervals for sampling distributions, not for the distribution of the random variable.