Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sampling Distributions, Z-Tests, Power • We draw inferences about population parameters from sample statistics – Sample proportion approximates population proportion – Sample mean approximates population mean – Sample variance (using n-1) approximates population variance – Etc. • Statistics vary from one sample to the next – If a statistic is unbiased, then on average over many samples, it will equal the population parameter – There will be variability around that average – The distribution of a statistic is the sampling distribution • We will consider sampling distributions for – Sample means – Sample proportions assuming Bernoulli processes • I.e., 2 possible outcomes, and the binomial distribution applies – Note that for a Bernoulli process, if we use the random variable X = (0, 1) to denote a failure and a success, respectively, then the sample proportion is the mean of X • Thus, the sample proportion is also a sample mean • The shape of the sampling distribution depends on the population distribution and on the sample size n 1 • For the sampling distribution of the mean – Sampling distribution mean = population mean E (X ) = µ X = µ – Sampling distribution variance = (population variance)/sample size σ = 2 X σ2 n – Standard deviation of the sampling distribution is called the standard error of the mean σX = σ n • The Central Limit Theorem – For samples of size n taken from a population with mean µ and standard deviation σ • E (X ) = µ X = µ • σX = σ n • As n increases, the sampling distribution approaches the normal distribution • Holds for all population distributions σ • The fact that σ X = n decreases with n is very important and useful 2 • Apply this reasoning to the binomial distribution – Recall µ = E (s ) = np and σ 2 = np(1 − p ) • Instead of number of successes, let’s work with the proportion of successes – Divide s by n to get p̂ = s n – Divide expected mean by n and divide variance by n2 to get s E = E ( pˆ ) = p n σ 2pˆ = p(1 − p ) n σ pˆ = p(1 − p ) n • The Central Limit Theorem applies to the sampling distribution for the proportion of successes in n Bernoulli trials • The binomial distribution applied to sample proportions for samples of size n s E = E ( pˆ ) = p n σ 2pˆ = p(1 − p ) n σ pˆ = p(1 − p ) n – As n increases, the distribution of sample proportions approaches the normal • Knowing or assuming the population proportion, we can use the table of the standard normal distribution to determine the probability of obtaining sample proportions within any interval or beyond any point 3 • The sampling distribution of the mean (again) E (X ) = µ X = µ σX = σ n – As n increases, the distribution approaches the normal • Knowing or assuming the population mean and standard deviation, we can use the table of the standard normal distribution to determine the probability of obtaining sample means within any interval or beyond any point • Thus, we can test simple hypotheses about sample means (called the z-test) – Calculate z-score for sample mean, zobt – Compare to critical z-score, zcrit Sampling Distribution - Large n (Distribution is Normal) f(Mean) 0.03 0.02 σX = σ n 0.01 0 40 60 80 100 µ 120 140 160 Sample mean X 4 • Convert sample mean to zobt •Compare to zcrit, which is either za for a 1-tailed test of za/2 for a 2-tailed test •Decide whether or not to reject H0 0.50 Two-tailed test f(z) 0.40 0.30 0.20 .95 0.10 0.00 -5 -4 -3 -2 -1 z.025 = 1.96 0 z 1 2 3 4 5 z.025 = 1.96 • Two reasons large sample sizes are important – Sampling distribution approaches normal – Power of the test increases • Note square root of n in denominator of standard error • To calculate power for simple z-test – State H0 (µ null ) and H1 (µ real ≠ µ null , µ real > µ null , or µ real < µ null ) – Determine zcrit, which depends on 1-tailed versus 2-tailed test and on alpha – Determine sample size, n – Determine sample outcome(s) that would reject H0 – Assume a particular H1 – Convert the sample outcome(s) leading to rejection of H0 to z-scores under the assumed H1 – Use the table of the standard normal dist’n to determine the probability of the the outcome(s) under H1 – That is the power of the test 5