Download Sampling Distributions, Z-Tests, Power

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Sampling Distributions, Z-Tests, Power
• We draw inferences about population parameters
from sample statistics
– Sample proportion approximates population proportion
– Sample mean approximates population mean
– Sample variance (using n-1) approximates population
variance
– Etc.
• Statistics vary from one sample to the next
– If a statistic is unbiased, then on average over many
samples, it will equal the population parameter
– There will be variability around that average
– The distribution of a statistic is the sampling distribution
• We will consider sampling distributions for
– Sample means
– Sample proportions assuming Bernoulli processes
• I.e., 2 possible outcomes, and the binomial distribution applies
– Note that for a Bernoulli process, if we use the random
variable X = (0, 1) to denote a failure and a success,
respectively, then the sample proportion is the mean of X
• Thus, the sample proportion is also a sample mean
• The shape of the sampling distribution depends on
the population distribution and on the sample size n
1
• For the sampling distribution of the mean
– Sampling distribution mean = population mean
E (X ) = µ X = µ
– Sampling distribution variance =
(population variance)/sample size
σ =
2
X
σ2
n
– Standard deviation of the sampling distribution is
called the standard error of the mean
σX =
σ
n
• The Central Limit Theorem
– For samples of size n taken from a population
with mean µ and standard deviation σ
• E (X ) = µ X = µ
•
σX =
σ
n
• As n increases, the sampling distribution approaches
the normal distribution
• Holds for all population distributions
σ
• The fact that σ X = n decreases with n is
very important and useful
2
• Apply this reasoning to the binomial distribution
– Recall µ = E (s ) = np and σ 2 = np(1 − p )
• Instead of number of successes, let’s work
with the proportion of successes
– Divide s by n to get
p̂ =
s
n
– Divide expected mean by n and divide variance by
n2 to get
s
E   = E ( pˆ ) = p
n
σ 2pˆ =
p(1 − p )
n
σ pˆ =
p(1 − p )
n
• The Central Limit Theorem applies to the sampling
distribution for the proportion of successes in n
Bernoulli trials
• The binomial distribution applied to sample
proportions for samples of size n
s
E   = E ( pˆ ) = p
n
σ 2pˆ =
p(1 − p )
n
σ pˆ =
p(1 − p )
n
– As n increases, the distribution of sample
proportions approaches the normal
• Knowing or assuming the population proportion, we
can use the table of the standard normal distribution
to determine the probability of obtaining sample
proportions within any interval or beyond any point
3
• The sampling distribution of the mean (again)
E (X ) = µ X = µ
σX =
σ
n
– As n increases, the distribution approaches the
normal
• Knowing or assuming the population mean and
standard deviation, we can use the table of the standard
normal distribution to determine the probability of
obtaining sample means within any interval or beyond
any point
• Thus, we can test simple hypotheses about
sample means (called the z-test)
– Calculate z-score for sample mean, zobt
– Compare to critical z-score, zcrit
Sampling Distribution - Large n
(Distribution is Normal)
f(Mean)
0.03
0.02
σX =
σ
n
0.01
0
40
60
80
100
µ
120
140
160
Sample mean X
4
• Convert sample mean to zobt
•Compare to zcrit, which is either za for a 1-tailed test
of za/2 for a 2-tailed test
•Decide whether or not to reject H0
0.50
Two-tailed test
f(z)
0.40
0.30
0.20
.95
0.10
0.00
-5 -4 -3 -2 -1
z.025 = 1.96
0
z
1
2
3
4
5
z.025 = 1.96
• Two reasons large sample sizes are important
– Sampling distribution approaches normal
– Power of the test increases
• Note square root of n in denominator of standard error
• To calculate power for simple z-test
– State H0 (µ null ) and H1 (µ real ≠ µ null , µ real > µ null , or µ real < µ null )
– Determine zcrit, which depends on 1-tailed versus 2-tailed
test and on alpha
– Determine sample size, n
– Determine sample outcome(s) that would reject H0
– Assume a particular H1
– Convert the sample outcome(s) leading to rejection of H0
to z-scores under the assumed H1
– Use the table of the standard normal dist’n to determine
the probability of the the outcome(s) under H1
– That is the power of the test
5