Download U2.2-OneSampleMeanTest

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistical Test for Population Mean
•
•
•
•
In statistical testing, we use deductive reasoning to
specify what should happen if the conjecture or null
hypothesis is true.
A study is designed to collect data to challenge this
hypothesis.
We determine if what we expected to happen is
supported by the data.
• If it is not supported, we reject the conjecture.
• If it cannot be rejected we (tentatively) fail to
reject the conjecture.
We have not proven the conjecture, we have simply
not been able to disprove it with these data.
One Sample Inf-1
Logic Behind Statistical Tests
Statistical tests are based on the concept of Proof by Contradiction.
If P then Q = If NOT Q then NOT P
Analogy with justice system
Court of Law
1. Start with premise that person is
innocent. (Opposite is guilty.)
2. If enough evidence is found to show
beyond reasonable doubt that person
committed crime, reject premise.
(Enough evidence that person is
guilty.)
3. If not enough evidence found, we
don’t reject the premise. (Not enough
evidence to conclude person guilty.)
Statistical Hypothesis Test
1. Start with null hypothesis, status quo.
(Opposite is alternative hypothesis.)
2. If a significant amount of evidence is
found to refute null hypothesis, reject it.
(Enough evidence to conclude
alternative is true.)
3.
If not enough evidence found, we don’t
reject the null. (Not enough evidence to
disprove null.)
One Sample Inf-2
Examples in Testing Logic
CLAIM:
A new variety of turf grass has been developed that is
claimed to resist drought better than currently used
varieties.
CONJECTURE: The new variety resists drought no better than currently
used varieties.
DEDUCTION: If the new variety is no better than other varieties (P), then
areas planted with the new variety should display the same
number of surviving individuals (Q) after a fixed period
without water than areas planted with other varieties.
CONCLUSION: If more surviving individuals are observed for the new
varieties than the other varieties (NOT Q), then we
conclude the new variety is indeed not the same as the
other varieties, but in fact is better (NOT P).
One Sample Inf-3
Five Parts of a Statistical Test
1. Null Hypothesis (H0:)
2. Alternative Hypothesis (HA:)
3. Test Statistic (T.S.)
Computed from sample data.
Sampling distribution is known if
the Null Hypothesis is true.
4. Rejection Region (R.R.)
Reject H0 if the test statistic
computed with the sample
data is unlikely to come from
the sampling distribution under
the assumption that the Null
Hypothesis is true.
5. Conclusion:
Reject H0 or Do Not Reject H0
H0: m ≤ 530
HA: m > 530
T.S.
Other HA:
m < 530
m  530
ym
z 
~ N(0,1)

n
*
R.R. z* > za
Or z* < -za
Or | z*| > za/2
One Sample Inf-4
Hypotheses
Research Hypotheses: The thing we are primarily interested in “proving”.
H1A : m  m 0
H1A : m  1.6m
H : m  m0
H : m  m0
H : m  1.6m
2
A
3
A
2
A
3
A
Average height
of the class
H : m  1.6m
Null Hypothesis: Things are what they say they are, status quo.
H0 : m  m0
H0 : m  1.6m
(It’s common practice to always write H0 in this way, even though
what is meant is the opposite of HA in each case.)
One Sample Inf-5
Test Statistic
Some function of the data that uses
estimates of the parameters we are
interested in and whose sampling
distribution is known when we assume the
null hypothesis is true.
Most good test
statistics are
constructed using
some form of a
sample mean.
The Central
Why? Limit Theorem
Of Statistics
One Sample Inf-6
Developing a Test Statistic for the
Population Mean
Hypotheses of
interest:
H1A : m  m 0
H2A : m  m 0
H3A : m  m 0
or
Test Statistic: Sample Mean
Under H0: the sample mean has a
sampling distribution that is
normal with mean m0.
Under HA: the sample mean has a
sampling distribution that is
also normal but with a mean
m1 that is different from m0.
m  m0
1 n
y   yi
n i1
y ~ N(m 0 , )
y ~ N m 0 ,  
n

y ~ N(m1, )



y ~ N m1,

n

One Sample Inf-7
Graphical View
H0: True
HA1:
True
(3)
(1)
m0
(2)
m1
What would you conclude if the sample mean fell in
location (1)? How about location (2)? Location (3)? Which
is most likely 1, 2, or 3 when H0 is true?
One Sample Inf-8
Rejection Region
N(m 0 ,  n )
H0: Assumed True
Testing HA1: m  m0
Reject H0 if the
sample mean is
in the upper tail of
the sampling
distribution.
Rejection Region
m0
C1
If y  C1 Re ject H0
How do we determine C1?
One Sample Inf-9
Determining the Critical Value for
the Rejection Region
Reject H0 if the sample mean is larger than “expected”.
If H0 were true, we would expect 95% of sample means to be less
than the upper limit of a 90% CI for μ.
C  m 0  1.645
1

n
From the standard
normal table.
In this case, if we use this critical value, in 5 out of 100 repetitions of the
study we would reject Ho incorrectly. That is, we would make an error.
But, suppose HA1 is the true situation,
then most sample means will be greater
than C1 and we will be making the correct
decision to reject more often.


 y  m0

P
 1.645   0.05



n


One Sample Inf-10
Rejection Regions for Different Alternative Hypotheses
N(m 0 ,

n
5%
)
H0: m = m0 True
Testing HA1: m  m0
Rejection Region
C1
H0: m = m0 True
Testing HA1: m  m0
5%
Rejection Region
H0: m = m0 True
Testing HA1: m  m0
C2
2.5%
2.5%
Rejection Region
C 3L
m0
C 3U
One Sample Inf-11
Type I Error
H0: True
Rejection Region
If the sample mean is at
location (1) or (3) we make
the correct decision.
m0
(3)
C1
(2)
(1)
If the sample mean is at location (2) and H0 is actually
true, we make the wrong decision.
This is called making a TYPE I error, and the probability of making
this error is usually denoted by the Greek letter a.
a = P(Reject H0 when H0 is is the true condition)
1
If C  m 0  1.645
If C1  m 0  z a

n

then a = 1/20=5/100 or .05
n
then the Type I error is a.
One Sample Inf-12
Type II Error
If the sample mean is at
location (1) or (3) we make
the wrong decision.
HA1: True
Rejection Region
(3)
m0
C1
(1)
m1
(2)
If the sample mean is at location (2)
we make the correct decision.
This is called making a TYPE II error, and the probability of making
this type error is usually denoted by the Greek letter b.
b = P(Do Not Reject H0 when HA is the true condition)
We will come back to the Type II error later.
One Sample Inf-13
Type I and II Errors
Type I Error Region
H0: True
a
Type II Error Region
HA1: True
• We want to minimize the Type I error,
just in case H0 is true, and we wish to
minimize the Type II error just in case HA
is true.
• Problem: A decrease in one causes an
increase in the other.
• Also: We can never have a Type I or II
error equal to zero!
b
m1
m0
Rejection Region
C1
One Sample Inf-14
Setting the Type I Error Rate
The solution to our quandary is to set the Type I Error Rate to be small,
and hope for a small Type II error also. The logic is as follows:
1. Assume the data come from a population with unknown distribution
but which has mean (m) and standard deviation ().
2. For any sample of size n from this population we can compute a
sample mean, y .
3. Sample means from repeated studies having samples of size n will
have the sampling distribution of y following a normal distribution
with mean m and a standard deviation of /n (the standard error of
the mean). This is the Central Limit Theorem.
4. If the Null Hypothesis is true and m = m0, then we deduce that with
probability a we will observe a sample mean greater than
m0 + za /n. (For example, for a = 0.05, za=1.645.)
One Sample Inf-15
Setting Rejection Regions for Given Type I Error
Following this same logic, rejection rules for all three possible
alternative hypotheses can be derived.
H : m  m0
1
A
Reject Ho if:
H2A : m  m 0
H3A : m  m 0
y  m0
 za

n
y  m0
 z a

n
y  m0
 za / 2

n
Note: It is just
easier to work
with a test
statistic that is
standardized.
We only need
the standard
normal table to
find critical
values.
For (1-a)100% of repeated studies, if the true population mean is m0 as
conjectured by H0, then the decision concluded from the statistical test
will be correct. On the other hand, in a100% of studies, the wrong
decision (a Type I error) will be made.
One Sample Inf-16
Risk
The value 0 < a < 1 represents the risk we are
willing to take of making the wrong decision.
But, what if I don’t wish to take any risk?
Why not set a = 0?
What if the true situation is expressed
by the alternative hypothesis and not
the null hypothesis?
a = 0.05
a = 0.01
a = 0.001
Suppose HA1 is really the true
situation. Then the samples come
from the distribution described by the
alternative hypothesis and the
sampling distribution of the mean
will have mean m1, not m0.
y ~ N(m1, 
n
)
But
this!
m0
m1
One Sample Inf-17
The Rejection Rule
(at an a Type I error probability)
y  m0
Rule : Reject H 0 if
 za
 n
or if
y  C  m0  z a
1

n
 y  m0

Pr
 za   a
 n

m0
Do not reject
Ho
m1
C1
Reject Ho
Pretty clear cut when m1 much greater than m0
One Sample Inf-18
Error Probabilities
 y  m0

Pr
 za | H0   a
 n

When H0 is true
m0
C1
m1
When HA is true
 y  m0

 y  m0

 z a | HA   1  b
Pr
 z a | HA   b Pr
 n

 n

If HA is the true situation, then any sample whose
mean y is larger than m0  za  n will lead to the
correct decision (reject H0, accept HA).
One Sample Inf-19
If HA is the true situation
Then any sample such that its sample mean y is less
than m 0  za 
n will lead to the wrong
decision (do not reject H0, reject HA).
True Situation
Reject HA
Do not reject HA
C1
Reject H0
Decision
Accept H0
H0 : True
HA : True
Type I Error
Correct
1-b
a
Correct
Type II Error
1-a
b
One Sample Inf-20
Computing Error Probabilities
 y  m0

Pr
 za | H0   a
 n

Type I Error Probability
(Reject H0 when H0 true)
 y  m0

Pr
 z a | HA   1  b
 n

 y  m0

Pr
 z a | HA   b
 n

Power of the test.
(Reject H0 when HA true)
Type II Error Probability
(Reject HA when HA true)
One Sample Inf-21
Example
Sample Size: n = 50
Sample Mean: y  40.1
Sample Standard
deviation: s = 5.6
H0: m = m0 = 38
HA: m > 38
What risk are we willing to take that we reject H0 when in
fact H0 is true?
P(Type I Error) = a = .05 Critical Value: z.05 = 1.645
Rejection Region
40.1  38
 2.651  1.645
5.6 50
y  m0
 za
 n
Conclusion: Reject H0
One Sample Inf-22
Type II Error.
To compute the Type II error for a hypothesis test, we
need to have a specific alternative hypothesis stated,
rather than a vague alternative
Vague
HA: m > m0
HA: m < m0
HA: m  m0
HA: m  10
Specific
HA: m = m1 > m0
HA: m = m1 < m0
HA: m = 5 < 10
HA: m = 20 > 10
Note:
As the difference
between m0 and m1
gets larger, the
probability of
committing a Type II
error (b) decreases.
Significant Difference: D = m0 - m1
One Sample Inf-23
Computing the probability of a Type II Error (b)
H0: m = m0
HA: m = m 1 > m0
H0: m = m0
HA: m = m 1 < m0
H0: m = m0
Ha: m  m0
m = m1  m0

m 0  m1

b  Pr Z  za 

 n






m 0  m1

b  Pr Z  za 

 n






m 0  m1

b  Pr Z  za 

2
 n

D  m0 m1  critical difference
1 - b = Power of the test
m0
m1




m0
m1
One Sample Inf-24
Example: Power
H 0 : m  m0  38
n  50
y  40.1
s  5 .6
H A : m  m1  40
Pr(Type I error)  a  .05
z a  1.645

m 0  m1

b  Pr Z  z a 

 n



D 
  Pr Z  1.645 





50





D
 Pr Z  1.645 
n





Assuming s is actually equal to  (usually what is done):
2


b  Pr  Z  1.645 
( 50 )   Pr(Z  0.88)  0.1894
5.6


One Sample Inf-25
Power versus D (fixed n and )

m0  m1

b  Pr Z  z a 

 n

Type II error


D
  Pr  Z  z 
a






n


As D gets larger, it is
less likely that a Type
II error will be made.
1-a
b
D
One Sample Inf-26
Power versus n (fixed D and )

m0  m1

b  Pr Z  z a 

 n

Type II error


D
  Pr  Z  z 
a






n


As n gets larger, it is
less likely that a Type II
error will be made.
1-a
b
n
What happens for larger ?
One Sample Inf-27
Power vs. Sample Size

m 0  m1

b  Pr Z  z a 

 n



D
  Pr  Z  1.645 






n


Power = 1-b
D
1
2
4
n
50
50
50
b
.755
.286
.001
power
.245
.714
.999
D
1
2
4
n
25
25
25
b
.857
.565
.054
power
.143
.435
.946
One Sample Inf-28
Power Curve
See Table 4,
Ott and
Longnecker
1-a
Pr(Type II Error)
b
0
0
large
D
1-a
n=50
Power
1b
n=10
0
0
D
large
One Sample Inf-29
Summary

m 0  m1
b  Pr  Z  z a 

 n


D
 Pr  Z  z a 



1) For fixed  and n,


  Pr  Z  z 
a




m 0  m1


n



n


b decreases (power increases) as D increases.
2) For fixed  and D,
b decreases (power increases) as n increases.
3) For fixed D and n,
b decreases (power increases) as  decreases.
One Sample Inf-30
N(m0,/n)
Increasing Precision for fixed D
increases Power
1 < 
Decreasing  decreases
the spread in the
sampling dist of y
N(m0,1 /n)
D
Note: za changes.
Same thing happens if
you increase n.
Same thing happens if D
is increased.
One Sample Inf-31
Sample Size Determination
1) Specify the critical difference, D (assume  is known).
2) Choose P(Type I error) = a and Pr(Type II error) = b
based on traditional and/or personal levels of risk.
n
3) One-sided tests:
2
( za  z b ) 2
D2
2
n  2 ( za / 2  z b ) 2
D
4) Two-sided tests:
Example: One-sided test,  = 5.6, and we wish to show a
difference of D = .5 as significant (i.e. reject H0: m=
m0 for HA: m= m1= m0 + D) with a = .05 and b = .2.

5.6
2


n
1
.
65

0
.
84
 777.7 ~ 778
2
.5
One Sample Inf-32
2
Related documents