Download Chapter 8 b

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Foundations of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Chapter 8
Introduction to Hypothesis Testing
Hypothesis Testing
Hypothesis testing is a statistical
procedure
Allows researchers to use sample data to
draw inferences about the population of
interest
Although the details of a hypothesis test
will change from one situation to another,
the general process will remain constant
Hypothesis Testing (cont.)
For this chapter, we have to understand zscores, probability, and the distribution of
sample means to create a new statistical
procedure known as a hypothesis test.
Hypothesis Test
A hypothesis test is a statistical method that
uses sample data to evaluate a hypothesis
about a population mean.
Underlying logic




State a hypothesis about a population
Use the hypothesis to predict the characteristics that
the sample should have
Obtain a random sample from the population
Compare the obtained sample data with the
prediction that was made from the hypothesis
A hypothesis test is typically used in the
context of a research study
Once a researcher completes a research
study, a hypothesis test is used to
evaluate the results

Details of the hypothesis test will change from
one situation to another
For now, we will focus on the most
common hypothesis tests
Situation: A researcher is using one sample
to examine one unknown population
The purpose of the research is to
determine the effect of the treatment on
the individuals in the population.
The goal is to determine what happens to
the population after the treatment is
administered.
Figure 8.1
The research situation of hypothesis testing
Copyright © 2002 Wadsworth Group. Wadsworth is an imprint of the
Wadsworth Group, a division of Thomson Learning
Assume the population forms a
normal distribution
Begin with a known population
(before the treatment)
Purpose is to determine
the effect of the treatment
on the individuals in the
population
What happens to the population
after the treatment is administered?
The basic research situation for hypothesis testing
The problem is to determine whether or
not the treatment has an effect;
The parameters are known for the
population before treatment;
The question is whether or not the
population mean is changed by the
treatment;
To help answer the question, the
researcher obtains a sample of individuals
who have received the treatment.
To simplify the hypothesis-testing situation,
one basic assumption is made about the
effect of the treatment
If the treatment has any effect, it is simply
to add a constant amount to (or subtract a
constant amount from) each individual’s
score.
Remember a constant will not change the
shape of the population, nor will it change
the standard deviation
The population after the treatment will also
have the same shape as the original
population and the same s.d.
The sample in the research study
The goal of the hypothesis test is to
determine whether or not the treatment
has any effect on the individuals in the
population
Because the populations are usually too
big, we use a sample.
The hypothesis test will use the sample to
test a hypothesis about the unknown
population mean.
Because a hypothesis test is a formalized
procedure that follows a standard series of
operations,
Researchers have a standardized method
for evaluating the results of their research
studies;
Other researchers will understand how the
data were evaluated and how conclusions
were reached.
Hypothesis test formal structure
Will use a four-step process
Will be used throughout the rest of the
book
Example 8.1
Psychologists note that stimulation during
infancy can have profound effects on the
development of infant rats.
Based on data, one might theorize that
increased stimulation early in life can be
beneficial.
Could this theory be applied to infants?
Mean weight of 2-year olds is m = 26 lbs.
With a s = 4 lbs
n=16
Sample parents given instructions for
working with their infants
At age 2, will weigh the children
We do not know what will happen to the
mean weight for 2-year old children
Do have a sample of 16 infants that we
can be sure about.
Can use this sample to draw inferences
about the unknown population
Follow the four steps
Steps
1. State the hypothesis
2. Set the criteria for a decision
3. Collect data and compute sample
statistics
4. Make a decision
Four Steps
Step 1
State the hypotheses


Actually state two hypotheses
Both in terms of population parameters
Null hypotheses


States that the treatment has no effect.
Identified by the symbol Ho
H stands for hypothesis
O indicates that this is the zero-effect
Ho= m infants handled = 26 pounds
Step 1
The second hypothesis is the opposite of
the null hypothesis
Called the scientific or alternative
hypothesis (H1)
States that the treatment has an effect on
the dependent variable
H1= m infants handled >< = 26 pounds
An alternative hypothesis simply states that
there will be some type of change
It might be necessary to specify the direction of
the effect in H1

m > 26 pounds
This is called directional hypothesis test
Note that both hypotheses refer to a population
whose mean is unknown

The population of infants who receive extra handling
early in life
Step 2
Set the Criteria for a Decision
Will eventually use the data from the sample to
evaluate the credibility of the null hypothesis
Will use the null hypothesis to predict the kind of
sample mean that ought to be obtained
We will determine exactly what sample means
are consistent with the null hypothesis and what
sample means are at odds with the null
hypothesis
Begin by examining all the possible
sample means that could be obtained in
the null hypothesis is true
Distribution of sample means should be
centered at m = 26
The distribution of sample means is then
divided into two sections.
1. Sample means that are likely to be
obtained if Ho is true

Those close to the null hypothesis
2. Sample means that are very unlikely to
be obtained if Ho is true

Those that are very different from the null
hypothesis
The High probability samples are located in the
center of the distribution and have sample
means close to the value specified in the null
hypothesis.
The low-probability samples are located in the
extreme tails of the distribution.
After the distribution has been divided in this
way, we can compare our sample data with the
values in the distribution
We can determine whether our sample mean is
consistent with the null hypothesis
Figure 8.2
The set of potential samples is divided into
those that are likely to be obtained and
those that are very unlikely if the null
hypothesis is true.
Copyright © 2002 Wadsworth Group. Wadsworth is an imprint of the
Wadsworth Group, a division of Thomson Learning
Figure 8.2
The distribution of sample means if the null hypothesis is true
Alpha Level
To find the boundaries that separate the
high-probability samples from the lowprobability samples, we must define
exactly what is meant by “low” probability
and “high” probability.
This is accomplished by selecting a
specific probability value, which is known
as the level of significance or the alpha
level for the hypothesis test.
The alpha (a ) value is a small probability
that is used to identify the low-probability
samples.



a = .05 (5%)
a = .01 (1%)
a = .001 (0.1%)
With a a = .05 (5%), we will separate the
most likely 95% of the sample means (the
central values)
Extreme Values
The extremely unlikely values, as defined
by the alpha level, make up what is called
the critical region
Extreme values are inconsistent with the
null hypothesis
If data produce a sample mean that is
located in the critical region, we will
conclude that the data are inconsistent
with the null hypothesis
Technically, the critical region is defined by
sample outcomes that are very unlikely to
occur if the treatment has no effect
That is, if the null hypothesis is true
It is almost impossible if there is no
treatment effect
The boundaries for the critical region
To determine the exact location for the
boundaries that define the critical region


Use the alpha-level probability
Unit normal table
a = .05 (5%)
Find the boundaries that separate the extreme
5% from the middle 95%
Split the 5%

2.5% (or 0.0250) in each tail
Z = +/- 1.96
Thus, for any normal distribution, the extreme
5% is in the tails of the distribution beyond z =
1.96 and z = -1.96
The values define the boundaries of the critical
region for a hypothesis test using a = .05.
Copyright © 2002 Wadsworth Group. Wadsworth is an imprint of the
Wadsworth Group, a division of Thomson Learning
Figure 8.3
The critical region for an alpha of .05
a = .01
1% or .0100 is split between the two tails
The proportion in each tail is .0050
z= +/- 2.58
a=.
.01% or .0010 is split between the two tails
The proportion in each tail is .0005
z= +/- 3.30
Collect Data and Compute Sample Statistics
Step 3
Collect sample data
Collect the data after the sample has been
selected

Assures an honest objective evaluation of data
Raw data are summarized with the appropriate
statistics


Compute the sample mean (in this example)
Compare the sample mean with the null hypothesis
To compare the sample mean with the null
hypothesis, compute a z-score that describes
exactly where the sample mean is located
relative to the hypothesized population mean
from Ho
z = sample mean – hypothesized population mean
Z=M–m
sM
M = sample mean
Standard error between M and m
Make a Decision
Use the z-score value obtained in Step 3
to make a decision about the null
hypothesis according to the criteria
established in Step 2
Two possible decisions


Accept the null hypothesis
Reject the null hypothesis
Sample data fall into critical region
Rejecting the Null Hypothesis vs.
Proving the Alternative Hypothesis
The reason for focusing on the null
hypothesis as compared to the alternative
hypothesis comes from the limitations of
inferential logic
Remember that we want to use the
sample data to draw conclusions, or
inferences, about a population
Logically, it is easier to demonstrate that a
universal (population) hypothesis is false
than to demonstrate that it is true
It would be difficult to state “the treatment
has an effect” as the hypothesis and then
try to prove that this is true
Therefore, we state the null hypothesis
“the treatment has no effect” and try to
show that it is false
In the end, we still demonstrate that the
treatment does have an effect.
We find support for the alternative
hypothesis by disproving (rejecting) the
null hypothesis