Download Hypothesis Testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Interaction (statistics) wikipedia , lookup

Transcript
Hypothesis Testing
Chapter 13
Hypothesis Testing

Decision-making process
 Statistics used as a tool to assist with
decision-making
 Scientific hypothesis is a statement of the
predicted relationship amongst the variables
 Null hypothesis is a statement of no
relationship amongst the variables
Null Hypothesis Not Rejected
Total Population
Sample
reared in
sterile
environment
Sample
reared in
enriched
environment
Null Hypothesis Rejected
Total population
of rats reared in
sterile environment
Sample used
in study
Total population
of rats reared in
enriched environment
Sample used
in study
Hypothesis Testing
In Experimental Studies

Your research design determines the kind of
statistical test you will use.
 Experimental studies test hypotheses while
quasi-experimental studies tend to focus
more on generating hypotheses.
Research
Designs/Approaches
Type
Purpose
Time
frame
Experimental
Test for cause/ current
effect
relationships
Quasiexperimental
Test for cause/ Current or
past
effect
relationships
without full
control
Degree
of
control
High
Examples
Comparing
two types of
treatments for
anxiety.
Moderate Gender
to high
differences in
visual/spatial
abilities
Research
Designs/Approaches
Type
Purpose
Time
frame
Degree
of
control
Examples
Nonexperime
ntal correlational
Ex post
facto
Examine
relationship
between two
variables
Current
(crosssectional)
or past
Low to
medium
Examine the
effect of past
event on
current
functioning.
Past &
current
Low to
medium
Relationship
between
studying style
and grade
point average.
Relationship
between
history of
child abuse &
depression.
Research
Designs/Approaches
Type
Purpose
Time
frame
Nonexperime
ntal correlational
Cohortsequential
Examine relat. Future betw. 2 var.
predictive
where 1 is
measured
later.
Examine
Future
change in a
var. over time
in overlapping
groups.
Degree
of
control
Examples
Low to
moderate
Relat. betw.
history of
depression &
development
of cancer.
How motherchild
negativity
changed over
adolescence.
Low to
moderate
Research
Designs/Approaches
Type
Purpose
Time
frame
Degree
of
control
Examples
Survey
Assess
opinions or
characteristics
that exist at a
given time.
Discover
potential
relationships;
descriptive.
Current
None or
low
Voting
preferences
before an
election.
Past or
current
None or
Low
People’s
experiences of
quitting
smoking.
Qualitative
Tests of Significance
The Question
Null Hypothesis Statistical Test
Group Difference
between means of 2 diff.
groups
Diff. betw. 2 means of
related groups
Diff. betw. means of 3
groups
Group Relationships:
betw. 2 variables
Group Relationships:
betw. 2 correlations
H0: g1 = g2
t-independent
H0: g1a = g1b
t-dependent
H0: g1 = g2 = g3 ANOVA
H0: xy = 0
H0: ab = cd
t-test for sig. Of
correlation
t-test for sig. Of
diff. betw. 2 corr.
Experimental Designs

Examines differences between experimentally
manipulated groups or variables (e.g., one
group gets a certain drug and the other gets a
placebo).
 At minimum, experimental (independent)
variable has two levels (e.g., drug vs.
placebo).
– Advantage is that you can determine causality.
– Disadvantage is cost and many variables cannot
be experimentally manipulated (e.g., smoke
exposure over time).
Null Hypothesis
Significance Testing

Null hypothesis
– Results are due to “chance”
– H0

Alternative (scientific) hypothesis
– Results are due to a true “effect”
– H1
Null Hypothesis
Significance Testing

Null hypothesis
– Results are due to “chance” (H0)

Alternative (scientific) hypothesis
– Results are due to a true “effect” (H1)

Assess
– Assuming H0 is true, what is the probability or
“chance” of obtaining the data we did?
Null Hypothesis
Significance Testing

Null hypothesis
– Results are due to “chance” (H0)

Alternative (scientific) hypothesis
– Results are due to a true “effect” (H1)

Assess
– Assuming H0 is true, what is the probability or
“chance” of obtaining the data we did?

Decide
– If the chance is small enough, reject H0 and infer
the “effect” is real.
Experimental Designs:
Hypothesis Testing
Type of Experimental Research Design
Between
Subject
Within
Subject
Number of
independent
variables
Number of groups
or levels of the
independent variable
One independent
variable
Two groups
More than
two groups
Independent
samples t-tes
One-way
ANOVA
Two independent
variables
Two groups or
two levels of the
independent variable
More than two groups
or more then two levels of
the independent variable
Two-way
ANOVA
Correlated
t-tests
Repeated measures
ANOVA
Parametric Vs. Non-Parametric
Statistics: Two-Sample Cases
Level of
Related Samples
measurement
Nominal
McNemar
test
Independent
Samples
Fisher
X2
exact
test
Ordinal
Sign
test
Wilcoxon matchedpairs sign test
Median
Interval
T-test for
T-independent test
pairs
matched
test
Mann-Witney U test
Parametric Vs. Non-Parametric
Statistics: > 2-Sample Cases
Level of
Related Samples
measurement
Nominal
Cochran
Ordinal
Friedman
Q test
2-way
ANOVA
Interval
Repeated
ANOVA
Independent
Samples
X2
test
Kruskal-Wallis
way ANOVA
measures
ANOVA
one-
Parametric Vs. Non-Parametric
Statistics: > 2-Sample Cases
Level of
measurement
Correlation
Nominal
Contingency
Ordinal
Spearman
Interval
Pearson’s
coefficient
rank correlation
Kendall rank correlation, etc.
Coefficient
Correlation
Sampling Distribution of Mean
Difference Scores
4.5
4
3.5
3
2.5
Normal Curve
2
1.5
1
95% of all cases
0.5
0
99% of all cases
0
Critical Values of T

Need to determine the degrees of freedom
– df = N-2

Need to determine the p value for rejecting
the null hypothesis (alpha)
 Need to determine if this is a 1-tailed or 2tailed level of significance.
T-Values

T120 = 2.00, p < 0.05
What is one of the major
criticisms of employing
statistical tests of the null
hypothesis to determine if
effects are true?
Limitations of Statistical Tests
of the Null Hypothesis

Does not take into account the size of the
difference between means (effect size)
Analysis of Variance (ANOVA)

F-ratio = MSbet
MSwithin
 Essentially is the between group variance
divided by the within group variance.
 If the groups come from similar
populations, the variances between the
groups will be similar to the variance within
groups (null hypothesis is not rejected).
ANOVA

Between group variance consists of:
– Variability due to the effect of the independent
variable (treatment effect)
– Variability due to chance factors

Within group variance consists of:
– Variability in data with the treatment groups that
is due to chance since if treatment effect was
consistent, all subjects within a treatment group
would experience similar magnitude of effect.
Analysis of Variance (ANOVA)

F-ratio = MSbet
MSwithin
 The MS refers to the mean square and is the
sums of squares divided by the appropriate
degrees of freedom.
 Df for MSbet is the number of groups minus 1.
 Df for MSwithin is the total number of scores in
the experiment minus the number of groups.
ANOVA
MSbet = treatment effect + chance variability
MSwithin =
chance variability

Ratio will be 1 if there is no treatment effect
 F(2,144) = 5.56, p < 0.05.
Two-Way ANOVA

Where you have 2 independent variables,
each having at least 2 levels. For example,
– Drug dose (none vs. 5 mg)
– Delivery mood (intravenous vs. oral)

Factorial design so you can test both main
effects and interaction effects
Mixed Model:
2 Between Subject Factors
1 within Subject Factor

Where you have 2 independent variables, each having
at least 2 levels. For example,
– Drug dose (none vs. 5 mg)
– Delivery mood (intravenous vs. oral)

One within subject factor with for example 3 levels
– Pre-treatment, 3 and 6 months follow-up

Factorial design so you can test both main effects and
interaction effects (3-way interaction effects)
Rejecting the Null Hypothesis

Null hypothesis can be rejected but not
accepted
 Arguments made for allowing some
flexibility in being able to conclude the null
hypothesis is true;
– No other studies of the phenomenon have
rejected the null hypothesis
– P value for the test of the null hypothesis is
large (e.g., > .20 or .40).
– Research design is sufficiently powerful
Errors in Statistical
Decision-Making

Type I error – falsely reject the null
hypothesis
– At p < .05 there is a 5% chance (5 in 100) of
falsely rejecting null hypothesis

Type II error – failing to reject the null
hypothesis when it is false
External Validity
Chapter 14
Goals of Psychology
Research

Goal is to understand the underlying laws
governing the behaviour of organisms.
 The extent to which the results of your study
help inform one about these underlying laws,
the more valuable the findings.
 Limits to the importance of the findings are the
internal/external validity.
External Validity

Extent to which the results of the study can
be generalized across different persons,
settings, and times.
 Typically think of generalizing to specific
populations (e.g., North American
elementary school students) than world at
large.
 Best safeguard is random selection but not
usually feasible.
Threats to External Validity

Lack of population validity
 Lack of ecological validity
 Lack of time validity
Population Validity

Generalizing to the defined population (i.e.,
target population) from which the sample
was drawn.
 Sample is the experimentally accessible
population.
Population Validity
Target
Population
Experimentally
accessible
population
Sample
Population Validity

Threatened by a selection by treatment
interaction:
– Treatment results may not be exactly
reproducible in target population.

Even willingness to volunteer for studies
have been shown to result in a selection by
treatment interaction effect.
Ecological Validity

Extent to which the results can be
generalized across settings or environmental
conditions.
– E.g., Would the treatment effect observed in
patients recruited from a 1st class medical
centre be the same as the the treatment effect
observed in patients recruited from a local
community hospital?
Ecological Validity

Multiple-Treatment Interference
– Sequencing effect whereby exposure to one
treatment influences responses to another
treatment; or
– Exposure to one experiment influences
response in another experiment (e.g.,
sophisticated participants).
Ecological Validity

Hawthorne Effect
– Knowing one is in a study can affect one’s
behaviour
– Participant bias effects (e.g., social
acceptability, compliance)

Novelty or Disruption Effect
– Effects are simply due to novelty and wear off
once novelty diminishes.
Ecological Validity

Experimenter Effect
– Enthusiastic experimenter/clinician may get
different effects than a clinician who is
implementing the treatment in routine care.

Pre-testing Effect
– Administering a pre-test may sensitive the
participant in such a way that he/she may
respond differently to the experiment than what
would have occurred without a pre-test.
Temporal Validity

Extent to which the results would generalize
to other times
– Results might vary depending on the time
elapsed between presentation of the
independent variable and the measurement of
the dependent variable.
Temporal Validity

Seasonal Variation
– Variation that appears regularly over time (e.g.,
change in traffic accident rates between
daylight savings time and non-daylight savings
time).
– Fixed-time variation – variation at specific,
predictable time points
– Variable-time variation – don’t know when
variation will occur but when it occurs, there
are predictable responses.
Temporal Validity

Cyclical Variation
– Predictable variation within people or other
organisms

Personological Variation
– Variation in the characteristics of the individual
over time
Internal Vs. External Validity

Tends to be an inverse relationship
– Internal validity ; external validity

In testing for between group differences,
you want to minimize within group
variability and maximize between group
differences
 To do so you want to ensure high control
over factors that could confound the results
but this often results in increasingly
artificial experimental conditions.
When Is External Validity Less
Important
When you don’t need to demonstrate that
“X” will happen but rather “X” can happen.
 Sometimes the main goal is to test a theory
and extent to which it reflects “real-life” is
less important.
