Download Confidence Intervals

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Analysis of variance wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Confidence Intervals
Studies to Review
• Loftus and Masson(1994)
• Masson and Loftus(2003)
• Jarmasz and Hollands(2009)
Using Confidence Intervals in
within-subject designs
Loftus and Masson(1994)
Intro
• Bayesian Technique
• Null Hypothesis and Significance Testing
• Competing Hypotheses
• Using Confidence Intervals
Confidence Intervals
• “How well does an observed pattern of sample means represent the
underlying pattern of population means?”
• Confidence intervals have been argued to replace formal stat analysis
• Hypothesis testing is designed to address a restricted, convoluted, and usually
uninteresting question
• A finding of statistical significance implies that the experiment has enough
statistical power
Confidence Intervals
• Confidence intervals can provide
• An initial estimate and intuitive assessment
• The best estimate of the underlying pattern of population means
• The degree to which the observed pattern should be taken seriously
Confidence Intervals for Within Subject
Design
• Current textbooks say that CIs are used for between subject designs
• A CI used for between subject design has two useful properties
• Determined by the same error term as an ANOVA
• A confidence interval around a sample mean and one around the difference
between 2 means are related by a factor of sq(2^3)
• Confidence in patterns of means can be judged on the basis of
confidence intervals plotted around the individual means
A Hypothetical Experiment
• Measures the effects of study time on free recall
• Participants are presented a list of 20 words
• 1, 2, or 5s per word
• Interested in the relationship between study time and words recalled
Between-Subject Data
• Analyzing data as if it were between subjects
• N=30, 10 subjects per group in 3 groups
• Each group participates in 1 of the 3 time conditions with recall
recorded
Between Subject Confidence Intervals
Within-Subject Data
• Same study ran with 10 subjects
• Each subject ran through each condition
Creating a Confidence Interval
• Ex. Creating a confidence interval for the 1s condition
• Researchers would need to create a CI that has the same inter-subject
variability as the between-subject error variance
• This would cause the CI to give a different impression than the within-subject
ANOVA
• This error occurs because inter-subject variance is irrelevant to a within
subject ANOVA, but decides the size of a confidence interval
• Generally speaking the ANOVA and confidence interval are based on
different error terms, causing conflicting information
A Within Subject Confidence Interval
• Ignore the inter-subject variance
• Normalize participant scores by subtracting from the original score a
subject-deviation score
• Mi(rightmost column table 2) – the grand mean M = 12.73(bottom right table
2)
A Within Subject Confidence Interval
• Each subject pattern and condition means remain unchanged
• Each subject now has the same grand mean
• Graphing the data now gives a visual without subject variability
• 2 sources of variability left
• Condition Variance
• Interaction Variance
Conclusion
• Between and within subject confidence intervals both provide:
• Information consistent with an ANOVA
• A clear picture of underlying patterns between means
• A picture of statistical power
• Each experiment has its own challenges
• Specific hypothesis to be evaluated
• Assumptions being met or violated
• Sources of variance that vary in importance
• “No one set of algorithmic rules can appropriately cover all
situations” or “it depends”
Using Confidence Intervals for
Graphically Based Data
Interpretation
Masson and Loftus(2003)
Intro
• Null hypothesis significance testing(NHST) has been heavily debated
• Goal to enhance alternative to NHST for data interpretation
• Prior research has tried to address between/within subject design
and confidence intervals
• This paper considers how to apply a graphical approach to mixed
designs
• “A rule of thumb: plotted means whose confidence intervals overlap
by no more than about half the distance of one side of an interval can
be deemed to differ under NHST”
An Example from prior research
• Similar explanation to Loftus & Masson 1994
• Example of using the Stroop task
• Emphasize that confidence intervals for within-subject research only
support inferences about patterns of means across conditions, not
inferences about population means
Assumptions
• Between subject – homogeneity of variance
• Within-subject – sphericity assumption
• An ANOVA would calculate an e value to correct degrees of freedom if
violated
• Researchers suggest calculating individual confidence intervals if e < .75
however a negative variance can result
• To avoid this, a researcher could construct confidence intervals on single
degree of freedom contrasts
Example of individual CIs
• Table 2 data produces an e of <.6
and violates sphericity
Multifactor Designs
Jarmasz and Hollands(2009)
Multifactor Designs
• Major considerations when trying to graphically present multifactor
designs
• How to illustrate main effects
• How to illustrate interactions
• How to handle violations of homogeneity and sphericity
• If violations occur, transform data to reduce heterogeneity of variance
Designs with 2 levels
• Factorial designs question which mean squared should be used for
confidence intervals
• A regular between subject design has just MSwithin or 1 error term
• This gives a researcher a single confidence interval
• A violation of homogeneity could allow separate intervals
Contrasts
• “Researchers consider an effect produced by a linear combination of
means where that combination is defined by a set of weights applied
to the means.”
• These weights must sum to 0
• Contrast effect
• A difference between 2 means
• A main effect
• An interaction effect
• Computing a confidence interval for this
Graphing Contrasts
• After calculating weights and
creating confidence intervals,
graphs would be plotted
normally
• If a confidence interval does not
include 0, it is thought to be a
significant interaction
• E.g. Encoding interaction with
condition A2
Designs with 2 levels – Within Subjects
• Unlike a between subject study, within subject design includes
multiple MS error terms
• One way to calculate a confidence interval would be to pool your MS
error terms and divide them by the sum of degrees of freedom
Designs with 2 levels – Within Subjects
• What if you should not pool all error terms?
• Pool error terms within a factor of 2 and compute an individual CI for others
• The individual CI only applies to the effect it is associated with
Mixed Designs
• One between subject and one within subject factor
• This means pooling MS error terms would not be possible
• Separate confidence intervals must be constructed for each
Designs with 3 or more Factors
• First plot means with Cis based on pooled MS error estimates then
plot normally
Confidence Intervals in
Repeated-Measures Designs:
The Number of Observations
Principle
Intro
• CIs are a well established statistical method
• They draw inferences well in between subject design
• Using CIs in repeated measure design has been proposed and
accepted
• The size of a CI is affected by the number of observation that create
the mean
• Loftus & Masson papers do not describe calculating n well
• Most think this is total participants
How to Calculate n
• N or number of observations
• Number of participants
• Multiplied by the product of the number of levels of all factors
• Divided by the number of levels for the effect of interest
New CI Formula
• MSrxs = MSE for R
• L = product of the levels of all
RM factors in the analysis
• r is the number of levels for R
Special Scenarios
• IM Factors in Mixed IM-RM Designs
• Mixed IM-RM Interactions
• Repeated Measures Main Effect
• CI(Direction)
Ex. IM Factors in Mixed IM-RM Designs
• Multiple IM factors have no multiplying effect on the MSE
• There is only a since MSE(Mean Square within or MSw)
• Adding RM factors inflate the MSw for any IM factor
• This changes the formula used to calculate a CI
IM-RM Confidence Interval Formula
• Mi is the mean for the relevant
level of Factor A
• ni is the number of participants
in each level of Factor A
• L is the product of the number
of levels of all RM factors
Generalizing the Number of Observations
Principle
• Applying the principle involves 2 steps
• First – Determine how many participants contribute to the effect
• Total participants for pure RM effects
• Participants per condition for IM effects
• Participants for IM-RM interactions
• Multiply the number of participants by the product of the levels of
the remaining RM factors
• In-Article examples