Download 10-18 lecture

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Review
• Social facilitation effect:
People ride stationary bikes alone or in groups.
Measure each person’s power in both conditions.
Power (W)
Subject Alone
Group
Diff
A
224
245
21
B
167
160
-7
C
289
301
12
D
257
264
7
E
348
360
12
F
207
205
-2
Mdiff = 7.2
1.
2.
Calculate difference scores
Calculate mean difference score
Effect Size
10/18
Effect Size
• If there's an effect, how big is it?
– How different is m from m0, or mA from mB, etc.?
• Separate from reliability
– Inferential statistics measure effect relative to standard error
– Tiny effects can be reliable, with enough power
– Danger of forgetting about practical importance
• Estimation vs. inference
– Inferential statistics convey confidence
– Estimation conveys actual, physical values
• Ways of estimating effect size
– Raw difference in means
– Relative to standard deviation of raw scores
Direct Estimates of Effect Size
• Goal: estimate difference in population means
– One sample: m - m0
– Independent samples: mA – mB
– Paired samples: mdiff
• Solution: use M as estimate of m
– One sample: M – m0
– Independent samples: MA – MB
– Paired samples: Mdiff
• Point vs. interval estimates
– We don't know exact effect size; samples just provide an estimate
– Better to report a range that reflects our uncertainty
• Confidence Interval
– Range of effect sizes that are consistent with the data
– Values that would not be rejected as H0
• CI is range of values for m or mA – mB consistent with data
– Values that, if chosen as null hypothesis, would lead to |t| < tcrit
• One-sample t-test (or paired samples):
– Retain H0 if t =
M - m0
< tcrit i.e.
SE
M - m0 < tcrit × SE
-4
p
p
0.0 0.4 0.8
0.0 0.4 0.8
– Therefore any value of m0 within tcritSE of M would not be rejected
0.0 0.4 0.8
0.0 0.4p 0.8
Computing Confidence Intervals
-4 -2
tcritSE
M M
m
m
m
-2 00 -4 -400 2 -2 -22 4 00
M
z
– tcritSE
z
m
004
M +ztcritSE
z
2
2
4
4
Formulas for Confidence Intervals
• Mean of a single population (or of difference scores)
M ± tcritSE
• Difference between two means
(MA – MB) ± tcritSE
• Effect of sample size
– Increasing n decreases standard error
– Confidence interval becomes narrower
– More data means more precise estimate
• Always use two-tailed critical value
– p(|tdf| > tcrit) = a/2
– Confidence interval has upper and lower bounds
– Need a/2 probability of falling outside either end
Interpretation of Confidence Interval
• Pick any possible value for m (or mA – mB)
• IF this were true population value
– 5% chance of getting data that would lead us to falsely reject that value
– 95% chance we don’t reject that value
• For 95% of experiments, CI will contain true population value
– "95% confidence"
–
–
–
–
0.0 0.4 0.8
p
• Other levels of confidence
Can calculate 90% CI, 99% CI, etc.
Correspond to different alpha levels: confidence = 1 – a
Leads to different tcrit: t.crit = qt(1-alpha/2,df,low=FALSE)
Higher confidence requires wider intervals (tcrit increases)
tcritSE
• Relationship to hypothesis testing
– If m0 (or 0) is not in the confidence
interval,
M
M then we reject H0
m
-4
-2
00
M – tcritSE
z
2
4
M + tcritSE
Standardized Effect Size
• Interpreting effect size depends on variable being measured
– Improving digit span by 2 more important than for IQ
• Solution: measure effect size relative to variability in raw scores
• Cohen's d
– Effect size divided by standard deviation of raw scores
– Like a z-score for means
Samples
d
dTrue
Estimated
One
Independent
Paired
m - m0
s
m A - mB
s
m diff
s
M - m0
M A - MB
M diff
MS
MS
MS
Meaning of Cohen's d
• How many standard deviations does the mean change by?
• Gives z-score of one mean within the other population
– (negative) z-score of m0 within population
– z-score of mA within Population B
• pnorm(d) tells how many scores are above other mean
(if population is Normal)
-4
m
-2 0
m
0
z
2
4
0.0 0.4 0.8
ds
0.0 0.4 0.8
p
p
0.0 0.4 0.8
– Fraction of scores in population that are greater than m0
– Fraction of scores in Population A that are greater than mB
-4
ds
-4-2
m
m
0 2A
z
z
-20 B
24
4
Cohen's d vs. t
Samples
Statistic
d
t
One
Independent
Paired
M - m0
M A - MB
M diff
MS
M - m0
MS n
MS
M A - MB
æ
ö
MSç n1 + n1 ÷
è A
Bø
MS
M diff
MS n
• t depends on n; d does not
• Bigger n makes you more confident in the effect,
but it doesn't change the size of the effect