Download Day 12 Powerpoint Slides - University of Texas School of Information

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Analysis of variance wikipedia , lookup

Transcript
i
INF397C
Introduction to Research in Information
Studies
Spring, 2005
Day 13
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
1
More resources -- SE
•
i
Dr. Phil Doty’s 26-minute online tutorial on standard error of the mean. Very
helpful:
http://cobra.gslis.utexas.edu:8080/ramgen/Content2/faculty/doty/research/dsmbb.r
m
–
–
–
Note, I had not talked of “expected value.” When you hear that “M is the expected value
of µ,” you can substitute “M is used to estimate µ.”
Note also we have not talked about “CV,” though last week I did say to expect that S is
kinda of the same general order of magnitude, but smaller than, M. Same idea.
Don’t forget Dr. Doty’s page of tutorials,
http://www.gslis.utexas.edu/~lis397pd/fa2002/tutorials.html
where you will find also an eight-minute introduction to inferential statistics, two tutorials on
confidence intervals, and one on Chi squared.
I think one thing you’ll find interesting in these tutorials is that here is a second professor,
using a different text book (Spatz), who studied at a different school, who’s never heard
me lecture (nor I him) . . . and we use much the same language to describe things. The
point is, this stuff (descriptive and inferential statistics) is universal.
•
Two pages of explanation of standard error of the mean:
http://davidmlane.com/hyperstat/A103735.html
•
http://research.med.umkc.edu/tlwbiostats/stnderrmean.html
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
2
More resources - Probability
i
• Jim asked for a visual demonstration of probability and
outliers (or some such). Here’s a good one:
http://www.eskimo.com/~cdickens/applets/pDemo/pDe
moApplet.html
• Here’s a better one:
http://www.ms.uky.edu/~mai/java/stat/GaltonMachine.h
tml
• Here’s another:
http://www.stattucino.com/berrie/dsl/Galton.html
• http://www.mathgoodies.com/lessons/vol6/intro_proba
bility.html
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
3
More resources – Assorted things
i
• http://www.stat.berkeley.edu/~stark/Java/Html/
NormHiLite.htm Use the slider bars! Man,
don’t you wish you had had access to this tool
for the last question on the midterm?!
• http://www.stat.berkeley.edu/~stark/Java/Html/
StandardNormal.htm Play with the standard
deviation slider bar!
• http://wwwstat.stanford.edu/~naras/jsm/NormalDensity/N
ormalDensity.html
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
4
t tests
i
• Go to the McGraw-Hill statistics primer and read the
subsections on Inferential Statistics. Not a lot of meat,
there, but it will help you to hear it stated in a slightly
different way.
• For some examples of the use of t tests . . .
– http://www.yogapoint.com/info/research.htm for an example of
some t tests.
– http://www.main.nc.us/bcsc/Chess_Research_Study_I.htm
Notice how they continually say “p>.05” rather than “p<.05”! Do
NOT, as they suggest at the end, “send a check for $39.95
payable to the American Chess School.”
Go find more examples, just for yourself.
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
5
Confidence Intervals
i
• We calculate a confidence interval for a population parameter.
• The mean of a random sample from a population is a point
estimate of the population mean.
• But there’s variability! (SE tells us how much.)
• What is the range of scores between which we’re 95% confident
that the population mean falls?
• Think about it – the larger the interval we select, the larger the
likelihood it will “capture” the true (population) mean.
• CI = M +/- (t.05)(SE)
• See Box 12.2 on “margin of error.” NOTE: In the box they arrive
at a 95% confidence that the poll has a margin of error of 5%. It
is just coincidence that these two numbers add up to 100%.
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
6
CI about a mean -- example
•
•
•
•
i
CI = M +/- (t.05)(SE)
Establish the level of α (two-tailed) for the CI. (.05)
M=15.0 s=5.0 N=25
Use Table A.2 to find the critical value associated with the df.
– t.05(24) = 2.064
• CI = 15.0 +/- 2.064(5.0/SQRT 25)
= 15.0 +/- 2.064
= 12.935 – 17.064
“The odds are 95 out of 100 that the population mean falls between
12.935 and 17.064.”
(NOTE: This is NOT the same as “95% of the scores fall within this
range!!!)
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
7
Another CI example
i
• Hinton, p. 89.
• t test not sig.
• What if we did this via confidence
intervals?
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
8
Limitations of t tests
•
•
•
i
Can compare only two samples at a
time
Only one IV at a time (with two levels)
But you say, “Why don’t I just run a
bunch of t tests”?
a) It’s a pain in the butt.
b) You multiply your chances of making a
Type I error.
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
9
ANOVA
i
• Analysis of variance, or ANOVA, or F
tests, were designed to overcome these
shortcomings of the t test.
• An ANOVA with ONE IV with only two
levels is the same as a t test.
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
10
ANOVA (cont’d.)
i
• Remember back to when we first busted out some
scary formulas, and we calculated the standard
deviation.
• We subtracted the mean from each score, to get a feel
for how spread out a distribution was – how DEVIANT
each score was from the mean. How VARIABLE the
distribution was.
• Then we realized if we added up all these deviation
scores, they necessarily added up to zero.
• So we had two choices: we coulda taken the absolute
value, or we coulda squared ‘em. And we squared
‘em.
Σ(X – M)2
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
11
ANOVA (cont’d.)
i
• Σ(X – M)2
• This is called the Sum of the Squares
(SS). And when we add ‘em all up and
average them (well – divide by N-1), we
get S2 (the “variance”).
• We take the square root of that and we
have S (the “standard deviation”).
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
12
ANOVA (cont’d.)
i
• Let’s work through the Hinton example
on p. 111.
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
13
F is . . .
i
• F is the variance ratio.
• F is
– between conditions variance/error variance
– (systematic differences + error variance)
/error variance
– Between conditions variance/within
conditions variance
(This from Hinton, p. 112, p. 119.)
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
14
Check out . . .
i
• ANOVA summary table on p. 120. This is for a ONE FACTOR
anova (i.e., one IV). (Maybe MANY levels.)
• Sample ANOVA summary table on p. 124.
• Don’t worry about unequal sample sizes – interpretation of the
summary table is the same.
• The only thing you need to realize in Chapter 13 is that for
repeated measures ANOVA, we also tease out the between
subjects variation from the error variance. (See p. 146 and 150.)
• Note, in Chapter 15, that as factors (IVs) increase, the
comparisons (the number of F ratios) multiply. See p. 167, 174.
• What happens when you have 3 levels of an IV, and you get a
significant F?
• Memorize the table on p. 177. (No, I’m only kidding.)
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
15
Interaction effects
i
• Here’s what I want you to understand about
interaction effects:
– They’re WHY we run studies with multiple IVs.
– A significant interaction effect means different levels
of one IV have different influences on the other IV.
– You can have significant main effects and
insignificant interactions, or vice versa (or both sig.,
or both not sig.) (See p. 157, 158.)
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
16
Correlation
i
• With correlation, we return to
DESCRIPTIVE statistics. (This is
counterintuitive. To me.) (Well, it’s
BOTH descriptive and inferential.)
• We are describing the strength and
direction of the relationship between two
variables.
• And how much one variable predicts the
other.
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
17
Correlation
i
• Formula –
– Hinton, p. 259, or
– S, Z, & Z, p. 393
• Two key points:
– How much predictability does one variable
provide, for another.
– NOT causation.
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
18
Correlation (cont’d.)
i
• Go to the McGraw-Hill statistics primer
http://highered.mcgrawhill.com/sites/0072494468/student_view0/stati
stics_primer.html
and click on “Correlational Statistics.” Read the
three sub-sections. I will NOT ask you to
calculate a product moment correlation nor a
coefficient of determination, but these are
good concepts to know, and these few pages
describe it well.
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
19
Let’s talk about the final
i
• Here’s what you’ve read:
– Huff (How to lie with statistics)
– Dethier (To know a fly)
– Hinton: Ch. 1 – 15, 20
– S, Z, & Z: Ch. 1-8, 10-13
– Several other articles
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
20
For the final, EMPHASIZE…
i
• Descriptive stat
–
–
–
–
Measures of central tendency, dispersion
Z scores (both ways!)
Frequency distributions, tables, graphs
Correlation (interpret, not calculate)
• Inferential stat
–
–
–
–
–
–
–
Hypothesis testing
Standard error of the mean
t test (calculate one, for one sample; interpret others)
Confidence intervals (maybe calculate one)
Chi square (maybe one)
ANOVA – interpret summary table
Type I and II errors
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
21
Emphasize . . .
i
• Experimental design
–
–
–
–
–
–
–
IV, DV, controls, confounds, counterbalancing
Repeated measures, Independent groups
Sampling
Operational definitions
Individual differences variable
Ethics of human study
Possible sources of bias and error variance and how to
minimize/eliminate
• Qualitative methods
– Per Rice Lively, Gracy, Doty
– Survey generation (from SZZ, Ch. 5)
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
22
De-emphasize
i
•
•
•
•
•
Complicated probability calculations
APA ethical standard (S,Z, & Z, Ch. 3)
Content analysis (SZZ, Ch. 6)
Calculating an ANOVA.
Nonequivalent control group design
(SZZ, Ch. 11) (Indeed, de-emphasize all
Ch. 11)
• Hinton, Ch. 12
R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | [email protected]
23