Download Ethics & Research

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Operations research wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Some basic statistical tests & more
on basic statistical analysis
Communication Research
Week 11
with help from:
Carey, J & Dimmitt, C. (2003) Statistical Analysis: Is Change Real? www.umass.edu/schoolcounseling/
WelcometoAmherstMassachusetts/StatisticalAnalysis.ppt [accessed 10 Oct 2006]
http://www.statsoft.com/textbook/stathome.html
Why Statistical Analysis?




After we gather and compute our data, we want to be
sure that the scores of two groups really are different.
We want to be sure that the differences we see are
not just due to chance.
If we are basing decisions on real differences our
behavior is directed and purposeful.
If we are basing decisions on differences that are
only due to chance our behavior is random and
chaotic.
Communication Research
2
Statistical Tests


Allow us to estimate the likelihood that the
apparent differences between groups are real
and not due to chance.
These tests have the built in capacity to take
the number of people per group and the
variability of the data into account when
making these estimates.
Communication Research
3
Measuring Variables






Variables – the things we measure, control or manipulate
Independent variables (IV) are usually those that are
manipulated
Dependent variables (DV) are only measured or registered
They differ in how well they can be measured and the type of
measurement scale used
Two or more variables are related if, in a sample of
observations, the values systematically correspond to each
other for these observations
eg height is considered related to weight because typically tall
people are heavier than short ones; IQ is related to the number
of errors in a test, if people with higher IQs make fewer errors
Communication Research
4
Why are relations between
variables considered important?



The philosophy of science believes that there is no other
way of representing “meaning” except in terms of
relations between some quantities or qualities
Statistical significance (p-value) of a result is the
probability that the observed relationship (eg between the
variables) or a difference (eg between the means) in a
sample occurred by pure chance (“luck of the draw”) and
that in the populations from which the sample was drawn,
no such relationship or differences exist
In other words, the statistical significance of a result tells
us something about the degree to which the result is
“true” (ie representative of the population)
Communication Research
5
Example – "Baby boys to baby girls
ratio."



Consider the following example from research on statistical
reasoning (Nisbett, et al., 1987). There are two hospitals: in the first
one, 120 babies are born every day, in the other, only 12. On
average, the ratio of baby boys to baby girls born every day in each
hospital is 50/50. However, one day, in one of those hospitals twice
as many baby girls were born as baby boys. In which hospital was it
more likely to happen?
The answer is obvious for a statistician, but as research shows, not
so obvious for a lay person: It is much more likely to happen in the
small hospital.
The reason for this is that technically speaking, the probability of a
random deviation of a particular size (from the population mean),
decreases with the increase in the sample size.
Communication Research
6
Data characteristics that help
determine the statistical test used





Type of data used – nominal, ordinal, interval,
ratio
Two groups vs more than two groups
Whether groups are matched (“paired”) or
unmatched
Whether groups are small or large
Whether the data are normally distributed
(continuous data)
Communication Research
7
Different data/variable types
Data type
Description
Example
Nominal
Allow for only qualitative classification
– they can be measured only in terms
of whether the individual items belong
to some distinctively different
categories, but we cannot quantify or
even rank order those categories.
For example, all we can say is that two (2) individuals
are different in terms of variable A (eg of a different
race), but we cannot say which one "has more" of the
quality represented by the variable. Typical examples
of nominal variables are gender, race, color, city, etc
Allow us to rank order the items we
measure in terms of which has less
and which has more of the quality
represented by the variable, but still
they do not allow us to say "how
much more."
Eg socioeconomic status of families. For example,
we know that upper-middle is higher than middle but
we cannot say that it is, for example, 18% higher.
Also this very distinction between nominal, ordinal,
and interval scales itself represents a good example
of an ordinal variable. For example, we can say that
nominal measurement provides less information than
ordinal measurement, but we cannot say "how much
less" or how this difference compares to the
difference between ordinal and interval scales.
Ordinal
Communication Research
8
Different data/variable types
Data type
Interval
Ratio
Description
Example
Allow us not only to rank order the
items that are measured, but also to
quantify and compare the sizes of
differences between them. For
example, temperature, as measured
in degrees Fahrenheit or Celsius,
constitutes an interval scale.
Eg We can say that a temperature of 40 degrees is
higher than a temperature of 30 degrees, and that an
increase from 20 to 40 degrees is twice as much as
an increase from 30 to 40 degrees
Are very similar to interval variables;
in addition to all the properties of
interval variables, they feature an
identifiable absolute zero point, thus
they allow for statements such as x is
two times more than y.
Typical examples of ratio scales are measures of
time or space. For example, as the Kelvin
temperature scale is a ratio scale, not only can we
say that a temperature of 200 degrees is higher than
one of 100 degrees, we can correctly state that it is
twice as high. Interval scales do not have the ratio
property. Most statistical data analysis procedures do
not distinguish between the interval and ratio
properties of the measurement scales.
Communication Research
9
Example 2: After implementation of a family math
education intervention, Latino/a students average 4th
Grade MCAS scaled score increased from 206 to
215.


Hypothesis (Ha). The two groups are really
different.
Null Hypotheses (Ho). The two groups are
not different, the apparent difference is due to
chance.
Communication Research
10
Example 2: After implementation of a family math
education intervention, Latino/a Students’ average 4th
Grade MCAS score increased from 206 to 215.


The variability of the outcome data is a major factor in
determining whether the differences are real or due
to chance.
At the TAB, in a straight bet, how much would you be
willing to wager the Ha is true if you knew that, if
students retake the MCAS within a month



90% of the time their two scores differ by less than 2
points.
90% of the time their two scores differ by less that 10
points.
90% of the time their two scores differ by less than 50
points.
Communication Research
11
Example 1: 70% of White and 40% of African
American 3rd graders score Advanced or Proficient on
the MCAS Reading Test.


The number of people in two groups is a major factor in
determining whether differences are real or due to
chance.
At the TAB, in a straight bet, how much would you be
willing to wager the hypothesis is true if you knew that:



The percentages are based on 10 students from each
group.
The percentages are based on 50 students from each
group.
The percentages are based on 100 students from each
group.
Communication Research
12
Parametric vs non parametric tests




Since we have two types of data we need two types of
statistical tests.
Parametric – the DV is a continous variable (eg age in
years) so it makes sense to calculate the mean and SD
Non parametric – the DV is a count (nominal data) or a
ranking (ordinal data) and so it makes no sense to
measure a means eg “the average gender of Australians
is 1.5”
Parametric Tests are generally more powerful, meaning
that if there is a real difference between the groups its
easier to find it with a Parametric Test
Communication Research
13
Examples of parametric tests





Independent t-test or a comparison of two means
Looks for a difference between two groups (eg men and
women) on a particular variable (eg whether they kiss on
the first date)
Paired t-test eg such as how someone feels about drink
driving before they get caught, and how they feel about it
afterwards.
In an SPSS output table, the Sig (2-tailed) value is the
significance value – the likelihood that the result could
happen by pure chance.
If the value is less than 0.05, the chance is less than 5%,
so the significance of the difference is 95% – this is
therefore highly significant
Communication Research
14
Choosing a significance level



Statistical Tests do not give us information that allows
us to definitively say whether an observed difference
between groups is real or just due to chance.
Statistical Tests do give us an estimate of the
likelihood that observed difference between groups
results from chance.
We must decide what criteria we will use for deciding
whether a difference is real.
Communication Research
15
Choosing a Significance Level

We do this by choosing a Significance Level




.25 25% chance difference is due to chance
.10 10% chance difference is due to chance
.05
5% chance difference is due to chance
.01
1% chance difference is due to chance
Communication Research
16
T-Test in SPSS
SPSS will allow you to do all of these
tests quite easily
Communication Research
17
If the value is less than 0.05,
the chance is less than 5%,
so the significance of the
difference is 95% which is
highly significant.
Levene’s test, checks to see whether
the variances of the two variables are
relatively similar. If the significance for
Levene's test is 0.05 or below, then the
‘Equal Variances Not Assumed’
t-test result (the one on the bottom) is
used. Otherwise you use the ‘Equal
Variances Assumed’ test
(the one on the top)
Communication Research
18
T-Test for Independent Samples

Remember, we need to know two other things
in order to ascertain the likelihood of chance
creating this size of a difference:



The number of people in each group
The variability of the scores
The number of people is easy, and is counted
in the frequency table (n)
Communication Research
19
Variability

In order to know
whether a difference
between two means is
important, we need to
know how much the
scores vary around the
means.
Communication Research
20
Variability



Holding the difference
between the means
constant
With High Variability the
two groups nearly
overlap
With Low Variability the
two groups show very
little overlap
Communication Research
21
Measuring Variability
Communication Research

Medium Variance

High Variance

Low Variance
22
Measuring Variability



Usually it’s easier to work with the square root of the
variance.
This statistic is called the Standard Deviation.
SPSS statistical tests will calculate the SD for you
Communication Research
23
ANOVA





ANOVA is an acronym; ANalysis Of VAriance.
It is an extension of the two-tailed t-test, and is generally used to test
for significant differences between means.
The name is derived from the fact that in order to test for statistical
significance between means, we actually compare (or analyse)
variances.
For two-group comparisons, ANOVA will give results identical to a ttest, but when the design is more complex, ANOVA offers numerous
advantages that t-tests cannot provide (even if you run a series of ttests comparing various cells of the design).
For example, it often happens in research practice that you need to
compare more than two groups (e.g., drug 1, drug 2, and placebo), or
compare groups created by more than one independent variable while
controlling for the separate influence of each of them (such as
Gender, Type of Drug, and Size of Dose).
Communication Research
24
Communication Research
25
Communication Research
26
Communication Research
27
Independent T test
John
Robert
Communication Research
28
Dependent T-test
John
John
Communication Research
29
Oneway ANOVA
John
Robert
Kevin
Communication Research
Tom
30
Repeated Measures ANOVA
John
John
John
Communication Research
John
31
Factorial ANOVA
John
Robert
Kevin
Tom
Janine
Roberta
Katie
Teresa
Communication Research
32
Mixed ANOVA
John
John
John
John
Janine
Janine
Janine
Janine
Communication Research
33
Chi2 Test




Chi Square (X2) - is a non-parametric test
It uses nominal data and checks to see if there are
significant differences between/among groups compared
to what would be expected.
The crosstabulation table tells you whether selected
variables are related to other selected variables; the chisquare table tells you what the degree of certainty is.
Chi-square is based on the fact that for a two-way table,
we can compute the frequencies that we would expect if
there was no relationship between the variables.
Communication Research
34
Chi2 Example


Suppose we ask 20 men and 20 women to choose
between two brands of soft drink - brands A and B. If
there is no relationship between preference and
gender, then we would expect about an equal
number of choices of brand A and brand B for each
gender.
The chi-square test becomes increasingly significant
as the numbers deviate further from this expected
pattern; that is, the more this pattern of choices for
men and women differs.
Communication Research
35
Communication Research
36
Communication Research
37
Communication Research
38
Communication Research
39
If our Chi2 test statistic has exceeded
the critical value for:



The .10 significance level it would mean that there
was only a 10% chance of seeing a difference that
large that resulted from chance.
The .05 significance level it would mean that there
was only a 5% chance of seeing a difference that
large that resulted from chance.
The .025 significance level it would mean that there
was only a 2.5% chance of seeing a difference that
large that resulted from chance.
Communication Research
40
If our Chi2 test statistic has exceeded
the critical value for:


The .01 significance level it would mean that
there was only a 1% chance of seeing a
difference that large that resulted from
chance.
The .005 significance level it would mean that
there was only a 0.5% chance of seeing a
difference that large that resulted from
chance.
Communication Research
41
Pearson correlation




The Pearson correlation looks for a relationship between
two variables and generates a mathematical index of the
relationship between them.
The value lies between -1.00 and +1.00, and the bigger the
number the stronger the relationship.
Pearson correlation indicates trends - one thing increases
(or decreases) as another thing increases (or decreases).
A negative value indicates that low scores on one variable
go with high scores on the other variable, while a positive
value indicates that a high score of one variable goes with
a high score on the other variable.
Communication Research
42
Communication Research
43
Note that SPSS will allow
you to calculate different
tests of significance as well
as note significant
correlations
Communication Research
44
Communication Research
45