Download BS900 Research Methods

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Time series wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
BS900 Research Methods
Introductory Guide to Statistics
and SPSS (Part 1)
Revised September 2010
BS900 Research Methods
CONTENTS
Part 1:
Section 1
Section 2
Section 3
Section 4
Part 2:
Section 5
Section 6
Section 7
Appendix
describing data
significance
non-parametric statistics
parametric statistics
correlations
ANOVA
multiple regression
Saving and presenting data
Stats map
Worksheets
Excellent reference books are
Coolican, H. (2004) Research Methods and Statistics in Psychology. Hodder and Stoughton *
Howell, D.C. Statistical Methods for Psychology. Duxbury
Kinnear, P.R. and Gray (2007) SPSS 15 made Simple, Psychology Press, Taylor and Francis**
Tabachnick B.G. and FIDELL, L.S. (1996) Using Multivariate Statistics. Harper Collins.
*referred to as Coolican in the following text
** referred to as K&G in the following text
2
BS900 Research Methods
Section 1: Describing data
Data is a set of numbers. Before we start getting too excited about any data set or group or population
it’s a good idea to be able to describe the sort of data we have.
Details of this are in the books (Coolican and K&G) and you should look them up. Types of data see
Coolican p244-255.
We are primarily concerned about 1) LEVELS of data, and 2) DISTRIBUTION of data
The following are known as LEVELS
Nominal
where information is put in to categories like sex ethnic group type of sport etc
Ordinal
where cases are arranged in rank positions like 1st or 2nd.
Interval (Scale) where measurements are on a scale like height, weight, heart rate, etc, with equal
distance between intervals
Ratio
are interval data which has an absolute and necessary zero. So the time can be ratio
(i.e. 0 seconds, whereas a temperature of zero doesn’t mean that there is no temperature)
Interval (scale) /ratio is the highest level of measurement, with nominal the lowest
So, Nominal data just put things in to a category (footballers, rugby players, gymnasts).
Ordinal tells you where you are in the ranks, e.g first in the race, last in the race.
Interval/ratio data is about measurement on a scale (because you know about this scale, you can say
more sophisticated things about this data. You can do a lot more tests with this type of data). This
makes it the highest level of measurement (highest= fanciest.).
Exercise 1: Collect the data
1) Measure and record the heart rate of everyone in the class
2) Create a list of all the heart rates
3) Organize the results from slowest to fastest and assign each person a rank
4) Put all the “male heart rates” in to one group and all the “female heart-rates” into
another.
Which is which type of data?
Ok so now we know what kind of data we have got. This is important because this determines what
kind of statistics we’re able to perform on the data.
The next thing we might want to know is what to make of a particular individuals’ measurement i.e how
does their heart–rate compare to other members of the population.
There are 3 ways to look at this;
Mean which is the average
Mode which is the most often occurring
Median the one in the middle when the data are in order
3
BS900 Research Methods
Exercise 2: Data distribution
The next thing we might like to know is something about how our data is distributed.
Normal Distribution (see Coolican 283 -288)
What most researchers, statisticians and (if they were animate) statistics want to see is a normally
distributed population. This is based on the idea that in any population there will be lots of “typical”
people (these are known as “normal” in mathematics and means just that i.e. a term used in
mathematics ) and a few “atypical” ones.
If we think about height, most adults would be between 5 foot and 6 foot tall. These are the “typical”
ones (normal in maths).
The further we go above or below these heights the fewer people there are and these are “atypical”.
In other words it is more unusual to be 7 foot or 4 foot tall, than say 5 foot 6 inches.
Because we want to be able to say most about most people (i.e. in this case the ones between 5 and 6
foot), we want to be sure that our population is “normally distributed”. That is, it has most people in the
middle and a few either side.
Statistical analysis works best with these normally distributed populations.
Of course in sport or health this can be a bit of a worry since we are often dealing with abnormal or
unusual populations (for example elite athletes or patients post MI)….. But we’ll think about that another
day.
4
BS900 Research Methods
It is easier to see distributions if we plot them on a graph
A normal distribution looks like this…………………draw it here (Coolican p284)
A skewed distribution looks like this (a positive skew i.e. a disproportionate number of tall people)
Coolican p291 (draw it here)
or like this (a negative skew i.e. a disproportionate number of short people) Coolican p291 (draw it
here)
If you have a skewed population and want to do statistical tests on the data you have a couple of
different options…. Use a non parametric test, or adjust the data using an appropriate conversion (I will
cover this another time)
5
BS900 Research Methods
Standard Deviation see Coolican p 283 -287
This applies only to interval/ratio data. If you look this up in a statistics book (including Coolican) you
will be faced with mathematics and will be told that a standard deviation is the square root of the
variance. This is point when many people panic and run away. In practice the maths involved are not
very scary but on this course you will not have to worry about them at all because we’ll let the
computers do that! You do, though, need to understand what a standard deviation is, and why it is
important.
Go to PASW 18.0 for windows
Click on type in data
Click on ok
Click on ‘VARAIBLE view’ tab at the bottom of the page
Click on variable name box and type in data_1
Click on ‘DATA view’ tab at the bottom of the page
Click on the box with the yellow fill
Type in the first value, enter, continue until all data_1 values are entered
Go back to ‘VARIABLE view’ tab at the bottom of the page
Click on variable name box in row 2 and type in data_2
Click on ‘DATA view’ tab at the bottom of the page
Input your data
Check your data is inputted correctly
Click on analyze /descriptive statistics/descriptive
Select both variables (data_1 and data_2_) and move them to the right hand box by clicking on the
central arrow
Click on OK
Data 1
6.00
3.00
6.00
5.00
9.00
5.00
1.00
4.00
7.00
3.00
5.00
6.00
5.00
6.00
4.00
Data 2
7.00
9.00
5.00
4.00
7.00
1.00
5.00
9.00
4.00
3.00
1.00
8.00
2.00
2.00
8.00
In short a standard deviation tells you how far a score is away from the mean. It also gives you an idea
of the dispersion or spread of the sample.
6
BS900 Research Methods
It can be shown mathematically that ;
within one standard deviation either side of the mean, 68.2% of all scores will be found,
within 2 standard deviations either side of the means, 95.44% of all scores will be found and
within 3 standard deviations either side of the mean 99.74% of all scores will be found.
So it is really useful thing to know because you can tell if (to return to our example) your heart rate is :Remarkable
Anything to be pleased about
Nothing to be pleased about
Something to worry about
Nothing to worry about
etc. etc.
Draw a normal distribution curve with standard deviations shown, and write in what percent of
the population are contained within each SD.
7
BS900 Research Methods
Exercise 3: entering more complex data.
Next we are going to input our more data into the computer. What we hope to achieve here to get the
computer to work out your mean median mode and the standard deviation of your data, according to
group.
Go to PASW 18.0 for windows
Click on type in data
Click on ok
Click on VARIABLE view
Click on variable name box and type in hrt_rate
Click on DATA view
Click on the box with the yellow fill
Type in the first value, enter, Type in the 2nd value…and so on until you have typed in all the data
Check your data is inputted correctly
In the same way, in the ‘VARIABLE view’ window, create a column named ‘gender’
- You can label these columns M and F if you change the ‘type’ to ‘string’ or you can label them 1
and 2 and create labels or ‘values’ that will appear on your output. This really helps if you have a
lot of columns all labelled 1 or 2 etc
Click on analyze /descriptive statistics/ frequencies*
Click on the central arrow (this moves hrt_rate to the variables box)(if you have input more than one
variable you will need to select the variable (s) you wish to analyse )
Click on statistics
Click on mean, mode, median, standard deviation
Click on continue
Click on ok
See what you get!
 *Note you can do broadly similar basic activities using the ‘frequencies’ or the ‘descriptive’
icons. For more complex data analysis use the ‘frequencies’ option.
To look at the mean etc ACCORDING TO GROUP, you need to
Click on Analyze/ Descriptives /Explore
In this scenario, ‘hrt_rate’ is the dependent variable and ‘gender’ is the factor.
See what you get!
For the standard deviation (SD)
the bigger the number (in relation to the size of the population) the more variability in the data, i.e. the
wider the distribution of the scores
the lower the number, the less variability in the data, i.e. the narrower the variability scores
You can also again information about the shape of the distribution by the kurtosis (shape of the curves)
and the skewness. Check the boxes for these components when you are in the ‘Frequencies’ box.
Kurtosis refers to the ‘peakness of the distrubution’ whilst skew refers to the ‘asymetry’ of the
distribution
8
BS900 Research Methods
Section 2: Significance
(Please refer to Coolican chapter 11 (p313 on)
This concept is at the centre of statistical analysis. Let us take a typical sports science experiment.
We think (hypothesise) that our new training programme benefits fitness
We take 2 groups and measure their fitness
Group 1 continues to pursue its ‘regular’ fitness programme
Group 2 follows our ‘new’ fitness programme
3 months later we measure fitness again
We want to know if the 2 groups are different (i.e. does our new programme work/ does our new
programme improve fitness more than the old programme?) Clearly there will be all sorts of other
variables sloshing around but lets just stick with the numbers.
We can see that the individual and average fitness scores have changed but:Are these changes meaningful?
Should we take any notice of them?
In other words
Are these differences statistically significant?
Being able to see that there is a difference is nothing like the same as saying that the differences
between them are statistically significant. NB this is not the same as saying a change is clinically
significant!
We get at this question of significance by considering the notion of probability.
We end up saying that these differences are probably significant. The good news is that we can say
how probable it is that these differences are significant.
Often we can very confident that we are correct.
You might find it interesting to note that statistics came from the law courts of 19th century France
where mathematicians were asked to say how likely it was that a given casino gambling game was fair.
The critical question was to resolve was the part played by chance.
When we perform statistical analysis today we are trying to do the same thing. That is, decide whether
our results are due to our experimental manipulation OR are due to chance. When you do anything
with science (especially if people are involved) it is always as well to watch out for chance!
Obviously there are all sorts of issues about which statistical test to use.
This is about choosing the correct tool for the job (e.g. there is little point using a lawn mower to clean
your car) but first we need to step back and think about the concepts involved in these ideas about
significance.
In order to be sure if our findings are statistically significant first we have to know what it is that we are
trying to find out. In other words we need a hypothesis – we state what we think will happen as a result
of our experimental intervention.
9
BS900 Research Methods
For our training experiment the hypothesis was
…………………………………………………………………………………………………………
…………………………………………………………………………………………………………
We also need a NULL hypothesis which states the opposite of the experimental hypothesis. The null
hypothesis states that any difference between the 2 groups is due to chance (i.e. nothing we did in the
experiment will have any effect on fitness)
For our training experiment the null hypothesis was
…………………………………………………………………………………………………………
…………………………………………………………………………………………………………
It is the NULL hypothesis that we accept or reject according to our statistical test.
Accepting the NULL hypothesis means that to some extent and/or at some point we were wrong, i.e.
our plan to increase fitness using our new training programme was not successful. There was NO
EFFECT on fitness due to our intervention
Rejecting the NULL hypothesis (and accept the experimental hypothesis) then we were correct i.e. our
new training programme was more successful.
We accept or reject these hypotheses on the basis of significance which in turn is related to probability.
There is a convention about this which puts significance at better than or equal to 5% (or 0.05). In
other words we accept something as being statistically significant if we think it will happen 95 times out
of 100. We leave the other 5 times to chance.
In the case of things like drug trials or invasive surgery we ask for a more stringent level of significance
often 1% since we cannot afford to be killing or harming people!
10
BS900 Research Methods
Exercise:
Remember, the significance level tells us whether to accept or reject the null hypothesis.
If the level is ABOVE 0.05 we ACCEPT the null hypothesis (which usually implies therefore that we
have to accept the fact that our experiment or intervention did not work, or that there was no statistical
difference between our groups).
If the level is BELOW 0.05, we reject the Null hypothesis and can generally conclude that the groups
are different and our experiment was successful or our groups were different and so on..
SO: imagine you have conducted some statistical tests (it does not matter at the moment which tests
you would do)
Given these p values shown, decide what you would do with your null hypothesis and what your
conclusion would be regarding your experiment.
1. You conduct a test to determine whether men are taller that women. Think about what your null
hypothesis and your experimental hypothesis would be. Your statistical test gives you a p value
of 0.03.
2. You conduct a study to see if fast walking has a different effect on weight loss that slow
walking, over the same distance. Statistical tests provide a p value of 0.08
3. You conduct a test to see if heart rate is lower in the morning than in the afternoon. Your
statistical test gives a p value of 0.56
4. You conduct a test to see if patients have higher self esteem at the end of a 12 week exercise
programme compared with before the programme. Your statistical test gives a p value of 0.048
11
BS900 Research Methods
Type 1 and Type 2 errors
These kinds of errors are common but should be avoided
Type 1 (Type I)
Where we reject the Null hypothesis when we should accept it. In other words say we have a
statistically significant result when we do not. There are lots of ways of making this mistake e.g. setting
our significance level too low, performing the wrongs stats, having a flawed methodology wanting to be
correct too badly etc. etc..
Type 2 (Type II)
Where we accept the Null hypothesis when we should have rejected it. In other words say we do not
have a statistically significant result when we do. In other, other words say we are wrong when we are
correct (dumb or what!). Again this is easy to do but it is understandable that a researcher would want
to be cautious.
The Null Hypothesis is…..
True
False
What we do
Accept
Reject
The last piece of this jigsaw is the idea of One- and Two-tailed tests.
It is VERY important to get this right, as getting it wrong makes it easy to make a type 1 or a type 2
error.
If we return to our training example, we hoped that our new training program would improve the fitness
of our players.
If our hypothesis was
1) “Our new training program will (statistically) significantly improve fitness”
then because we are making a prediction about the type of change we expect (i.e. an improvement)
then we call this a one tail hypothesis. When we do this we say that we are predicting a direction, in
this case the direction is, an improvement. If we were predicting a decrement in performance then I
hope you can see that this is also a 1 tailed hypothesis because we are still predicting a direction.
If our hypothesis was
2)“Our new training program will (statistically) significantly change fitness”
then because we are making a prediction that a change will occur but we are not sure whether it will be
an improvement or a decrement then we call this a two tail hypothesis. When we do this we say that
we are not predicting a direction, we are just predicting a change.
If you were wondering where the tail bit comes in look at these graphs.
12
BS900 Research Methods
This is a One tail hypothesis graph. It shows you that we are interested in just one end of the graph.
We call this end a tail.
This is a Two tail hypothesis graph. It shows you that we are interested in both ends of the graph.
Again we call these end tails.
13
BS900 Research Methods
Both of these graphs refer to the theoretical distribution of a statistic. As you can see the distribution
has 2 tails and a central hump. In order to reject the null hypothesis we must have some value which is
so unusual that it is unlikely that it is due to chance. Thus we only consider the tails of our distribution,
as the more common values occur in the centre (remember?).
We also know that only values that can occur by chance only 5% or less of the time are extreme
enough to be deemed significant. That is extreme enough to let us reject the null hypothesis. So when
we consider a one or a two-tailed hypothesis we are only considering 5% of the distribution.
A two-tailed hypothesis is in effect saying that “the effect of the independent variable will be to
produce a value which is extreme, but which could be located at either end of the distribution.
Whereas a one-tailed hypothesis is in effect saying “the effect of the independent variable will
be to produce a value which is extreme and which is located at one end of the distribution.
In effect it is easier to gain a significant result with a one tailed hypothesis because you have (as it
were) 5% of the distribution all in one place and that is why it is relevant for type 1 and 2 errors.
However
DO NOT CHEAT!
When formulating a one tail hypothesis it is important to identify the “region of acceptance” first. In
other words say whether you think you are going to get an improvement or a decrement. You must not
add it later or worse still change it because you get it backwards!
14
BS900 Research Methods
Section 3: Non parametric Statistics
Choosing a statistical test.
Before we can decide on which statistical test to use we need to ask ourselves a number of questions:
1.
2.
3.
4.
Am I looking for a difference or a relationship ?
What level is the data?
Are the data parametric or non parametric?
How many groups /tests – re-tests are involved ?
(see Coolican p363-373 and chapter 13 p395 on)
1. Most statistical significance tests fall into 2 types
Tests of difference i.e tests which address the question….
“Are these two groups statistically significantly different to one another?”
and
Correlations i.e. tests which address the question
“To what extent are these two groups related”. (correlations come later in the course)
2. Hopefully you will remember from earlier lectures that there are different types/levels of data. If
you remember these were:Ordinal, Nominal and Interval/Ratio (scale) data.
You can perform statistics on all three but you cannot perform all statistics on all types of data.
3. Parametric statistics can only be performed on interval/ratio data which meet certain criteria
(see later lecture).
Non parametric statistics can be used on the other two.
YOU MUST NOT PERFORM PARAMETRIC STATISTICS ON NON-PARAMETRIC DATA
4. Are you testing 2 groups or more than 2 groups.
15
BS900 Research Methods
There is a further consideration within this category and this is the idea of a related (within subject)
and an unrelated (between subject) design.
A related (within subjects) design is where all the measurements were taken from the SAME subjects,
one more than one occasion.
e.g. I measure 10 students’ resting heart rate, get them to train with Reebok step for 3 weeks and then
measure their resting heart-rate again. I then compare the 2 sets of data from (before and after) to see
if there is a statistically significant difference between them.
An unrelated (between subjects) design is where the measurements are from 2 different groups.
e.g. I measure 10 non sporty students resting heart rate. Then I get a different 10 subjects who
regularly train with Reebok step and then measure their resting heart-rate too. I then compare the 2
groups to see if there is a statistically significant difference between them.
Your choice of design may be dependant on all sorts of factors e.g. practice effects, comparing athletes
with non-athletes, comparing the effects of different fitness programs etc.
Please check that you understand this idea now. If you don’t then do ask!
1. Wilcoxon Test (differences, non-parametric – ordinal or above, 2 groups, withinsubjects)
The Wilcoxon Signed Rank Test is used to test for differences when subjects have been monitored on
two different occasions or under two different conditions and is a non-parametric test.
You have performed an experiment to see whether exercise intensity affected limb pain. Ten subjects
cycled at 50% VO2max for 30-mins and then rated their lower limb discomfort on a scale of 1-10 (1
being no pain and 10 being very painful). The test was repeated one week later but this time cycling at
an intensity of 70% VO2max. The scores for each subject are given below.
Leg discomfort score 50% VO2max
4
5
2
6
3
4
5
2
6
5
Leg discomfort score 70% VO2max
7
8
5
9
5
6
8
6
9
6
The dependent variable is probably not normally distributed. Enter the data in two columns (one for the
data with intensity set at 50% VO2max and one for the data with the intensity at 70% VO2max)
Select Analyze/Nonparametric Tests/Legacy dialogue/2 Related samples: Highlight each condition
and click the arrow to the left of the "Test Pair(s) List" window. You should see in the "Test Pair(s) List"
window the way the comparison between conditions will be made. Make sure that Wilcoxon is selected
for Test Type and then select Options. In the options window, select Descriptive, Exclude cases
16
BS900 Research Methods
test-by-test, and click the Continue button. You should be back to the original window. Click OK to
run the analyses.
The Wilcoxon output provides us with descriptive statistics, information about ranks, and finally the test
statistic.
The first sub-table, Ranks, shows the sums of the positive and negative ranks, the second sub-table,
Test Statistics, the value of the test statistic whose p=value (Asymp. Sig. (2-tailed)) is less than 0.05,
confirming that there is a significant difference between the leg pain scores for the two exercise
intensities. We can write these results as:
Z = -2.871; p<0.01
Additional Reading
Pallant – Chapter 16 (pgs 220-225) and chapter 17 (pgs 232-241)
17
BS900 Research Methods
What sort of test do we use if we have unrelated data?
2. The Chi square test (aka the ² test) (differences, non-parametric, 2 groups,
between – subjects)
This is for unrelated data at the nominal level and the data must be in the form of frequencies. Again it
measures the difference between groups.
In the following experiment we recorded how many men and how many women bought or did not buy a
drink after a visit to the gym.
Data table 1
Sex
M
M
F
F
Bought a drink
Y
N
Y
N
Frequency
3
7
6
4
We want to know if there is a significant difference in drink buying between the 2 groups.
We are comparing the ‘expected’ behavior with the ‘observed’ behavior.
We expect (i.e. the null hypothesis) that there will be no difference in the drink buying behaviour of men
and women.
In VARIABLE view, add Sex as in row one, and choose ‘string’ as the data type
Add a second row called ‘Drink’ . Add a ‘value’ with 1= yes and 2=no
In DATA view add the data using the information in the summary table above as a guide.
Click on Analyse / Descriptives/ Crosstabs
Move ‘Sex’ into the ‘Row’ box
Move ‘Drink’ as the ‘Column’ box
Click ‘ Statistics’
Select ‘Chi squared’
Click OK
Your data should look something like this…
gender * drink Crosstabulation
Count
drink
yes
gender
Total
no
Total
f
6
4
10
m
3
7
10
9
11
20
18
BS900 Research Methods
Chi-Square Tests
Value
Pearson Chi-Square
Continuity
Exact Sig. (2-
Exact Sig. (1-
sided)
sided)
sided)
1.818a
1
.178
.808
1
.369
1.848
1
.174
Correctionb
Likelihood Ratio
df
Asymp. Sig. (2-
Fisher's Exact Test
N of Valid Cases
.370
.185
20
a. 2 cells (50.0%) have expected count less than 5. The minimum expected count is 4.50.
b. Computed only for a 2x2 table
The important data here are the Pearson Chi-Squared value and the Asymp Sig (2 sided).
Do these results show there to be no significant difference , or a significant difference between the drink
buying behaviour of men and women ?
How would we report this data ?
19
BS900 Research Methods
3.
The Mann-Whitney (U) test (differences, non-parametric (ordinal or above), 2
groups, between subjects)
This is a test of difference for ordinal, unrelated, data.
So it’s like the Wilcoxon but it works for unrelated data.
It looks at the differences between the sums of two sets of ranks.
See Coolican p367-371 for a summary and 212-214 in K&G.
In this experiment we have measured athletes’ speed of reaction in milliseconds in a catching task
either to their right or their left side.
Case
Side
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
Reaction
Time(ms)
500
513
300
561
483
502
439
467
420
480
392
445
271
523
421
489
501
388
411
467
1=L
2=R
20
BS900 Research Methods
Go to PASW 18.0 for windows
Click on type in data
Click on ok
Select ‘VARIABLE’ view’
Click on variable name row 1 and type in “side”
Click on measure
Click on ordinal
Click on variable name row 2 and type in “reactime”
Click on measure
Click on scale
Switch to ‘DATA view’
Click on the first box in the first column
Type in the first value from the above table
Click on the box below…..and so on until you have typed in all the data
Check your data is inputted correctly
Click on analyze /non-parametric/legacy dialogues/ 2 independent-samples
Click on ‘reactime’
Click on the top arrow (this moves it to the test variables list)
Click on ‘side’
Click on the lower arrow (this moves L/R to the grouping variables box)
Click on ‘define groups’
Type the value 1 into the group 1 box
Type the value 2 into the group 2 box
Click on continue
Click on ok
See what you get!
You should have something like this…
Ranks
side
speed
N
Mean Rank
Sum of Ranks
1
10
12.15
121.50
2
10
8.85
88.50
Total
20
Test Statisticsb
speed
Mann-Whitney U
33.500
Wilcoxon W
88.500
Z
-1.248
Asymp. Sig. (2-tailed)
Exact Sig. [2*(1-tailed Sig.)]
.212
.218a
Does this show a significant difference between L and R handed people or not ?
Can we reject the null hypothesis?
How would we report this data ?
21
BS900 Research Methods
Section 4: Parametric Statistics
(See Coolican p 359 on)
Somewhat unsurprisingly parametric statistics deal with parameters which are measures of populations,
in particular the mean and variance. Mathematically variance is the square of the standard deviation.
It’s a kind of measurement of all that is going on in your sample.
As it says in Coolican
“Parametric tests are so called because their calculation involves an estimate of population parameters
made on the basis of sample statistics”.
The bigger the sample the more accurate the estimate will be, this is because in a small sample the
statistics are distorted by an odd or extreme value (I suppose they are more likely to stick out like a
sore thumb!).
Parametric statistics are “good things” because with the correct data (see below) they are said to have
more power. In these circumstances power is defined as the likelihood of the test detecting a
significant difference when the null hypothesis is false. The other good thing is that when compared to
non-parametric tests one need fewer data to reach the same level of power. Because Parametric tests
use all the information available they are more sensitive, this means that if there is a difference
between groups, the test is good at finding it.
But remember
More subjects is better
Parametric statistics can’t make up for poor and/or poorly collected data.
Parametric statistics can only be used on certain types of data (see below)
ASSUMPTIONS FOR PARAMETRIC TESTS
1. Interval data
2. Sample is drawn from a normally distributed sample
3. Homogeneity of variance (that is that the variance of the two samples are not significantly different)
In fact many parametric tests are sufficiently robust to cope with quite untidy data.
Please read p362 on in Coolican about this
Please also read the comparison parametric and non-parametric tests on p363
4. The t-Test for related (within subjects) data
This is a test of difference for interval or ratio data for a related design where this data meets the
assumptions previously discussed.
Other words used meaning the same type of test are ‘paired’ or ‘within subjects’
22
BS900 Research Methods
In this experiment a group of archers used (condition A) imagery as their preparation for their
competition and on another occasion (condition B) rehearsal of archery for their preparation. So the
archers used both training strategies and we want to know which is the best.
i.e. it is within subjects, the same subjects do both tests, one week and the next week. The score can
range between 0 and 20
number of points scored in competition
Score Week 1 (A)
6
15
13
14
12
16
14
15
18
17
12
7
15
Score week 2 (B)
6
10
7
8
8
12
10
10
11
9
8
8
8
(There is practical information about this in K&G p203-208.)
23
BS900 Research Methods
So now we are going to perform a related/paired t-test in SPSS. Coolican explains what the test is
doing on pages 203 -208
Go to PAWS 18.0 for windows
Click on type in data
Click on ok
Go to VARIABLE view
Click on variable name box 1 and type in “week_1”
Click on variable name box 2 and type in “week_2”
Go to DATA view
Enter the data in the above table in 2 separate columns***
Click on analyze/ compare means / paired sample t test
Move week_1 and week_2 to the ‘paired variables’ box
Click on ok
See what you get! –
**a good tip to remember is that each PERSON has their OWN row in SPSS
Paired Samples Statistics
Mean
Pair 1
N
Std. Deviation
Std. Error Mean
week_1
13.3846
13
3.52464
.97756
week_2
8.8462
13
1.67562
.46473
Paired Samples Test
Paired Differences
95% Confidence
Interval of the
Mean
Pair
week_1 -
1
week_2
4.53846
Std.
Std. Error
Deviation
Mean
2.60177
.72160
Difference
Lower
Upper
2.96622
6.11070
Sig. (2t
6.289
df
tailed)
12
.000
What are the important values to look at here ?
How would we report this data ?
24
BS900 Research Methods
What sort of test do we use if we have unrelated or non–paired data?
5. The t-test for unrelated data (between)
This is a test of difference for interval or ratio data for an unrelated design where this data meets the
assumptions previously discussed. Other words used for this same type of test are independent or
between subjects
We want to know if there is a significant difference in speed of reaction in a visual field task between 2
groups.
See Coolican p196-203 for a summary and 191-203 in K&G.
In this experiment we have measured males and female athletes’ speed of reaction in milliseconds in a
recognition task.
Athletes therefore can only be in one group (males or females) It is a between subjects or unrelated or
independent test.
You do not need to have the same number of subjects in each group.
Case
Gender
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
m
m
m
m
m
m
m
m
m
m
f
f
f
f
f
f
f
f
f
Reaction
Time(ms)
500
513
300
561
483
502
439
467
420
480
392
445
271
523
421
489
501
388
411
25
BS900 Research Methods
Go to PASW 18.0 for windows
Click on type in data
Click on ok
Go to VARIABLE view
Click on variable name box 1 and type in “sex”
Change the data type to ‘string’
Click on variable name box 2 and type in “time”
Go to DATA view
Enter the data. Remember each subject has their own row……
Click on analyze /compare means / independent-samples t-test
Move ‘time’ to the ‘test variable’ box.
Move ‘sex’ to the ‘grouping variable’ box. We are grouping peoples scores according to their sex.
Click on ‘define groups’
The way you enter these must match how you have entered them in your sheet – e.g. capitals and
words or numbers must be the same.
Click on continue
Click on ok
See what you get!
Your data should look something like this…
Group Statistics
sex
time
N
Mean
Std. Deviation
Std. Error Mean
m
10
466.5000
70.37242
22.25371
f
9
426.7778
76.02101
25.34034
Independent Samples Test
Levene's Test
for Equality of
Variances
t-test for Equality of Means
95% Confidence
Interval of the
Sig.
(2F
time Equal
.119
Sig.
t
.735 1.183
df
Mean
Std. Error
tailed) Difference Difference
17
.253
39.72222
33.58023
variances
Difference
Lower
Upper
- 110.57032
31.12588
assumed
Equal
1.178 16.418
.256
39.72222
33.72478
variances not
- 111.06797
31.62353
assumed
What are the important values here ?
Is there a significant difference between males and females reaction time?
How do we report these data ?
26
BS900 Research Methods
What do you want to look for ?
Differences
2 groups
Between
Within
Regression
Correlations
>2 groups
Between
Within
Between AND
Within
27