Download Workshop 4. Hypothesis Testing part 1: t

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Workshop 4. Hypothesis Testing part 1: t-tests
1. The effect of images on computer game player movement
An experiment is run to see the effect of adding visual imagery to rooms
within a computer game level. Room A is full of images while room B is bare.
Players are placed between the rooms and observed for 60 minutes. The time
spent in the bare room B is recorded.
For a particular experiment a sample n = 16 is chosen. The results show that
the mean time spent in B is 39 minutes with SS = 540.
The question is, does this data give us believable evidence that visuals have
an effect on the players, and if so how.
STAGE 1. (a) Write down the Null hypothesis. What does this tell you about
the population average  ? Write down the alternative hypothesis. What does
this tell you about the population average?
STAGE 2. (b) What is the number of degrees if freedom? For an alpha o f
0.05, find the t- values from the t-table.
(c) Sketch the distribution, and indicate where the null hypothesis is to be
rejected.
STAGE 3. Calculate the test statistic. First, (d) Calculate the sample sd
SS
s
df
(e) Calculate the standard error
(f) Calculate the t-statistic
sX 
t
s
n
X 
sX
STAGE 4. Decision. Does the t-statistic fall in the critical region? Do we reject
the null hypothesis or not? Can we deduce whether players prefer visuals or
not? How?
Page 1 of 6
2. Hypothesis testing with two Independent Samples
In this experiment, two sample groups of students were taken and each asked
to memorize a number of noun-noun pairs (e,g, dog / bike). Then one group
were asked to form images of the pairs, e.g. a dog riding a bike. They were
given a memory test and the number of correct recalled pairs was noted.
Here’s the summary of the results:
Group 1 (No imagery) n = 10, sample mean = 19, SS = 40
Group 2 (Imagery)
n = 10, sample mean = 26, SS = 50.
STAGE 1. (a) What is the null hypothesis. What does it tell us about the
differences of the population means? What is the alternative hypothesis?
STAGE2. (b) Find the degrees of freedom (df) for this data? Remember there
are two independent samples.
(c) Choose an alpha of 0.05 and look up the t-value from the table. Sketch the
t distribution.
STAGE 3. Calculate the test statistic
(c) Calculate the pooled variance s 
SS1  SS2
df1  df 2
(d) Calculate the standard error sX1  X 2  s
(e) Calculate the t-statistic t 
1 1

n1 n2
( X 1  X 2 )  ( 1  2 )
s X1  X 2
STAGE 4. Is the t statistic in the critical region? Is the null hypothesis rejected
or accepted? If it turns out that the use of imagery has an effect, can we
deduce whether this is beneficial or not?
Page 2 of 6
3. Hypothesis testing with Repeated Measures
Here an experiment is run to test whether computer games can reduce
anxiety. Five subjects were tested for anxiety before and after playing a
computer game. Here’s the results:
Player
A
B
C
D
E
Before
9
4
5
4
5
After
4
1
4
0
1
D
-5
D-squared
25
(a) Complete the difference (D) and the squared columns.
STAGE 1. (b) State the null hypothesis. What does this say about the
population mean  D ?
STAGE 2. (c) What is the sample size? What is the degrees of freedom (df)?
Hint, the same sample is used twice. For an alpha of 0.05, find the t-value
from the table. Plot the t distribution.
STAGE 3. (d) Calculate the sd for the sample s 
(e) Calculate the standard error sD  s
SS
df
1
n
D  D
sD
STAGE 4. Is the null hypothesis rejected or not. Can we say whether
computer game play reduces anxiety?
(f) Now find the t statistic t 
Page 3 of 6
Page 4 of 6
Using a spreadsheet to perform t-tests
Enter your data into an Excel spreadsheet. You will need the raw measurements.
Your two sets of data will both need to be in rectangular arrays: it is easiest to use
two adjacent columns.
Enter the following command (preceded by a ‘=’ sign) into a vacant cell.
t-Test:
TTEST(range1,range2,tails,type)
Notes:
1. ‘tails’ is the number of tails in the test. It should be either 1 or 2.
2. ‘type’ indicates the type of t-test.
type = 1: the test is to be conducted on paired data, e.g. before and
after for the same sample. In this case, range 1 and range 2 must
contain the same number of data items, in the same order.
type = 2: the test assumes that the two samples have equal variance.
type = 3: the test does not assume that the samples have equal
variance.
3. The number that the function produces is between 0 and 1, and is the
probability that the two data sets come from samples with the same mean.
Unless you have paired data, it is best to use a type 3 test – this makes no
assumptions about the variance. (Variance = square of standard deviation)
--------------------------------------------------------------1. Repeat exercise 3 above using the spreadsheet. Hopefully you will get the same
answer.
2. Here is some data about the heights
of men and women, from Josh
Deutsch (2010), online at
http://www.statistics-helponline.com/node65.html, accessed
10.10.12
What would you criticise about the way
in which this data has been recorded
(apart from the fact that they are in
inches, and the reason for this is
because the data are from a US site)?
Use a t-test to determine the
significance of any difference between
the heights of men and women.
Height of men
(inches)
67.489439
69.483160
70.561353
74.846320
69.469678
71.959434
68.360909
70.582437
72.777127
73.612962
74.591664
65.933320
70.154467
73.060535
66.321518
72.125492
72.615020
67.630836
70.996237
70.616807
69.491898
69.044748
69.113072
71.566874
63.306848
Page 5 of 6
Height of women
(inches)
63.463062
63.880407
64.539034
63.841551
65.692283
64.963393
66.325883
65.102038
66.229205
62.041943
63.663395
67.989878
69.852506
69.211567
63.448222
58.165974
61.652194
64.821550
63.396557
63.592375
63.476537
64.693599
65.660290
64.927502
66.915061
3. The following data are from TIEE (Teaching Issues and Experiments in Ecology)
(2005), online at
http://tiee.ecoed.net/vol/v1/experiments/fastplants/fastplants_student_data.html,
accessed 10.10.12. This is data that was obtained by students.
(If you go to the web page, you will find that data set 4 tells a rather sad little story.)
Analyse the data using t-tests to determine the significance of any differences
caused by the different treatments. You may be able to do more than one t-test on
each set of data.
(I don’t know what rapid cycling Brassica is, but I haven’t
seen it in the Tour de France.)
Data Set #3. Effects of insect herbivory on rapid cycling Brassica (height in cm;
4 plants/tmt)
Wild (before): height (28, 22, 26, 21)
Dwarf (before): height (7, 7, 6, 4)
Control - wild (before): height (29, 30, 24, 25)
Control - dwarf (before): height (6, 10, 9, 5)
Wild (after): height (28, 60, 59, 52)
Dwarf (after): height (7, 8, 6, 7)
Control - wild (after): height (35, 50, 48, 47)
Control - dwarf (after): height (22, dead, 14, 12)
______________________________________________________________
Data Set #5. How does flooding affect rapid-cycling Brassica (height in cm; 4
plants/tmt)
Flooded (wild type) growth amount: height (2.25, 0.75, 1.75, 2.5)
Flooded (dwarf) growth amount: height (1.5, 1.0, 1.0, 1.0)
Non-flooded (wild type) growth amount: height (3.0, 2.5, 2.5, 3.25)
Non-flooded (dwarf) growth amount: height (1.5, 1.25, 0, 0.5)
______________________________________________________________
Data Set #6. Effects of sugar on the growth of rapid-cycling Brassica (leaf
length in mm; 8 plants/tmt)
Control (before): leaf number (2, 2, 2, 3, 2, 2, 3, 2); leaf length (5, 7, 5, 6, 9, 7,
8, 8)
Sugar (before): leaf number (2, 1, 3, 2, 3, 2, 2, 2); leaf length (7, 6, 8, 8, 7, 6,
7, 6)
Control (after): leaf number (4, 3, 4, 4, 4, 4, 4, 3); leaf length (17, 17, 13, 15,
17, 17, 17, 17)
Sugar (after): leaf number (4, 3, 4, 4, 4, 4, 3, 4); leaf length (17, 22, 20, 15,
10, 17, 22, 16)
Page 6 of 6