Download Laboratory 4 - School of Computer Science and Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

History of statistics wikipedia , lookup

Time series wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 4:
Comparing Two Independent Samples
To complete the laboratory exercise, work your way through this handout, which is self
contained and self explanatory. Work in pairs (two per machine), and learn from each other.
Keep separate logs of your work. The tutor is available to help with technicalities and discuss
substantive issues if necessary.
Invitations to consider the results of Minitab analysis and their statistical and substantive
interpretations are printed in italics. Take some time for this; consult your neighbour or
tutor. Enter your responses in a Word document, as if draft contributions to a report on
the experiment and its analysis.
Topics:
1.
2.
3.
Comparing two independent samples

initial data analysis

2-sample t

checking assumptions
When are perceived sample differences statistically significant?
A comprehensive exercise
The final part of Laboratory 3 showed how to use a one sample test to compare two samples of
measurements when the study design ensures that measurements in the two samples have
been appropriately matched, so that the difference between the samples is adequately
represented by the differences between matched pairs of measurements. The test of
difference then becomes a test of the deviation from 0 of the average of the single "sample" of
paired differences.
When the matched pairs design is appropriate, the variation between measurements made on
different subjects is removed from consideration when the paired differences are calculated,
thus reducing the standard error to which the mean difference is referred via the t-statistic.
However, pairing may not always be possible.
Part 1 deals with a case where matching is not possible and the standard "2-sample t test" is
appropriate. Initial data analysis is followed by application of the test and then diagnostic
analysis to validate the standard assumptions underlying the test.
In Part 2, the question of when a perceived difference, perhaps established through initial data
analysis, corresponds to a statistically significant difference. The effect of a sequence of
increasing hypothetical differences on both the perception and the fact are studied. Using
simulated data, the effect of changing samples sizes on these issues may also be invesigated.
In Part 3, students are invited to apply the approach to analysis used in Part 1 to another case
study and bring the study to a conclusion in the form of a client report.
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 4
Learning Objectives:
Be able to

make and interpret dotplots, boxplots and numerical summaries of two samples of
data as part of an initial data analysis, with reference to spread, level, patterns and
exceptions

compare and contrast horizontal and vertical boxplots

implement and interpret a 2-sample t test

explain key aspects of the test

compare and contrast versions of the test assuming and not assuming equal
standard deviations, using Minitab Help to seek relevant information

explain the assumptions underlying the validity of the 2-sample t test and the
consequences of their not being valid

calculate residuals and make and interpret a Normal diagnostic plot of the residuals

make and use Normal reference plots to assist with interpretation

check the equal standard deviations assumption

discuss the implications of exceptional cases in inference regarding standard
deviations

set up a procedure to check the effect of increasing mean difference between
samples on the graphical impact of the difference and the statistical significance of
the difference

use computer simulation to assess the effect for varying sample sizes

implement a comprehensive analysis of the differences between two independent
samples and prepare a comprehensive report on the analysis.
page 2
Trinity College, Dublin
Generic Skills Programme
1.
Statistics for Research Students
Laboratory 4
Comparing two samples
As part of a larger study of academic progress by males and females, IQ scores of samples of
seventh grade boys and girls in a Mid-West USA school district were measured. Assuming that
these samples were representative of all the seventh graders, male and female, in the school
district, a basic question is: Is there evidence of a difference in IQ scores for boys and girls? If
so, a supplementary question is: What is the size of this difference?
The data are available in IQ Scores.xls in the GenericSkillsData folder; copy to Minitab.
To facilitate brushing and identification of individual cases later, stack the data in a single
column, with group identifiers in another column:







1.1
name C3 "IQ", name C4 "Group",
from the Data menu, select Stack, / Columns
enter Boys and Girls as the columns to stack,
check "Column of current worksheet" and enter IQ,
store subscripts in Group,
ensure "Use variable names in subscript column" is checked,
click OK.
Initial data analysis
Produce graphical and numerical summaries as follows:









from the Graph menu, select Dotplot, then One Y, With Groups1,
select IQ as the Graph variable,
select Group as the categorical variable for grouping,
click OK,
repeat with boxplots, this time clicking on the Scale button and checking Transpose
value and category scales,
from the Stat menu, select Basic Statistics, then Display Descriptive Statistics,
select IQ as the Variable and Group as the "By variable",
click the Statistics button, check Mean, Standard deviation, Minimum, Maximum, N
nonmissing, uncheck others,
click OK, OK.
Compare Boys' and Girls' IQ with regard to
(a)
spread
(b)
level
Comment on patterns and exceptions.
Transposing the value and category scales gives horizontal boxplots, consistent with the
dotplots (and the usual convention for histograms). To see the effect of changing this,

produce vertical boxplots.
Which do you prefer? Why?
1
note that "One Y" means that the data are in one column, with group identifiers in another column.
Recall that Y is frequently used in statistical notation to represent a response variable. Here, Y refers to
the IW variable and may be regarded as a response to an IQ test
page 3
Trinity College, Dublin
Generic Skills Programme
1.2
Statistics for Research Students
Laboratory 4
2-sample t
Use a two-sample t-test to test the statistical significance of the difference between mean IQ's
for boys and girls:




from the Stat menu, choose Basic Statistics, then 2-sample t,
check "Samples in one column", then select IQ for Samples and Group for Subscripts,
check "Assume equal variances",
click OK.
Report on the result of the t-test.
Explain the make up of the Pooled Standard Deviation.
Why are the degrees of freedom = 76?
Report on the confidence interval estimate.
There is a suggestion that the spread of Girls IQ exceeds that of Boys. To allow for this
possibility,

repeat the 2-Sample t, allowing for unequal variances.
List the differences in results.
The key difference between the cases of equal and unequal variances is that the sampling
distribution of the test statistic changes. When the variances are equal (and Normality applies),
the t distribution is appropriate. When the variances are not equal, Minitab uses an
approximation to the sampling distribution which is also a t distribution but with different degrees
of freedom. To find out more about this, use Minitab context sensitive Help:





edit the last dialog (Ctrl+E),
click on the Help button in the dialog box,
click on the "Equal or unequal variances" link at the end of the first Help page and read
the resulting help,
click on the "main topic" link, then "see also", "Methods and formulas", "Test statistics",
read the Help.
(Note the confusing use of "sample standard deviation, s", when
"standard error" is meant).
Which test do you prefer? Why?
1.3
Checking assumptions
Both t-tests used above make use of the t distribution as the null hypothesis sampling
distribution of the 2-sample t statistic. The validity of this use of the t distribution depends on
certain assumptions including an assumption that
the underlying frequency distribution of IQ is Normal.
In addition, the first test assumes that
the standard deviations of Boys and Girls IQs are equal.
If these assumptions are invalid, then
page 4
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 4
the significance level of the test may not be 5%,
the p-value may be incorrectly calculated,
the width of the confidence interval may be too narrow, or too wide.
It is sensible, therefore, to check these assumptions, especially given the reservations noted at
the initial data analysis stage.
1.3.1
Checking Normality
The Normality assumption may be checked using a Normal probability plot of the residuals,
defined as the individuals IQ values less the relevant sample mean. By combining the two
samples, we have a larger sample on which to base the Normality check. By subtracting the
relevant means, we ensure that the combined sets of residuals have a common sample mean
of 0; combining the two samples with their different means would distort the Normal plot. This
parallels the calculation of residuals at the end of Part 3 of Laboratory 3.

Use the procedure in Laboratory 3 to calculate residuals,
 use C5, C6, C7 instead of C30, C31, C32,
 name C5 "Boys Res", C6 "Girls Res", C7 "Residuals".
Before proceeding to produce the Normal plot, recall that, by default, Minitab puts the Normal
scores on the vertical axis and the data on the horizontal, out of line with popular convention.
To change this,



from the Tools menu, select Options, open Individual Graphs, select Probability Plots,
under Graph Orientation, check "Show raw data on vertical scale",
click OK.
To make the Normal plot,




from the Graph menu, click on Probability Plot, choose Single,
enter Residuals as the Graph variable,
click on Distribution, then the Data Display tab, uncheck "Show confidence interval",
click OK, OK.
Discuss the result.
The confidence interval, unchecked in the instructions above, provides an interval within which
individual points from a genuine Normal sample are expected to fit. To see how this applies
here,

repeat the Normal plot, this time checking the "Show confidence interval" box.
Discuss the result.
Can you reconcile the values outside the confidence limits with the result of the
Anderson-Darling (AD) test?
To assist further with interpreting the Normal plot, make Normal reference plots:

follow the instructions in Laboratory 2, page 5, for making 19 Normal reference plots, this
time generating 78 rows and storing them in C8-C26,
page 5
Trinity College, Dublin
Generic Skills Programme


Statistics for Research Students
Laboratory 4
ensure that the "Show confidence interval" box is unchecked,
compare the Normal plot of residuals with each of the reference plots in turn.
For each reference plot, note any deviations from a straight line, compare the pattern in the
Residuals plot to that in each reference plot.
Do you think that the Residuals data follow a Normal model? Explain.
1.3.2
Checking Equal Standard Deviations
The second assumption may be checked formally by testing the significance of the difference of
the two sample standard deviations. The standard test for this, referred to as the F-test, is
based on the ratio of the two standard deviations. This test is known to be sensitive to
departures from the Normality assumption and so Minitab also provides a second test, Levene's
test, designed to be less sensitive to lack of Normality. These tests may be implemented as
follows:



from the Stat menu, select Basic Statistics, then 2 Variances,
check "Samples in one column", then select IQ for Samples and Group for Subscripts,
click OK.
The results are displayed in terms of confidence intervals for the two standard deviations,
boxplots of the data, and summary results of the two tests.
Comment on the results.
Note the range of possible values of  included in the confidence interval for Girls
standard deviation. Discuss:
 what implications does the upper end have for possible spread of Girls IQ?
 how does this compare to the spread evident from the boxplots?
 what is needed to reduce the confidence interval width?
Since p-values are reported for the two tests, critical values are not needed. Critical values for
the F test are readily available (and may be computed using Minitab) and will be used later in
the course. Levene's test uses approximate critical values which are the squares of the
appropriate critical values for the 2-sample t test used above. (These are F-test critical values
also, but for a different F-test).
1.4
Reanalysis
The Normal diagnostic plot and the initial data analysis suggested the possibility that the
smallest values in both Boys and Girls data may be exceptional. While it is not easy to decide
on this issue, it makes sense to re-apply the t-test with these cases deleted, to see what effect,
if any, their deletion has on the formal analysis. To do this,



revisit the Normal diagnostic plot and brush all four suspect points (hold down the shift
key while pointing at each point in succession),
 (if the brush is not working, remake the Normal diagnostic plot and turn on the
brush)
from the Data menu, select Subset Worksheet,
check "Specify which rows to exclude", then check "Brushed rows",
page 6
Trinity College, Dublin
Generic Skills Programme


Statistics for Research Students
Laboratory 4
click OK,
reapply the 2-sample t-test, as in §1.2 above.
Report on the results of the t-test.
Compare with the results of the earlier application.
Comment
2.
When are perceived sample differences statistically significant?
While the initial data analysis suggested that girls' IQs were somewhat lower than boys', the ttest did not indicate statistical significance in the observed difference, meaning that the
observed difference could be explained away in terms of chance variation and that, if different
samples of boys and girls had been tested, the observed difference might not have recurred, or
the difference might have been in the opposite direction. (One toss of a fair coin resulting in
heads does not imply that all tosses of that coin will result in heads).
An obvious question is "how big must an observed difference be to be statistically significant?"
A partial answer to this question may be found by creating versions of the data adjusted to have
increasingly bigger differences between boys and girls and observing the effect of increasing
difference on both the dotplots and the value of t. Minitab can be set up to do this in a few
steps that may be outlined as:
first, create a column whose first cell will hold potential mean difference values, to be
changed as desired,
next, adjust the Boys IQ values so that Boys and Girls have the SAME mean, (subtract
the mean difference, calculated earlier to be 5.12), and adjust further so that they have
the desired mean difference,
next, calculate the t-statistic for testing the difference between Boys (as adjusted) and
Girls,
finally, make a dotplot of the the Boys (as adjusted) and Girls.
Minitab can be set to update the calculations and the dotplot each time the potential mean
difference value, set up in the first step, is changed. The effect on both t and dotplot can then
be observed simultaneously, as you iterate through a sequence of potential differences.
This may be achieved as follows:

name C28

enter 0 in row 1 of C28,

name C29

use the Calculator to calculate 'Boys' – 5.12 + 'delta' in C29, check the "Assign as a
formula" box, click OK,
"delta";
"Boys+",
delta will be set to 0 initially and then to successively
increasing difference vales
Boys+ will contain the Boys IQ values, adjusted to have
the same mean as the Girls initially, and then
increasing values, as delta is increased
page 7
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 4

name C30

in C30, calculate MEAN('Boys+')-MEAN(Girls),

name C31

in C31, calculate2
"Mean Diff",
"SE Diff",
the numerator of the t-statistic
the denominator of the t-statistic
sqrt(STDEV(Boys)**2/COUNT(Boys)+STDEV(Girls)**2/COUNT(Girls))

name C32

in C32, calculate C30/C31,

make a dotplot of Boys+ and Girls,

right-click the completed graph and check "Update Graph Automatically",
"t",
the t-statistic
How do Boys+ and Girls compare in the dotplot?
What is the value of t?
Now, change delta to 1, then 2, then 3, etc. Note changes in the dotplot and in the value of t.
How big is delta when the sample means are significantly different according to t?
How big is delta when the samples appear "different" according to the dotplot?
3
A comprehensive exercise
High Pressure Liquid Chromatography is a sophisticated methodology to separate, identify and
quantify the constituents of chemical compounds. New variants on the basic methods are
regularly being developed. At one stage in a HPLC method development programme, two
different sulphonic acid sodium salts were used in one phase of the method. The variants were
designated as Method A and Method B. The effects of the methods on the percentage
recovery of the nominal level of the active ingredients of pharmaceutical products were being
compared.
The study involved each method being used in 12 separate analytical runs, with the methods
being implemented in random order, over a period of several days. The results reported for
each run are the averages of a fixed number of within-run replicates.
The results for one of the test materials are shown below.
2
This is the Minitab version of the formula
2
Boy
s
nBoy s
2
Girls

, for combining the standard errors of the
nGirls
mean for Boys and the mean for Girls into a single standard error for the difference between the means.
page 8
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 4
Method A
95.0
97.3
95.7
95.7
94.8
95.8
94.2
93.0
96.2
95.9
96.2
94.9
Method B
97.3
97.2
95.2
95.6
99.2
96.2
98.5
95.9
96.0
98.0
95.9
96.8
They are available in the %Recovery dataset in the GenericSkillsData folder.
Carry out a detailed analysis of the data.
Include initial data analysis, detailed examination of the validity of the standard model, and
standard tests of difference between the methods. Where statistical significance is established,
report the result in the form of a confidence interval
Prepare, in Microsoft Word, a short management report setting out your conclusions, with an
appendix including all relevant computer output.
page 9
Trinity College, Dublin
Generic Skills Programme
Statistics for Research Students
Laboratory 4
Conclusion
This concludes Laboratory 4. The learning objectives listed at the outset are reproduced here.
Check them individually and ensure that you have achieved each one; seek help from the Tutor
if necessary.
Learning Objectives:
Be able to

make and interpret dotplots, boxplots and numerical summaries of two samples of
data as part of an initial data analysis, with reference to spread, level, patterns and
exceptions

compare and contrast horizontal and vertical boxplots

implement and interpret a 2-sample t test

explain key aspects of the test

compare and contrast versions of the test assuming and not assuming equal
standard deviations, using Minitab Help to seek relevant information

explain the assumptions underlying the validity of the 2-sample t test and the
consequences of their not being valid

calculate residuals and make and interpret a Normal diagnostic plot of the residuals

make and use Normal reference plots to assist with interpretation

check the equal standard deviations assumption

discuss the implications of exceptional cases in inference regarding standard
deviations

set up a procedure to check the effect of increasing mean difference between
samples on the statistical significance of the difference

use computer simulation to assess the effect for varying sample sizes

implement a comprehensive analysis of the differences between two independent
samples and prepare a comprehensive report on the analysis.
page 10