Download review:confidence intervals and hypothesis tests

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
REVIEW: CONFIDENCE INTERVALS AND
HYPOTHESIS TESTS
V INEYARD S OIL D ATA : P OTASSIUM IN 2004 COMPARED TO 2007
BACKGROUND
Soil Potassium was measured at 10 randomly sampled locations in a vineyard in 2004 and again in
2007.
•
•
Has soil potassium changed over time?
If so, how much has it changed?
We’ll begin by assuming we essentially have two independently chosen random samples of soil.
One sample from 2004 and another from 2007. (As a matter of fact, the data was collected
differently and we shall see the impact of this on the analysis later).
LOOK AT THE DATA
WINESOILS UNLINKED.MTW
We start with the “Unlinked” worksheet. The potassium data is stored in one column, year of
sampling in another.
Graph > Individual Value Plot
‘One Y’, ‘With Groups’
K graph variables
year categorical variables
Data View
Individual Symbols
Mean Symbols
600
500
K (ppm)
400
300
200
100
2004
2007
Year
STAT 513 - Schaffner
Ch00-1.Docx
Page 1 of 8
Graph > Histogram
‘Simple’
K graph variables
Multiple Graphs
By variables
Year separate panels
2004
8
6
Frequency
4
2
0
2007
8
6
4
2
0
100
200
300
K (ppm)
400
500
Panel variable: Year
Comments on shape?
STAT 513 - Schaffner
Ch00-1.Docx
Page 2 of 8
A common plot is the mean with an error bar. How is the error bar computed? What does it tell us?
Graph > Interval Plot
‘One Y’, ‘With Groups’
Data View
Interval bar
bar
Make graph and then double click the intervals and change interval type to “Standard Error” and
side = “upper”
Bars are One Standard Error from the Mean
400
K (ppm)
300
200
100
0
2004
2007
Year
STAT 513 - Schaffner
Ch00-1.Docx
Page 3 of 8
PRODUCE BASIC SUMMARY STATISTICS
Stat > Basic Statistics > Display Descriptive Statistics
K Variable
year By variable
Descriptive Statistics: K
Variable
K
Year
2004
2007
N
10
10
N*
0
0
Variable
K
Year
2004
2007
Maximum
216.0
566.0
Mean
118.9
313.5
SE Mean
15.0
47.3
StDev
47.5
149.6
Minimum
82.0
154.0
Q1
91.8
175.5
Median
101.5
275.0
Q3
129.3
445.3
RECALL THE ONE SAMPLE T-INTERVAL
Formula:
Consider separate intervals for each year. First, we’ll split the worksheet in two, one for each year’s
data.
Data > Split Worksheet
Year ‘By Variables’
Window > 2004 worksheet (or use
)
Stat > Basic Statistics > 1-sample t
Samples in columns
K Samples in columns
Results for: Unlinked Soils(Year = 2004)
One-Sample T: K
Variable
K
N
10
Mean
118.9
StDev
47.5
SE Mean
15.0
95% CI
(84.9, 152.9)
Results for: Unlinked Soils(Year = 2007)
One-Sample T: K
Variable
K
N
10
Mean
313.5
STAT 513 - Schaffner
StDev
149.6
SE Mean
47.3
95% CI
(206.5, 420.5)
Ch00-1.Docx
Page 4 of 8
CONDUCT TWO-SAMPLE T-TEST
When is this test appropriate?
What are the hypotheses?
We will work with the full “Unlinked” worksheet.
Stat > Basic Statistics > Two Sample t
Samples in one column
K Samples
year Subscripts
Two-Sample T-Test and CI: K, Year
Two-sample T for K
Year
2004
2007
N
10
10
Mean
118.9
314
StDev
47.5
150
SE
Mean
15
47
Difference = mu (2004) - mu (2007)
Estimate for difference: -194.6
95% CI for difference: (-305.2, -84.0)
T-Test of difference = 0 (vs not =): T-Value = -3.92
STAT 513 - Schaffner
Ch00-1.Docx
P-Value = 0.003
DF = 10
Page 5 of 8
ASSESSING NORMALITY
Graph > Probability Plot
Probability Plot of K
Normal - 95% CI
0
2004
99
500
2007
95
90
Percent
80
1000
2004
Mean
118.9
StDev
47.47
N
10
AD
1.566
P-Value <0.005
2007
Mean
313.5
StDev
149.6
N
10
AD
0.572
P-Value 0.102
70
60
50
40
30
20
10
5
1
0
500
1000
K (ppm)
Panel variable: Year
Do these data meet the conditions required for the two-sample t-test?
STAT 513 - Schaffner
Ch00-1.Docx
Page 6 of 8
CONDUCT PAIRED T-TEST
In actuality, the data was collected at the same sites for each year. Thus there aren’t really two
independently chosen random samples of size 10 each, but rather one sample of 10 locations
measured twice: once in 2004 and again in 2007.
What are the hypotheses of the paired t-test?
What data is actually analyzed?
What conditions are needed?
STAT 513 - Schaffner
Ch00-1.Docx
Page 7 of 8
Here we work with the “wine soils” worksheet.
Wine Soils.MTW
Stat > Basic Statistics > paired t-test
Samples in columns
2004.S.K First sample
2007.S.K Second sample
Paired T-Test and CI: 2004.S.K, 2007.S.K
Paired T for 2004.S.K - 2007.S.K
2004.S.K
2007.S.K
Difference
N
10
10
10
Mean
118.9
313.5
-194.6
StDev
47.5
149.6
119.4
SE Mean
15.0
47.3
37.7
95% CI for mean difference: (-280.0, -109.2)
T-Test of mean difference = 0 (vs not = 0): T-Value = -5.16
STAT 513 - Schaffner
Ch00-1.Docx
P-Value = 0.001
Page 8 of 8
Related documents