Download U3.2-InferencesAboutVariances

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Inferences about Variances (Chapter 7)
In this Lecture we will:
•
•
•
•
Develop point estimates for the population variance
Construct confidence intervals for the population variance.
Perform one-sample tests for the population variance.
Perform two-sample tests for the population variance.
Note:
Need to assume normal population distributions for all sample
sizes, small or large! If the population(s) are not normally
distributed, results can be very wrong. Nonparametric
alternatives will be presented later.
Variance-Test-1
Point Estimate for 2
The point estimate for 2 is the sample variance:
n
1
2
s2 
(
y

y
)

i
n  1 i1
What about the sampling distribution of s2?
(I.e. what would we see as a distribution for s2 from repeated samples).
If the observations, yi, are from a normal (,)
distribution, then the quantity
2
(n  1) s
2

has a Chi-square distribution with df = n-1.
Variance-Test-2
Chi Square (2) Distribution
x
2
0. 0.1 0.2 0.3 0.4 0.5
 df2 2
0

2
df 5
 df2 10

2
df 15
5 10
15
20
25
30
y
Non-symmetric.
Shape indexed by one parameter called the degrees of freedom (df).
Variance-Test-3
Table 7 in Ott and Longnecker
Chi Square Table
Variance-Test-4
Confidence Interval for 2
(n  1)s
2
If
2
has a Chi Square Distribution, then a 100(1-)% CI
can be computed by finding the upper and lower
/2 critical values from this distribution.
Pr( 2df ,1
2
(n  1)s2
2



)  1 
2
df ,
2

12.83
.8312
0.5 x5 0.1
0.15
df=5
0.
.95
0
5 10
15
20
25
30
y
0
 2df ,
2
20.48
3.247
0. 0.2 0.4 x10 0.6 0.8 0.1

2
df ,1 
2
df=10
.95
5 10
15
20
25
30
y
Variance-Test-5
Pr( 2df ,1
2
(n  1)s2
2



)  1 
2
df ,
2

2
(n  1)s2
(
n

1
)
s
2


 2
2
df ,
df ,1
2
2
Consider the data from the contaminated site vs. background.
Background Data:
n7
y  2.48
s  1.13
A 95% CI for background population variance
2
(6)1.132
(
6
)
1
.
13
2


 2
2
 6,0.025
 6,0.975
0.53   2  6.193
s2 = 1.277
Variance-Test-6
Hypothesis Testing for 2
What if we were
interested in testing:
Test Statistic:
2
(
n

1
)
s
2 
02
H0 :
1
Ha : 2
3
 2   02 ( 02 is specified )
 2   02
 2   02
 2   02
Rejection Region:
1. Reject H0 if 2 > 2df,
2. Reject H0 if 2 < 2df,1-
3. Reject H0 if either 2 < 2df,1-/2 or 2 > 2df,/2
Example:
n7
y  2.48
s  1.13
2
(
6
)
1
.
13
2 
 7.66
1
In testing Ha: 2 > 1:
Reject H0 if 2 > 26,0.05 =12.59
Conclude: Do not reject H0.
Variance-Test-7
Tests for Comparing Two Population Variances
Objective: Test for the equality of variances (homogeneity assumption).
has a probability distribution in repeated
sampling which follows the F distribution.
x
2
5
0. 0.2 0.4 0.6 0.8 1.0
s12
s 22
12
 22
F(2,5)
The F distribution shape is
defined by two parameters
denoted the numerator degrees
of freedom (ndf or df1 ) and the
denominator degrees of freedom
(ddf or df2 ).
0
F(5,5)
2
4
6
8
10
y
Variance-Test-8
F distribution:
• Can assume only positive values (like 2, unlike normal and t).
• Is nonsymmetrical (like 2, unlike normal and t).
• Many shapes -- shapes defined by numerator and denominator
degrees of freedom.
• Tail values for specific values of df1 and df2 given in Table 8.
df1 relates to degrees of freedom associated with s21
df2 relates to degrees of freedom associated with s22
Variance-Test-9
Numerator df = df1.
Table 8
Note this table has three
things to specify in order
to get the critical value.
Denominator df = df2.
4.28
5.82
Probability Level
F Table
Variance-Test-10
Hypothesis Test for two population variances
H0 :   
2
1
Test Statistic:
2
2
versus
2
1
2
2
s
F
s
Rejection Region:
Ha : 1.
12  22
2.
12  22
For one-tailed tests, define
population 1 to be the one with
larger hypothesized variance.
For any 0    1 :
Fdf1 ,df2 ,1  1 Fdf2 ,df1 ,
1. Reject H0 if F > Fdf1,df2,.
2. Reject H0 if F > Fdf1,df2,/2 or if F < Fdf1,df2,1-/2.
In both cases, df1=n1-1 and df2=n2 -1.
Variance-Test-11
Background Samples
Example
T.S.
R.R.
2
1
2
2
Study Site Samples
n1  7
n2  7
y1  2.48
y 2  4.82
s1  1.13
s2  0.89
2
s
1.13
1.2769
F


 1.612
2
s
0.89
0.7921
Reject H0 if F > Fdf1,df2,
where df1=n1-1 and df2=n2-1
 = 0.05, F6,6,0.05 = 4.28 One-sided Alternative Hypothesis
Reject H0 if F > Fdf1,df2,/2
or if F < F df1,df2,1-/2
 = 0.05, F6,6,0.025 = 5.82, F6,6,0.975 = 0.17 Two-sided Alternative
Conclusion: Do not reject H0 in either case.
Variance-Test-12
(1-)100% Confidence Interval for Ratio of
Variances


s12 
1  12 s12 



F

2 
2
2
df 2 ,df1 ,

2

s2  Fdf ,df ,  2 s2
 1 2 2
Note: degrees of freedom have been swapped.
Example (95% CI):
F6, 6, 0.025  5.82
1.132  1   12 1.132
F6,6,0.025 
 2 
2 
2

0.89  F6, 6, 0.025   2 0.89
12
0.277  2  9.282
2
Note: not a  argument!
Variance-Test-13
Conclusion
While the two sample test for variances looks simple (and is simple),
it forms the foundation for hypothesis testing in Experimental
Designs (ANOVA).
Nonparametric alternatives are:
• Levene’s Test (Minitab);
• Fligner-Killeen Test (R).
Variance-Test-14
Software Commands for Chapters 5, 6 and 7
MINITAB
Stat -> Basic Statistics -> 1-Sample z, 1-Sample t, 2-Sample t, Paired t,
Variances, Normality Test.
-> Power and Sample Size -> 1-Sample z, 1-Sample t, 2-Sample t.
-> Nonparametrics -> Mann-Whitney (Wilcoxon Rank Sum Test)
-> 1-sample Wilcoxon (Wilcox. Signed Rank Test)
R
t.test( ): 1-Sample t, 2-Sample t, Paired t.
power.t.test( ): 1-Sample t, 2-Sample t, Paired t.
var.test( ): Tests for homogeneity of variances in normal populations.
wilcox.test( ): Nonparametric Wilcoxon Signed Rank & Rank Sum tests.
shapiro.test( ), ks.test( ): tests of normality.
Variance-Test-15
Example
It’s claimed that moderate exposure to ozone increases lung capacity. 24
similar rats were randomly divided into 2 groups of 12, and the 2nd group
was exposed to ozone for 30 days. The lung capacity of all rats were
measured after this time.
No-Ozone Group: 8.7,7.9,8.3,8.4,9.2,9.1,8.2,8.1,8.9,8.2,8.9,7.5
Ozone Group: 9.4,9.8,9.9,10.3,8.9,8.8,9.8,8.2,9.4,9.9,12.2,9.3
• Basic Question: How to randomly select the rats?
• In class I will demonstrate the use of MTB and R to analyze these data.
(See “Comparing two populations via two sample t-tests” in my R resources
webpage.)
Variance-Test-16