Download Outline - Queen's University

Document related concepts
no text concepts found
Transcript
Background Slides for CHEE824
• Hypothesis tests
– For comparison of means
– Comparison of variances
– Discussion of power of a hypothesis test - type I and
type II errors
• Joint confidence regions (for the linear case)
CHEE824 - Winter 2006
J. McLellan
1
Hypothesis Tests
… are an alternative approach to confidence limits for
factoring in uncertainty in decision-making
Approach
– make a hypothesis statement
– use appropriate test statistic for statement
– consider range of values for test statistic that would be
likely to occur if hypothesis were true
– compare value of test statistic estimated from data to
range - if significant, hypothesis is rejected, otherwise
hypothesis is accepted
CHEE824 - Winter 2006
J. McLellan
2
Example
Naphtha reformer in a refinery
» under old catalyst, octane number was 90
» under new catalyst, average octane number of 92 has
been estimated using a sample of 4 data points
» standard deviation of octane number in unit is known to
be 1.5
» has the octane number improved significantly?
» We could use confidence limits to answer this question
• for the mean, with known variance
• form interval, and see if old value (90) is contained in
interval for new mean
» consider direct test … hypothesis test
CHEE824 - Winter 2006
J. McLellan
3
Example
Hypothesis test Null hypothesis
» H 0 :   90
Alternate hypothesis
» H a :   90
“status quo”
– approach
» mean is estimated using sample average
» if observed average is within reasonable variation limits
of old mean, conclude that no significant change has
occurred
» reference distribution - Standard Normal
CHEE824 - Winter 2006
J. McLellan
4
Example
» to compare with Standard Normal, we must standardize
» if mean under new catalyst was actually the old mean,
then X  90
/ 4
would be distributed as a Standard Normal distribution
• observed values would vary accordingly
» now choose a fence - limit that contains 95% of values of
Standard Normal
» if observed value exceeds fence, then it is unlikely that
the mean under the new catalyst is equal to the old mean
• small chance of obtaining an observed average outside
this range
» if value exceeds fence, reject null hypothesis
CHEE824 - Winter 2006
J. McLellan
5
Example
» Compute test statistic value using observed average of
92:
x  90 92  90

 2.67
 / 4 1.5 / 4
» now determine fence - test at 95% significance level upper tail area is 0.05
• z = 1.65
» compare: 2.67 > 1.65 -conclude that mean must be
significantly higher, since likelihood of obtaining an
average of 92 when true mean is 90 is very small
fence - upper tail
area is 0.05
CHEE824 - Winter 2006
We only use the upper tail here,
because we are interested in testing
to see whether the new mean is
greater than the old mean.
J. McLellan
6
Example
– there is a small chance (0.05) that we could obtain an
observed average that would lie outside the fence even
though the mean had not changed
» in this case, we would erroneously reject the null
hypothesis, and conclude that the catalyst had caused a
significant increase
» referred to as a “Type I error” - false rejection
• this would happen 5% of the time
• to reduce, move fence further to the extreme of the
distribution - reduce upper tail area
»  = 0.05 is the “significance level”
• (1-  ) is sometimes referred to as the “confidence level”
»  is a tuning parameter for the hypothesis test
CHEE824 - Winter 2006
J. McLellan
7
Hypothesis Tests
Review sequence
H 0 :   90
1) formulate hypothesis
H a :   90
X  90
/ 4
2) form test statistic
3) compare to “fence” value
z = 1.65
4) in this case, reject null hypothesis
CHEE824 - Winter 2006
J. McLellan
8
Types of Hypothesis Tests
One-sided tests
– null hypothesis - parameter equal to old value
– alternate hypothesis - parameter >, < old value
» e.g., H 0 :   90
H a :   90
Two-sided tests
– null hypothesis - parameter equal to old value
– alternate hypothesis - parameter not equal to old value
(could be greater than, less than)
» e.g., H 0 :   90
H a :   90
CHEE824 - Winter 2006
In two-sided tests, two fences are used (upper,
lower), and significance area is split evenly
between lower and upper tails.
J. McLellan
9
Hypothesis Tests for Means
… with known variance
Two-Sided Test - at the  significance level
Hypotheses: H 0 :    0
H a :   0
X  0
/ n
Test Statistic:
Fences:
z / 2
 z / 2
Reject H0 if X   0  z
 /2
/ n
CHEE824 - Winter 2006
J. McLellan
rejection region
10
Hypothesis Tests for Means
… with known variance
One-Sided Test - at the  significance level
Hypotheses: H 0 :    0
H a :   0
X  0
/ n
Test Statistic:
Fences:
Reject H0 if
CHEE824 - Winter 2006
z
X  0
 z
/ n
rejection region
J. McLellan
11
Hypothesis Tests for Means
… with known variance
One-Sided Test - at the  significance level
Hypotheses: H 0 :    0
H a :   0
Test Statistic:
X  0
/ n
Fences:  z  z1
Reject H0 if
CHEE824 - Winter 2006
X  0
 z1
/ n
rejection region
J. McLellan
12
Hypothesis Tests for Means
When the variance is unknown, we estimate using the
sample variance.
Test statistic
– use “standardization” using sample standard deviation
X  0
s/ n
Reference distribution – becomes the Student’s t distribution
– degrees of freedom are those of the sample variance
» n-1
CHEE824 - Winter 2006
J. McLellan
13
Hypothesis Tests for Means
… with unknown variance
Two-Sided Test - at the  significance level
Hypotheses: H 0 :    0
H a :   0
Test Statistic:
X  0
s/ n
Fences: t n 1, / 2
 t n 1, / 2  t n 1,1 / 2
Reject H0 if X   0  t
n 1, / 2
s/ n
CHEE824 - Winter 2006
J. McLellan
rejection region
14
Hypothesis Tests for Means
… with unknown variance
One-Sided Test - at the  significance level
Hypotheses: H 0 :    0
H a :   0
Test Statistic:
Fences:
X  0
s/ n
t n 1,
Reject H0 if X   0
s/ n
CHEE824 - Winter 2006
 t n 1,
J. McLellan
rejection region
15
Hypothesis Tests for Means
… with unknown variance
One-Sided Test - at the  significance level
Hypotheses: H 0 :    0
H a :   0
Test Statistic:
X  0
s/ n
Fences:  t n 1,  t n 1,1
Reject H0 if X   0
s/ n
CHEE824 - Winter 2006
 t n 1,1
J. McLellan
rejection region
16
Hypothesis Tests for Variances
• Hypotheses
» e.g.,
H 0 :  2   02
H a :  2   02
• Test Statistic
» since
then
s2 ~
2
n 1
(n  1) s 2
 02
CHEE824 - Winter 2006
 n21
~  n21
J. McLellan
Test Statistic
17
Hypothesis Tests for Variances
Two-Sided Test - at the  significance level
Hypotheses: H 0 :  2   02
H a :  2   02
Test Statistic:
( n  1) s 2
 02
Fences:
 n21,1 / 2 ,
 n21, / 2
Reject H0 if
(n  1) s 2
 02
  n21,1 / 2 ,
CHEE824 - Winter 2006
Rejection region
or
(n  1) s 2
 02
J. McLellan
  n21, / 2
18
Hypothesis Tests for Variances
One-Sided Test - at the  significance level
Hypotheses: H 0 :  2   02
H a :  2   02
Test Statistic:
( n  1) s 2
 02
Fences:
Reject H0 if
 n21,
(n  1) s 2
 02
CHEE824 - Winter 2006
Rejection region
  n21,
J. McLellan
19
Hypothesis Tests for Variances
One-Sided Test - at the  significance level
Hypotheses: H 0 :  2   02
H a :  2   02
Test Statistic:
( n  1) s 2
 02
Fences:
Reject H0 if
 n21,1
(n  1) s 2
 02
CHEE824 - Winter 2006
  n21,1
J. McLellan
Rejection region
20
Outline
•
•
•
•
random samples
notion of a statistic
estimating the mean - sample average
assessing the impact of variation on estimates sampling distribution
• estimating variance - sample variance and standard
deviation
• making decisions - comparisons of means, variances
using confidence intervals, hypothesis tests
• comparisons between samples
CHEE824 - Winter 2006
J. McLellan
21
Comparisons Between Two Samples
So far, we have tested means and variances against
known values
» can we compare estimates of means (or variances)
between two samples?
» Issue - uncertainty present in both quantities, and must
be considered
Common Question
» do both samples come from the same underlying parent
population?
» e.g., compare populations before and after a specific
treatment
CHEE824 - Winter 2006
J. McLellan
22
Preparing to Compare Samples
Experimental issues
» ensure that data is collected in a randomized order for
each sample
• ensure that there are no systematic effects - e.g., catalyst
deactivation, changes in ambient conditions, cooling water
heating up gradually
» blocking - subject experimentation to same conditions ensure quantities other than those of interest aren’t
changing
CHEE824 - Winter 2006
J. McLellan
23
Comparison of Variances
… is typically conducted prior to comparing means
» recall that standardization required for hypothesis test (or
confidence interval) for the mean requires use of the
standard deviation  we should compare variances first
before choosing appropriate mean comparison
Approach
» focus on ratio of variances
 12 /  22
• is this ratio = 1?
• will be assessed using sample variances
» what should we use for a reference distribution?
CHEE824 - Winter 2006
J. McLellan
24
Comparison of Variances
Test Statistic
– for use in both hypothesis tests and confidence
intervals
The quantity
s12 /  12
s22 /  22
~ Fn1 1, n2 1
F-distribution
» n1 and n2 are the number of points in the samples used
to compute s 2 and s 2 respectively
1
2
CHEE824 - Winter 2006
J. McLellan
25
The F Distribution
… arises from the ratio of two Chi-squared random
variables, each divided by their degrees of freedom
s12 /  12
s22 /  22
~ Fn1 1, n2 1
s2
1
~
 n21
 2 n 1
» sample variance is sum of squared Normal random
variables
» dividing by population variance standardizes them, and
the expression becomes sum of standard Normal r.v.’s,
i.e., Chi-squared
CHEE824 - Winter 2006
J. McLellan
26
Confidence Interval Approach
Form probability statement for this test statistic:
P( Fn1 1, n2 1,1 / 2 
s12 /  12
s22 /  22
 Fn1 1, n2 1, / 2 )  1  
and rearrange:
P(
s12
s22 Fn1 1, n2 1, / 2
CHEE824 - Winter 2006

 12
 22

s12
s22 Fn1 1, n2 1,1 / 2
J. McLellan
)  1
27
Confidence Interval Approach
100(1-)% Confidence Interval
s12
s22 Fn1 1, n2 1, / 2

 12
 22

s12
s22 Fn1 1, n2 1,1 / 2
Approach:
» compute confidence interval
» determine whether “1” lies in the interval
• if so - identical variances is a reasonable conjecture
• if not - different variances
CHEE824 - Winter 2006
J. McLellan
28
Hypothesis Test Approach
Typical approach
– use a 1-sided test, with the test direction dictated by
which variance is larger
Test Statistic
s12 /  12
s22 /  22
CHEE824 - Winter 2006

Under the null hypothesis,
we are assuming that
s12
 12
s22
 22
J. McLellan
1
29
Hypothesis Tests for Variances
One-Sided Test - at the  significance level
For s12  s 22
H 0 :  12   22
Hypotheses:
H a :  12   22
s12
Test Statistic:
s 22
Fences:
Reject H0 if
CHEE824 - Winter 2006
Fn1 1, n2 1, 
s12
s22
 Fn1 1, n2 1, 
J. McLellan
30
Hypothesis Tests for Variances
One-Sided Test - at the  significance level
For s 22  s12
H 0 :  12   22
Hypotheses:
H a :  22   12
s 22
Test Statistic:
Why the reversal?
s12
Fences:
Reject H0 if
CHEE824 - Winter 2006
Fn2 1, n1 1, 
s22
s12
 Fn2 1, n1 1, 
J. McLellan
31
Why the reversal?
• Property of F-distribution
2
s
• typically, we would compare 1
s 22
against Fn1 1, n2 1,1
• Problem » tables for upper tail areas of 1- are not always available
• Solution - use the following fact for F-distributions
F 1 , 2 ,1 
1
F 2 , 1 ,
• to use this, reverse the test ratio - previous slide
CHEE824 - Winter 2006
J. McLellan
32
Example
Global warming problem from tutorial:
» s1 - standard devn for March ‘99 is 3.2 C Each is estimated
» s2 - standard devn for March ‘98 is 2.3 C using 31 data points
» has the variance of temperature readings increased in
1999?
» first, work with variances:
• 1999 -- 10.2 C2
• 1998 -- 5.3 C2
» since a) we are interested in whether variance increased,
and b) 1999 variance (10.2) is greater than 1998
variance (5.3), use the ratio s 2
1
s 22
CHEE824 - Winter 2006
J. McLellan
33
Example
Hypotheses:
H 0 :  12   22
H a :  12   22
» observed value of ratio = 1.94
» “fence value” - test at the 5% significance level:
• F31-1, 31-1, 0.05 = 1.84
» since observed value of test statistic exceeds fence
value, reject the null hypothesis
• variance has increased
Note
» if we had conducted the test at the 1% significance level
(F=2.39), we would not have rejected the null hypothesis
CHEE824 - Winter 2006
J. McLellan
34
Example
Now use confidence intervals to compare variances:
s12
 12
s12
s22 Fn1 1, n2 1, / 2

 22

s22 Fn1 1, n2 1,1 / 2
» use a 95% confidence interval - outer tail area is 2.5% on
each side
» this is a 2-tailed interval, so we need
Fn1 1, n2 1,  / 2  F311, 311, 0.025  2.07
Fn1 1, n2 1,1 / 2  Fn1 1, n2 1,1 / 2  Fn1 1, n2 1, 0.975
 1 / Fn2 1, n1 1, 0.025
 1 / F311, 311, 0.025  0.48
CHEE824 - Winter 2006
J. McLellan
35
Example
Confidence interval:
 12
10.2
10.2


2
5.3 (2.07)  2 5.3 (0.48)
 0.93 
Conclusion
 12
 22
 4.0
» since 1 is contained in this interval, we conclude that the
variances are the same
» why does the conclusion differ from the hypothesis test?
• 2-sided confidence interval vs. 1-sided hypothesis test
• in confidence interval, 1 is close to the lower boundary
CHEE824 - Winter 2006
J. McLellan
36
Comparing Means
The appropriate approach depends on:
» whether variances are known
» whether a test of sample variances indicates that
variances can be considered to be equal
• measurements coming from same population
Assumption: data are Normally distributed
The approach is similar, however the form depends on
the conditions above
» form test statistic
» use reference distribution
» re-arrange (confidence intervals) or compare to fence
(hypothesis tests)
CHEE824 - Winter 2006
J. McLellan
37
Comparing Means
Known Variances
» if variances are known (  12 ,  22 ), then
 12  22
( X 1  X 2 ) ~ N ( 1   2 ,

)
n1
n2
» now we can standardize to obtain our test statistic
( X 1  X 2 )  ( 1   2 )
 12  22

n1
~Z
n2
Note - we are assuming that the samples used for the averages are independent.
CHEE824 - Winter 2006
J. McLellan
38
Comparing Means
Known Variances
Confidence Interval
» form probability statement for test statistic as a Standard
Normal random variable
» re-arrange interval
» procedure analogous to that for mean with known
variance
 12  22
 12  22
( X 1  X 2 )  z / 2

 ( 1   2 )  ( X 1  X 2 )  z / 2

n1
CHEE824 - Winter 2006
n2
n1
J. McLellan
n2
39
Comparing Means
Known Variances
Hypothesis Test H 0 : 1   2
H a : 1   2
Test Statistic
( X1  X 2 )
Two-Sided Test
 12  22

n1
Fences
n2
z / 2
 z / 2
Reject H0 if
CHEE824 - Winter 2006
( X1  X 2 )
 12  22

n1 n2
 z / 2
J. McLellan
40
Comparing Means
Unknown Variance
– appropriate choice depends on whether variances can
be considered equal or are different
» test using comparison of variances
» if variances can be considered to be equal, assume that
we are sampling with same population variance
» pool variance estimate to obtain estimate with more
degrees of freedom
CHEE824 - Winter 2006
J. McLellan
41
Pooling Variance
– If variances can reasonably be considered to be the
same, then we can assume that we are sampling from
population with same variance
s12 
» convert sample variances back to sums of squares, add
them together, and divide by the combined number of
degrees of freedom
n1
1 n1
2
2
2
 ( X 1, i  X 1 )  (n1  1) s1   ( X 1, i  X 1 )
n1  1 i 1
i 1
» can follow similar procedure for
CHEE824 - Winter 2006
J. McLellan
s 22
42
Pooling Variance
– We have obtained the original sum of squares from
each sample variance
– combine to form overall sum of squares
SS overall  (n1  1) s12  (n2  1) s22
– degrees of freedom
 overall  n1  1  n2  1  (n1  n2 )  2
– pooled variance estimate
2
2
(
n

1
)
s

(
n

1
)
s
1
2
2
s 2p  1
n1  n2  2
CHEE824 - Winter 2006
J. McLellan
43
Comparing Means
Unknown Variance - “Equal Variances”
Confidence Intervals
( X1  X 2 )  t , / 2 s p
1 1
1 1

 ( 1  2 )  ( X1  X 2 )  t , / 2 s p

n1 n2
n1 n2
» recall that t ,1 / 2  t , / 2
» since variance is estimated, we use the t-distribution as a
reference distribution
» degrees of freedom = (n1-1) + (n2-1)
» if 0 lies in this interval, means are not different
CHEE824 - Winter 2006
J. McLellan
44
Comparing Means
Unknown Variance - “Equal Variances”
Hypothesis Test H 0 : 1   2
H a : 1   2
( X1  X 2 )
1 1
sp

n1 n2
Test Statistic
Fences
t , / 2
 t , / 2  t ,1 / 2
Reject H0 if
CHEE824 - Winter 2006
( X1  X 2 )
 t , / 2
1 1
sp

J. McLellan
n1 n2
45
Comparing Means
Unknown Variance - “Unequal Variances”
– test becomes an approximation
• approach
» test statistic
( X1  X 2 )
s12 s 22

n1 n2
» reference distribution - Student’s t distribution
» estimate an “equivalent” number of degrees of freedom
CHEE824 - Winter 2006
J. McLellan
46
Comparing Means
Unknown Variance - “Unequal Variances”
– equivalent number of degrees of freedom
– degrees of freedom  is largest integer less than or
equal to
2
2
2
 s1 s 2 
  
 n1

2
2
 s1 
 
n2 
2
2
 s2 
 
 n1 
 n2 
   
n1  1 n2  1
CHEE824 - Winter 2006
J. McLellan
47
Comparing Means
Unknown Variance - “Unequal Variances”
Confidence Intervals
» similar to case of known variances, but using sample
variances and t-distribution
s12 s22
s12 s22
( X1  X 2 )  t , / 2
  ( 1  2 )  ( X1  X 2 )  t , / 2

n1 n2
n1 n2
» degrees of freedom  is the effective number of degrees
of freedom (from previous slide)
» recall that t ,1 / 2  t , / 2
» if 0 isn’t contained in interval, conclude that means differ
CHEE824 - Winter 2006
J. McLellan
48
Comparing Means
Unknown Variance - “Unequal Variances”
Hypothesis Test H 0 : 1   2
H a : 1   2
Test Statistic
Fences
Reject H0 if
CHEE824 - Winter 2006
( X1  X 2 )
s12 s 22

n1 n2
t ,1 / 2 , t , / 2
( X1  X 2 )
s12 s 22

n1 n2
 t , / 2
J. McLellan
49
Paired Comparisons for Means
Previous approach
» 2 data sets obtained from 2 processes
» compute average, sample variance for EACH data set
» compare differences between sample averages
Issue » extraneous variation present because we have
conducted one experimental program for process 1, and
one distinct experimental program for process 2
» additional variation reduces sensitivity of tests
• location of fences depends in part on extent of variation
» can we conduct experiments in a paired manner so that
they have as much variation in common as possible, and
extraneous variation is eliminated?
CHEE824 - Winter 2006
J. McLellan
50
Paired Comparisons of Means
Approach » set up pairs of experimental runs with as much in
common as possible
» collect pairs of observations for each experimental run -process 1, process 2
» compute differences
» conduct a confidence interval or hypothesis test on the
mean of the differences, using the average of the
differences in the test statistic
• variance estimated using the sample variance of the
differences
» test to see if the mean of the differences is plausibly zero
(no difference in population means)
CHEE824 - Winter 2006
J. McLellan
51
Paired Comparison of Means
Example - oxide thickness on
silicon wafers
» runs at two positions in
a furnace
» run pairs of tests with a
wafer in each location
A
920
914
927
891
943
902
910
856
937
857
Furnace Position
B
difference
923
-3
924
-10
913
14
881
10
923
20
884
18
887
23
858
-2
916
21
857
0
average
variance
std
CHEE824 - Winter 2006
J. McLellan
9.1
141.6556
11.90191
52
Paired Comparison of Means
Confidence Interval
D  t n 1, / 2 sd / n  1   2  D  t n 1, / 2 sd / n
»
D, sd
are average and standard deviation of
differences
» conclude that means are identical if zero is contained in
interval
» n is number of data points in paired samples (e.g., 10
pairs)
CHEE824 - Winter 2006
J. McLellan
53
Paired Comparison of Means
Hypothesis Test
H 0 : 1   2  0
H a : 1   2  0
Test Statistic
Fences
Reject H0 if
CHEE824 - Winter 2006
D
sd / n
t n 1,1 / 2 , t n 1, / 2
D
 t n 1, / 2
sd / n
J. McLellan
54
“Tuning” Hypothesis Tests
What significance level should we use for a hypothesis
test?
rejection region
CHEE824 - Winter 2006
Rejection region has area . If the
null hypothesis were actually true,
there is probability  that we the
observed value would fall outside
the fences, and we would erroneously
reject the null hypothesis
 FALSE REJECTION
- referred to as a Type I error
J. McLellan
55
Adjusting the False Rejection Rate
… is achieved by moving the fences further out
» use a higher threshold as a basis to reject null
hypothesis
» i.e., make the outer tail area  SMALLER
» e.g., instead of testing at 5% significance (95%
confidence level), test at 1% significance level (99%
confidence level)
CHEE824 - Winter 2006
J. McLellan
56
Failure to Detect
Suppose the mean has actually increased.
Failure to detect region observed values of the test
statistic falling in this region
should in fact be rejected,
however they aren’t because
they fall within the acceptance
region - FAILURE TO REJECT
 referred to as a Type II error
which has a probability  of
occurring
False rejection region
with area = 
CHEE824 - Winter 2006
J. McLellan
57
Failure to Detect
The probability of a type II error
depends on:
– size of the shift to be
detected
– location of the fence -significance level (Type I
error probability)
– influences degree of
overlap of two
distributions, and thus
the overlap area
Area = 
CHEE824 - Winter 2006
J. McLellan
58
Failure to Detect
Distribution for X-bar is
standardized as:
Schematic:
X  0
/ n
however if the true mean has
shifted, this not a standard Normal
random variable.
 if new mean has shifted by 
then we must use
X  0
X   0  



/ n / n
/ n
as the standardized form
CHEE824 - Winter 2006
J. McLellan
59
Failure to Detect
Computing  - for 1-sided hypothesis test
» outer tail area on high side is 
» fence value is z
» type II error probability is P( X  z ) where X has
mean 0+
» in order to compute probability of type II error, convert X
to standard normal:
X  0
X   0  

  P(
 z )  P(
 z 
)
/ n
/ n
/ n
 P( Z  z 
CHEE824 - Winter 2006

/ n
)  P( Z  z   n )
J. McLellan
60
Failure to Detect
Introduce



» size of shift as multiple of standard deviation of X
(population)
» no analytical expression for 
» summarize in graphs referred to as Operating
Characteristic Curves
» 1-  is called the POWER of the hypothesis test
CHEE824 - Winter 2006
J. McLellan
61
Operating Characteristic Curve
• Example shape of the curve
1
For fixed value of 
0.8
Increasing sample size n
Probability
of failing 0.6
to reject
0.4

n=1
0.2
n=5
n=50
0
0
CHEE824 - Winter 2006
1
2
3
Size of shift 
J. McLellan
4
62
Operating Characteristic Curve
• Illustrates trade-off between false detection/failure to
detect for fixed sample size
• Use - examples
» given desired false detection, failure to detect rates,
determine sample size required to detect given shift
» given sample size and false detection rate, determine
failure to detect rate given size of shift
False detection rate
Failure to detect
rate
CHEE824 - Winter 2006
Sample size n
J. McLellan
63
Operating Characteristic Curves
… are available for:
» 2-sided hypothesis test for mean
• variance known
• variance unknown
» 1-sided hypothesis test for mean
• variance known
• variance unknown
» tests for variance
CHEE824 - Winter 2006
J. McLellan
64
Joint Confidence Region (JCR)
… answers the question
Where do the true values of the parameters lie?
Recall that for individual parameters, we gain an understanding of
where the true value lies by:
» examining the variability pattern (distribution) for the
parameter estimate
» identify a range in which most of the values of the
parameter estimate are likely to lie
» manipulate this range to determine an interval which is
likely to contain the true value of the parameter
CHEE824 - Winter 2006
J. McLellan
65
Joint Confidence Region
Confidence interval for individual parameter:
Step 1) The ratio of the estimate to its standard deviation is distributed as
a Student’s t-distribution with degrees of freedom equal to that of the
standard devn of the variance estimate
i  i
s
i
~ t
Step 2) Find interval [ t , / 2 , t , / 2 ] which contains 100(1   )%
of values -i.e., probability of a t-value falling in this interval is (1  )
Step 3) Rearrange this interval to obtain interval i  t , / 2 s
i
which contains true value of parameter 100(1   )%of the time
CHEE824 - Winter 2006
J. McLellan
66
Joint Confidence Region
Comments on Individual Confidence Intervals:
» sometimes referred to as marginal confidence intervals cf. marginal distributions vs. joint distributions from earlier
» marginal confidence intervals do NOT account for
correlations between the parameter estimates
» examining only marginal confidence intervals can
sometimes be misleading if there is strong correlation
between several parameter estimates
• value of one parameter estimate depends in part on anther
• deletion of the other changes the value of the parameter
estimate
• decision to retain might be altered
CHEE824 - Winter 2006
J. McLellan
67
Joint Confidence Region
Sequence:
Step 1) Identify a statistic which is a function of the parameter
estimate statistics
Step 2) Identify a region in which values of this statistic lie a certain
fraction of the time (a 100(1   )% region)
Step 3) Use this information to determine a region which contains
the true value of the parameters 100(1   )% of the time
CHEE824 - Winter 2006
J. McLellan
68
Joint Confidence Region
The quantity
(    ) T XT X(    )
p
s2

estimate of
inherent
noise variance
(if MSE is used, degrees of freedom is n-p)
~ Fp,n  p
is the ratio of two sums of squares, and is distributed as an F-distribution with p
degrees of freedom in the numerator, and n-p degrees of freedom in the
denominator
CHEE824 - Winter 2006
J. McLellan
69
Joint Confidence Region
We can define a region by thinking of those values of the ratio which have a value
less than
Fp,n p,1
i.e.,
(    ) T XT X(    )
p
s2

 Fp,n  p,1
Rearranging yields:
(    )T XT X(    )  ps2 Fp,n p,
CHEE824 - Winter 2006
J. McLellan
70
Joint Confidence Region - Definition
The 100(1   )% joint confidence region for the parameters is defined as
those parameter values
satisfying:

(    )T XT X(    )  ps2 Fp,n  p,1
Interpretation:
» the region defined by this inequality contains the true values of
the parameters 100(1   )% of the time
» if values of zero for one or more parameters lie in this region,
those parameters are plausibly zero, and consideration should
be given to dropping the corresponding terms from the model
CHEE824 - Winter 2006
J. McLellan
71
Joint Confidence Region - Example with 2 Parameters
Let’s reconsider the solder thickness example:
2367 
 10
;
( X T X)  


2367 563335
. 
45810
;
  


. 
  113
s2  135.38
95% Joint Confidence Region (JCR) for slope&intercept:
(    ) T XT X(    )

 0  0
0  0 
  ps2 F
1  1 XT X 
 2 s2 F2,10 2,0.95

p
,
n

p

 1  1 
CHEE824 - Winter 2006

J. McLellan
72
Joint Confidence Region - Example with 2 Parameters
95% Joint Confidence Region (JCR) for slope&intercept:
.  0 
45810
  2(135.38) F
.  0  113
.  1XT X 
45810
2,8,0.95


.  1 
  113
 2(135.38)(4.46)  1207.59
The boundary is an ellipse...
CHEE824 - Winter 2006
J. McLellan
73
Joint Confidence Region - Example with 2 Parameters
rotated - implies correlation
between estimates of slope
and intercept
Region
-0.6
centred at least squares
parameter estimates
Slope
-1.6
320
Intercept
600
greater “shadow” along horizontal axis --> variance of
intercept estimate is greater than that of slope
CHEE824 - Winter 2006
J. McLellan
74
Interpreting Joint Confidence Regions
1) Are axes aligned with coordinate axes?
» is ellipse horizontal or vertical?
» indicates no correlation between parameter estimates
2) Which axis has the greatest shadow?
» projection of ellipse along axis
» indicates which parameter estimate has the greatest variance
3) The elliptical region is, by definition, centred at the least squares
parameter estimates
4) Long, narrow, rotated ellipses indicate significant correlation between
parameter estimates
5) If a value of zero for one or more parameters lies in the region, these
parameters are plausibly zero - consider deleting from model
CHEE824 - Winter 2006
J. McLellan
75
Joint Confidence Regions
T T
What is the motivation for the ratio (    ) X X(    )
p
2
used to define the joint confidence region? s
Consider the joint distribution for the parameter estimates:
1 
exp{ (    ) T  1(    )}

2
(2 ) p / 2 det(   )
1
Substitute in estimate for
parameter covariance matrix:
CHEE824 - Winter 2006
(    )T (( XT X) 1 s2 ) 1(    )

(    )T XT X(    )
J. McLellan
s2
76
Confidence Intervals from Densities
f  (b)
Individual Interval
f   (b0 , b1)
0 1
Joint Region
b1
lower
upper
b
b0
area = 1-alpha
CHEE824 - Winter 2006
volume = 1-alpha
Joint Confidence
Region
J. McLellan
77
Relationship to Marginal Confidence Limits
marginal confidence interval
for slope
Region
-0.6
Slope
centred at least squares
parameter estimates
-1.6
320
Intercept
600
marginal confidence interval for intercept
CHEE824 - Winter 2006
J. McLellan
78
Relationship to Marginal Confidence Limits
95% confidence
region implied by
considering parameters
individually
marginal confidence interval
for slope
Region
-0.6
95% confidence
region for parameters
considered jointly
Slope
-1.6
320
Intercept
600
marginal confidence interval for intercept
CHEE824 - Winter 2006
J. McLellan
79
Relationship to Marginal Confidence Intervals
Marginal confidence intervals are contained in joint confidence
region
» potential to miss portions of plausible parameter values
at tails of ellipsoid
» using individual confidence intervals implies a
rectangular region, which includes sets of parameter
values that lie outside the joint confidence region
» both situations can lead to
• erroneous acceptance of terms in model
• erroneous rejection of terms in model
CHEE824 - Winter 2006
J. McLellan
80
Related documents