Download Chapter 3 Experiments with a Single Factor: The Analysis of Variance

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Islamic University, Gaza - Palestine
Chapter 3 Experiments with a Single Factor: The
Analysis of Variance
1
Islamic University, Gaza - Palestine
3.1 An Example
• Chapter 2: A signal-factor experiment with two levels
of the factor
• Consider signal-factor experiments with a levels of
the factor, a  2
• Example:
–
–
–
–
The tensile strength of a new synthetic fiber.
The weight percent of cotton
Five levels: 15%, 20%, 25%, 30%, 35%
a = 5 and n = 5
2
Islamic University, Gaza - Palestine
• Does changing the cotton
weight percent change the
mean tensile strength?
• Is there an optimum level for
cotton content?
3
Islamic University, Gaza - Palestine
3.2 The Analysis of Variance
• a levels (treatments) of a factor and n replicates for each
level.
• yij: the jth observation taken under factor level or treatment
i.
4
Islamic University, Gaza - Palestine
Models for the Data
• Means model:
 i  1,2,..., a
y ij   i   ij , 
 j  1,2,..., n
– yij is the ij th observation,
– i is the mean of the ith factor level,
– ij is a random error with mean zero,
• Effects model:
 i  1,2,..., a
y ij     i   ij , 
 j  1,2,..., n
5
Islamic University, Gaza - Palestine
• Linear statistical model
• One-way or Signal-factor analysis of variance model
• Completely randomized design: the experiments are performed
in random order so that the environment in which the treatment
are applied is as uniform as possible.
• For hypothesis testing, the model errors are assumed to be
normally and independently distributed random variables with
mean zero and variance, σ2, i.e. yij ~ N(μ+τi, σ2)
• Fixed effect model: a levels have been specifically chosen by
the experimenter.
6
Islamic University, Gaza - Palestine
3.3 Analysis of the Fixed Effects Model
• Interested in testing the equality of the a treatment means, and
E(yij) = μ + τi = μi , i = 1,2, …, a
H0: μ1 = μ2 = …… = μa
H1: μi ≠ μj, for at least one pair (i, j)
• Constraint (Restraint):
• H0: τ1 = τ2 = … = τa = 0 v.s. H1: τi ≠ 0, for at least one i


i
a
i
  i  0
i
7
Islamic University, Gaza - Palestine
•
Notations:
n
a
n
y i    y ij , y    y ij
j 1
i 1 j 1
y i   y i  / n , y   y  / N , N  an
an: the total number of observations.
3.3.1 Decomposition of the Total Sum of Squares
•
•
Total variability into its component parts.
The total sum of squares (a measure of overall variability
a
n
in the data)
2
SST   ( yij  y.. )
i 1 j 1
•
Degree of freedom: an – 1 = N – 1
8
Islamic University, Gaza - Palestine
a
n
a
n
2
(
y

y
)

[(
y

y
)

(
y

y
)]
 ij ..  i . ..
ij
i.
i 1 j 1
2
i 1 j 1
a
a
n
 n  ( y i .  y .. )   ( y ij  y i . )2
i 1
2
i 1 j 1
SS T  SS Treatments  SS Error
• SSTreatment: sum of squares of the differences between the
treatment averages (sum of squares due to treatments) and the
grand average, and a – 1 degree of freedom
• SSE: sum of squares of the differences of observations within
treatments from the treatment average (sum of squares due to
error), and N – a degrees of freedom.
9
Islamic University, Gaza - Palestine
SST  SSTreatments  SS E
• A large value of SSTreatments reflects large differences in treatment means
• A small value of SSTreatments likely indicates no differences in treatment
means
• dfTotal = dfTreatment + dfError
•
SSE (n1)S12 (n 1)Sa2

N a (n 1)(n 1)
•If there are no differences between a treatment means,
SS Treatments

a 1
n ( y i  y ) 2
i
a 1
10
Islamic University, Gaza - Palestine
• Mean squares:
MS Treatments
SS Treatments
SS E
, MS E 

a 1
N a
a
n
a
1
1
E ( MS E ) 
E ( y ij2   y i2 )   2
N  a i 1 j 1
n i 1
a
E ( MS Treatments )   2  n( i ) /(a  1)
i 1
3.3.2 Statistical Analysis
• Assumption: ξij are normally and independently distributed
with mean zero and variance σ2
11
Islamic University, Gaza - Palestine
• SST/ σ 2 ~ Chi-square (N – 1), SSE/ σ2 ~ Chi-square (N – a),
SSTreatments/σ2 ~ Chi-square (a – 1), and SSE/ σ2 and
SSTreatments/ σ2 are independent (Theorem 3.1)
• H0: τ1 = τ2 = …. = τa = 0 v.s. H1: τi ≠ 0, for at least one i
Islamic University, Gaza - Palestine
• Reject H0 if F0 > Fα, a-1, N-a
• Rewrite the sum of squares:
• See page 71
y2
SS T   y ij 
N
i 1 j 1
a
n
1 a 2 y2
SS Treatments   y i 
n i 1
N
SS E  SS T  SS Treatments
13
Islamic University, Gaza - Palestine
Response:Strength
ANOVA for Selected Factorial Model
Analysis of variance table [Partial sum of squares]
Sum of
Mean
F
SourceSquares
DF
Square
Value Prob > F
Model 475.76
4
118.94
14.76 < 0.0001
A
475.76
4
118.94
14.76 < 0.0001
Pure Error161.20
20
8.06
Cor Total636.96
24
Std. Dev. 2.84
Mean
15.04
C.V.
18.88
PRESS 251.88
R-Squared
Adj R-Squared
Pred R-Squared
Adeq Precision
0.7469
0.6963
0.6046
9.294
14
Islamic University, Gaza - Palestine
3.3.3 Estimation of the Model Parameters
• Model: yij = µ + τi +ξij
• Estimators:
• Confidence intervals:
ˆ  y
ˆi  y i  y
ˆ i  y i
̂
y i  ~ N (  i ,  2 / n)
y i  t / 2, N  a
MS E
MS E
  i  y i  t / 2 , N  a
n
n
y i  y j   t / 2, N  a
MS E
MS E
  i   j  y i  y j   t / 2, N  a
n
n
15
Islamic University, Gaza - Palestine
• Example 3.3 (page 75)
• Simultaneous Confidence Intervals (Bonferroni method):
Construct a set of r simultaneous confidence intervals on
treatment means which is at least 100(1-): 100(1-/r) C.I.’s
3.3.4 Unbalanced Data
• Let ni observations be taken under treatment i, i=1,2,…,a, N =
i ni, ( some of the measured data are missed)
2
y
SS T   y ij2  
N
i 1 j 1
a
ni
a
SS Treatments  
i 1
y i2 y2

ni
N
16
Islamic University, Gaza - Palestine
1. The test statistic is relatively insensitive to small
departures from the assumption of equal variance for the
a treatments if the sample sizes are equal.
2. The power of the test is maximized if the samples are of
equal size.
17
Islamic University, Gaza - Palestine
3.4 Model Adequacy Checking
• Assumptions: yij ~ N(µ+τi, σ2)
• The examination of residuals
• Definition of residual:
• The residuals should be structure-less.
eij  y ij  yˆ ij ,
yˆ ij  ˆ  ˆi  y  ( y i  y )  y i
18
Islamic University, Gaza - Palestine
3.4.1 The Normality Assumption
• Plot a histogram of the residuals
• Plot a normal probability plot of the residuals
• See Table 3-6
19
Islamic University, Gaza - Palestine
• May be
– Slightly skewed (right tail is longer than left tail)
– Light tail (the left tail of error is thinner than the tail part of
standard normal)
• Outliers
• The possible causes of outliers: calculations, data coding,
copy error,….
• Sometimes outliers are more informative than the rest of
the data.
20
Islamic University, Gaza - Palestine
• Detect outliers: Examine the standardized residuals,
d ij 
eij
MS E
3.4.2 Plot of Residuals in Time Sequence
• Plotting the residuals in time order of data collection is
helpful in detecting correlation between the residuals.
• Independence assumption
21
Islamic University, Gaza - Palestine
R e s i d u a ls v s . R u n
5 .2
R es iduals
2 .9 5
0 .7
- 1 .5 5
- 3 .8
1
4
7
10
13
16
19
22
25
Run Num ber
22
Islamic University, Gaza - Palestine
3.4.3 Plot of Residuals Versus Fitted Values
• Plot the residuals versus the fitted values
R e s i d u a ls v s . P r e d i c t e d
• Structure-less
5 .2
2 .9 5
R es iduals
2
2
0 .7
2
2
- 1 .5 5
2
2
2
- 3 .8
9 .8 0
1 2 .7 5
1 5 .7 0
1 8 .6 5
2 1 .6 0
P r e d i c te d
23
Islamic University, Gaza - Palestine
• Nonconstant variance: the variance of the observations increases as
the magnitude of the observation increase, i.e. yij  2
• If the factor levels having the larger variance also have small sample
sizes, the actual type I error rate is larger than anticipated.
• Variance-stabilizing transformation
Poisson
Square root transformation yij
Lognormal
Logarithmic transformation log yij
Binomial
Arcsin transformation arcsin y ij
Islamic University, Gaza - Palestine
• Statistical Tests for Equality Variance:
H 0 :  12     a2 v.s. H 1 : above not true for at least one  i2
q
2
– Bartlett’s test:
 0  2.3026
c
a
q  ( N  a ) log S   (ni  1) log S i2
2
P
i 1
1  a
1
1 
c  1
  (ni  1)  ( N  a ) 
3(a  1)  i 1

a
S p2   (ni  1) S i2 /( N  a )
i 1
– Reject null hypothesis if
 02   2 ,a 1
Islamic University, Gaza - Palestine
• Example 3.4: the test statistic is
 02  0.93 and  02.05, 4  9.49
• Bartlett’s test is sensitive to the normality assumption
• The modified Levene test:
– Use the absolute deviation of the observation in each treatment
from the treatment median.
d ij  y ij  ~
y i , i  1,2,  , a, j  1,2,  , ni
– Mean deviations are equal => the variance of the observations
in all treatments will be the same.
– The test statistic for Levene’s test is the ANOVA F statistic for
testing equality of means.
26
Islamic University, Gaza - Palestine
• Example 3.5:
•
– Four methods of estimating flood flow frequency procedure (see
Table 3.7)
– ANOVA table (Table 3.8)
– The plot of residuals v.s. fitted values (Figure 3.7)
– Modified Levene’s test: F0 = 4.55 with P-value = 0.0137. Reject the
null hypothesis of equal variances.
27
Islamic University, Gaza - Palestine
•
•
•
•
Let E(y) =  and y  
Find y* = y that yields a constant variance.
*  +-1
Variance-Stabilizing Transformations
* and 

= 1 - 
Transformation
*constant
0
1
No transformation
*  1/2
½
½
Square root
*  
1
0
Log
*  3/2
3/2
-1/2
Reciprocal square root
 *  2
2
-1
Reciprocal
28
Islamic University, Gaza - Palestine
• How to find :
• Use
S i   i and y i   i
log  yi  log    log  i
• See Figure 3.8, Table 3.10 and Figure 3.9
29
Islamic University, Gaza - Palestine
3.5 Practical Interpretation of Results
• Conduct the experiment => perform the statistical analysis =>
investigate the underlying assumptions => draw practical
conclusion
3.5.1 A Regression Model
• Qualitative factor: compare the difference between the levels
of the factors.
• Quantitative factor: develop an interpolation equation for the
response variable.
Islamic University, Gaza - Palestine
Regression analysis : See Figure 3.1 25
X = A: Cotton Weight %
20.5
Final Equation in Terms of
Actual Factors:
This is an empirical model of
the experimental results
2
Strength
Strength = +62.61143
-9.01143* Cotton Weight %
+0.48143 * Cotton Weight
%^2 -7.60000E-003 *
Cotton Weight %^3
2
2
2
16
2
11.5
7
2
2
15.00
20.00
25.00
30.00
A: Cotton
31 Weight %
35.00
Islamic University, Gaza - Palestine
3.5.2 Comparisons Among Treatment Means
• If that hypothesis is rejected, we don’t know which
specific means are different
• Determining which specific means differ following an
ANOVA is called the multiple comparisons problem
3.5.3 Graphical Comparisons of Means
Islamic University, Gaza - Palestine
3.5.4 Contrast
• A contrast: a linear combination of the parameters of the form
a
a
i 1
i 1
   ci  i ,  ci  0
• H0:  = 0 v.s. H1:   0
• Two methods for this testing.
33
Islamic University, Gaza - Palestine
The first method:
a
a
i 1
i 1
Let C   ci y i Then Var (C )  n 2  ci2
a
Under H 0 ,
c y
i 1
i
i
a
~ N (0,1)
n 2  ci2
i 1
a
Hence the statistic, t 0 
c y
i 1
i
i
a
nMS E  ci2
i 1
~ t N a
Islamic University, Gaza - Palestine
• The second method:
a
F0  t 02 
(  ci y i ) 2
i 1
a
nMS E  ci2
~F1,N  a
i 1

 a
  ci y i 
MS C SS C / 1

, SS C   i 1 a
F0 

MS E
MS E
n ci2
i 1
35
Islamic University, Gaza - Palestine
The C.I. for a contrast, 
a
   ci  i
i 1
σ2
Let C   ci y i . Then Var(C) 
n
i 1
a
MS E
n
a
Hence C.I.  ci y i  t / 2, N  a
i 1
a
2
c
i
i 1
a
2
c
i
i 1
• Unequal Sample Size


  ci y i 
ci y i


i 1
3. SSC   i a1
a
2
2
n
c
MS E  ni ci
 ii
a
a
a
1.  ni ci  0 2. t 0 
i 1
i 1
i 1
2
Islamic University, Gaza - Palestine
3.5.5 Orthogonal Contrast
• Two contrasts with coefficients, {ci} and {di}, are orthogonal if
ci di = 0
• For a treatments, the set of a – 1 orthogonal contrasts partition
the sum of squares due to treatments into a – 1 independent
single-degree-of-freedom components. Thus, tests performed
on orthogonal contrasts are independent.
• See Example 3.6 (Page 94)
37
Islamic University, Gaza - Palestine
3.5.6 Scheffe’s Method for Comparing All Contrasts
• Scheffe (1953) proposed a method for comparing any and all
possible contrasts between treatment means.
Suppose u  c1u 1   c au  a , u  1,2, , m
a
C u   ciu y i and S Cu  MS E  (ciu2 / ni )
i 1
i 1
The critical value : S  ,u  S Cu (a  1) F ,a 1, N  a
If C u  S  ,u , then reject H 0 : u  0
• See Page 95 and 96
Islamic University, Gaza - Palestine
3.5.7 Comparing Pairs of Treatment Means
• Compare all pairs of a treatment means
• Tukey’s Test:
– The studentized range statistic:
q
y max  y min
MS E / n
, y max and y min are the largest and smallest
sample means out of a group of p sample means
MS E
The critical point is T  q (a, f )
n
or T  q (a, f ) MS E (1 / ni  1 / n j )
– See Example 3.7
Islamic University, Gaza - Palestine
• Sometimes overall F test from ANOVA is significant, but the pairwise comparison of mean fails to reveal any significant
differences.
• The F test is simultaneously considering all possible contrasts
involving the treatment means, not just pairwise comparisons.
The Fisher Least Significant Difference (LSD) Method
• For H0: i = j
t0 
y i  y j 
MS E (1 / ni  1 / n j )
Islamic University, Gaza - Palestine
• The least significant difference (LSD):
LSD  t / 2, N a
• See Example 3.8
1

1
MS E   
n n 
j 
 i
Duncan’s Multiple Range Test
• The a treatment averages are arranged in ascending order,
and the standard error of each average is determined as
S yi  
MS E
, nh 
nh
a
a
1 / n
i 1
i
Islamic University, Gaza - Palestine
• Assume equal sample size, the significant ranges are
R P  r  p, f S yi  , p  2,3, , a
• Total a(a-1)/2 pairs
• Example 3.9
The Newman-Keuls Test
• Similar as Duncan’s multiple range test
• The critical values:
K P  q ( p, f ) S yi 
42
Islamic University, Gaza - Palestine
3.5.8 Comparing Treatment Means with a Control
• Assume one of the treatments is a control, and the analyst is
interested in comparing each of the other a – 1 treatment
means with the control.
• Test H0: i = a v.s. H1: : i  a, i = 1,2,…, a – 1
• Dunnett (1964)
• Compute
y  y , i  1,2,  , a  1
i
• Reject H0 if
y i  y a
a
1
1 
 d  (a  1, f ) MS E   
 ni n a 
• Example 3.10
43
Islamic University, Gaza - Palestine
3.7 Determining Sample Size
• Determine the number of replicates to run
3.7.1 Operating Characteristic Curves (OC Curves)
• OC curves: a plot of type II error probability of a statistical
test,
  1  PReject H 0 | H 0 is false
 1  P ( F0  F ,a 1, N  a | H 0 is false)
44
Islamic University, Gaza - Palestine
• If H0 is false, then
F0 = MSTreatment / MSE ~ noncentral F
with degree of freedom a – 1 and N – a and noncentrality
parameter 
• Chart V of the Appendix
• Determine
a
2 
n i2
i 1
a 2
• Let i be the specified treatments. Then estimates of i :
• For 2, from prior experience, a previous experiment or a
preliminary test or a judgment estimate.
a
 i  i   ,    i / a
i 1
45
Islamic University, Gaza - Palestine
• Example 3.11
• Difficulty: How to select a set of treatment means on which the
sample size decision should be based.
• Another approach: Select a sample size such that if the
difference between any two treatment means exceeds a
specified value the null hypothesis should be rejected.
2
nD
2 
a 2
Islamic University, Gaza - Palestine
3.7.2 Specifying a Standard Deviation Increase
• Let P be a percentage for increase in standard deviation of an
observation. Then
a

2

 i /a
i 1
/ n

1  0.01P 
2
 
1 n
• For example (Page 110): If P = 20, then

1.2
2

 1 n  0.66 n
47
Islamic University, Gaza - Palestine
3.7.3 Confidence Interval Estimation Method
• Use Confidence interval.
y i  y j   t / 2, N  a
MS E
MS E
  i   j  y i  y j   t / 2, N  a
n
n
• For example: we want 95% C.I. on the difference in mean
tensile strength for any two cotton weight percentages to be 
5 psi and  = 3. See Page 110.
48
Islamic University, Gaza - Palestine
3.9 The Regression Approach to the Analysis of Variance
Model: yij = + i + ij
2
L    ij2    yij     i 
a
n
i 1 j 1
a
n
i 1 j 1
L L

 0, i  1,2,, a
  i
 y
a
n
i 1 j 1
 ˆ  ˆi   0 &   yij  ˆ  ˆi   0, i  1,2,, a
n
ij
j 1
Islamic University, Gaza - Palestine
• The normal equations
Nˆ  nˆ1
nˆ  nˆ1
nˆ
nˆ
• Apply the constraint
 nˆ2
 nˆ2

   nˆa



 nˆa

y
y1
y 2

y a
ˆ  y ,ˆi  y i  y
Then estimations are
• Regression sum of squares (the reduction due to fitting the full
model)
a
a
i 1
i 1
R(  , )  ˆy  ˆi y i  
y i2
n
Islamic University, Gaza - Palestine
The error sum of squares:
a
n
SS E   y ij2  R , 
i 1 j 1
Find the sum of squares resulting from the treatment effects:
R( |  )  R(  , )  R(  )
 R(Full Model) - R(Reduced Model)
y
 y /n
N
i 1
2
2
i
51
Islamic University, Gaza - Palestine
• The testing statistic for H0: 1 = … = a
R( |  ) /(a  1)
F0 
~ Fa 1, N  a
a n 2

 y ij  R(  , ) /( N  a)
 i 1 j 1

52