Download 2kr Factorial Designs with Replications

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
1
2kr Factorial Designs with
Replications
• r replications of 2k Experiments
– 2kr observations.
– Allows estimation of experimental errors
• Model:
y = q0 + qAxA + qBxB +qABxAxB+e
e=Experimental error
2
Computation of Effects
• Simply use means of r measurements
I
A
1 -1
1
1
1 -1
1
1
164 86
41 21.5
B AB
-1
-1
1
1
38
9.5
1
-1
-1
1
20
5
y Mean y
(15, 18, 12)
(45, 48, 51)
(25, 28, 19)
(75, 75, 81)
15
48
24
77
Total
Total/4
Effects: q0 = 41, qA = 21.5, qB = 9.5, qAB = 5
3
Estimation of Experimental Errors
• Estimated Response:
^
yi = q0 + qAxAi + qBxBi +qABxAixBi
• Experimental Error = Estimated-Measured
eij = yij - ^yi
= yij - q0 - qAxAi - qBxBi - qABxAixBi
e
ij
0
i, j
• Sum of Squared Errors:
SSE =
22 r
e
i 1
2
i
4
Experimental Errors: Example
Estimated Response:
y^1 = q0 - qA - qB + qAB = 41 -21.5 -9.5 +5 = 15
Experimental errors:
e11 = y11 - y^1 = 15 - 15 = 0
Effect
i
1
2
3
4
I
A
41 21.5
1 -1
1
1
1 -1
1
1
Estimated
Measured
B AB Response Responses
^
9.5
5
yi yi1 yi2 yi3
-1
1
15 15 18 12
-1 -1
48 45 48 51
1 -1
24 25 28 19
1
1
77 75 75 81
SSE = 02 + 32 + (-3)2 + (-3)2 + ... + 42 = 102
Errors
ei1 ei2 ei3
0 3
-3 0
1 4
-2 -2
-3
3
-5
4
5
Allocation of Variation
Total variation or total sum of squares:
SST =
2
(
y

y
..)
 ij
i, j
yij  q0  q A x Ai  qB xBi  q AB x Ai xBi  eij
2
2
2
2
2
2
2
2
(
y

y
..)

2
rq

2
rq

2
rq

e
 ij
 ij
A
B
AB
i, j
i, j
SST = SSA + SSB + SSAB + SSE
6
Derivation
Model:
yij  q0  q A x Ai  qB xBi  q AB x Ai xBi  eij
 y  q  q
ij
i, j
0
i, j
x
A Ai
i, j
  qB xBi   q AB x Ai xBi   eij
i, j
i, j
i, j
Since x’s, their products, and all errors add to zero
2
y

q

2
 ij  0 rq0
i, j
i, j
7
Derivation (cont’d)
Mean response:
1
y..  2  yij  q0
2 r i, j
Squaring both sides of the model and ignoring cross product terms:
2
2
2 2
y

q

q
 ij  0  A x Ai
i, j
i, j
i, j
2
2 2
  q B2 xBi2   q AB
x Ai
xBi   eij2
i, j
i, j
i, j
SSY = SS0 + SSA + SSB + SSAB + SSE
8
Derivation (cont’d)
Total Variation:


SST   yij  y..
2
i, j
  y   y ..
2
2
ij
i, j
i, j
 SSY  SS 0
 SSA  SSB  SSAB  SSE
One way to compute SSE:

2
SSE  SSY  2 2 r q02  q A2  qB2  q AB

9
Example: Memory-Cache Study
SSY  152  182  12 2  452    752  752  812
 27204
SS 0  2 2 rq02  12  412  20172
SSA  2 rq  12  (21.5)  5547
2
2
A
2
SSB  2 2 rq B2  12  (9.5) 2  1083
2
SSAB  2 2 rq AB
 12  52  300
SSE  27204  2  3(41  21.5  9.5  5 )  102
2
2
2
2
SST  SSY  SS 0  27204  20172  7032
2
10
Example: Memory-Cache
Study(cont’d)
SSA + SSB + SSAB + SSE
=5547 + 1083 + 300 + 102
= 7032 = SST
Factor A explains 5547/7032 or 78.88%
Factor B explains 15.40%
Interaction AB explains 4.27%
1.45% is unexplained and is attributed to errors.
11
Review: Confidence Interval for
the Mean
Problem: How to get a single estimate of the population mean
from k sample estimates?
Answer: Get probabilistic bounds.
Eg., 2 bounds, C1 & C2  There is a high probability, 1-,
that the mean is in the interval (C1, C2 ):
Pr {C1    C2} = 1 - 
• Confidence interval  (C1, C2 )
•   Significance Level
• 100 (1-)  Confidence Level
• 1-  Confidence Coefficient.
12
Confidence Interval for the Mean
(cont’d)
Note: Confidence Level is traditionally expressed
as a percentage (near 100%); whereas, significance
level , is expressed as a fraction & is typically
near zero; e.g., 0.05 or 0.01.
13
Confidence Interval for the Mean
(cont’d)
Example:
Given sample with:
mean = x = 3.90
SD = s = 0.95
n = 32
A 90 % CI for the mean = 3.90 + (1.645)(0.95)/ 32
= (3.62, 4.17), used the central limit theorem.
Note: A 90 % CI => We can state with 90 % confidence
that the population mean is between 3.62 & 4.17. The
chance of error in this statement is 10 %
14
Testing for a Zero Mean
Difference in processor times of two different
implementations of the same algorithms was measured on 7
similar workloads. The differences are:
{1.5, 2.6, -1.8, 1.3, -0.5, 1.7, 2.4}
Can we say with 99 % confidence that one implementation is
superior to the other
15
Testing for a Zero Mean (cont’d)
Sample size = n = 7
mean = x = 1.03
sample variance = s2 = 2.57
sample deviation = s = 1.60
CI = 1.03  t x 1.60/ 7 = 1.03  0.605t
100 (1- ) = 99,  = 0.01, 1- /2 = 0.995
From Table, the t value at six degrees of freedom is:
t[0.995; 6] = 3.707 & the 99% CI = (-1.21, 3.27).
Since the CI includes zero, we can not say with 99%
confidence that the mean difference is significantly
different from Zero.
16
Type I & Type II Errors
In testing a NULL, hypothesis, the level of significance is the
probability of rejecting a true hypothesis.
HYPOTHESIS
D
E
C
I
S
I
O
N
Actually True
Actually False
To Accept
Correct
 Error (Type II)
To Reject
 Error (Type I)
Correct
Note: The letters  &  denote the probability related to
these errors
17
Confidence Intervals For Effects
Effects are random variables.
Errors ~ N(0,σe) => y ~ N( y.., σe)
Since q0 = Linear combination of normal variables
2
2
=> q0 is normal with variance  e /( 2 r )
Variance of errors:
1
SSE
2
s  2
eij  2
MSE

2 (r  1) ij
2 (r  1)
2
e
18
Confidence Intervals For Effects
(cont’d)
Denominator = 22(r - 1)
= # of independent terms in SSE
=> SSE has 22(r - 1) degrees of freedom.
2
2
2
Estimated variance of q0 : sq  se /( 2 r )
0
Similarly,
sq A  sqB  sq AB 
Se
22 r
Confidence intervals (CI) for the effects:
qi  t1 / 2; 22 ( r 1) sqi
CI does not include a zero => significant
19
Example
For Memory-cache study:
Standard deviation of errors:
102
se  22SSE

8  12.75  3.57
( r 1)
Standard deviation of effects:
sqi  se / 2 r  3.57 / 12  1.03
2
For 90% Confidence : t0.95,8   1.86
20
Example (cont’d)
Confidence intervals:
qi  (1.86)(1.03)  qi  1.92
q0  (39.08,42.91)
q A  (19.58,23.41)
qB  (7.58,11.41)
q AB  (3.08,6.91)
No zero crossing
=> All effects are significant.
21
Confidence Intervals for
Contrasts
Contrast  Linear combination
with  coefficients = 0
Variance of  hiqi:
2
2
se  hi
2
s hq 
2
ii
2 r
For 100 ( 1 -  ) % confidence interval,
use t1 / 2; 22 ( r 1) .
22
Example: Memory-cache study
u = qA + qB -2qAB
Coefficients = 0,1,1, and -2 => Contrast
Mean u = 21.5 + 9.5 - 2 x 5 = 21
Variance
2
s
2
e 6
su  2
 6.375
2 3
Standard deviation
su  6.375  2.52
t[0.95;8] = 1.86
90% Confidence interval for u :
u  tsu  21 1.86  2.52  (16.31, 25.69)
23
CI for Predicted Response
Mean response y^ :
y^ = q0 + qA xA + qB xB + qABxA xB
The standard deviation of the mean of m response:
s yˆ m  se
neff

1
neff
 m1

1
2
= Effective deg of freedom
Total number of runs
=
1 + Sum of DFs of params used in y^
22 r

5
24
CI for Predicted Response
(cont’d)
100 ( 1 -  ) % confidence interval:
yˆ  t1 / 2; 22 ( r 1) s yˆ m
A single run (m = 1) :
s yˆ1  s

5
e 22 r
Population mean (m  ) : s yˆ  s

1
 
5
e 22 r
1
1
2
2
25
Example: Memory-cache Study
For xA = -1 and xB = -1:
•A single confirmation experiment:
y^1 = q0 - qA - qB + qAB
= 41 - 21.5 - 9.5 + 5 = 15
Standard deviation of the prediction:
s yˆ1  s

5
e 22 r

1
1
2
 3.57
5
12
 1  4.25
Using t[0.95;8]=1.86, the 90% confidence interval is:
15  1.86  4.25  (8.09,22.91)
26
Example: Memory-cache Study (cont’d)
•Mean response for 5 experiments in future:
s yˆ1  s

5
e 22 r

1
m

1
2
 3.57
5
12
 15  2.80
The 90% confidence interval is:
15  1.86  2.80  (9.79,20.29)
•Mean response for a large number of experiments in future:
s yˆ  s
 
5
e 22 r
1
2
 3.57
5
12
 2.30
The 90% confidence interval is:
15  1.86  2.30  (10.72,19.28)
27
Example: Memory-cache Study (cont’d)
•Current mean response: Not for future.
(Use the formula for contrasts):
s yˆ 
se2

hi2
22 r

12.754
12
 2.06
90% confidence interval:
15  1.86  2.06  (11.17,18.83)
Notice: Confidence intervals become narrower.
28
Assumptions
1. Errors are statistically independent.
2. Errors are additive.
3. Errors are normally distributed
4. Errors have a constant standard deviation e.
5. Effects of factors are additive.
=> observations are independent and normally
distributed with constant variance.
29
Related documents