Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1
2kr Factorial Designs with
Replications
• r replications of 2k Experiments
– 2kr observations.
– Allows estimation of experimental errors
• Model:
y = q0 + qAxA + qBxB +qABxAxB+e
e=Experimental error
2
Computation of Effects
• Simply use means of r measurements
I
A
1 -1
1
1
1 -1
1
1
164 86
41 21.5
B AB
-1
-1
1
1
38
9.5
1
-1
-1
1
20
5
y Mean y
(15, 18, 12)
(45, 48, 51)
(25, 28, 19)
(75, 75, 81)
15
48
24
77
Total
Total/4
Effects: q0 = 41, qA = 21.5, qB = 9.5, qAB = 5
3
Estimation of Experimental Errors
• Estimated Response:
^
yi = q0 + qAxAi + qBxBi +qABxAixBi
• Experimental Error = Estimated-Measured
eij = yij - ^yi
= yij - q0 - qAxAi - qBxBi - qABxAixBi
e
ij
0
i, j
• Sum of Squared Errors:
SSE =
22 r
e
i 1
2
i
4
Experimental Errors: Example
Estimated Response:
y^1 = q0 - qA - qB + qAB = 41 -21.5 -9.5 +5 = 15
Experimental errors:
e11 = y11 - y^1 = 15 - 15 = 0
Effect
i
1
2
3
4
I
A
41 21.5
1 -1
1
1
1 -1
1
1
Estimated
Measured
B AB Response Responses
^
9.5
5
yi yi1 yi2 yi3
-1
1
15 15 18 12
-1 -1
48 45 48 51
1 -1
24 25 28 19
1
1
77 75 75 81
SSE = 02 + 32 + (-3)2 + (-3)2 + ... + 42 = 102
Errors
ei1 ei2 ei3
0 3
-3 0
1 4
-2 -2
-3
3
-5
4
5
Allocation of Variation
Total variation or total sum of squares:
SST =
2
(
y
y
..)
ij
i, j
yij q0 q A x Ai qB xBi q AB x Ai xBi eij
2
2
2
2
2
2
2
2
(
y
y
..)
2
rq
2
rq
2
rq
e
ij
ij
A
B
AB
i, j
i, j
SST = SSA + SSB + SSAB + SSE
6
Derivation
Model:
yij q0 q A x Ai qB xBi q AB x Ai xBi eij
y q q
ij
i, j
0
i, j
x
A Ai
i, j
qB xBi q AB x Ai xBi eij
i, j
i, j
i, j
Since x’s, their products, and all errors add to zero
2
y
q
2
ij 0 rq0
i, j
i, j
7
Derivation (cont’d)
Mean response:
1
y.. 2 yij q0
2 r i, j
Squaring both sides of the model and ignoring cross product terms:
2
2
2 2
y
q
q
ij 0 A x Ai
i, j
i, j
i, j
2
2 2
q B2 xBi2 q AB
x Ai
xBi eij2
i, j
i, j
i, j
SSY = SS0 + SSA + SSB + SSAB + SSE
8
Derivation (cont’d)
Total Variation:
SST yij y..
2
i, j
y y ..
2
2
ij
i, j
i, j
SSY SS 0
SSA SSB SSAB SSE
One way to compute SSE:
2
SSE SSY 2 2 r q02 q A2 qB2 q AB
9
Example: Memory-Cache Study
SSY 152 182 12 2 452 752 752 812
27204
SS 0 2 2 rq02 12 412 20172
SSA 2 rq 12 (21.5) 5547
2
2
A
2
SSB 2 2 rq B2 12 (9.5) 2 1083
2
SSAB 2 2 rq AB
12 52 300
SSE 27204 2 3(41 21.5 9.5 5 ) 102
2
2
2
2
SST SSY SS 0 27204 20172 7032
2
10
Example: Memory-Cache
Study(cont’d)
SSA + SSB + SSAB + SSE
=5547 + 1083 + 300 + 102
= 7032 = SST
Factor A explains 5547/7032 or 78.88%
Factor B explains 15.40%
Interaction AB explains 4.27%
1.45% is unexplained and is attributed to errors.
11
Review: Confidence Interval for
the Mean
Problem: How to get a single estimate of the population mean
from k sample estimates?
Answer: Get probabilistic bounds.
Eg., 2 bounds, C1 & C2 There is a high probability, 1-,
that the mean is in the interval (C1, C2 ):
Pr {C1 C2} = 1 -
• Confidence interval (C1, C2 )
• Significance Level
• 100 (1-) Confidence Level
• 1- Confidence Coefficient.
12
Confidence Interval for the Mean
(cont’d)
Note: Confidence Level is traditionally expressed
as a percentage (near 100%); whereas, significance
level , is expressed as a fraction & is typically
near zero; e.g., 0.05 or 0.01.
13
Confidence Interval for the Mean
(cont’d)
Example:
Given sample with:
mean = x = 3.90
SD = s = 0.95
n = 32
A 90 % CI for the mean = 3.90 + (1.645)(0.95)/ 32
= (3.62, 4.17), used the central limit theorem.
Note: A 90 % CI => We can state with 90 % confidence
that the population mean is between 3.62 & 4.17. The
chance of error in this statement is 10 %
14
Testing for a Zero Mean
Difference in processor times of two different
implementations of the same algorithms was measured on 7
similar workloads. The differences are:
{1.5, 2.6, -1.8, 1.3, -0.5, 1.7, 2.4}
Can we say with 99 % confidence that one implementation is
superior to the other
15
Testing for a Zero Mean (cont’d)
Sample size = n = 7
mean = x = 1.03
sample variance = s2 = 2.57
sample deviation = s = 1.60
CI = 1.03 t x 1.60/ 7 = 1.03 0.605t
100 (1- ) = 99, = 0.01, 1- /2 = 0.995
From Table, the t value at six degrees of freedom is:
t[0.995; 6] = 3.707 & the 99% CI = (-1.21, 3.27).
Since the CI includes zero, we can not say with 99%
confidence that the mean difference is significantly
different from Zero.
16
Type I & Type II Errors
In testing a NULL, hypothesis, the level of significance is the
probability of rejecting a true hypothesis.
HYPOTHESIS
D
E
C
I
S
I
O
N
Actually True
Actually False
To Accept
Correct
Error (Type II)
To Reject
Error (Type I)
Correct
Note: The letters & denote the probability related to
these errors
17
Confidence Intervals For Effects
Effects are random variables.
Errors ~ N(0,σe) => y ~ N( y.., σe)
Since q0 = Linear combination of normal variables
2
2
=> q0 is normal with variance e /( 2 r )
Variance of errors:
1
SSE
2
s 2
eij 2
MSE
2 (r 1) ij
2 (r 1)
2
e
18
Confidence Intervals For Effects
(cont’d)
Denominator = 22(r - 1)
= # of independent terms in SSE
=> SSE has 22(r - 1) degrees of freedom.
2
2
2
Estimated variance of q0 : sq se /( 2 r )
0
Similarly,
sq A sqB sq AB
Se
22 r
Confidence intervals (CI) for the effects:
qi t1 / 2; 22 ( r 1) sqi
CI does not include a zero => significant
19
Example
For Memory-cache study:
Standard deviation of errors:
102
se 22SSE
8 12.75 3.57
( r 1)
Standard deviation of effects:
sqi se / 2 r 3.57 / 12 1.03
2
For 90% Confidence : t0.95,8 1.86
20
Example (cont’d)
Confidence intervals:
qi (1.86)(1.03) qi 1.92
q0 (39.08,42.91)
q A (19.58,23.41)
qB (7.58,11.41)
q AB (3.08,6.91)
No zero crossing
=> All effects are significant.
21
Confidence Intervals for
Contrasts
Contrast Linear combination
with coefficients = 0
Variance of hiqi:
2
2
se hi
2
s hq
2
ii
2 r
For 100 ( 1 - ) % confidence interval,
use t1 / 2; 22 ( r 1) .
22
Example: Memory-cache study
u = qA + qB -2qAB
Coefficients = 0,1,1, and -2 => Contrast
Mean u = 21.5 + 9.5 - 2 x 5 = 21
Variance
2
s
2
e 6
su 2
6.375
2 3
Standard deviation
su 6.375 2.52
t[0.95;8] = 1.86
90% Confidence interval for u :
u tsu 21 1.86 2.52 (16.31, 25.69)
23
CI for Predicted Response
Mean response y^ :
y^ = q0 + qA xA + qB xB + qABxA xB
The standard deviation of the mean of m response:
s yˆ m se
neff
1
neff
m1
1
2
= Effective deg of freedom
Total number of runs
=
1 + Sum of DFs of params used in y^
22 r
5
24
CI for Predicted Response
(cont’d)
100 ( 1 - ) % confidence interval:
yˆ t1 / 2; 22 ( r 1) s yˆ m
A single run (m = 1) :
s yˆ1 s
5
e 22 r
Population mean (m ) : s yˆ s
1
5
e 22 r
1
1
2
2
25
Example: Memory-cache Study
For xA = -1 and xB = -1:
•A single confirmation experiment:
y^1 = q0 - qA - qB + qAB
= 41 - 21.5 - 9.5 + 5 = 15
Standard deviation of the prediction:
s yˆ1 s
5
e 22 r
1
1
2
3.57
5
12
1 4.25
Using t[0.95;8]=1.86, the 90% confidence interval is:
15 1.86 4.25 (8.09,22.91)
26
Example: Memory-cache Study (cont’d)
•Mean response for 5 experiments in future:
s yˆ1 s
5
e 22 r
1
m
1
2
3.57
5
12
15 2.80
The 90% confidence interval is:
15 1.86 2.80 (9.79,20.29)
•Mean response for a large number of experiments in future:
s yˆ s
5
e 22 r
1
2
3.57
5
12
2.30
The 90% confidence interval is:
15 1.86 2.30 (10.72,19.28)
27
Example: Memory-cache Study (cont’d)
•Current mean response: Not for future.
(Use the formula for contrasts):
s yˆ
se2
hi2
22 r
12.754
12
2.06
90% confidence interval:
15 1.86 2.06 (11.17,18.83)
Notice: Confidence intervals become narrower.
28
Assumptions
1. Errors are statistically independent.
2. Errors are additive.
3. Errors are normally distributed
4. Errors have a constant standard deviation e.
5. Effects of factors are additive.
=> observations are independent and normally
distributed with constant variance.
29