Download Experimental Design

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Analysis of variance wikipedia , lookup

Transcript
Experimental Design
• If a process is in statistical control but has poor
capability it will often be necessary to reduce
variability.
• Experimental design often offers a more effective
way to accomplish this than control charting.
• Control charting is a passive statistical method. We
monitor the process and wait for information that
may lead to a useful change.
• Experimental design is an active statistical method.
We perform a series of tests on a process making
changes in the inputs and observing the
corresponding effect of the outputs.
• Application of experimental design in the
product/process design stage is also important and
can result in improved yield and reduced costs.
Guidelines for Experimental
Design
• Identify the problem
– Have a clear question you would like to answer
– Suspect that there is something to be gained from
experimentation
– Must be feasible to take the process off-line for
experimentation
• Choose the factors (inputs) that will be varied in the
experiment and their levels (values).
– Put together a team of individuals who understand
the process at different levels(engineers, managers,
technicians, operators)
– Identify the controllable and uncontrollable factors
that may impact process performance.
– Rank the controllable factors (usually just monitor
environmental factors)
– Determine the levels for each controllable factor
chosen
•
•
•
•
•
•
Formulate hypothesis
Select the appropriate experimental design
Conduct the experiments
Analyze the results
Draw conclusions
Take action
• Good experimental design should:
– Eliminate known sources of bias
– Guard against unknown sources of bias
– Ensure that the experiment provides useful
information about the process without using
excessive experimental resources.
Experiments with a single factor
• We will begin by studying the most simple
experimental design (an experiment with 1 factor)
in detail.
• The data analysis for single factor designs is similar
to that of more complex experiments.
• We will use analysis of variance (ANOVA) as the
primary statistical tool for analyzing output from
experiments
An Example
• A manufacturer of paper used for making grocery
bags is interested in improving the tensile strength
of the product. The manufacturing process specs
currently call for 10% hardwood concentration.
The paper has an average tensile strength of 15 psi.
The process engineer and the operators suspect that
tensile strength is a function of the pulp hardwood
concentration in the paper. Economic
considerations dictate that the range of possible
hardwood concentrations range between 5% and
20%. The process engineer decides to investigate 4
levels of hardwood concentration: 5%, 10%, 15%,
and 20%. She takes six specimens (replicates) at
each level yielding 24 total specimens. Each
specimen is then tested in random order. The results
are as follows:
Concentration
(%)
5
10
15
20
1
2
3
Observations
4
5
6
7
12
14
19
8
17
18
25
15
13
19
22
11
18
17
23
9
19
16
18
10
15
18
20
Null and Alternative Hypothesis
• Null Hypothesis: The mean response under each
factor (treatment) is equal
• Alternative Hypothesis: At least one of the
treatment means is different.
• ANOVA makes inferences about means from
examination of the variability in the experiment.
• The total variability can be measured by the total
squared deviation of each response from the overall
mean. This is termed Total Sum of Squares (TSS)
• ANOVA then partitions TSS into two parts:
Between treatment Sum of Squares (BSS) and
Within treatment Sum of Squares (ESS).
• SST = BSS + ESS
• BSS represents the difference between factor level
means and the grand average and so gives an
indication of the differences between factor levels
on the response.
• Differences between observations within each
factor level and the factor level mean are due to
random error. Therefore the within treatment sum
of squares is often termed ESS (Error sum of
squares)
• If the difference between BSS and ESS is large the
null hypothesis will be rejected (The factor effects
are significant)
• Before comparing the sums of squares we must
scale them by their degrees of freedom:
• Let a = total number of factors
n = total number of observations within each
factor
Then, there are an = N observations
SST has N - 1 degrees of freedom
Since there are a levels of the factor SSB has (a-1)
degrees of freedom
within each factor there are n observations
(replicates) providing (n-1) degrees of freedom
for estimation. Since there are a factors ESS has
a(n-1) degrees of freedom.
The ratio of a sum of squares to its number of
degrees of freedom is termed a mean square.
• Once the mean squares are calculated for the BSS
and ESS, the ratio BSS/ESS provides a statistic that
follows the F distribution. The critical value is then
compared with the calculated value to determine the
outcome of the hypothesis test.
• The ANOVA table provides a convenient summary
of this information:
Source of
Variation
Between Factors
Within Factors
Total
Sum of Squares
SSB
SSE
SST
Degrees of
freedom
a-1
a(n-1)
An
Mean Square
F
MSfactor
MEerror
Msfactor/MSerror
Randomized Block Designs
• Blocking involves grouping experimental units
which have similar effects on the response variable.
That is, we seek to eliminate the effect of
extraneous factors within a block so that the
between treatment effect (our main concern) can be
more precisely measured.
• Example:
Suppose we wish to investigate the differences in raw
materials from three different vendors. Processing
will take place on two machines. If we randomly
assign raw materials to machines we will not be
able to claim that differences in output are due to
vendors because we have not eliminated the effects
of the machines. If we instead form two blocks
consisting of the two machines differences in output
from each block represent differences in vendors
whereas differences in output between machines
will represent differences due to blocking, I.e.,
whether blocking was successful.
• With a blocking design, the variance is partitioned
into 3 parts. Sum of squares due to treatment
(SSTR), Sum of squares due to the blocking factor
(SSBF), and SSE. Therefore, successful blocking
minimizes variation between observations within a
block while maximizing the variation between
blocks. Since SSBF is eliminated from the
experimental error the result is an increase in the
precision of the experiment.
• The ANOVA table for a randomized block design
would be as follows:
Source of
Variation
Treatments
Sum of Squares
Mean Square
F
SSTR
Degrees of
freedom
a-1
MStreatment
SSBF
SSE
SST
b-1
(a-1)(b-1)
ab-1
MSblock
MSerror
MStreatment/Mser
ror
MSblock/MSerror
Blocks
Error
Total
Example
• We wish to test the effect of four different
chemicals on the strength of a particular fabric. It is
known that the effect of these chemicals varies
considerably across fabric specimens. We take 5
fabric samples and randomly apply each chemical
to each fabric. We have now isolated the effect of
the chemical in the fairly homogeneous
environment of a single sample of fabric. The
results are as follows:
Chemical
1
2
3
4
1
1.3
2.2
1.8
3.9
2
1.6
2.4
1.7
4.4
Fabric Sample
3
0.5
0.4
0.6
2
4
1.2
2
1.5
4.1
5
1.1
1.8
1.3
3.4