Download Aineprogramm

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Analysis of variance wikipedia , lookup

History of statistics wikipedia , lookup

Time series wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
OPTIMAALSE PROJETEERIMISE MATEMAATILISED MEETODID
MER 9020
Aine maht
6 EAP, E S
statsionaarõpe: nädalatunnid
loenguid
praktikume
harjutusi
4.0
2.0
0.0
2.0
Aine eesmärkideks on:
 süvendada ettevalmistust toodete ja tootmisprotsesside modelleerimise ja optimeerimise
alal. Tutvustada valdkonna arengut
 anda uusimaid teadmisi ja oskusi iseseisvaks teadus- ja arendustööks, sh. probleemide,
teadusuuringute ja projektide eesmärkide püstitamiseks, teadusuuringute ja arendustöö
läbiviimiseks vajalike teooriate, meetodite ja tarkvarasüsteemide tundmises, matemaatiliste
ja eksperimentaalsete meetodite ja tarkvarasüsteemide rakendamises
Aineprogramm
Sissejuhatus
1. Matemaatilised mudelid tehniliste süsteemide ja –protsesside modelleerimiseks
ja vahendid nende loomiseks/kirjeldamiseks. Eksperimentaalsete mudelite
kasutamine . Inseneri statistika (Engineering statistics)
a. Vastavusfunktsioonid (asendusmudelid) ja nende kasutamine
b. Statistika ja projekteerimine. Eksperimentidest määratud mudelite
kasutamine tehniliste süsteemide ja protsesside kirjeldamisel. Katsete
planeerimine. Engineering statistics and design. Statistical Design of
Experiments. Use of statistical models in design.
c. Kaasaegsed meetodid asendusmudelite koostamiseks. Närvivõrkude
metoodika kasutamine.
d. Mudeli täpsuse hindamine. Riskide arvestamine, töökindlus, ohutus. Statistical
tests of Hypotheses. Analysis of Variance. Risk, reliability, and Safety
2. Insenerilahenduste optimeerimine, üldalused (Basics of Optimisation methods
and search of solutions)
a. Optimeerimisülesannete tüübid (Types of Optimization Problem)
b. Optimaalse valiku ülesanded. Mitmekriteriaalsed Pareto-optimaalsed lahendid.
(Multicriterial optimal decision theory. Pareto optimality, Use of malticriterial
decision theory in engineering design).
c. Klassikalised optimeerimisülesanded. (Optimization by Differential Calculus.
Lagrange multipliers. Examples of use the classical optimization method in
engineering).
d. Matemaatilise planeerimise meetodid. Otsene ja duaalne planeerimise
ülesanne. Duaalse ülesande kasutamine. Optimeerimisülesennete
lahendite tundlikkuse analüüs. (Mathematical programming methods and
their use for engineering design, process planning and manufacturing
resource planning. Direct and Dual Optimization tasks. Sensitivity
analysis).
e. Geneetilised optimeerimise algoritmid..
3. Tehniliste süsteemide ja protsesside simuleerimine. Simuleerimine kasutades
juhuslike arvude generaatoreid ja SIMULINK’i
4. Toodete ja tootmisprotsesside optimaalne projekteerimine arendused.
Optimeerimise kasutamise näited.
Iseseisev-praktiline töö: Praktiline töö koosneb järgnevast viiest tööst:
1. Katsetulemuste statistiline hindamine: peamiste statistiliste hinnangute
leidmine katsetulemustele, kahe erineva katseseeria vahelise seose
hindamine(korrelatsioon) , katseeeria varieeruvuse erinevuse hindamine.
2. Regressioonanalüüsi mudeli koostamine: mudeli parameetrite
hindamine katsetulemuste alusel, saadud mudeli parameetrite
usaldadusvahemike hindamine. Valitud mudeli sobivuse analüüs.
3. Närvivõrkude mudeli leidmine
4. Lineaarse (mittelineaarse) optimeerimisülesande lahendamine,
optimeerimise tulemuste analüüs
5. Mudeli statistiline simuleerimine arvutil , simuleerimstulemuste
analüüs.
Õppekirjandus
1. Übi, E. Ekstreemumülesanded majanduses ja tehnikas, Külim, 2002, 176
2. Ravindran, A; K.M. Ragsdell, G.V. Reklaitis. Engineering optimization :
lk.
3. Antoniou, A. Practical optimization : algorithms
1. Sissejuhatus
Tehniliste süsteemide optimerimise mudelid on oma sisult kirjeldavad ja
normatiivsed (normative or prescriptive), nende eesmärgiks on leida süsteemi
parameetrite parimad väärtused eeldusel , et teada on süsteemi või protsessi
kirjeldav mudel. Usaldusväärse analüütilise (teoreetilise) mudeli puudumisel
tulebsee luua, kasutades katseid, vaatlusi jms, ning asendades analüütilise
mudeli asendusmudeliga (surrogaatmudeliga) hinnates nn “vastavuspinda
(vastavus funktsiooni)” (response surface) .
Arvestades püstitatud ülesannet võib kursust esitada järgmise skeemiga.
Katsed,
arvutikatsed,
Protsesside
jälgimine, jms
Simuleerimine,
analüüs
Katsete plaanimine
(DOE)
Asendusmudel;
Surrogaatmudel
(response surface)
Optimeerimine
Tehniliste süsteemide optimeerimine (skeem)
Insenerilahendused arvutipõhistes süsteemides (Definition of Engineering tasks)
 Inseneriülesanded sisaldavad endas info teisendamist ja neid võib jagada:




Simuleerimise (informatsiooni (andmete) interpreteerimise)
Analüüsi ülesanneteks
Diagnostika ülesanneteks
Sünteesi ülesanneteks
Erinevate ülesannete tüüpide puhul kasutatakse erinevaid lähenemisi.
Simuleerimist kasutataks etteantud struktuuriga info/objekti analüüsiks eesmärgiga hinnata
objektide käitumist sõltuvalt parameetritest
Analüüsil on teada mõjuvad parameetrid ja struktuur , analüüsitakse objekti käitumist
parameetrite muutumisel.
Diagnostika on simuleerimise vastasülesanne, püütakse leida parameetrid , millised tagavad
nõutava väljundi/käitumise
Süntees on pöördülesanne analüüsile, kus püütakse lähtudes väljundist määrata sobiv
struktuur ja parameetrid.
1.1 Vastavusfunkstsiooni hindamise metoodika. Response Surface
Methodology (RSM)
There is a difference between data and information. To extract information from data you
have to make assumptions about the system that generated the data. Using these assumptions
and physical theory you may be able to develop a mathematical model of the system.
Generally, even rigorously formulated models have some unknown parameters..
Identifying of those unknown constants and fitting an appropriate response surface model
from experimental data requires knowledge of Design of Experiments, regression modelling
techniques, and optimization methods.
The response surface equations give the response in terms of the several independent
variables of the problem. If the response is plotted as a function of X 1 , X 2 etc., we obtain a
response surface. A powerful statistical procedure, that employs factorial analysis and
regression analysis has been developed for the determination of the optimum operating
condition on a response surface.
Response surface methodology (RSM) has two objectives:
1. To determine with one experiment where to move in the next experiment so as to
continually seek out the optimal point on the response surface.
2. To determine the equation of the response surface near the optimal point.
Response surface methodology (RSM) uses a two-step procedure aimed at rapid
movement from the current position into the region of the optimum. This is followed by the
characterization of the response surface in the vicinity of the optimum by a mathematical
model. The basic tools used in RSM are two-level factorial designs and the method of least
squares (regression) model and its simpler polynomial forms.
1.2. Modelleerimine. Modelling
A model is a representation or pattern of an idea or problem. That is, a model is a way to
describe or present a problem in a way that aids in understanding or solving the problem.
Models serve several purposes:
The Purpose of Modelling
1. To make an idea concrete. This is done by representing it mathematically, pictorially or
symbolically.
2. To reveal possible relationships between ideas. Relationships of hierarchy, support,
dependence, cause, effect, etc. can be revealed by constructing a model.
We have to be careful, then, how much we let our models control our thinking.
3. To simplify the complex design problem to make it manageable or understandable. Almost
all models are simplifications because reality is so complex.
4. The main purpose of modelling, which often includes all of the above three purposes, is to
present a problem in a way that allows us to understand it and solve it..
Types of Models
A. Visual. Draw a picture of it. If the problem is or contains something physical, draw a
picture of the real thing--the door, road, machine, bathroom, etc. If the problem is not
physical, draw a symbolic picture of it, either with lines and boxes or by representing aspects
of the problem as different items--like cars and roads representing information transfer in a
company.
B. Physical. The physical model takes the advantages of a visual model one step further by
producing a three dimensional visual model.
C. Mathematical. Many problems are best solved mathematically.
1.3. Projekteerimisülesannete keerukus. Complexity theory for design
Ülesande keerukus hindamisel lähtutakse kolme tüüpi keerukusest:
 Arvutuslik keerukus (arvutus mahukus)
 Kirjeldamise keerukus
 Mõistmise (äratundmise keerukus)
Complexity theory is part of the theory of computation dealing with the resources required
during computation to solve a given problem. The most common resources are time (how
many steps does it take to solve a problem) and space (how much memory does it take to
solve a problem). Other resources can also be considered, such as how many parallel
processors are needed to solve a problem in parallel. Complexity theory differs from
computability theory, which deals with whether a problem can be solved at all, regardless of
the resources required.
The time complexity of a problem is the number of steps that it takes to solve an instance of
the problem, as a function of the size of the input, using the most efficient algorithm. To
understand this intuitively, consider the example of an instance that is n bits long that can be
solved in n² steps. In this example we say the problem has a time complexity of n². Of course,
the exact number of steps will depend on exactly what machine or language is being used. To
avoid that problem, we generally use Big O notation. If a problem has time complexity O(n²)
on one typical computer, then it will also have complexity O(n²) on most other computers, so
this notation allows us to generalize away from the details of a particular computer.
Big O notation is a type of symbolism used in complexity theory, computer science, and
mathematics to describe the asymptotic behavior of functions. More exactly, it is used to
describe an asymptotic upper bound for the magnitude of a function in terms of another,
usually simpler, function.
Keerukuse hindamiseks kasutatakse järgmist skaalat (referents-funktsioone ( Big O notation) :
 Logaritmiline keerukus, O(log n)
 Lineaarne keerukus , O(n)
 Polünomaalne keerukus, O( n q )
 Eksponentsiaalne keerukus O(a n )
 Faktoriaalne keerukus O (n!)
 Dopelt-eksponentsiaalne keerukus, O(n n ) .
Logaritmile keerukuse korral ei sõltu aeg ülesande mahust (mõõtmetest). Arvutusmahtude
seisukohalt on vastuvõetavad peamiselt logaritmilised ja lineaarsed keerukused.
2. Eksperimentidest määratud mudelite kasutamine tehniliste süsteemide
ja protsesside optimeerimisel. Matemaatilise statistika kasutamine.
Development of models. statistical decision theory.
2.1 Use of software tools (MATLAB and Excel) for statistical analysis.
Statistics Toolbox extends MATLAB® to support a wide range of common
statistical tasks. The Excel Data Analysis Tools represent also the wide list of statistical
analysis tools.
The following tasks are our special area of interest.
Probability Distributions. Toolbox supports computations involving over 30 different
common probability distributions, plus custom distributions which you can define. For each
distribution, a selection of relevant functions is available, including density functions,
cumulative distribution functions, parameter estimation functions, and random number
generators. The toolbox also supports nonparametric methods for density estimation.
Linear Models. In the area of linear regression, Statistics Toolbox has functions to
compute parameter estimates, predicted values, and confidence intervals for
simple and multiple regression, stepwise regression, ridge regression, and
regression using response surface models. In the area of analysis of variance
(ANOVA).
Nonlinear Models. For nonlinear regression models, Statistics Toolbox provides additional
parameter estimation functions and tools for interactive prediction and visualization of
multidimensional nonlinear fits. The toolbox also includes functions that create classification
and regression trees to approximate regression relationships.
Multivariate Statistics supports methods for the visualization and analysis of
multidimensional data, including principal components analysis, factor
analysis, one-way multivariate analysis of variance, cluster analysis, and
classical multidimensional scaling.
Statistical Process Control. In the area of process control and quality management, Statistics
Toolbox provides functions for creating a variety of control charts, performing process
capability studies, and evaluating Design for Six Sigma (DFSS) methodologies.
Design of Experiments. Statistics Toolbox provides tools for generating and augmenting full
and fractional factorial designs, response surface designs, and D-optimal designs. The toolbox
also provides functions for the optimal assignment of units with fixed covariates.
2.1.1. Juhuslike suuruste hindamine. Probability Distributions. Descriptive
Statistics . Estimation the parameters of Distribution.
A typical data sample is distributed over a range of values, with some values occurring more
frequently than others. Some of the variability may be the result of measurement error or
sampling effects. For large random samples, however, the distribution of the data typically
reflects the variability of the source population and can be used to model the data-producing
process. Statistics computed from data samples also vary from sample to sample.
Modelling distributions of statistics is important for drawing inferences from statistical
summaries of data. Probability distributions are theoretical distributions, based on
assumptions about a source population. They assign probability to the event that a random
variable, such as a data value or a statistic, takes on a specific, discrete value, or falls within a
specified range of continuous values.
Choosing a model often means choosing a parametric family of probability distributions and
then adjusting the parameters to fit the data. The choice of an appropriate distribution family
may be based on a priori knowledge, such as matching the mechanism of a data-producing
process to the theoretical assumptions underlying a particular family, or a posteriori
knowledge, such as information provided by probability plots and distribution tests.
Parameters can then be found that achieve the maximum likelihood of producing the data.
In following we give an example of use the Excel Data Analysis Tool for estimation the
probabilistic parameters: Descriptive Statistics function.
Normaalse jaotusega juhuslike arvude genereerimise näide
Sigma =1,mean =10
Sigma =0,1,mean =10
9,699767841
10,24425731
11,19835022
7,81641236
11,09502253
9,30979584
8,153089109
9,226492946
9,432075128
10,13485305
10,04223762
10,00892771
9,844905233
10,11216025
10,08380721
9,872989292
10,04739752
10,0507107
10,14481066
9,95751125
Juhuslike suuruste jaotust iseloomustavate näitajate hindamine
(Descriptive Statistics)
Column1
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
Largest(1)
Smallest(1)
Confidence Level(95,0%) for mean
Column1
9,631
0,350
9,566
#N/A
1,106
1,224
-0,482
-0,201
3,382
7,816
11,198
96,310
10,000
11,198
7,816
0,792
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
Largest(1)
Smallest(1)
Confidence Level(95,0%) for mean
10,01655
0,031003
10,04482
#N/A
0,098042
0,009612
-0,36557
-0,71427
0,299905
9,844905
10,14481
100,1655
10
10,14481
9,844905
0,070135
STANDARD ERROR
The standard error σ is a method of measurement or estimation of the standard deviation of
the sampling distribution associated with the estimation method. The term may also be used to
refer to an estimate of that standard deviation, derived from a particular sample used to
compute the estimate.
Let X be a random variable with mean value μ:
Here the operator E denotes the average or expected value of X. Then the standard
deviation of X is the quantity
Sample variance
In probability theory and statistics, the variance is used as a measure of how far a set of
numbers are spread out from each other. It is one of several descriptors of a probability
distribution, describing how far the numbers lie from the mean (expected value).
Kurtosis
In probability theory and statistics, kurtosis (from the Greek word κυρτός, kyrtos or kurtos,
meaning bulging) is a measure of the "peakedness" of the probability distribution of a realvalued random variable, although some sources are insistent that heavy tails, and not
peakedness, is what is really being measured by kurtosis. Higher kurtosis means more of the
variance is the result of infrequent extreme deviations, as opposed to frequent modestly sized
deviations.
The fourth standardized moment is sometimes used as the definition of kurtosis.
It is defined as
where μ4 is the fourth moment about the mean and σ is the standard deviation.
Skewness
In probability theory and statistics, skewness is a measure of the asymmetry of the probability
distribution of a real-valued random variable. The skewness value can be positive or negative,
or even undefined. Qualitatively, a negative skew indicates that the tail on the left side of the
probability density function is longer than the right side and the bulk of the values (possibly
including the median) lie to the right of the mean. A positive skew indicates that the tail on
the right side is longer than the left side and the bulk of the values lie to the left of the mean.
A zero value indicates that the values are relatively evenly distributed on both sides of the
mean, typically but not necessarily implying a symmetric distribution.
2.1.2. Korrelation. Seosed juhuslike arvude vahel. Correlation.
In statistics, dependence refers to any statistical relationship between two random variables
or two sets of data. Correlation refers to any of a broad class of statistical relationships
involving dependence.
Figure Correlation examples.
Näide korrelatsiooni hindamisest
Example of estimating the correlation using Excel Data analysis
x
y
z
3
1
5
6
7
8
4
5
2
5
3
2
4
7
6
5
4
4
3
4
8
7
4
2
3
6
4
2
2
7
Korrelatsiooni hindamise näide Excelis
Column 1
Column 1
Column 2
Column 3
1
0,825593
-0,22052
Column 2
Column 3
1
-0,48651
1
Kodutöö 1. Katsetulemuste statistiline hindamine
Kasutatades Excel Data analysis vahendeid hinnata juhuslike suuruste
jaotust iseloomustavaid näitajaid (Descriptive Statistics) lahendades
järgmisi ülesandeid:
1. peamiste statistiliste hinnangute leidmine katsetulemustele. Hindamise
aluseks on vähemalt 2 erinevat vabalt valitud katseseeriat. Statistilised
hinnangud tuleb leida mõlemale seeriale.
2. kahe erineva katseseeria vahelise seose hindamine(korrelatsioon,
kovariatsioon )
3. katseeeria varieeruvuste erinevuse hindamine, kasutatdes F kriteeriumit
võrreleda, kas rinevate kats ete tulemuste vareeruvuste erinevus on
juhuslik. So hinnatakse katsetulemuste dispersioone  12 ,  22 vastavalt s12 , s 22
abil hinnatakse suhet Fb =
s12
s 22
ja võrreldakseseda F funkstiooni
tabelväärtusega FT etteantud usaldatavusega (tavaliselt 5% )(F
tabelväärtuyste leidmisel on vabadusastmetekas vastavalt katsete arvud
n1  1; n2  1 ) .Kui Fb  FT siis loetakse , et kehtib nn ” nullhüpotees” ja
erinevus varieeruvuste vahel erinevates katsetulemustes on juhuslik.
Ülesande lahendamiseks võib kasutada ka Excel Data Analysis funktsiooni
F-Test two –Sample for Variance.
Näide analüüsist funktsiooniga : F-Test Two-Sample for Variances
Variable
Variable 2
1
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
4,642857
6,708791
14
13
1,712482
0,172124
2,576927
4,071429
3,917582
14
13
Küsimused seoses kodutööga 1
 Seletada ja võrrelda erinevaid juhuslike suuruste jaotust
iseloomustavaid näitajaid
 Anda hinnang ja erinevate katsetulemuste omavahelisele seosele
 Hinnata erinevate katsetulemuste (sh varieeruvuse) erinevust.
2.1.3. Lineaarse regressiooni mudelid. Linear regression Models.
Linear regression is widely used in science to describe relationships between variables. It
ranks as one of the most important tools used. Researchers usually include several variables in
their regression analysis in an effort to remove factors that might produce spurious
correlations. However, it is never possible to include all possible confounding variables in a
study employing regression. For this reason, randomized experiments are considered to be
more trustworthy than a regression analysis.
Linear models represent the relationship between a continuous response variable and one or
more predictor variables (either continuous or categorical) in the form y  X     ,
Where
y is an n-by-1 vector of observations of the response variable.
X is the n-by-p design matrix determined by the predictors.
 is a p-by-1 vector of unknown parameters to be estimated.
 is an n-by-1 vector of independent, identically distributed random disturbances.
Näide: Algandmed
Kuu
1
2
3
4
5
6
10
1
2
3
4
5
6
7
Teenus
Nõudlus
1
1
1
1
1
1
1
2
2
2
2
2
2
2
10
12
11
13
15
13
13
100
101
102
105
102
101
102
Regressiooni analüüsi tulemuste näide
SUMMARY OUTPUT
Regression Statistics
Multiple R
1,000
R Square
0,999
Adjusted R
Square
0,999
Standard
Error
1,485
Observations
14,000
ANOVA
df
SS
MS
Regression
2,000 27997,457 13998,728
Residual
11,000
24,257
2,205
Total
13,000 28021,714
Standard
Coefficients
Error
t Stat
Intercept
-78,350
1,487
-52,682
X Variable 1
0,278
0,164
1,692
X Variable 2
89,548
0,797
112,373
RESIDUAL OUTPUT
Observation Predicted Y Residuals
1
11,476
-1,476
2
11,753
0,247
3
12,031
-1,031
4
12,309
0,691
5
12,587
2,413
6
12,865
0,135
7
13,977
-0,977
8
101,023
-1,023
9
101,301
-0,301
10
101,579
0,421
11
101,857
3,143
12
102,135
-0,135
13
102,413
-1,413
14
102,691
-0,691
F
6347,977
Significance
F
0,000
P-value
Lower 95% Upper 95%
0,000
-81,624
-75,077
0,119
-0,084
0,640
0,000
87,794
91,302
Linearization Transformations
The computational difficulties associated with nonlinear regression analysis sometimes can be
avoided by using simple transformations that convert a problem that is nonlinear into one that
can be handled by simple linear regression analysis. The most common transformations are
given in following Table. The reader needs to be aware that logarithmic transformation can
introduce bias in the prediction of the response variable.
TABLE
Some common linearization transformation, W = c + dV
Nonlinear equation
Linearized equation
Linearized variables
W
V
y = a + bx
y = a + bx (linear)
b
ln y=ln a + b ln x
y = ax (logarithmic)
bx
ln y= ln a+bx
y = ae (exponential)
 bx
1
y  1  e (exponential)
ln
 bx
1 y
y=a+bx
y  a  b x (square
root)
y=a+b/x (inverse)
y=a+bx
y
ln y
ln y
ln
1
1 y
x
ln x
x
x
y
x
y
1/x
Nonlinear regression
Nonlinear regression in statistics is the problem of fitting a model
to multidimensional x,y data, where f is a nonlinear function of x with parameters θ.
In general, there is no algebraic expression for the best-fitting parameters, as there is in linear
regression. Usually numerical optimization algorithms are applied to determine the bestfitting parameters.
2.1.4. Regressioonanalüüs. Vähimruutude meetod. Regression analysis.
Method of Least squares
Regression analysis is any statistical method where the mean of one or more random
variables is predicted conditioned on other (measured) random variables. Regression analysis
is the statistical view of curve fitting: choosing a curve that best fits given data points.
Sometimes there are only two variables, one of which is called X and can be regarded as nonrandom, because it can be measured without substantial error and its values can even be
chosen at will. For this reason it is called the independent or controlled variable. The other
variable called Y, is a random variable called the dependent variable, because its values
depend on X. In regression we are interested in the variation of Y on X.
This dependence is called the regression of Y on X.
Regression is usually posed as an optimization problem as we are attempting to find a solution
where the error is at a minimum. The most common error measure that is used is the least
squares: this corresponds to a Gaussian likelihood of generating observed data given the
(hidden) random variable.
Regression can be expressed as a maximum likelihood method of estimating the parameters of
a model.
The earliest form of linear regression was the method of least squares, which was published
by Legendre in 1805, and by Gauss in 1809. The term "least squares" is from Legendre's term,
moindres carrés. However, Gauss claimed that he had known the method since 1795.
2.1.5 Analysis of variance
In statistics, analysis of variance (ANOVA) is a collection of statistical models and their
associated procedures which compare means by splitting the overall observed variance into
different parts.
ANOVA is a particular form of statistical hypothesis testing used in the analysis of
experimental data. A test result (calculated from the null hypothesis and the sample) is called
statistically significant if it is deemed unlikely to have occurred by chance, assuming the truth
of the null hypothesis. A statistically significant result (when a probability (p-value) is less
than a threshold (significance level)) justifies the rejection of the null hypothesis.
In the typical application of ANOVA, the null hypothesis is that all groups are simply random
samples of the same population. Rejecting the null hypothesis implies that different treatments
result in altered effects.
.
ANOVA uses traditional standardized terminology. The definitional equation of sample
variance is
, where the divisor is called the degrees of freedom
(DF), the summation is called the sum of squares (SS), the result is called the mean square
(MS) and the squared terms are deviations from the sample mean. For regression analysis
ANOVA is based on partitioning of the sum of squares and estimates 3 sample variances:
 a total variance based on all the observation deviations,
 an error variance based on all the observation deviations from their appropriate
treatment means and a treatment variance.
 the treatment variance based on the deviations of treatment means from the grand
mean, the result being multiplied by the number of observations in each treatment to
account for the difference between the variance of observations and the variance of
means.
The fundamental technique is a partitioning of the total sum of squares into components
related to the effects in the model used. For example, we show the model for a simplified
ANOVA with one type of treatment at different levels. (If the treatment levels are quantitative
and the effects are linear, a linear regression analysis may be appropriate.)
SSTotal = SSError + SSTreatments.
The number of degrees of freedom (abbreviated df) can be partitioned in a similar way and
specifies the chi-square distribution which describes the associated sums of squares.
dfTotal = dfError + dfTreatments.
Analyses of variance lead to F-tests of statistical significance using Fisher's F-distribution.
Kodutöö 2. Regressioonanalüüsi mudeli koostamine.
Regression analysis based on Microsoft Excel Data analysis tools.
Microsoft Excel provides a set of Data Analysis Tools (called the Analysis ToolPak) that could be used
for development complex statistical or engineering analysis.
The Analysis Toolpak includes the tool for regression analysis. The Regression analysis tool perform
linear regression analysis by using the “least square“ method to fit line through a set of observations.
In following figures the inputs data, setting the parameters for regression analysis task and output the
results are represented.
The Regression tool in Excel’s Data Analysis determines the coefficients (ai) that yield the
smallest residual sum of squares, which is equivalent to the greatest correlation coefficient
squared, R2, for Equation (1). This is known as linear regression analysis.
y = a0 + a1*x1 + a2*x2 + a3*x3 + ….
(1)
where y is the dependent variable (response), a0 is the intercept, and x1, x2, x3 etc. are the
independent variables (factors). It is assumed that you have n observations of y versus
different values of xi.
Note that the xi can be functions of the actual experimental variables, including products
of different variables.
Following are the meanings of the results produced by Excel:

R Square = (Multiple R)2 = R2 = 1 - Residual SS / Total SS = Regress SS / Total SS
(roughly the fraction of the variation in y that is explained by equation 1).

R = correlation coefficient

Adjusted R Square = 1 - (Total df / Residual df)(Residual SS / Total SS)

Standard Error = (Residual MS)0.5

Note that “Error” does not mean there’s a mistake or an experimental error. It’s just a
definition related to residuals. This word is used a lot in statistical analysis, and is
misunderstood by those who don’t know statistics.
ANOVA = ANalysis Of VAriance

Regression df = regression degrees of freedom = number of independent variables
(factors) in equation 1.

Regression SS = Total SS - Residual SS

Regression MS = Regression SS / Regression df

Regression F = Regression MS / Residual MS

Significance F = FDIST(Regression F, Regression df, Residual df) = Probability that
equation (1) does NOT explain the variation in y, i.e. that any fit is purely by chance.
This is based on the F probability distribution. If the Significance F is not less than 0.1
(10%) you do not have a meaningful correlation.

Residual df = residual degrees of freedom = Total df - Regression df = n - 1 - number of
independent variables (xi)

Residual SS = sum of squares of the differences between the values of y predicted by
equation 1 and the actual values of y. If the data exactly fit equation 1, then Residual SS
would be 0 and R2 would be 1.

Residual MS = mean square error = Residual SS / Residual df

Total df = total degrees of freedom = n – 1

Total SS = the sum of the squares of the differences between values of y and the average y
= (n-1)*(standard deviation of y)2

Coefficients = values of ai which minimize the Residual SS (maximize R2). The Intercept
Coefficient is a0 in equation 1.

Standard error = (Residual MS using only the Coefficient for that row)0.5

t Stat = Coefficient for that variable / Standard error for that variable

P-value = TDIST(|t Stat|, Residual df, 2) = the Student’s t distribution two-tailed
probability
If one divides this by 2, it is the probability that the true value of the coefficient has the
opposite sign to that found. You want this probability to be small in order to be sure that
this variable really influences y, certainly less than 0.1 (10%). If near 50%, try another fit
with this variable left out.

There is a 95% probability that the true value of the coefficient lies between the Lower
95% and Upper 95% values. The probability is 2.5% that it lies below the lower value,
and 2.5% that it lies above. The narrower this range the better. If the lower value is
negative and the upper value positive, try correlating the data with this variable left out. If
the resulting R2 and Significance F are either improved or little changed, then the data do
not support inclusion of that variable in your correlating equation.
Regressioonanalüüsi näide.
1. Lähteandmed ja pöördumine Microsoft Excel Data Analysis Tools (the Analysis
ToolPak) poole.
Figure Representation the data of experiments and Setting of Regression Analysis
2. Regressioonanalüüsi tulemuste näide
Figure Example of output data (results of Regression analysis) using Excel
Analüüsi tulemusena saaadi lineaarne mudel kujul (As the result, the linear model is received
in form):
y=a0 + a1 x1 + a2 x2 +a3 x1*x2=15 + 0,725 x1 + 0,625 x2 +0,275 x1*x2.
Mudeli parameetrite täpsust hinnati usaldusvahemikega vastavalt. The confidence intervals
for parameters of regression model are (with a 95% degree of confidence):
14,455<= a0 <= 15,545
0,058 <= a1 <= 1,392
-0,042 <= a2 <= 1,292
-0,392 <= a3 <= 0,942.
Küsimused seoses kodutööga 2
 Valida katsetulemuste kirjeldamiseks sobiv mudel. Esitada
regresioonanalüüsi tulemusena saadud mudel
 Hinnata valitud mudeli sobivust katsetulemuste kirjeldamiseks.
Millisel viisil oleks otstarbekas mudelit arendada?
 Hinnata leitud mudeli (tema parameetrite) täpsust
2.2. Katsete planeerimine. Design of Experiments (DOF).
There is a world of difference between data and information. To extract information from data
you have to make assumptions about the system that generated the data. Using these
assumptions and physical theory you may be able to develop a mathematical model of the
system. Generally, even rigorously formulated models have some unknown constants.
The goal of experimentation is to acquire data that enable you to estimate these constants. But
why do you need to experiment at all? You could instrument the system you want to study
and just let it run. Sooner or later you would have all the data you could use. In fact, this is a
fairly common approach. There are three characteristics of historical data that pose problems
for statistical modelling: Suppose you observe a change in the operating variables of a system
followed by a change in the outputs of the system. That does not necessarily mean that the
change in the system caused the change in the outputs.
A common assumption in statistical modelling is that the observations are independent of
each other. This is not the way a system in normal operation works.
Controlling a system in operation often means changing system variables in tandem. But if
two variables change together, it is impossible to separate their effects mathematically.
Designed experiments directly address these problems. The overwhelming advantage of a
designed experiment is that you actively manipulate the system you are studying.
With Design of Experiments (DOE) you may generate fewer data points than by using
passive instrumentation, but the quality of the information you get will be higher.
A strategy for design of experiments was first introduced in the early 1920s when a scientist
at a small agricultural research station in England, Sir Ronald Fisher, showed how one could
conduct valid experiments in the presence of many naturally fluctuating conditions such as
temperature, soil condition, and rainfall. In the past decade, the application of DOE has
gained acceptance as an essential tool for improving the quality of goods and services. This
recognition is partially due to the work of Genichi Taguchi, a Japanese quality expert, who
promoted the use of DOE in designing robust products--those relatively insensitive to
environmental fluctuations.
The well-planned experimentation can provide data that will enable device manufacturers to
identify the causes of performance variations and to eliminate or reduce such variations by
controlling key process parameters, thereby improving product quality.
Properly designed and executed experiments will generate more-precise data while using
substantially fewer experimental runs than alternative approaches. They will lead to results
that can be interpreted using relatively simple statistical techniques, in contrast to the
information gathered in observational studies, which can be exceedingly difficult to interpret.
Design of Experiments provides essential methods and strategies for persons having
responsibility for improving current processes or products or charged with developing new
processes or products. Process or product development engineers or managers, manufacturing
engineers and managers, research scientists, engineers, or managers, and technical personnel
in quality assurance or engineering will benefit personally from this technique. These
individuals will be better prepared to plan, collect, and interpret data from experiments
designed to provide real-world knowledge for gaining a competitive edge.
Objectives
 To understand the role that designed experiments can play in process improvement.
 To develop a background for pursuing further studies in experimental design.
Continuing studies in designed experiments enables participants to increase their value
to the organization.
 To learn from faculty who have themselves conducted experiments in industry. This
course teaches practical application, not theory.
2.2.1. Experiment Planning Strategy. Concepts
Studies, often called process characterization, can be done by varying the key elements of the
process (i.e., equipment, materials, and input parameters, and so forth) and determining which
sources of variation have the most impact on process performance.
The process should also be challenged to discover how outputs change as process variables
fluctuate within allowable limits. This testing is essential to learning what steps must be taken
to protect the process if worst-case conditions for input variables ever occur during actual
manufacturing operations. Once again, an effective method for studying various combinations
of variables is DOE. In particular, simple two-level factorial and fractional factorial designs
are useful techniques for worst-case-scenario studies.
In 1950, Gertrude Mary Cox and William Cochran published the book Experimental Design
which became the major reference work on the design of experiments for statisticians for
years afterwards.
Statistics
Statistics is a type of data analysis which practice includes the planning, summarizing, and
interpreting of experiments (observations) of a system possibly followed by predicting or
forecasting of future events based on a mathematical model of the system being observed.
Statistics is a branch of applied mathematics specifically in the area of statistical theory which
uses probability theory in the mathematical models.
The basic tenet of statistics is that a population can be represented by a sample of the
population when the sample is sufficiently large and when the sample is composed of a
random selection of units from the population. Statistical theory provides methods for
determining how large a sample is needed to provide for statistically significant results.
Randomization
Randomization is a core principle in the statistical theory of design of experiments. Its use
was extensively promoted by R.A. Fisher in his book Statistical Methods for Research
Workers. Randomization involves randomly allocating the experimental units across the
treatment groups. Randomization is not haphazard; it serves a purpose in both frequentist and
Bayesian statistics. By randomizing the order in which the test combinations are run,
researchers can eliminate the effects of unknown trending variables on the results of the
experiment. A frequentist would say that randomization reduces bias by equalising other
factors that have not been explictly accounted for in the experimental design. Considerations
of bias are of little concern to Bayesians, who recommend randomization because it produces
ignorable designs. In design of experiments, frequentists prefer Completely Randomized
Designs. Other experimental designs are used when a full randomization is not possible.
Randomization is also required in the generation of random numbers for scientific research
and cryptography. Random number generators are sometimes used for these purposes.
Blocking
In the statistical theory of the design of experiments, blocking is the arranging the
experimental units in groups (blocks) which are similar to one another. This reduces sources
of variability and thus leads to greater precision. Suppose we have invented a process which
we believe makes the soles of shoes last longer, and we wish to conduct a field trial. Blocking
can be used to prevent experimental results being influenced by variations from batch to
batch, machine to machine, day to day, or shift to shift
Orthogonality
In mathematics, orthogonal is synonymous with perpendicular when used as a simple
adjective that is not part of any longer phrase with a standard definition.. It comes from the
Greek "ortho", meaning "right", and "gonia", meaning "angle". Two streets that cross each
other at a right angle are orthogonal to each other. Two vectors in an inner product space are
orthogonal if their inner product is zero. If the vectors are x and y this is written
. The
word normal is sometimes also used for this concept by mathematicians, although that word
is rather overburdened.
Two subspaces are called orthogonal if each vector in one is orthogonal to each vector in the
other. Note however that this does not correspond with the geometric concept of
perpendicular planes. The largest subspace that is orthogonal to a given subspace is its
orthogonal complement.
Orthogonality is a system design property which enables the making of complex designs
feasible and compact. The aim of an orthogonal design is to guarantee that operations within
one of its components neither create nor propagate side-effects to other components. For
example a car has orthogonal components and controls, e.g. accelerating the vehicle does not
influence anything else but the components involved in the acceleration. On the other hand, a
car with non orthogonal design might have, for example, the acceleration influencing the
radio tuning or the display of time. Consequently, this usage is seen to be derived from the use
of orthogonal in mathematics; one may project a vector onto a subspace, by projecting it each
member of a set of basis vectors separately and adding the projections if and only if the basis
vectors are mutually orthogonal.
Orthogonality guarantees that modifying the technical effect produced by a component of a
system neither creates or propagates side effects to other components of the system. The
emergent behaviour of a system consisting of components should be controlled strictly by
formal definitions of its logic and not by side effects resulting from poor integration, i.e. nonorthogonal design of modules and interfaces. Orthogonality reduces the test and development
time, because it's easier to verify designs that neither cause side effects nor depend on them.
Experiments
In the scientific method, an experiment is a set of actions and observations, performed to
verify or falsify a hypothesis or research a causal relationship between phenomena. The
experiment is a cornerstone in empirical approach to knowledge.
Design of experiments attempts to balance the requirements and limitations of the field of
science in which one works so that the experiment can provide the best conclusion about the
hypothesis being tested.
In some sciences, such as physics and chemistry, it is relatively easy to meet the requirements
that all measurements be made objectively, and that all conditions can be kept controlled
across experimental trials. On the other hand, in other cases such as biology, and medicine, it
is often hard to ensure that the conditions of an experiment be performed consistently; and in
the social sciences, it may even be difficult to determine a method for measuring the
outcomes of an experiment in an objective manner.
For this reason, sciences such as physics are often referred to as "hard sciences", while others
such as sociology are referred to as "soft sciences"; in an attempt to capture the idea that
objective measurements are often far easier in the former, and far more difficult in the latter.
As a result of these considerations, experimental design in the "hard" sciences tends to focus
on the elimination of extraneous effects (type of flour, impurities in the water); while
experimental design in the "soft" sciences focuses more on the problems of external validity,
often through the use of statistical methods. Occasionally events occur naturally from which
scientific evidence be drawn, natural experiments. In such cases the problem of the scientist is
to evaluate the natural "design".
Controlled experiments. Many hypotheses in sciences such as physics can establish
causality by noting that, until some phenomenon occurs, nothing happens; then when the
phenomenon occurs, a second phenomenon is observed. But often in science, this situation is
difficult to obtain.
A controlled experiment generally compares the results obtained from an experimental
sample against a control sample, which is practically identical to the experimental sample
except for the one aspect whose effect is being tested.
In many laboratory experiments it is good practice to have several replicate samples for the
test being performed and have both a positive control and a negative control. The results from
replicate samples can often be averaged, or if one of the replicates is obviously inconsistent
with the results from the other samples, it can be discarded as being the result of some an
experimental error (some step of the test procedure may have been mistakenly omitted for that
sample). Most often, tests are done in duplicate or triplicate. A positive control is a procedure
that is very similar to the actual experimental test but which is known from previous
experience to give a positive result. A negative control is known to give a negative result. The
positive control confirms that the basic conditions of the experiment were able to produce a
positive result, even if none of the actual experimental samples produce a positive result. The
negative control demonstrates the base-line result obtained when a test does not produce a
measurable positive result; often the value of the negative control is treated as a "background"
value to be subtracted from the test sample results.
Controlled experiments can be particularly useful when it is difficult to exactly control all the
conditions in an experiment. The experiment begins by creating two or more sample groups
that are probabilistically equivalent, which means that measurements of traits should be
similar among the groups and that the groups should respond in the same manner if given the
same treatment. This equivalency is determined by statistical methods that take into account
the amount of variation between individuals and the number of individuals in each group. In
fields such as microbiology and chemistry, where there is very little variation between
individuals and the group size is easily in the millions, these statistical methods are often
bypassed and simply splitting a solution into equal parts is assumed to produce identical
sample groups.
Once equivalent groups have been formed, the experimenter tries to treat them identically
except for the one variable that he or she wishes to isolate.
Errors
An error has different meanings in different domains. Current meanings in some of those
domains are described below. The Latin word error meant "wandering" or "straying".
Statistics. An error is a difference between a computed, estimated, or measured value and the
true, specified, or theoretically correct value.
Experimental science. An error is a bound on the precision and accuracy of the result of a
measurement. These can be classified into two types: statistical error (see above) and
systematic error. Statistical error is caused by random (and therefore inherently
unpredictable) fluctuations in the measurement apparatus, whereas systematic error is caused
by an unknown but nonrandom fluctuation. If the cause of the systematic error can be
identified, then it can usually be eliminated.
An error is a difference between desired and actual performance. Engineers often seek to
design systems in such a way as to mitigate or preferably avoid the effects of error. One type
of error is human error. Human factors engineering is often applied to designs in an attempt to
minimize this type of error by making systems more forgiving or error tolerant. Errors in a
system can also be latent design errors that may go unnoticed for years, until the right set of
circumstances arises that cause them to become active.
2.3. Traditional vs. Factorial Designs
One traditional method of experimentation is to evaluate only one variable (or factor) at a
time-all of the variables are held constant during test runs except the one being studied. This
type of experiment reveals the effect of the chosen variable under set conditions; it does not
show what would happen if the other variables also changed.
It was Fisher's idea that it was much better to vary all the factors at once using a factorial
design, in which experiments are run for all combinations of levels for all of the factors. With
such a study design, testing will reveal what the effect of one variable would be when the
other factors are changing.
The advantage of the factorial design is also its efficiency. Montgomery has shown that this
relative efficiency of the factorial experiments increases as the number of variables increases.
In other words, the effort saved by such internal replication becomes even more dramatic as
more factors are added to an experiment.
Two level Factorial Design
A factorial experiment is a statistical study in which each observation is categorised
according to more than one factor. Such an experiment allows studying the effect of each
factor on the response variable, while requiring fewer observations than by conducting
separate experiments for each factor independently. It also allows studying the effect of the
interaction between factors on the response variable.
A two-factor, two-level factorial design is normally set up by building a table using minus
signs to show the low levels of the factors and plus signs to show the high levels of the
factors. Table and figure shows a factorial design for the application example. The simplest
factorial experiment is the 22 factorial experiment, so named because it considers two levels
for each of two factors, producing 22 = 4 factorial points. Suppose an engineer wishes to study
the total power used by each of two different motors, A and B, running at each of two
different speeds, 2000 or 3000 RPM. The factorial experiment would consist of four points:
motor A at 2000 RPM, motor B at 2000 RPM, motor A at 3000 RPM, and motor B at 3000
RPM.
Run no.
1
2
3
4
Level of Factor
X1
X2
+1
+1
-1
+1
+1
-1
-1
-1
X1 X 2
+1
-1
-1
+1
Values of parameters
X1
X2
3000
3000
2000
3000
3000
2000
2000
2000
Response
15,8
14,7
14,5
14,5
The first column in the table of plan of experiments shows the run number for the four
possible runs. The next two columns show the level of each main factor, A and B, in each run,
and the fourth column shows the resulting level of the interaction between these factors,
which is found by multiplying their coded levels (-1 or +1). Columns 5 and 6 show the actual
values assigned to the low and high variable levels in the design. Test runs using each of these
four combinations constitute the experiment. The last column contains the responses from the
experiment. Filling in this column requires the hard work of running each experiment and
then recording the result.
To use the standard plans the following coding of parameters is needed:
xi  1
correspond to the X i,max
xi  1
correspond to the X i ,min
This coding corresponds to the following linear transformation of the initial
parameters X i :
2 ( X i  X i ,min )
xi 
1
( X i ,max  X i ,min )
To save space, the points in a factorial experiment are often abbreviated as −1 −1 ; +1 −1 , −1
+1 , and +1 +1 . The factorial points can also be abbreviated by (1), a, b, and ab, where the
presence of a letter indicates that the specified factor is at its high (or second) level and the
absence of a letter indicates that the specified factor is at its low (or first) level. So "a"
indicates that factor A is at its high setting, while all other factors are at their low (or first)
setting. (1) is used to indicate that all factors are at their lowest (or first) values.
Ideally, a factorial experiment should be replicated. Replication allows a researcher to
estimate experimental error. If the magnitude of experimental error is unknown (which
happens if it cannot be estimated), then a researcher cannot determine whether any effects or
interactions are statistically significant. If a design cannot be replicated, it may be possible to
get some estimate of experimental error by invoking the effect sparsity principle and/or the
main effects principle. However, these principles usually don't apply unless several factors are
being considered in the same experiment. The experimental runs in a factorial experiment
should also be randomized. Randomization attempts to reduce the impact that bias could have
on the experimental results.
In some studies there may be more than two important variables. The three-factor, two-level
design is shown in Figure . With three two-level factors, eight experiments will be required,
and there will be four replicates of each level of each factor, further increasing the precision
of the result. There will be three two-factor interactions and a three-factor interaction to
evaluate. Usually, interactions involving three or more factors are not important and can be
disregarded.
Sõltuvalt eesmärgist võib kasutada erienevaid katseplaane: keskpuntiga kahetasandiga
kompsiitplaan, kolmetasandeline katseplaan keskpuntiga jm.
Joonis Keskpunktiga komposiitne plaan (Central composite designs)
Joonis Kolmetasandeline Box-Behnken’i katseplaan ( A Box-Behnken design for three
factors. /Douglas C. Montgromery. Design and Analysis of Experiments, 2001/
Joonis Tsentraalne komposiit-katseplaan (A face-centered central composite design for k=3.
/Douglas C. Montgromery. Design and Analysis of Experiments, 2001/
Calculation of Effects
Calculation of the Main Effects. With a factorial design, the average main effect of
changing thrombin level from low to high can be calculated as the average response at the
high level minus the average response at the low level:
Main effect of X i 
 responses
at high X i -  responses at low X i
half the number of runs in experiment
The fact that these effects have a positive value indicates that the response (i.e., the
coagulation rate) increases as the variables increase. The larger the magnitude of the effect,
the more critical the variable.
Estimate of the Interaction. A factorial design makes it possible not only to determine the
main effects of each variable, but also to estimate the interaction (i.e., synergistic effect)
between the two factors, a calculation that is impossible with the one- factor-at-a-time
experiment design. Interaction is a kind of action which occurs as two or more objects have
an effect upon one another. The idea of a two-way effect is essential in the concept of
interaction instead of a one-way causal effect. Combinations of many simple interactions can
lead to surprising emergent phenomena. It has different tailored meanings in various sciences.
The interaction effect is the average difference between the effect of thrombin at the high
level of ion concentration and the effect of thrombin at the low level of ion concentration, or
As in a two-factor experiment, the average effect of each factor can be calculated by
subtracting the average response at the low level from the average response at the high level.
Adding Centre Points
Designs with factors that are set at two levels implicitly assume that the effect of the factors
on the dependent variable of interest (e.g., fabric Strength) is linear. It is impossible to test
whether or not there is a non-linear (e.g., quadratic) component in the relationship between a
factor A and a dependent variable y, if A is only evaluated at two points (.i.e., at the low and
high settings). If one suspects that the relationship between the factors in the design and the
dependent variable is rather curve-linear, then one should include one or more runs where all
(continuous) factors are set at their midpoint. Such runs are called centre-point runs, since
they are, in a sense, in the centre of the design (see graph).
Central Composite Designs
As pointed out before, in order to estimate the second order, quadratic, or non-linear
component of the relationship between a factor and the dependent variable, one needs at least
3 levels for the respective factors. What does the information function look like for a simple
3-by-3 factorial design, for the second-order quadratic model?
The first four runs in this design are the previous 2-by-2 factorial design points (or square
points or cube points); runs 5 through 8 are the so-called star points or axial points, and runs
9 and 10 are centre points.
The information function for this design for the second-order (quadratic) model is rotatable,
that is, it is constant on the circles around the origin.
The two design characteristics discussed so far -- orthogonality and rotatability -- depend on
the number of centre points in the design and on the so-called axial distance (alpha), which
is the distance of the star points from the centre of the design (i.e., 1.414 in the design shown
above). It can be shown that a design is rotatable if:
= ( nc )¼
where nc stands for the number of cube points in the design (i.e., points in the factorial portion
of the design).
A central composite design is orthogonal, if one chooses the axial distance so that:
= {[( nc + ns + n0 )½ - nc½]² * nc/4}¼
where
nc is the number of cube points in the design
ns is the number of star points in the design
n0 is the number of centre points in the design
To make a design both (approximately) orthogonal and rotatable, one would first choose the
axial distance for rotatability, and then add centre points, so that:
n0 4*nc½ + 4 - 2k
where k stands for the number of factors in the design.
Finally, if blocking is involved give the following formula for computing the axial distance to
achieve orthogonal blocking, and in most cases also reasonable information function contours,
that is, contours that are close to spherical:
= [k*(l+ns0/ns)/(1+nc0/nc)]½
where
ns0 is the number of centre points in the star portion of the design
ns is the number of non-centre star points in the design
nc0 is the number of centre points in the cube portion of the design
nc is the number of non-centre cube points in the design
2.4. Fractional Factorial Designs
One disadvantage of two-level factorial designs is that the number of treatment combinations
in a factorial design increases rapidly with an increase in the number of factors. Thus, if n = 4,
2 n =16, but if n = 8, 2 n =256. Since engineering experimentation can easily involve 6 to 10
factors, the number of experiments required can rapidly become prohibitive in cost.
Consider a 2 6 factorial experiment. There are:
 6 main effects
 15 first-order interactions
 20 second-order interactions
 15 third-order interactions
 6 fourth-order interactions
 1 fifth-order interaction
Since we usually are not interested in higher-order interactions, we are collecting a great deal
of extraneous information if we perform all 64 treatment combinations. This permits us to
design an experiment that is some fraction of the total factorial design.
In this case, a fractional factorial design may be used.Fortunately, because three-factor and
higher-order interactions are rarely important, such intensive efforts are seldom required. For
most purposes, it is only necessary to evaluate the main effects of each variable and the twofactor interactions, which can be done with only a fraction of the runs in a full factorial
design. If there are some two-factor interactions that are known to be impossible, one can
further reduce the number of runs. For the earlier example of eight factors, one can create an
efficient design that may require as little as 16 runs.
However, the number of experimental runs required for three-level factorial designs can be
much greater than for their two-level counterparts. Factorial designs are therefore rather
unattractive if a researcher wishes to consider multiple levels. It may not be necessary to
consider more than two levels, however, if the factors are continuous in nature.
When the factors are continuous, two-level factorial designs assume that the effects are linear.
If a quadratic effect is expected for a factor, a more complicated experiment should be used,
such as the central composite design. Optimization of factors that could have quadratic effects
is the primary goal of response surface methodology.
2.5. Neural Networks
An artificial neural network (ANN), often just called a "neural network" (NN), is a
mathematical model or computational model based on biological neural networks. It consists
of an interconnected group of artificial neurons and processes information using a
connectionist approach to computation. In most cases an ANN is an adaptive system that
changes its structure based on external or internal information that flows through the network
during the learning phase.
In more practical terms neural networks are non-linear statistical data modeling tools.
They can be used to model complex relationships between inputs and outputs or to find
patterns in data.
Neural networks are composed of simple elements operating in parallel. These elements are
inspired by biological nervous systems. As in nature, the network function is determined
largely by the connections between elements. We can train a neural network to perform a
particular function by adjusting the values of the connections (weights) between elements.
Commonly neural networks are adjusted, or trained, so that a particular input leads to a
specific target output. Such a situation is shown below. There, the network is adjusted, based
on a comparison of the output and the target, until the network output matches the target.
Typically many such input/target pairs are used, in this supervised learning, to train a
network.
Batch training of a network proceeds by making weight and bias changes based on an entire
set (batch) of input vectors. Incremental training changes the weights and biases of a network
as needed after presentation of each individual input vector. Incremental training is sometimes
referred to as “on line” or “adaptive” training.
Neural networks have been trained to perform complex functions in various fields of
application including pattern recognition, identification, classification, speech, vision and
control systems. Today neural networks can be trained to solve problems that are difficult for
conventional computers or human beings. Throughout the MATLAB toolbox emphasis is
placed on neural network paradigms that build up to or are themselves used in engineering,
and other practical applications.
The supervised training methods are commonly used, but other networks can be obtained
from unsupervised training techniques or from direct design methods. Unsupervised networks
can be used, for instance, to identify groups of data. Certain kinds of linear networks and
The field of neural networks has a history of some five decades but has found solid
application only in the past fifteen years, and the field is still developing rapidly. Thus, it is
distinctly different from the fields of control systems or optimization where the terminology,
basic mathematics, and design procedures have been firmly established and applied for many
years. Neural Network Toolbox of MATLAB will be a useful tool for industry, education and
research, a tool that will help users find what works and what doesn’t, and a tool that will help
develop and extend the field of neural networks.
2.5.1. Neural Network Applications
Industrial
•Neural networks are being trained to predict the output of industrial processes. They then
replace complex and costly equipment used for this purpose in the past.
Manufacturing
•Manufacturing process control, product design and analysis, process and machine diagnosis,
real-time particle identification, visual quality inspection systems, beer testing, welding
quality analysis, paper quality prediction, computer-chip quality analysis, analysis of grinding
operations, chemical product design analysis, machine maintenance analysis, project bidding,
planning and management, dynamic modelling of chemical process system
Oil and Gas
•Exploration
Robotics
•Trajectory control, forklift robot, manipulator controllers, vision systems
•Truck brake diagnosis systems, vehicle scheduling, routing systems
2.5.2. Neuron Model
Simple Neuron
A neuron with a single scalar input and no bias appears on the left below.
The scalar input p is transmitted through a connection that multiplies its strength by the scalar
weight w, to form the product wp, again a scalar. Here the weighted input wp is the only
argument of the transfer function f, which produces the scalar output a. The neuron on the
right has a scalar bias, b. You may view the bias as simply being added to the product wp as
shown by the summing junction or as shifting the function f to the left by an amount b. The
bias is much like a weight, except that it has a constant input of 1.
The transfer function net input n, again a scalar, is the sum of the weighted input wp and the
bias b. This sum is the argument of the transfer function f.
Here f is a transfer function, typically a step function or a sigmoid function, which takes the
argument n and produces the output a. Examples of various transfer functions are given in the
next section. Note that w and b are both adjustable scalar parameters of the neuron. The
central idea of neural networks is that such parameters can be adjusted so that the network
exhibits some desired or interesting behavior. Thus, we can train the network to do a
particular job by adjusting the weight or bias parameters, or perhaps the network itself will
adjust these parameters to achieve some desired end.
Transfer Functions
Many transfer functions are used to build a model. Three of the most commonly used
functions are shown below.
The hard-limit transfer function shown above limits the output of the neuron to either 0, if the
net input argument n is less than 0; or 1, if n is greater than or equal to 0.
The linear transfer function is shown below.
A
A
Neurons of this type are used as linear approximators in “Linear Filters”.
The sigmoid transfer function shown below takes the input, which may have
any value between plus and minus infinity, and squashes the output into the
range 0 to 1. This transfer function is commonly used in backpropagation networks, in part
because it is differentiable.
Neuron with Vector Input
A neuron with a single R-element input vector is shown below. Here the individual element
inputs are multiplied by weights and the weighted values are fed to the summing junction.
Their sum is simply Wp, the dot product of the (single row) matrix W and the vector p.
p1, p2,... pR
w1 1 , w1 2 , ... w1 R 
The neuron has a bias b, which is summed with the weighted inputs to form the net input n. This sum, n, is the
argument of the transfer function f.
The figure of a single neuron shown above contains a lot of detail. When we consider
networks with many neurons and perhaps layers of many neurons, there is so much detail that
the main thoughts tend to be lost. Thus, the authors have devised an abbreviated notation for
an individual neuron. This notation, which will be used later in circuits of multiple neurons, is
illustrated in the diagram shown below.
Here the input vector p is represented by the solid dark vertical bar at the left. The dimensions
of p are shown below the symbol p in the figure as Rx1. (Note that we will use a capital letter,
such as R in the previous sentence, when referring to the size of a vector.) Thus, p is a vector
of R input elements. These inputs post multiply the single row, R column matrix W. As
before, a constant 1 enters the neuron as an input and is multiplied by a scalar bias b. The net
input to the transfer function f is n, the sum of the bias b and the product Wp.
This sum is passed to the transfer function f to get the neuron’s output a, which in this case is
a scalar. Note that if we had more than one neuron, the network output would be a vector.
A layer of a network is defined in the figure shown above. A layer includes the combination
of the weights, the multiplication and summing operation (here realized as a vector product
Wp), the bias b, and the transfer function f. The array of inputs, vector p, is not included in or
called a layer.
As discussed previously, when a specific transfer function is to be used in a figure, the symbol
for that transfer function will replace the f shown above. Here are some examples.
Network Architectures
Two or more of the neurons shown earlier can be combined in a layer, and a particular
network could contain one or more such layers. First consider a single layer of neurons.
A Layer of Neurons
A one-layer network with R input elements and S neurons follows.
In this network, each element of the input vector p is connected to each neuron input through
the weight matrix W. The ith neuron has a summer that gathers its weighted inputs and bias to
form its own scalar output n(i). The various n(i) taken together form an S-element net input
vector n. Finally, the neuron layer outputs form a column vector a. We show the expression
for a at the bottom of the figure.
Multiple Layers of Neurons
A network can have several layers. Each layer has a weight matrix W, a bias vector b, and an
output vector a. To distinguish between the weight matrices, output vectors, etc., for each of
these layers in our figures, we append the number of the layer as a superscript to the variable
of interest. You can see the use of this layer notation in the three-layer network shown below,
and in the equations at the bottom of the figure.
Kodutöö3 Närvivõrkude mudeli loomine.
Näide. Example of use the Neural Network Toolbox of MATLAB
To the problem of selection different technology options the ANN model together with
corresponding training were used.
Using the selection parameters down (Table 2.12.1), the ANN trained for each technology (like
vacuum forming processes, acrylic trimming technologies and reinforcement) was used. For
illustrating the data of observations are presented the Table 2.12.1. . There are three options {0 - Not
usable, 1- Reverse draw forming with two heaters, 2- Straight vacuum forming} for finding out
vacuum forming technology.
Table 2.12.1. Selection parameters for vacuum forming processes
Parameter and mark
Description
Dimensions (L and B):
L x B; 280x430,680x760 mm up to 2000x1000 mm
Max depth of draw (H):
H; 183, 220, 300 mm up to 800 mm
Max material thickness (D):
D; 3.2 mm, 4 mm, 6 mm, 7 mm
Undercuts (UC):
yes/no
…
…
Draft angle (α):
α; α >5 ˚
Surface quality (Q):
low, medium, high
Batch size (N):
N; 1 <= N <= 10000 (0<= log N<= 4)
…
…
Wall thickness after forming (h):
h; 0.7 < h < 3 mm
Heating temperature (Th):
Th; 180˚C <= Th <= 220˚C
Cooling time (E):
C; 3 < C < 7 min
Heating zones (Z):
Z; 1 < Z < 4
Cooling points (P):
P; 2 <= P <= 5
Table 2.12.2. Vacuum forming training mode
Vacuum
Sample
Geom
forming
1
1
1
2
2
2
…
…
…
20
2
1
Log
(nP)
2
2
…
2
Dim
Thick
SQ
PT
UC
I
1
2
…
2
0
1
…
1
2
2
…
2
1
2
…
1
2
2
…
2
2
1
…
1
Where:
 Geom is the geometric complexity;
 Log(nP) is the number of parts;
 Dim is the dimension of vacuum forming bench table;
 Thick is maximal material thickness;
 SQ is surface quality;
 PT is part texture;
 UC is undercuts;
 I is investments.
The acceptability of model was estimated by the accuracy of model on training data (it was accepted
that training goal, the total error estimate must be less than 0.000001). In that task a mathematical
model for a neuron could be represented as:
vi  g 2 ( w2,1  g1 ( w1,1  l  e1 )  e2 ),
Where:




vi is the output activation of the unit l;
W is a numeric weight;
e is the bias vector;
g is the activation function of the unit (there was used two log-sigmoid functions as activation
functions).
The proposed models have the following parameters:
   3.47    - 2.26 - 5.04


   14.03   - 15.52 6.17

 23.99   2.39
6.93
vi  g 2  
g 
  38.57  1    6.84 3.69


2.72
  33.60    0.60
   1.18    0.09
0.40



- 2.02  - 26.37  


- 11.81  - 4.90  


4.55 - 0.82   - 5.50  
l  
    26.99
- 0.56 1.33   12.02  


10.69 - 11.30  6.48  

 


- 5.23 8.32   - 6.55  

. . . . - 1.86
. . . .
. . . .
. . . .
. . . .
. . . .
4.28
In Figure 2.12.1 is brought out the training curve, which got using the system MatLAB. The training
curve shows the reduction in error over several epoch of training for selection of reverse draw vacuum
forming or straight forming.
Figure 2.12.1. Training curve showing the reduction in error over several epoch of training for selection of
reverse draw vacuum forming or straight forming
Programmi näide (vaja uus näide):
% Reinforcement: yes, no
% Lähteandmed
p=[ 3 3 1 2 3 2 1 3 3 2 2 1 2 1 1 2 3 2 1 2; …
2 1 1 0 2 2 2 0 2 0 1 1 1 0 2 2 1 2 2 0;...
2 3 2 2 1 2 3 3 1 1 1 0 0 2 3 2 3 1 1 2; …
2 0 1 2 2 0 1 0 1 1 2 2 2 0 0 2 0 2 1 0;...
2 3 2 1 1 2 3 3 1 2 2 3 2 1 2 3 2 1 2 3; …
22211112211222121211]
t=[1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 ]
% Närvivõrgu mudeli kirjeldamine
net =newff(minmax(p),[4,1],{'logsig','logsig'},'trainlm');
% Muudame treeningu parameetreid
net.trainParam.show=100;
net.trainParam.lr=0.05;
net.trainParam.epochs=3000;
net.trainParam.goal=0.00001;
[net,tr]=train(net,p,t);
% Kontrollime lahendit
a=round(sim(net,p))
% Väljastame parameetrid
bias1=net.b{1}
bias2=net.b{2}
weights1=net.IW{1,1}
weights2=net.LW{2,1}
p=
Columns 1 through 12
3 3 1 2 3
2 1 1 0 2
2 3 2 2 1
2 0 1 2 2
2 3 2 1 1
2 2 2 1 1
2
2
2
0
2
1
1
2
3
1
3
1
3
0
3
0
3
2
3
2
1
1
1
2
2
0
1
1
2
1
2
1
1
2
2
1
1
1
0
2
3
2
Columns 13 through 20
2 1 1 2 3 2
1 0 2 2 1 2
0 2 3 2 3 1
2 0 0 2 0 2
2 1 2 3 2 1
2 2 1 2 1 2
t=
Columns 1 through 12
1 0 0 0 0 1
Columns 13 through 20
0 0 1 1 0 1
1
2
1
1
2
1
2
0
2
0
3
1
0
0
0
0
0
0
0
0
TRAINLM-calcjx, Epoch 0/3000, MSE 0.21925/1e-005, Gradient 0.0881321/1e-010
TRAINLM-calcjx, Epoch 12/3000, MSE 2.20172e-006/1e-005, Gradient 3.15234e-006/1e010
TRAINLM, Performance goal met.
a=
Columns 1 through 12
1 0 0 0 0 1
Columns 13 through 20
0 0 1 1 0 1
bias1 =
-15.4059
1.9312
-5.2136
-9.0533
bias2 =
-4.4747
weights1 =
-2.3906 0.4411
3.8734 9.7168
3.6735 -4.5363
1.3977 -4.8421
weights2 =
14.3645 10.4440
0
0
0
0
0
0
0
0
1.9699 7.0688 -2.0334 5.1471
2.5332 -5.6808 -3.6823 -4.1839
-5.3135 4.8883 -3.1428 3.4034
0.4030 8.2915 4.4874 2.7907
-1.4794 -12.7180
Küsimused seoses kodutööga 3
1. Esitada ja põhjendada valitud mudelit
2. Hinnata valitud mudeli sobivust katseandmete kirjeldamiseks.