Download f05

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Digital Media Lab
Data Mining Applied
To Fault Detection
Shinho Jeong
Jaewon Shim
Hyunsoo Lee
{cinooco, poohut, darth7}@icu.ac.kr
Digital Media Lab
1
Introduction

Aims of work


Neural Network Implementation of the Non-linear PCA model using
Principal Curve algorithm to increase both rapidity & accuracy of
fault detection.
Data mining?


Logo
Extracting useful information from raw data
using statistical methods and/or AI techniques.
Characteristics

Maximum use of data available.
 Rigorous theoretical knowledge not required.
 Efficient for a system with deviation between actual process and
first principal based model .

Application

Process monitoring


Fault detection/diagnosis/isolation
Process estimation

Digital Media Lab
Soft sensor
2
Fault Detection?
Logo
Fault introduction
Digital Media Lab
3
Issues

Major concerns

Rapidity


Ability to detect fault situation at an earlier stage of fault
introduction.
Accuracy


Logo
Ability to distinguish fault situation from possible process
variations.
Trade-off problem

Solve through


Digital Media Lab
Frequent acquisition of process data.
Derivation of efficient process model through data
analysis using Data mining methodologies.
4
Inherent Problems
①
Multi-colinearity problem

Due to high correlation among variables.


②
Due to more variables than observations.


Likely to cause over-fitting problem in model-building phase.
Dimensional reduction required.
Non-linearity problem

Due to non-linear relation among variables.


④
Likely to cause redundancy problem.
Derivation of new uncorrelated feature variables required.
Dimensionality problem

③
Logo
Pre-determination of degree of non-linearity required.
Application of non-linear model required.
Process dynamics problem

Due to change of operating conditions with
time.

Digital Media Lab
Likely to cause change of correlation structure among variables.
5
Statistical Approach

Logo
Statistical data analysis

Uni-variate SPC


Conventional Shewart, CUSUM, EWMA, etc.
Limitations

Perform monitoring for each process variable.


More concerned with how variables co-vary.


Inefficient for multi-variate system.
Need for multi-variate data analysis
Multi-variate SPC

PCA


Digital Media Lab
Most popular multi-variate data analysis method.
Basis for regression modesl(PLS, PCR, etc).
6
Linear PCA(1)

Features

Creation of…





Logo
Fewer => solve ‘Dimensionality problem‘
&
Orthogonal => solve ‘Multi-colinearity problem‘
new feature variables(Principal components)
through linear combination of original variables.
Perform Noise reduction additionally.
Basis for PCR, PLS.
Limitation

Linear model => inefficient for nonlinear process.
Digital Media Lab
7
Linear PCA(2)

Logo
Theory
Let , x  [ x1 x2 x3
xm ] ~ original var's
Cov( x)   ,  pi   i pi (i  1, 2,3,
ti  x pi  t  x  P  x  t  P (
'
x  {t1 p1  t2 p2  t3 p3 
'
'
'
, m)
P ~ orthonormal matrix)
 tl pl }  {tl 1 pl 1 
'
'
 tl Pl '  el  x  el
F (tl )  tl Pl '  x ~decoding mapping
Digital Media Lab
'
Encoding
mapping
 x  f ( x, Pl )  tl Pl '  ( x Pl ) Pl '  F (G ( x ))
G ( x )  x Pl  tl ~encoding mapping
 t m pm }
x
tl
x
Decoding
mapping
8
Linear PCA(3)

Logo
ERM inductive principle
n
1
R emp ( Pl )   xi  xi
n i 1
 Limitation
2
where, xi  F (G( xi ))  ( xi pl ) p
'
l
G( xi ), F (tl ) ~ linear functions

Alternatives

Extension of linear functions to non-linear ones
using…


Neural networks.
Statistical method.
Digital Media Lab
9
Kramer’s Approach
x
Input layer

x
Mapping
layer
'
Bottleneck
layer
Logo
x%
Demapping
Output layer
layer
Limitations

Difficult to train the networks with 3 hidden layers.
 Difficult to determine the optimal # of hidden nodes.
 Difficult to interpret the meaning of the bottle-neck layer.
Digital Media Lab
10
Non-linear PCA(1)

Principal curve(Hastie et al. 1989)


Logo
Statistical, Non-linear generalization of the first
linear Principal component.
Self-consistency principle
x  F (G( x))  ( x | z  arg min F ( z )  x )
2
z
①
Projection step(Encoding)
z  G( x)  arg min F ( z )  x )
2
z
②
Conditional averaging(Decoding)
x  F ( z )  ( x | z )
Digital Media Lab
11
Non-linear PCA(2)
Logo
1

Limitations
0.8



Finiteness of data.
Unknown density distribution.
No a priori information about data.
σ=0.5
0.6
0.4
σ=1
0.2

σ=2
Additional consideration
σ=4
0
-5
②

-4
-3
-2
-1
0
1
2
3
4
Conditional averaging => Locally weighted
regression, Kernel regression
Increasing flexibility(Span decreasing)

Digital Media Lab
Span : fraction of data considered to be in the neighborhood.
~ smoothness of fit
~ generalization capacity
12
5
Proposed Approach(1)

Logo
LPCA v.s. NLPCA
Digital Media Lab
13
Proposed Approach(1)

Logo
Creation of Non-linear principal scores
x  F1 ( z1 )  e1 where, F1 ( z1 )  C1
ei 1  Fi ( zi )  ei where, i =1,2,3,
x =F1 ( z1 )  F2 ( z2 ) 
 z  [ z1 , z2 ,
Digital Media Lab
and e0 = x
 Fl ( zl )  el  x  el
, zl ] ~ non-linear principal score
14
Proposed Approach(2)

Logo
Implementation of Auto-associative N.N.
Construction of 2 MLP N.N.'s from ( x, z ) & ( z , x)
Reconstructed
x
Input layer
1st MLP's hidden
1st MLP
Digital Media Lab
z
z
NLPC
score
x
2nd MLP's hidden
Reconstructed
2nd MLP
15
Case Study

Logo
Objective

Fault detection during operating mode change using
6 variables
 Data acquisition & Model building

NOC data : 120 observations => NLPCA model building
 Fault data : another 120 observations
FI
FI
9
8
FI
1
A
JI
CW S
XA
TI
FI
7
2
D
Condenser
13
PI
LI
CW R
A
N
A
L
Y
Z
E
R
FI
3
E
SC
TI
5
PI
PI
FI
LI
10
CW S
XA
XB
XC
XD
XE
XF
A
N
A
L
Y
Z
E
R
Digital Media Lab
XB
XC
XD
XE
XF
XG
XH
Vap /liq
separator
6
TI
12
FI
CW R
Stripper
TI
TI
FI
LI
Reactor
Stm
drift
Cond
FI
FI
C
Purge
Compressor
4
11
A
N
A
L
Y
Z
E
R
XD
XE
XF
XG
XH
Product
16
Model Building
 Principal curve fitting
Logo
 Auto-associative
N.N. using 2 MLP’s
5 iterations
1st MLP N.N.
30 iterations
50 iterations
Digital Media Lab
2nd MLP N.N.
17
Monitoring Result
Logo
Fault introduction

NLPCA model more efficient than LPCA model!!!
Digital Media Lab
18
Conclusion

Result


Logo
Fault Detection performance was enhanced in terms
of both speed and accuracy when applied to a test
case.
Future work

Integration of ‘Fault Diagnosis’ and ‘Fault Isolation’
methods to perform complete process monitoring on
a single platform.
Digital Media Lab
19
Related documents