Download Groove Sizing Using a Robust Neural Network

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Immunity-aware programming wikipedia , lookup

Two-port network wikipedia , lookup

Transcript
GROOVE SIZING USING A ROBUST NEURAL NETWORK
APPROACH
L. Le Brusquet, M.-E. Davoust and G. Fleury
ECOLE SUPERIEURE D'ELECTRICITE, Service des Mesures,
Plateau de Moulon, 3 rue Joliot Curie, 91192 Gif-sur-Yvette Cedex ABSTRACT. The remote field eddy current technique is used to inspect conductive pipes from the
inside. The problem is to calculate an estimation of groove dimensions from observed data. A first
approach was previously developed using a two-step parametric inversion. Results from this first
approach are produced using a new model. A second approach using a neural network is presented.
This technique is known for the lack of robustness which may occur when precautions are not
sufficient. This paper presents these precautions and the results of both approaches.
INTRODUCTION
The remote field eddy current (RFEC) technique is used to inspect conductive pipes
from the inside. This technique is known to be sensitive to internal and external flaws. The
RFEC mechanism is now well understood [3], and is particularly convenient for
ferromagnetic material testing. Figure 1 illustrates a typical experimental apparatus used
for groove dimensioning, such as corrosion grooves, for example. In our case, the sensor is
pushed inside the pipe by some means. The modulus and phase of the detector voltage are
acquired along with the coil positions. They are called here observed data.
The problem is to obtain an estimation of groove dimensions from these observed
data. For this problem there is no algebraic solution of the Maxwell's equations. Therefore,
a first approach was previously developed [1] using a two-step parametric inversion. The
first one is to find an algebraic function which models the direct problem, studied by means
of the finite element technique; the parameters of this function are obtained using an
optimizing method. In a second step, another relation must be chosen to calculate groove
dimensions from the optimized parameters. This method gives results which are right
enough for certain kind of applications. However, two models have to be built here, one for
the direct model and one for measurement equation, and results depend on these modeling
phases.
It may be difficult to find better models for the two-step parametric inversion, especially
when the measurement system depends on complex physical laws.
2mm
Detector e&l I B&dter
35 mm
FIGURE 1. Schematic illustration of the sensor-pipe apparatus.
CP657, Review of Quantitative Nondestructive Evaluation Vol. 22, ed. by D. O. Thompson and D. E. Chimenti
© 2003 American Institute of Physics 0-7354-0117-9/03/S20.00
711
In such cases, experimental approaches may be valuable, provided that a sufficient number
of values of both observed signals and unknown quantities are available.
This set of values, called the training set, is used to choose an optimal function among the
functions of a given parameterized class. Regression and neural network are very
straightforward cases of this approach. No design of a physical model between the
observed signals and the unknown quantities is required. This is the advantage of such an
approach.
To solve our groove sizing problem, an one-hidden layer feed- forward neural network was
used because of its ability to represent nonlinear relations. The required training set is
composed of the signals obtained with the FE tool. The consequence is the weak number of
simulated grooves. It involves a large risk for the estimated neural network to over fit the
data of the training set and to give important errors for data which do not belong to the
training set. This paper relates both the precautions which have been taken to prevent from
this risk and the obtained results.
FIRST APPROACH
In many applications, an unknown quantity m has to be estimated from a vector of
observed values y. This may be encountered particularly in the domain of Nondestructive
Evaluation. Measurement systems can be formalized by two equations. One sets up the
relation between the observed data and the physical phenomenon — it is called here the
Observation Equation. This relation describes the observable physical variable. It is the
classical nonlinear regression model:
yt=f(xl90) + et
/ = !,...,».
(1)
where y = [yi9..-9yn'F is the vector of the observed data 0 = [0l,...90p]J is the vector of
the unknown parameters, x = [xl9...,xn\r stands for the experiment design (e.g., sensors
coordinates, sample times, frequencies for eddy current transducers...), and e = [^,...,^ W ] T
is the vector of random observation errors.
The other is related to the genuine measurement, the significant variable — it is called here
the Measurement Equation. This equation can be written in the general following form:
(2)
Here the components of measurement m are the two dimensions of the groove.
Observation Equation
An FE tool was used to compute synthetic observed data for a given set of groove
parameters. The induced voltage was computed for about 100 sensor positions. Various
dimensions were chosen considering the applications. Depth d and length / took the values
(rf=0.33, 0.66, 1, 1.33, 1.66 mm) and (7=0.33, 1.33, 4, 8, 12, 20 mm). Coupling every depth
and length value leads therefore to 30 very realistic simulated grooves. Figure 2 gives a few
observed data [1].
From these data, an algebraic model was chosen. Taking into account symmetry properties
and limit variations of the data, several algebraic functions were considered. According to
some criteria the function given in (3) was proved to be the most suitable [1]; it was also be
712
f=280Hz, d=1.33mm, l=1.33mm
-60
-40
-20
0
20
xinmm
40
60
-40
FIGURE 2. Observed data for /=4mm.
-20
0
xinmm
20
40
60
FIGURE 3. Simulated (o) and modeled phase lags
proved that the probability density of measurement does not depend on the
parameterization [4].
(02(^
(3)
where L is the distance between coils.
We have built a new function (4) which happens to give better results:
(4)
Optimal parameters 6 are estimated by minimizing an usual quadratic criterion (5). The
Levenberg-Marquardt algorithm [5] is used for that purpose.
HI/eM-4-
(5)
The vector jc contains all positions of the sensor for the observed data y (phase of the
detector voltage), fQ is the equation model given by (4).
Figure 3 compares the data and the modeled phase lag given by the minimization of £in
(5), for one defect.
Measurement Equation
According to the described approach, an algebraic form for g was chosen. The
simplest choice was a linear function of parameters but to improve the overall accuracy of
the method we considered [1] a bilinear one (6).
(6)
By using matrix expressions, these equations may be written for all simulated defects as:
d
, 1=0 cl
(7)
where the matrix 0 contains the parameters for each defect; each line of 0 is composed
of the terms of the bilinear form for one defect, as defined in (6). The cfj coefficients are
713
contained in vectors cd and cl, and vectors d and / contain exact defect dimensions. The
coefficients ctj may be calculated by solving the rectangular system of equations (6) in the
least squares sense, using a relative criterion. The solution for each dimension of defect is
then given by (8):
(8)
where wd and wi are the matrixes of weights defined from defect sizes. The calibration
phase of the model building consists of calculating matrixes cj' and c°pt. Then, the
estimated values of d and / can be calculated according to (9).
d=®copt,
(9)
Results
Figure 4 shows relative errors on the dimensions of grooves versus the defect
surface. This error is calculated from (9). and the true values of defects. A test of crossvalidation (29 grooves are used to estimate c°dpt and c°pt, the left groove is used to
calculate d and / ) is then carried out to prove the approach realistic towards the goal
which is to estimate groove dimensions of an unknown defect. Figure 5 gives relative
errors on the dimensions of groove in the case of the cross-validation.
Results of the two-step parametric inversion are summarized in table 1. The second model
leads to a significant improvement. It can be noticed that the depth estimations are more
accurate than the length estimations, whatever the model is.
TABLE 1. Calibration and cross-validation relatives errors for model 1 (eq. (3)) and model 2 (eq.4)
model 1
model 2
depth d
length /
length /
depth d
3.44
12.41
1.22
4.98
(lecalibraticnl) (%)
14
\|8 cross- validation!) (%)
max 6 cross- validation
81
3
161
(%)
1.37 10
2.63
16.7
9.68
173.5
04.
+
H
i
i
!
i
;
1
i
;
i
i
;
i
- ++1 + r + + i
.................................4..
+
+
0
+
+!
i
;+ ;
5
15
A 10 —^.-j-.-.j...^-.-!-..-..].....-^.
.;...............;........... . . . . ............. ;.................
!
:
•
j+
i
i
;
i
i
^*i.+ ..L.....j+ . 1
|-5
+ 1
10
15
20
25
Defect surface in mm2
5*^.^^......^...^.^^^^^^.^.^^.^...^^.
8
>
!
-10
i
30
...t.i....!... + L....i..J... .j...
*1)
1C
35
FIGURE 4. Calibration: relative error on depth and length.
714
L ...i
j
5
!
;
1
|
10
15
20
25
2
Defect surface in mm
!
30
3
... . . .
i T
-io.
i
+
10
15
20
25
Defect surface in mm2
30
-200
35
10
15
20
25
Defect surface in nun2
30
35
FIGURE 5. Cross-validation: relative error on depth and length.
EXPERIMENTAL APPROACH
Improving the two-step parametric inversion is a challenging task because of the
complexity of the physical system. An experimental approach such as a neural network one
[6] seems interesting because of its well-known ability to model a large number of
nonlinear relations [7], Two one-hidden layer feed- forward neural networks were used.
One was used to estimate the depth d, the other estimates the length /.
Figure 6 shows the architecture of such a neural network. It performs the following
function:
with
/
where the weights w - {(w//).=1 p .=1 n > fe)/=i p> ( w /)/=i /» ^j nave to be adjusted to find the
best performance for the data of the training set:
w = argmin
w
^^ (V—efect - f
J ne
f defect used
\\
\ in the learning step J ^
m = d or /
m defect
^
P stands for the number of neurons in the hidden layer. It has to be chosen sufficiently high
in such a way that the function ^neural network approximates well the data.
./neural network("0
FIGURE 6. Architecture of an one-hidden layer feed-forward neural network.
715
Variations of results may be observed if the neural network is over-parameterized [8] [9].
This problem may happen when the number of weights ((n+2)P+l) to be estimated is
higher than the number of observations (defects) used to train the neural network.
In our case, the over-parameterization is strong since only 30 simulated grooves are
available (7V=30) while the input length n is 53. Two solutions were studied to prevent from
over-fitting:
- the optimization of the neural network dimensioning,
the reduction on the input length by decimating the observed signal.
NEURAL NETWORK DIMENSIONING
It concerns both the choice of the number P of neurons of the hidden layer and the
choice of the input structure. A large part of the work was dedicated to the choice of P.
For the input structure, we chose to use log(y) instead of y as the input of the neural
network. This choice allows to spread the range of minor grooves. Thus, the neural
network can easily discriminate them:
The number of neurons in the hidden layer is a significant parameter. A too weak number
leads to a neural network which cannot fit, nor the learning data, neither the future
observed data. A too high number leads to a too complex neural network and overfitting
problem may occur.
The number P was selected by using a cross-validation method (29 grooves used for the
learning of the neural network, one groove used for testing it). For a given number P,
results depend on the initial point of the retro-propagation algorithm used to adjust the
weights w. That is why 50 simulations were carried out, and the results related to a
parameter set are given with bias and dispersion estimations.
Table 2 shows that the learning error logically decreases as P increases while the testing
error is the lowest for P=2:
for P=l, the learning error is high because the neural network cannot model the true
nonlinear relation,
- for large values of P (P>3), the ratio between the learning error and the testing error
becomes higher and higher. This result is in accordance with the fact that the neural
network is more and more over-parameterized.
The values P=2 (resp. P=2 or P=3) may be retained for the depth (resp. the length)
estimation.
TABLE 2. Variation of learning and testing errors with number of neurons P. For each configuration, 50
simulations were carried out. The max k cross-validation I criterion is the worst error among the 50 simulations.
"•Q
•5
t
£
60
c
0)
number of neurons P
1
2
3
4
5
(|e,eaming|}(%)
7.46
1.02
0.84
0.71
0.58
\|£ cross-validation |/ (0//0)
9.71
1.92
2.11
2.54
2.50
max
£
cross- validation |(0//°)
32.7
7.00
27.8
18.5
18.3
{|e.ean,ing|}(%)
6.77
1.99
1.42
1.17
1.00
\| cross-validation |) (%)
8.29
3.24
3.32
3.73
3.82
1371
17.8
17.9
32.1
36.6
£
max
6
cross-validation | (%)
716
The obtained cross-validation errors on the depth are similar to those of the two-step
parametric inversion (model (2)). For the length estimation, they are about 10-fold lower:
robustness is increased.
Figure 7 gives for each groove the results obtained with P=2.
DECIMATION OF THE OBSERVED DATA
Another solution to get an under-parameterized neural network is to reduce the
number of inputs and to choose the number of neurons in the hidden layer in such a way
that (n+2)P+\<N. Reduction of the number of inputs can roughly be done by decimating
the observed signal. Table 3 shows that for a 5-decimation, two neurons in the hidden layer
is the best value. It can be noticed that it is the highest number which leads to an underparameterized neural network. Conclusions are the same than for the neural network
dimensioning: when the neural network is under-parameterized, the robustness is higher.
2
S?
.£
i ok
10
15
20
25
10
Defect surface in mm2
15
20
25
Defect surface in mm2
FIGURE 7. Cross-validation: relative errors on depth and length. The markers (+) indicate the error LI-mean
(over 50 simulations). The vertical lines give the standard deviations (+/- o around the mean values).
TABLE 3. Variations of the training and testing errors with the number of neurons in the hidden layer. The
observed signals were decimated with a factor equal to 5. 50 simulations were carried out.
number of neurons P
1
2
3
4
5
length of n>
14
27
40
53
66
over/underparameterized
under
under
over
over
over
{Ka™ng|}(%)
6.01
4.1
4.03
4.07
4.03
"0
!
1
{|ecroSs-validation|}(°/°)
7.48
5.58
5.89
5.87
5.80
maxEcros,..val[dation|(%)
37.4
23.2
30.7
19.5
19.6
(|e|eaming|}(%)
11.5
10.6
8.81
8.49
8.6
14.4
13.6
11.5
11.4
11.5
53.7
48.9
53.2
57.8
53.8
£
\| cross- validation!/ (%)
JJH
m
%
ax £ cross-validation |( )
717
CONCLUSION
The groove dimensioning problem may be solved with two different approaches.
We have shown that the first one, the parametric approach, may be improved in
comparison with previous paper [1], This improvement has required building of specific
models.
The second approach, consisting of learning a relation with data of a training set, was
achieved using a neural network. This approach requires less prior information. This is a
significant advantage when the measurement system is complex. Nevertheless, we have
shown that using a neural network needs some precautions, especially when the neural
network is over-parameterized. This situation may happen when the training set size is low.
Unfortunately, most of the training sets built with experimental techniques encounter this
problem. Two techniques were used to get an under-parameterized neural network: the
neural network dimensioning and the decimation of the inputs. It was proved that these
techniques permit stabilizing the estimator given by the trained neural network: robustness
is improved.
REFERENCES
1. Fleury G. and Davoust M.-E., "Remote field eddy current inspection for groove sizing Choice of a direct model structure ", Review of Progress in Quantitative Nondestructive
Evaluation, American Institute of Physics, vol 19A, n° 509, pp. 541-548, 2000.
2. Seghouane A. K. and Fleury G., "Local Robustness Analysis of Multilayered Feedforward
Neural Networks Using Probability Density Functions", IEEE International Workshop on
Intelligent Signal Processing 2001, pp. 165-169, Budapest (Hungary), 24-25 may 2001.
3. Atherton D.L., Schmidt T.R., Svendson T and von Rosen E., Mat. Eval, Vol. 50, pp. 44-50
(1992).
4. Fleury G., "Optimal Nonlinear Modeling and Reparameterization", in IEEE International
Workshop on Intelligent Signal Processing, WISP99, 1999, pp. 72-76.
5. Press W. and al, in Numerical Recipes, edited by Cambridge University Press, 1988, pp.
523-528.
6. Narenda K. S., Parthasarathy K., "Identification and control of dynamical systems using
neural networks", IEEE Trans. on Neural Networks, vol. 1, pp. 4-27, 1990.
7. Hornick K., Stinchcombe M., White H., "Multi-layer feedforward networks are universal
approximates", Neural Networks, vol. 2, pp. 359-366, 1989.
8. Saxen H., "Nonlinear time series analysis by neural networks: a case study", InternationalJournal-of-Neural-Systems, vol. 7, no. 2, May 1996, pp. 195-201.
9. Seghouane A. K., Moudden Y. and Fleury G., "On learning feedforward neural
networks with noise injection into inputs", IEEE International Workshop on Neural
Networks for Signal Processing, Martigny (Suisse), 04-06 September 2002, accepted for
publication.
718