Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
GROOVE SIZING USING A ROBUST NEURAL NETWORK APPROACH L. Le Brusquet, M.-E. Davoust and G. Fleury ECOLE SUPERIEURE D'ELECTRICITE, Service des Mesures, Plateau de Moulon, 3 rue Joliot Curie, 91192 Gif-sur-Yvette Cedex ABSTRACT. The remote field eddy current technique is used to inspect conductive pipes from the inside. The problem is to calculate an estimation of groove dimensions from observed data. A first approach was previously developed using a two-step parametric inversion. Results from this first approach are produced using a new model. A second approach using a neural network is presented. This technique is known for the lack of robustness which may occur when precautions are not sufficient. This paper presents these precautions and the results of both approaches. INTRODUCTION The remote field eddy current (RFEC) technique is used to inspect conductive pipes from the inside. This technique is known to be sensitive to internal and external flaws. The RFEC mechanism is now well understood [3], and is particularly convenient for ferromagnetic material testing. Figure 1 illustrates a typical experimental apparatus used for groove dimensioning, such as corrosion grooves, for example. In our case, the sensor is pushed inside the pipe by some means. The modulus and phase of the detector voltage are acquired along with the coil positions. They are called here observed data. The problem is to obtain an estimation of groove dimensions from these observed data. For this problem there is no algebraic solution of the Maxwell's equations. Therefore, a first approach was previously developed [1] using a two-step parametric inversion. The first one is to find an algebraic function which models the direct problem, studied by means of the finite element technique; the parameters of this function are obtained using an optimizing method. In a second step, another relation must be chosen to calculate groove dimensions from the optimized parameters. This method gives results which are right enough for certain kind of applications. However, two models have to be built here, one for the direct model and one for measurement equation, and results depend on these modeling phases. It may be difficult to find better models for the two-step parametric inversion, especially when the measurement system depends on complex physical laws. 2mm Detector e&l I B&dter 35 mm FIGURE 1. Schematic illustration of the sensor-pipe apparatus. CP657, Review of Quantitative Nondestructive Evaluation Vol. 22, ed. by D. O. Thompson and D. E. Chimenti © 2003 American Institute of Physics 0-7354-0117-9/03/S20.00 711 In such cases, experimental approaches may be valuable, provided that a sufficient number of values of both observed signals and unknown quantities are available. This set of values, called the training set, is used to choose an optimal function among the functions of a given parameterized class. Regression and neural network are very straightforward cases of this approach. No design of a physical model between the observed signals and the unknown quantities is required. This is the advantage of such an approach. To solve our groove sizing problem, an one-hidden layer feed- forward neural network was used because of its ability to represent nonlinear relations. The required training set is composed of the signals obtained with the FE tool. The consequence is the weak number of simulated grooves. It involves a large risk for the estimated neural network to over fit the data of the training set and to give important errors for data which do not belong to the training set. This paper relates both the precautions which have been taken to prevent from this risk and the obtained results. FIRST APPROACH In many applications, an unknown quantity m has to be estimated from a vector of observed values y. This may be encountered particularly in the domain of Nondestructive Evaluation. Measurement systems can be formalized by two equations. One sets up the relation between the observed data and the physical phenomenon — it is called here the Observation Equation. This relation describes the observable physical variable. It is the classical nonlinear regression model: yt=f(xl90) + et / = !,...,». (1) where y = [yi9..-9yn'F is the vector of the observed data 0 = [0l,...90p]J is the vector of the unknown parameters, x = [xl9...,xn\r stands for the experiment design (e.g., sensors coordinates, sample times, frequencies for eddy current transducers...), and e = [^,...,^ W ] T is the vector of random observation errors. The other is related to the genuine measurement, the significant variable — it is called here the Measurement Equation. This equation can be written in the general following form: (2) Here the components of measurement m are the two dimensions of the groove. Observation Equation An FE tool was used to compute synthetic observed data for a given set of groove parameters. The induced voltage was computed for about 100 sensor positions. Various dimensions were chosen considering the applications. Depth d and length / took the values (rf=0.33, 0.66, 1, 1.33, 1.66 mm) and (7=0.33, 1.33, 4, 8, 12, 20 mm). Coupling every depth and length value leads therefore to 30 very realistic simulated grooves. Figure 2 gives a few observed data [1]. From these data, an algebraic model was chosen. Taking into account symmetry properties and limit variations of the data, several algebraic functions were considered. According to some criteria the function given in (3) was proved to be the most suitable [1]; it was also be 712 f=280Hz, d=1.33mm, l=1.33mm -60 -40 -20 0 20 xinmm 40 60 -40 FIGURE 2. Observed data for /=4mm. -20 0 xinmm 20 40 60 FIGURE 3. Simulated (o) and modeled phase lags proved that the probability density of measurement does not depend on the parameterization [4]. (02(^ (3) where L is the distance between coils. We have built a new function (4) which happens to give better results: (4) Optimal parameters 6 are estimated by minimizing an usual quadratic criterion (5). The Levenberg-Marquardt algorithm [5] is used for that purpose. HI/eM-4- (5) The vector jc contains all positions of the sensor for the observed data y (phase of the detector voltage), fQ is the equation model given by (4). Figure 3 compares the data and the modeled phase lag given by the minimization of £in (5), for one defect. Measurement Equation According to the described approach, an algebraic form for g was chosen. The simplest choice was a linear function of parameters but to improve the overall accuracy of the method we considered [1] a bilinear one (6). (6) By using matrix expressions, these equations may be written for all simulated defects as: d , 1=0 cl (7) where the matrix 0 contains the parameters for each defect; each line of 0 is composed of the terms of the bilinear form for one defect, as defined in (6). The cfj coefficients are 713 contained in vectors cd and cl, and vectors d and / contain exact defect dimensions. The coefficients ctj may be calculated by solving the rectangular system of equations (6) in the least squares sense, using a relative criterion. The solution for each dimension of defect is then given by (8): (8) where wd and wi are the matrixes of weights defined from defect sizes. The calibration phase of the model building consists of calculating matrixes cj' and c°pt. Then, the estimated values of d and / can be calculated according to (9). d=®copt, (9) Results Figure 4 shows relative errors on the dimensions of grooves versus the defect surface. This error is calculated from (9). and the true values of defects. A test of crossvalidation (29 grooves are used to estimate c°dpt and c°pt, the left groove is used to calculate d and / ) is then carried out to prove the approach realistic towards the goal which is to estimate groove dimensions of an unknown defect. Figure 5 gives relative errors on the dimensions of groove in the case of the cross-validation. Results of the two-step parametric inversion are summarized in table 1. The second model leads to a significant improvement. It can be noticed that the depth estimations are more accurate than the length estimations, whatever the model is. TABLE 1. Calibration and cross-validation relatives errors for model 1 (eq. (3)) and model 2 (eq.4) model 1 model 2 depth d length / length / depth d 3.44 12.41 1.22 4.98 (lecalibraticnl) (%) 14 \|8 cross- validation!) (%) max 6 cross- validation 81 3 161 (%) 1.37 10 2.63 16.7 9.68 173.5 04. + H i i ! i ; 1 i ; i i ; i - ++1 + r + + i .................................4.. + + 0 + +! i ;+ ; 5 15 A 10 —^.-j-.-.j...^-.-!-..-..].....-^. .;...............;........... . . . . ............. ;................. ! : • j+ i i ; i i ^*i.+ ..L.....j+ . 1 |-5 + 1 10 15 20 25 Defect surface in mm2 5*^.^^......^...^.^^^^^^.^.^^.^...^^. 8 > ! -10 i 30 ...t.i....!... + L....i..J... .j... *1) 1C 35 FIGURE 4. Calibration: relative error on depth and length. 714 L ...i j 5 ! ; 1 | 10 15 20 25 2 Defect surface in mm ! 30 3 ... . . . i T -io. i + 10 15 20 25 Defect surface in mm2 30 -200 35 10 15 20 25 Defect surface in nun2 30 35 FIGURE 5. Cross-validation: relative error on depth and length. EXPERIMENTAL APPROACH Improving the two-step parametric inversion is a challenging task because of the complexity of the physical system. An experimental approach such as a neural network one [6] seems interesting because of its well-known ability to model a large number of nonlinear relations [7], Two one-hidden layer feed- forward neural networks were used. One was used to estimate the depth d, the other estimates the length /. Figure 6 shows the architecture of such a neural network. It performs the following function: with / where the weights w - {(w//).=1 p .=1 n > fe)/=i p> ( w /)/=i /» ^j nave to be adjusted to find the best performance for the data of the training set: w = argmin w ^^ (V—efect - f J ne f defect used \\ \ in the learning step J ^ m = d or / m defect ^ P stands for the number of neurons in the hidden layer. It has to be chosen sufficiently high in such a way that the function ^neural network approximates well the data. ./neural network("0 FIGURE 6. Architecture of an one-hidden layer feed-forward neural network. 715 Variations of results may be observed if the neural network is over-parameterized [8] [9]. This problem may happen when the number of weights ((n+2)P+l) to be estimated is higher than the number of observations (defects) used to train the neural network. In our case, the over-parameterization is strong since only 30 simulated grooves are available (7V=30) while the input length n is 53. Two solutions were studied to prevent from over-fitting: - the optimization of the neural network dimensioning, the reduction on the input length by decimating the observed signal. NEURAL NETWORK DIMENSIONING It concerns both the choice of the number P of neurons of the hidden layer and the choice of the input structure. A large part of the work was dedicated to the choice of P. For the input structure, we chose to use log(y) instead of y as the input of the neural network. This choice allows to spread the range of minor grooves. Thus, the neural network can easily discriminate them: The number of neurons in the hidden layer is a significant parameter. A too weak number leads to a neural network which cannot fit, nor the learning data, neither the future observed data. A too high number leads to a too complex neural network and overfitting problem may occur. The number P was selected by using a cross-validation method (29 grooves used for the learning of the neural network, one groove used for testing it). For a given number P, results depend on the initial point of the retro-propagation algorithm used to adjust the weights w. That is why 50 simulations were carried out, and the results related to a parameter set are given with bias and dispersion estimations. Table 2 shows that the learning error logically decreases as P increases while the testing error is the lowest for P=2: for P=l, the learning error is high because the neural network cannot model the true nonlinear relation, - for large values of P (P>3), the ratio between the learning error and the testing error becomes higher and higher. This result is in accordance with the fact that the neural network is more and more over-parameterized. The values P=2 (resp. P=2 or P=3) may be retained for the depth (resp. the length) estimation. TABLE 2. Variation of learning and testing errors with number of neurons P. For each configuration, 50 simulations were carried out. The max k cross-validation I criterion is the worst error among the 50 simulations. "•Q •5 t £ 60 c 0) number of neurons P 1 2 3 4 5 (|e,eaming|}(%) 7.46 1.02 0.84 0.71 0.58 \|£ cross-validation |/ (0//0) 9.71 1.92 2.11 2.54 2.50 max £ cross- validation |(0//°) 32.7 7.00 27.8 18.5 18.3 {|e.ean,ing|}(%) 6.77 1.99 1.42 1.17 1.00 \| cross-validation |) (%) 8.29 3.24 3.32 3.73 3.82 1371 17.8 17.9 32.1 36.6 £ max 6 cross-validation | (%) 716 The obtained cross-validation errors on the depth are similar to those of the two-step parametric inversion (model (2)). For the length estimation, they are about 10-fold lower: robustness is increased. Figure 7 gives for each groove the results obtained with P=2. DECIMATION OF THE OBSERVED DATA Another solution to get an under-parameterized neural network is to reduce the number of inputs and to choose the number of neurons in the hidden layer in such a way that (n+2)P+\<N. Reduction of the number of inputs can roughly be done by decimating the observed signal. Table 3 shows that for a 5-decimation, two neurons in the hidden layer is the best value. It can be noticed that it is the highest number which leads to an underparameterized neural network. Conclusions are the same than for the neural network dimensioning: when the neural network is under-parameterized, the robustness is higher. 2 S? .£ i ok 10 15 20 25 10 Defect surface in mm2 15 20 25 Defect surface in mm2 FIGURE 7. Cross-validation: relative errors on depth and length. The markers (+) indicate the error LI-mean (over 50 simulations). The vertical lines give the standard deviations (+/- o around the mean values). TABLE 3. Variations of the training and testing errors with the number of neurons in the hidden layer. The observed signals were decimated with a factor equal to 5. 50 simulations were carried out. number of neurons P 1 2 3 4 5 length of n> 14 27 40 53 66 over/underparameterized under under over over over {Ka™ng|}(%) 6.01 4.1 4.03 4.07 4.03 "0 ! 1 {|ecroSs-validation|}(°/°) 7.48 5.58 5.89 5.87 5.80 maxEcros,..val[dation|(%) 37.4 23.2 30.7 19.5 19.6 (|e|eaming|}(%) 11.5 10.6 8.81 8.49 8.6 14.4 13.6 11.5 11.4 11.5 53.7 48.9 53.2 57.8 53.8 £ \| cross- validation!/ (%) JJH m % ax £ cross-validation |( ) 717 CONCLUSION The groove dimensioning problem may be solved with two different approaches. We have shown that the first one, the parametric approach, may be improved in comparison with previous paper [1], This improvement has required building of specific models. The second approach, consisting of learning a relation with data of a training set, was achieved using a neural network. This approach requires less prior information. This is a significant advantage when the measurement system is complex. Nevertheless, we have shown that using a neural network needs some precautions, especially when the neural network is over-parameterized. This situation may happen when the training set size is low. Unfortunately, most of the training sets built with experimental techniques encounter this problem. Two techniques were used to get an under-parameterized neural network: the neural network dimensioning and the decimation of the inputs. It was proved that these techniques permit stabilizing the estimator given by the trained neural network: robustness is improved. REFERENCES 1. Fleury G. and Davoust M.-E., "Remote field eddy current inspection for groove sizing Choice of a direct model structure ", Review of Progress in Quantitative Nondestructive Evaluation, American Institute of Physics, vol 19A, n° 509, pp. 541-548, 2000. 2. Seghouane A. K. and Fleury G., "Local Robustness Analysis of Multilayered Feedforward Neural Networks Using Probability Density Functions", IEEE International Workshop on Intelligent Signal Processing 2001, pp. 165-169, Budapest (Hungary), 24-25 may 2001. 3. Atherton D.L., Schmidt T.R., Svendson T and von Rosen E., Mat. Eval, Vol. 50, pp. 44-50 (1992). 4. Fleury G., "Optimal Nonlinear Modeling and Reparameterization", in IEEE International Workshop on Intelligent Signal Processing, WISP99, 1999, pp. 72-76. 5. Press W. and al, in Numerical Recipes, edited by Cambridge University Press, 1988, pp. 523-528. 6. Narenda K. S., Parthasarathy K., "Identification and control of dynamical systems using neural networks", IEEE Trans. on Neural Networks, vol. 1, pp. 4-27, 1990. 7. Hornick K., Stinchcombe M., White H., "Multi-layer feedforward networks are universal approximates", Neural Networks, vol. 2, pp. 359-366, 1989. 8. Saxen H., "Nonlinear time series analysis by neural networks: a case study", InternationalJournal-of-Neural-Systems, vol. 7, no. 2, May 1996, pp. 195-201. 9. Seghouane A. K., Moudden Y. and Fleury G., "On learning feedforward neural networks with noise injection into inputs", IEEE International Workshop on Neural Networks for Signal Processing, Martigny (Suisse), 04-06 September 2002, accepted for publication. 718