Download Self-organizing maps for virtual sensors, fault detection and fault isolation in

Self-organizing maps for virtual sensors, fault detection and fault isolation in diesel engines Examensarbete utfört i Reglerteknik vid Tekniska Högskolan i Linköping av Conny Bergkvist and Stefan Wikner Reg nr: LiTH-ISY-EX–05/3634–SE Linköping 2005 Self-organizing maps for virtual sensors, fault detection and fault isolation in diesel engines Examensarbete utfört i Reglerteknik vid Tekniska Högskolan i Linköping av Conny Bergkvist and Stefan Wikner Reg nr: LiTH-ISY-EX–05/3634–SE Supervisor: Johan Sjöberg (LiTH) Urban Walter (Volvo) Fredrik Wattwil (Volvo) Examiner: Svante Gunnarsson (LiTH) Jonas Sjöberg (Chalmers) Linköping 18th February 2005. Avdelning, Institution Division, Department Datum Date 2005-02-11 Institutionen för systemteknik 581 83 LINKÖPING Språk Language Svenska/Swedish X Engelska/English Rapporttyp Report category Licentiatavhandling X Examensarbete C-uppsats D-uppsats ISBN ISRN LITH-ISY-EX--05/3634--SE Serietitel och serienummer Title of series, numbering ISSN Övrig rapport ____ URL för elektronisk version http://www.ep.liu.se/exjobb/isy/2005/3634/ Titel Title Self-organizing maps for virtual sensors, fault detection and fault isolation in diesel engines Författare Author Conny Bergkvist and Stefan Wikner Sammanfattning Abstract This master thesis report discusses the use of self-organizing maps in a diesel engine management system. Self-organizing maps are one type of artificial neural networks that are good at visualizing data and solving classification problems. The system studied is the Vindax(R) development system from Axeon Ltd. By rewriting the problem formulation also function estimation and conditioning problems can be solved apart from classification problems. In this report a feasibility study of the Vindax(R) development system is performed and for implementation the inlet air system is diagnosed and the engine torque is estimated. The results indicate that self-organizing maps can be used in future diagnosis functions as well as virtual sensors when physical models are hard to accomplish. Nyckelord Keyword self-organizing maps, neural network, virtual sensor, diesel engine, fault detection, fault isolation, automotive, development system Abstract This master thesis report discusses the use of self-organizing maps in a diesel engine management system. Self-organizing maps are one type of artificial neural networks that are good at visualizing data and solving classification problems. The system R studied is the Vindax development system from Axeon Ltd. By rewriting the problem formulation also function estimation and conditioning problems can be solved apart from classification problems. R In this report a feasibility study of the Vindax development system is performed and for implementation the inlet air system is diagnosed and the engine torque is estimated. The results indicate that self-organizing maps can be used in future diagnosis functions as well as virtual sensors when physical models are hard to accomplish. Keywords: self-organizing maps, neural network, virtual sensor, diesel engine, fault detection, fault isolation, automotive, development system i ii Preface The thesis has been made by two students, one student from Chalmers University of Technology and one from Linköpings University. For practical purposes this report exist in two versions; one at Chalmers and one at Linköping. Their contents are identical, only the framework differs slightly to meet each university’s rules. iii iv Acknowledgments Throughout the thesis work a number of persons have been helpful and engaged in our work. We would like to thank our instructors Fredrik Wattwil and Urban Walter at Engine Diagnostics, Volvo Powertrain for their support and engagement. In addition we thank instructors and examiners; Svante Gunnarsson and Johan Sjöberg at Linköpings University and Jonas Sjöberg at Chalmers University of Technology. Also the support team from Axeon Ltd - Chris Kirkham, Helge Nareid, Iain MacLeod and Richard Georgi - and the project coordinator, Carl-Gustaf Theen, as well as the other employees at the Engine Diagnostics group have been to great help for the success of this thesis. v vi Notation Symbols x, X Boldface letters are used for vectors and matrices. Abbreviations ABSE ANN DC EMS ETC PCI RMSE SOM VDS VP ABSolute Error Artificial Neural Network Dynamic Cycle Engine Management System European Transient Cycle Peripheral Component Interconnect Root-Mean-Square Error Self-Organizing Map R Vindax Development System Vindax R Processor vii viii Contents 1 Introduction 1.1 Background . . . . . 1.2 Problem description 1.3 Purpose . . . . . . . 1.4 Goal . . . . . . . . . 1.5 Delimitations . . . . 1.6 Method . . . . . . . 1.7 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Background theory 2.1 Self-organizing maps . . . . . . 2.2 Diesel engines . . . . . . . . . . 2.2.1 The basics . . . . . . . . 2.2.2 Four-stroke engine . . . 2.2.3 Turbo charger . . . . . . 2.2.4 What happens when the . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . gas pedal is . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . pushed? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . 5 . 8 . 8 . 9 . 9 . 10 R 3 The Vindax Development System 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Labeling step . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Classification step . . . . . . . . . . . . . . . . . . . . . . . 3.2 Conditioning of input signals . . . . . . . . . . . . . . . . . . . . . 3.3 Classification of cluster data . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Results and discussion . . . . . . . . . . . . . . . . . . . . . 3.4 Function estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Results and discussion . . . . . . . . . . . . . . . . . . . . . 3.5 Handling dynamic systems . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Results and discussion . . . . . . . . . . . . . . . . . . . . . 3.6 Comparison with other estimation and classification methods . . . 3.6.1 Cluster classification through minimum distance estimation ix 1 1 2 2 3 3 3 4 . . . . . . . . . . . . . . . . . . . . . . 13 13 14 14 15 16 16 19 20 20 21 25 25 26 29 29 x Contents 3.6.2 Function estimation through polynomial regression . . . . . . 30 4 Data collection and pre-processing 4.1 Data collection . . . . . . . . . . . 4.1.1 Amount of data . . . . . . . 4.1.2 Sampling frequency . . . . 4.1.3 Quality of data . . . . . . . 4.1.4 Measurement method . . . 4.2 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 31 31 32 32 33 34 5 Conditioning - Fault detection 37 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 5.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6 Classification - Fault detection and isolation 6.1 Introduction . . . . . . . . . . . . . . . . . . . 6.2 Method . . . . . . . . . . . . . . . . . . . . . 6.2.1 Leakage isolation . . . . . . . . . . . . 6.2.2 Reduced intercooler efficiency isolation 6.3 Results and discussion . . . . . . . . . . . . . . . . . . 41 41 41 42 43 44 7 Function estimation - Virtual sensor 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 47 47 47 48 8 Conclusions and future work 8.1 Conclusions . . . . . . . . . 8.1.1 Conditioning . . . . 8.1.2 Classification . . . . 8.1.3 Function estimation 8.2 Method criticism . . . . . . 8.3 Future Work . . . . . . . . 51 51 52 52 53 53 53 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography 55 A RMSE versus ABSE 57 B Measured variables 59 C Fault detection results C.1 90 % activation frequency C.2 80 % activation frequency C.3 70 % activation frequency C.4 60 % activation frequency resolving resolving resolving resolving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 62 63 64 65 Contents xi C.5 Most frequent resolving . . . . . . . . . . . . . . . . . . . . . . . . . 66 xii Contents Chapter 1 Introduction This chapter gives the background, the purpose, the goal, delimitations, used methods and the outline of the thesis. 1.1 Background The demand for new methods and technologies in the Engine Management System, EMS, is increasing. Laws that govern the permissible level of pollutants in the exhaust of diesel engines have to be followed at the same time as the drivers demand high performance and low fuel consumption, putting the EMS under pressure. Applications in the EMS are based on a number of inputs measured by sensors and estimated by models. Limitations of these sensors and models create situations where qualitative input signals cannot be obtained. Adding new sensors is expensive and creates additional need for diagnostics. Together, this forces engineers to look at new methods to attain these values or to find ways around the problem. Identifying models using Artificial Neural Networks, ANNs, can be one solution. A lot of research has been put into this area. The increasing number of applications with implemented ANNs indicates that the technology could be of use in an EMS. There are development systems available to produce hardware and software R applications using ANNs to be deployed in the EMS. The Vindax Development 1 System, VDS, by Axeon Limited is an example of this. The system may be capable of solving problems within the EMS. The thesis work has been performed at Volvo Powertrain Corporation. The diagnosis group want to know if the VDS can be used to help in the area of diagnosing diesel engines. 1 www.axeon.com 1 2 Introduction 1.2 Problem description Three different kinds of problems are investigated in this thesis: Conditioning - Fault detection Conditioning a system is one way of determining if the system is fault free. Assume that a system has states x ∈ Φ, where x denotes the system states and Φ denotes a set, if it is fault free. Determining if x ∈ Φ is a conditioning problem. Classification - Fault isolation There can be different kinds of faults that needs to be isolated. Assume that a system with states x is fault free if x ∈ Φ0 . Assume also that k different classes of errors can occur and that each error causes the states to belong to one of the sets Φ1 , Φ2 , . . . Φk . Determining which set x belongs to is a classification problem. Function estimation - Virtual sensor A system generates output according to y = f (u) ∈ <. This output can be measured by sensors or obtained by determining the function f . In complex systems, e.g. an engine, it may be hard to determine f exactly. Finding an estimation of f is a function estimation problem. 1.3 Purpose The aim for this master thesis is to investigate a neural network application, based on Self-Organized Maps, SOM. The main purpose is to investigate how well such a neural network works compared to traditional software applications used for control of real time functions used in EMSs. The actual environment to be investigated is the VDS. The thesis should: • Evaluate the training process of the system regarding: – Time requirements – Size and type of source measurement data needed • Evaluate the strengths and weaknesses of the system • Evaluate the performance and capabilities of the VDS to solve problems described in section 1.2. 1.4 Goal 1.4 3 Goal To fulfill the purpose, the thesis goal is divided into three parts: • Estimate the output torque with better accuracy than the value estimated by the EMS • Detect air leakages in the inlet air system as well as a reduced intercooler efficiency with accuracy sufficient to make a diagnosis • Isolate the same errors sufficient accurate to make a diagnosis All experiments are performed on a Volvo engine using the VDS. 1.5 Delimitations R The evaluation of the VDS will be based upon a PCI2 -card version of the Vindax hardware on a Windows PC. For the fault detection and isolation problems data from one single engine individual are used. However, two different types of test running cycles are used. For function estimation data from one single type of test cycle are used, but the test cycle is run on two different types of engines. For all problems, the data used for verification are of the same kind of data as for training. 1.6 Method To fulfill the purpose and reach the goal of this thesis, a feasibility study of the VDS is performed. The study reveals basic properties of the system and it will be a solid base for the continued work. With the feasibility study as base, the actual development of the applications takes place. 2 Peripheral PC. Component Interconnect, PCI, is a standard for connecting expansion cards to a 4 1.7 Introduction Thesis outline R Chapter 2 Introduction to the theory behind the Vindax Processor, VP, used in the VDS; self-organizing maps. The diesel engine is also described to give a sense of the environment where the algorithm is going to be deployed. Chapter 3 A feasibility study of the VDS and its three main application areas; conditioning, classification and function estimation. Chapter 4 Here the collection and pre-processing of data is discussed. Important issues as representability of data are handled, both in general and specific terms. Chapter 5-7 Each of these chapters describes the development of an application. These are very specific chapters and presents results that can be expected to be achieved with the VDS. Chapter 8 Ends the report with conclusions and recommendations for future work. Chapter 2 Background theory This chapter gives a short introduction to self-organizing maps and diesel engines for readers with no or little knowledge about these areas. 2.1 Self-organizing maps Professor Kohonen developed the SOM algorithm in its present form in 1982 and presents it in his book [1]. The algorithm maps high-dimensional data onto a low dimensional array. It can be seen as a conversion of statistical relationships into geometrical, suitable for visualization of high-dimensional data. As this is a kind of compression of information, preserving topological and/or metric relationships of the primary data elements, it can also be seen as a kind of abstraction. These two aspects make the algorithm useful for process analysis, machine perception, control tasks and more.1 A SOM is actually an ANN categorized as a competitive learning network. These kinds of ANNs contain neurons that, for each input, compete over determining the output. Each time the SOM is fed with an input signal, a competition takes place. One neuron is chosen as the winner and decides the output of the network. The algorithm starts with initiation of the SOM. Assume that the input variables, ξi , are fed to the SOM as a vector u = [ξ1 , ξ2 . . . ξn ]T ∈ <n . This defines the input space as a vector of dimension n. Each neuron is then associated with a reference vector, mi ∈ <n , that gives the neuron an orientation in the input space. The reference vectors are finally, often randomly, given an initial value. Then the algorithm carries out a regression process to represent the available inputs. For each sample of the input signal, u(t), the SOM-algorithm selects a winning neuron. To find the winner, the distance between each neuron and the 1 It is formally described, by the inventor himself, as ([1], p 106) “a nonlinear, ordered, smooth mapping of high-dimensional input data manifolds onto the elements of a regular low-dimensional array.” 5 6 Background theory input sample is calculated. According to equation (2.1)2 the winner mc (t) is the neuron closest to the input sample. ||u(t) − mc (t)|| ≤ ||u(t) − mi (t)|| ∀ i (2.1) Here t is a discrete time-coordinate of the input samples. When a winner is selected, the reference vectors of the network are updated. The winning neuron and all neurons that belong to its neighborhood are updated according to equation (2.2). mi (t + 1) = mi (t) + hc,i (t)(u(t) − mi (t)) (2.2) The function hc,i (t) is a ‘neighborhood function’ decreasing with the distance between mc and mi . The neighborhood function is traditionally implemented as a Gaussian (bell-shaped) function. Also for convergence reasons, hc,i (t) → 0 when t → ∞. This regression distributes the neurons to approximate the probability density function of the input signal. To give a more qualitative description of the SOM-algorithm an example is considered. The input signal, generated by taking points on the arc of a half circle3 , is shown in figure 2.1. This is a signal of dimension two that the SOM is going to map. The SOM has in this case a size of 16 neurons, i.e. an array 4 by 4. To illustrate the learning process, the array is imagined as a net, with knots, representing neurons, and elastic strings keeping a relation to each knot’s neighbors. For initiation, the SOM is distributed over the input space according to figure 2.1. The reference vectors are randomized in the input space (using a Gaussian distribution). During training, a regression process adjusts the SOM to the input signal. For each input, the closest neuron is chosen to be the winner. When a winner is selected, it and its neighbors are adjusted towards the input. This can be seen as taking the winning neuron and pulling it towards the input. The elastic strings in the net causes the neighbors also to adjust, following the winner. Referring to equation (2.2) the neighborhood-function, hc,i (t), determines the elasticity of the net. The regression continues over a finite number of steps to complete the training process. In figure 2.1 network shapes after 10 and 100 steps of training are illustrated. As can be seen a nonlinear, ordered and smooth mapping of the input signal is formed after 100 steps. The network has formed an approximation of the input signal.4 2 In equation (2.1) the norm used usually is the Euclidean (2-norm), but other norms can be used as well. The VDS uses the 1-norm. 3 Although there is probably no physical system giving this input, it suites the purpose to illustrate the SOM-algorithm. 4 Additional training has no significant affect on the structure of the neurons. 2.1 Self-organizing maps 7 Figure 2.1. Illustration of the SOM-algorithm. Each step involves presenting one input sample for the SOM. The network has started approximating the distribution of the input signal already after 10 steps. After 100 steps the network has formed an approximation of the input signal. 8 2.2 Background theory Diesel engines It is not crucial to know how diesel engines work to read this report, but some knowledge might increase the understanding why certain signals are used and how they affect the engine. This thesis will only discuss the four-stroke diesel engine because that is the type that is used in Volvo’s trucks. To learn more about engines, both diesel and Otto (e.g. petrol), see e.g. [2], [3]. 2.2.1 The basics A diesel engine, or compression ignited engine as it is sometimes called, converts air and fuel through combustion into torque and emissions. Figure 2.2. A picture from the side of one cylinder in a diesel engine. Figure 2.2 shows the essential components in the engine. The air is brought into the engine through a filter to the intake manifold. There the air is guided into the different cylinders. In the cylinder the air is compressed and when the fuel is injected the mixture self ignites. The combustion results in a pressure that generates a force on the piston. This in turn is transformed into torque on the crankshaft that makes it rotate. Another result of the combustion is exhaust gases that go through the exhaust manifold, the muffler and out into the air. In this thesis the measured intake manifold pressure and temperature are called boost pressure and boost temperature respectively. 2.2 Diesel engines 2.2.2 9 Four-stroke engine The engine operates in four strokes (see figure 2.3): Figure 2.3. The four strokes in an internal combustion engine. 1. Intake stroke. (The inlet valve is open and the exhaust valve is closed) The air in the intake manifold is sucked into the cylinder while the piston moves downwards. In engines equipped with a turbo charger, see section 2.2.3, air will be pressed into the cylinder. 2. Compression stroke. (Both valves are closed) When the piston moves upwards the air is compressed. The compression causes the temperature to rise. Fuel is injected when the piston is close to the top position and the high temperature causes the air-fuel mixture to ignite. 3. Expansion stroke. (Both valves are closed) The combustion increases the pressure in the cylinder and pushes the piston downwards, which in turn rotates the crankshaft. This is the stroke where work is generated. 4. Exhaust stroke. (The exhaust valve is open and the inlet valve is closed) When the piston moves upwards again it pushes the exhaust gases out into the exhaust manifold. After the exhaust stroke it all starts over with the intake stroke again. As a result of this, each cylinder produces work during one stroke and consumes work (friction, heat, etc.) during three strokes. 2.2.3 Turbo charger The amount of work, and in the end the speed, generated by the engine depends on the amount of fuel injected, but also the relative mass of fuel and oxygen is important. For the fuel to be able to burn there must be enough of oxygen. From the intake manifold pressure the available amount of oxygen is calculated and in 10 Background theory turn, together with the wanted work, the amount of fuel injected is calculated. So, a manifold pressure that is not high enough might lead to lower performance. Figure 2.4. A schematic picture of a diesel engine. To get more air, and with that more oxygen, into the cylinders, almost all large diesel engines of today use a turbo charger. This will increase the performance of the engine as more fuel can be injected. The turbo charger uses the heat energy in the exhaust gases to rotate a turbine. The turbine is connected to a compressor (see figure 2.4) that pushes air from the outside into the cylinders. One effect of the turbo charger is that the compressor increases the temperature of the air. When the air is warmer it contains less oxygen per volume. An intercooler can be used to increase the density of oxygen and by that increase the performance of the engine. After the intercooler the air has (almost) the same temperature as before going through the compressor but at a much higher pressure. From the ideal gas law, d = m/V = p/RT , we get that the density has increased. This way we can get more air mass into the same cylinder volume only using energy that otherwise would be thrown away. 2.2.4 What happens when the gas pedal is pushed? When the driver pushes the pedal the engine produces more torque. Simplified, the process looks like: 1. The driver pushes the pedal. 2. More fuel is injected into the cylinders. 3. More fuel yields a larger explosion which leads to more heat and in turn more pressure in the cylinder. (Ideal gas law) 2.2 Diesel engines 11 4. The increased pressure pushes the piston harder which leads to higher torque. But it does not stop here; after a little while the turbo ’kicks’ in. 5. The increased amount of exhaust gases together with the increased temperature accelerate the turbine. 6. The increased speed of the turbine makes the compressor push more air into the intake manifold and the pressure rises. 7. This leads to more air in the cylinders. 8. With more air in the cylinders more fuel can be used in the combustion which, in turn, leads to an even larger explosion and even more torque. The final four steps is repeated until a steady state is achieved. This is a rather slow process which takes a second or two before steady state is achieved. 12 Background theory Chapter 3 The Vindax Development System R In this chapter the VDS by Axeon Limited is introduced with a feasibility study. To be able to visualize and easily understand the way the VDS works, low dimensional inputs are used to create illustrative examples. To finish this chapter the VDS is compared with other systems trying to solve the same problems. In this chapter the variables y and u denote the output signal and the input signal respectively. 3.1 Introduction The VDS deploys a SOM algorithm in hardware through a software development system. The hardware system, used for this thesis, consists of a PCI version of the VP. It consists of 256 RISC processors, each representing a neuron in a SOM. The architecture allows for the computations to be done in parallel, which decreases the working time substantially. The dimension of the input vectors, i.e. the memory depth, is currently limited to 16. Configuration of the processor is quite flexible. It can be divided, in up to four separate networks that can work in parallel, individually or in other types of hierarchies. There is also a possibility to swap the memory (during operation), enabling more than one full size network to run in the same processor. The software is simple to use and provides good possibilities to handle and visualize data. It also includes some data pre-processing tools as well as output functions. In the software system, the VP is used with networks sizes of 256 neurons and smaller. To test larger networks, a software emulation mode, where the maximum size is 1024 neurons and the memory depth is 1024, is available. The emulation mode do not use the VP and therefore increases the calculation times considerably. 13 R The Vindax Development System 14 The VDS is able to solve classification problems, described in section 1.2, according to section 3.3. Rewriting of the conditioning and function estimation problems, enables a solution to these problems as well. Performing this is described in sections 3.2 and 3.4. Development of applications using the VDS involves 4 steps: 1. Data collection and pre-processing 2. Training 3. Labeling 4. Classification Data collection and pre-processing are described in chapter 4. During the training step the network is organized using the SOM algorithm presented in section 2.1. The labeling and classification steps are described below. 3.1.1 Labeling step The training has formed an approximation of the input signal distribution and this approximation should now be associated with output signals, be labeled. The labeling step uses the same input data as the training step, together with measured output data. It requires the correct output value to be known for each input. The neuron responding to a particular input sample should have an output value matching the measured output value. Presenting the data to the network makes it possible to assign values for each neuron. As there are more input samples than neurons, each neuron will have many output value candidates. If these candidates are not identical, which is most often the case, some kind of strategy for choosing which one to use as label is needed. Either, it is chosen manually for each neuron, or one of the automatic methods of the VDS is used. There are different kinds of methods suitable for different kinds of problems and they are discussed in sections 3.2-3.4 in its contexts. 3.1.2 Classification step If correctly labeled, the network can be used for mapping new inputs using the classification step. This is done in a very straightforward manner and can be seen as function evaluation. For each new input a winning neuron is selected as described in section 2.1. The label of the winner is the output of the network. 3.2 Conditioning of input signals 3.2 15 Conditioning of input signals To solve a conditioning problem, according to section 1.2, the distance measurement has to be involved. As discussed in section 2.1, the distance from the input sample to each neuron is calculated when a winner is selected. It is this distance that can be used for conditioning of signals. The idea is to save the maximum distance, between the neurons and the input signals, that occurs when the network is presented with input signals from a fault free system. An error has probably occurred if the maximum distance is exceeded when new inputs are presented to the network. To do this, the network is first trained on data from a system with no faults. After that, the same data are fed to the network again and the distance between the input signal and the winning neuron is saved for each sample. The maximum distance, for each neuron, is used as label. During the classification step, the label value of the winning neuron is subtracted from the distance between this neuron and the current sample to get the difference between the distance of the current sample and the maximum distance that occured during the training. Previously unseen data have been presented to the VP when the difference is larger than zero. Assuming that the network has been trained on representative (see 4.1) data, this implies that an error has probably occurred. Applying this on the real problem is as simple as on a constructed one. Therefore see chapter 5 for a real world example that illustrates the use of this technique. R The Vindax Development System 16 3.3 Classification of cluster data To explore the capabilities to solve classification problems, see section 1.2, with VPs, an example with clusters in three dimensions is made. 3.3.1 Introduction Three, Gaussian, clusters are created with the mean values in three points. They are labeled: blue, red and black. Different variances are used to create two sets of input data, shown in figures 3.1-3.2. Printed in black and white it is hard to see which cluster the data points belong to. The purpose of the figures is however to show how the clusters are distributed. Figure 3.1. First cluster setup with every tenth sample of the data plotted from different angles. The task is to take an input signal, composed of the three coordinates for each point, and classify it as blue, red or black. To do this, the VP is first trained (using all 256 neurons). This creates an approximation of the input signal distribution that, hopefully, contains three regions. This can be visualized by looking at the distance between neurons using the visualization functionality in the VDS. This is very useful when the input signal has more than three dimensions, as it is not possible to visualize the actual data itself. In this way it is possible to get an idea of how the neurons are distributed over the input space also when working in higher dimension. Then the SOM is labeled to be able to give some output. This is the part where settings for the classification are made. 3.3 Classification of cluster data 17 Figure 3.2. Second cluster setup with every tenth sample of the data plotted from different angles. When labeling the SOM, there will be conflicts when neurons are associated with more than one input. For example, if a neuron is associated with inputs with label red 60 times and black 40 times the neuron will be named multi-classified after the labeling process in the VDS. This situation has to be resolved and there are different methods for this. In the VDS, there are six different multi-classification resolving methods: 1. Most frequent 2. Activation frequency threshold 3. Activation count threshold 4. Activation frequency range threshold 5. Activation count range threshold 6. Manually The first two are most appropriate for the situations occurring in this thesis, which is why the other methods not are handled here.1 1 The third method is similar to the second one but harder to use as the number of activations depend on the amount of data. There is no reason to use a range (method 4-5) instead of a threshold as the neurons with a high number of activations definitely should be labeled. Finally the manual method is very time-consuming and does not really provide an advantage compared to the first two methods. 18 R The Vindax Development System Using the first method makes the VDS assign neuron labels that occurred most frequently. In the example above, this would mean that the neuron would be classified as red. With the second option it is possible to affect the accuracy of the classification. There will be a trade-off between the number of unclassified inputs and the accuracy of the ones classified. If the activation frequency threshold is set to a high value, say 90 percent, the classification will be accurate but there can be many inputs that are classified as unknown. In the example above, this would mean that the neuron is not classified and the threshold has to be lowered to 60 percent to classify the neuron as red. Figure 3.3. Possible label regions for input classification when using data from figure 3.2 as input. This is visualized in figure 3.3. In between the clusters there are inputs that belong to one cluster but are situated closer to another cluster center. For example, there are many inputs in figure 3.2 that belong to the red cluster but are closer to the black cluster center (and vice versa). Depending on how high the frequency threshold is set for the labeling, these inputs will be classified as black or not classified at all. With a high threshold there will be many neurons without a label. This causes the unclassified region to be big and the accuracy, of those classified, to be high. The drawback of this is that many inputs will not be classified at all, e.g. trying to classify between an engine with no faults and an engine with air leakage will return many answers saying nothing. 3.3 Classification of cluster data 3.3.2 19 Results and discussion The results from using the VDS to perform this task are summarized in tables 3.13.4. True value blue True value red True value black Classified as blue 95.74% 0.00% 0.00% Classified as red 0.00% 95.72% 0.00% Classified as black 0.06% 0.06% 98.41% Not classified 4.20% 4.16% 1.59% Table 3.1. Classification on first cluster (figure 3.1) data with no multi-classification resolution. True value blue True value red True value black Classified as blue 99.75% 0.13% 0.07% Classified as red 0.00% 99.74% 0.50% Classified as black 0.26% 0.13% 99.42% Table 3.2. Classification on first cluster (figure 3.1) data using most frequent as multiclassification resolution. True value blue True value red True value black Classified as blue 77.72% 0.07% 0.28% Classified as red 0.13% 84.13% 0.28% Classified as black 0.19% 0.13% 70.25% Not classified 21.96% 15.68% 29.19% Table 3.3. Classification on second cluster (figure 3.2) data with no multi-classification resolution. True value blue True value red True value black Classified as blue 96.76% 0.59% 2.46% Classified as red 1.10% 96.80% 1.12% Classified as black 2.14% 2.61% 96.42% Table 3.4. Classification on second cluster (figure 3.2) data using most frequent as multiclassification resolution. Using no multi-classification resolution, i.e. a frequency threshold of 100 percent, gives high accuracy but many unclassified samples. Instead, using most frequent resolution gives a high classification rate but with less accuracy. Which one to use depend on the application where it should be used. These two are extremes where no multi-classification resolution leaves many neurons without a label whereas the most frequent resolution method labels all neurons, except for the ones with equally many activations from two or more labels. The frequency threshold can be lowered to get a result that would lie somewhere in between these. I.e. a little lower accuracy but a little lower not classified signals. R The Vindax Development System 20 3.4 Function estimation This section investigates the capabilities to estimate functions, see section 1.2, with the VDS. 3.4.1 Introduction First a simple linear system is created: y1 [t] = 600u1 [t] + 750u2 [t], (3.1) with uniformly distributed random inputs u1 , u2 ∈ [0, 2]. These are used to train the VP. This is a very simple problem; parameter estimation. It is questionable whether to use a neural network to solve this as simpler methods will probably produce better results. It is however suitable to illustrate the characteristics of the VDS. The labeling of the VP will be done by, for each neuron, choosing the mean value of all label candidates. This is the most common method to select labels when estimating functions. It is appropriate as there will be a very large number of labels, i.e. measured outputs, and choosing the mean value will give a nice approximation. Three network sizes are used to estimate the system: - 64 neurons - 256 neurons, the physical limit of the VP - 1024 neurons, the limit of the software emulation mode Increasing the number of neurons also increases the number of output values (one output value per neuron). Used on the same problem a larger network should lead give a higher output resolution hence reduce the errors. As a second step, noise is introduced to the system. The network is going to be trained and labeled using noisy signals. The classification can then be compared with the true output values to see whether the network is performing worse or if it is a robust method. The noise, band-limited white noise, is added to the input and output signals. Different amplitudes will be used, as the input signals are much weaker than the output. The amplitude is chosen according to table 3.5. Signal u1 u2 y Noise-amplitude 5 · 10−4 5 · 10−4 100 Table 3.5. Noise amplitudes 3.4 Function estimation 21 Three non-linear systems are compared to the linear system. These are: y2 [t] y3 [t] 2 = 1200 · atan(u1 [t]) + |18 · u2 [t]| √ = 1200 · atan(u1 [t]) + 100 · e2· u2 [t] (3.2) (3.3) plus a third system identical to system (3.2) except for a backlash with a dead-band width of 200 and an initial output of 10002 , that is applied on the output signal. All systems have (different) uniformly distributed, random inputs, u1 , u2 ∈ [0, 2]. 3.4.2 Results and discussion The results are summarized in table 3.6 with Root Mean Square Errors, RMSEs, and Absolute Errors, ABSEs for the estimations. See appendix A for details of RMSE and ABSE. System System (3.1) 64 neurons 256 neurons 1024 neurons 256 neurons (noisy signal) System (3.2) without backlash with backlash System (3.3) RMSE mean ABSE max ABSE 70.9 35.3 18.5 59.2 29.6 15.6 187.8 93.2 51.5 254.0 199.2 1069 39.8 83.4 41.2 31.2 68.4 33.3 164.5 283.1 148.4 Table 3.6. Statistics after classification with VDS on the four systems examined in this section. Systems (3.2) and (3.3) are all with 256 neuron networks and without noise. With a SOM of 256 neurons a result shown in figure 3.4 is achieved. The estimated output values in the figure, are gathered at a number of levels. I.e. the output from the VP is discrete. It is hard to see in the figure, but there are actually 256 different levels, corresponding to the number of neurons in the SOM. Looking at how the SOM is distributed over the input space, the output levels are denser towards the middle. When trained, the SOM approximates the probability distribution of the input signal. In this example two uniformly distributed signals are used as input. Together these inputs have a higher probability for values in the middle range of the space. Extreme values have less probability and hence the look of figure 3.4.3 R R Matlab /Simulink help for details classical example in probability theory is throwing dices. Each throw is uniformly distributed among the results. Taking two throws in a row, the probability to get an outcome of 6 is higher than the probability to get for example 2. That is very similar to the problem that is approached here. 2 See 3A 22 R The Vindax Development System Figure 3.4. Correlation plot with estimated values compared to the true values of system (3.1) using a network of 256 neurons. The straight line shows the ideal case where the network has infinitely many neurons and is optimally trained and labeled. Figure 3.5. Correlation plot with estimated values compared to the true values using a network of 1024 neurons applied on system (3.1). The straight line shows the ideal case where the network has infinitely many neurons and is optimally trained and labeled. 3.4 Function estimation 23 Comparing figure 3.4 with figure 3.5, the horizontal lines are closer together and also shorter when the SOM is larger. This indicates smaller errors and the values in table 3.6 confirms this. The horizontal lines in figure 3.4 are approximately twice the length of the lines in figure 3.5. The RMSE and the mean ABSE are also twice as big with a 256 neuron SOM, see table 3.6. Looking at the results when using only 64 neurons shows that these values are, approximately, twice as large as for 256 neurons. All together the results indicates that, for this problem, the error size is reduced with the square-root of the network size. This is probably problem dependent and more tests are needed to reveal real relationship. Figure 3.6. Correlation plot with estimated values compared to the true values using a SOM of 256 neurons with noisy signals applied on system (3.1). The straight line shows the ideal case where the SOM has infinitely many neurons and is optimally trained and labeled. As noise is introduced to the problem, the SOM performs a lot worse. Figure 3.6 shows the correlation plot and the estimation error is much bigger now. In comparison to the noise free estimation error, there is a six times higher error made by the SOM now. The horizontal lines are distributed the same way compared to figure 3.4, but they are longer. This points out that there are more inputs that are misclassified now than earlier, increasing the error. This concurres with the values in table 3.6. Looking into non-linearities, the results from the systems (3.2) and (3.3) do not differ much from the linear case. This can also be seen when comparing the topmost graph in figure 3.7 with figure 3.4. There is no big difference between a 24 R The Vindax Development System Figure 3.7. Correlation plot between true and estimated values for system (3.2). Without backlash (top figure) and with backlash (bottom figure). The bottom figure definitely shows worse results than the top figure. The straight line shows the ideal case where the network has infinitely many neurons and is optimally trained and labeled. linear and non-linear system as long as it is static. The reason for this is that the SOM does a mapping. The addition of the backlash to system (3.2) decreases the VDS ability to model the system as can be seen in table 3.6 and when comparing the two plots in figure 3.7. A SOM can’t handle a backlash. The backlash output is dependent on the direction from which the backlash area is entered. The SOM system is static, so it only looks at current signal values, and therefore it doesn’t have the required information to handle dynamic systems, i.e. systems that depend on previous values. See section 3.5 for more about dynamic systems and SOMs. 3.5 Handling dynamic systems 3.5 25 Handling dynamic systems In this section, a method to estimate dynamic systems with SOMs is described. Although a function estimation problem is used to exemplify this, the method apply to conditioning and classification problems as well. In [4], the estimation of a dynamic system is done automaticly by modification of the SOM-algorithm. In short, the output signal is used as an extra input signal during the training step to create some short-term memory mechanism. The technique is called Vector-Quantized Temporal Associative Memory. This solution is not possible in the VDS without remaking of the core system. Another, manual, way of handling dynamic data is adopted instead. Lagged, input and output, signals are used as additional input signals to incorporate the information needed to handle dynamics. 3.5.1 Introduction The SOM can be seen as a function estimator, estimating the function f (·) in y(t) = f (u(t)), i.e. the output at time t only depends on input signals at time t. By adding historical input and output values to the input vector, the method can estimate functions on the form y(t) = f (u(t), u(t − 1), ..., u(t − k), y(t − 1), ..., y(t − n)) In this way, the dynamic function becomes, in theory, static when enough historical values are used. Knowing that the SOM method has no problem with non-linearities this leads to that, in theory, given enough historical signal values all causal, identifiable systems can be estimated with this method. This is only limited by the discrete output, there can only be as many output values as there are neurons. This theory is illustrated using the discrete system y[t] = u[t], (3.4) using both Gaussian and uniformly distributed random signals as input u. The VP is trained on the signal u[t] and labeled with y[t]. This example also shows upon differences when using different kinds of distributions. When this is verified, a time shift is introduced to the system; y[t] = u[t − 1] (3.5) Exactly as before, the processor is trained on the signal u[t] and labeled with y[t]. It should be impossible to approximate the system because of the random data input. The VP is not able to guess the value of y[t] when just knowing u[t] and not u[t − 1] as there is no correlation between them. Then the VP is trained on the signal u[t−1] and labeled with y[t]. This problem is identical to the first problem, just as a change of variable, and there should R The Vindax Development System 26 be no problem to estimate the system with the same accuracy as the system in equation (3.4). A situation where the exact dynamics of the system are known will not likely appear in practice. Therefore a test is performed to see what happens if the VP is over-fed with information. The processor is trained with both u[t] and historical values of u[t] as input. 3.5.2 Results and discussion Table 3.7 summarizes these results. System y[t] = u[t] y[t] = u[t − 1] Input signals u[t] u[t] Gaussian u[t] u[t − 1] u[t], u[t − 1] u[t], ..., u[t − 3] max ABSE 2.6 38.1 134.7 2.0 12.7 89.3 mean ABSE 0.4 0.4 64.0 0.4 4.0 15.6 RMSE 0.5 0.8 73.9 0.5 4.8 19.6 Table 3.7. Results of function estimation using the VDS on simple time lagging examples. Starting with the system in equation (3.4), the results show that the VP has no problem doing this estimation using a uniformly distributed random signal as input. The correlation plot in figure 3.8 visualizes how well the system is estimated and the values in table 3.7 are very good. Figure 3.8. Correlation plot of correct vs estimated values using a uniformly distributed input signal. Comparing figure 3.9 with figure 3.8 shows the differences between using a 3.5 Handling dynamic systems 27 Gaussian and a uniformly distributed signal as input. The estimation at extreme values are worse with a Gaussian signal. The reason for this is that the SOM algorithm approximates the distribution of the training data. Using an uniformly distributed input signal will spread the neurons evenly over the input space, while using Gaussian input more neurons will be placed in the center of the input space and less neurons at the perimeters. Less neurons, in an area, gives a larger error when discretizising the input signal and most often results in a larger error in the output signal, which can be seen in figure 3.9. Figure 3.9. Correlation plot of correct vs estimated values using a Gaussian input signal. Trying to estimate the system in equation (3.5), the result again confirms the expectations. The correlation between the correct and the estimated values of y is ∼ 0.0043, the system could not be estimated. As proposed, this is solved by lagging the input signal. Now the accuracy is almost identical to the first estimation. The small difference is due to randomness of the input signals. The VP is clearly confused when the dynamics is not exactly matched, as can be seen when comparing figure 3.10 with figure 3.8. Here both u[t] and u[t − 1] are used as inputs. The result shows that it is possible to estimate the system (3.5) although not as good as when using only u[t − 1] as input. This is because the SOM method uses unsupervised learning. It does not know, during training, that the extra dimension in the input space is useless. This results in a lower resolution. Using more historical signals as input gives even worse results. This shows the importance of having good knowledge of time shifts and other dynamics in the system. 28 R The Vindax Development System Figure 3.10. Correlation plot of correct vs estimated values when estimating y[t] = u[t − 1] with the VP. The processor is given u[t] and u[t − 1] as inputs. 3.6 Comparison with other estimation and classification methods 3.6 29 Comparison with other estimation and classification methods In this section a comparison is made between the VDS and some other simple methods to get an idea of the performance of the VDS. Classification and function estimation are the problems chosen as suitable for a comparison. 3.6.1 Cluster classification through minimum distance estimation The classification of cluster data with the VP can be compared with a simple R . The idea is to estimate the cluster centres by algorithm implemented in Matlab taking an average over the input training data. These averaged centres will then act as reference points to which a distance is calculated when an input data point should be classified. The algorithm is simply4 : 1. Estimate each cluster centre by taking the mean value of all input data that belong to each cluster 2. Calculate the distance from each new input data point to all the cluster centres 3. Classify the input as belonging to the closest cluster centre Classified as True value blue True value red True value black blue 97.10% 0.50% 1.84% red 0.86% 98.18% 1.34% black 2.04% 1.32% 96.82% Table 3.8. Classification results using the algorithm described. This algorithm is tested on the second set of cluster data (figure 3.2). The results are presented in table 3.8 and are compared with table 3.2 where most frequent multi-classification resolution is used. The performance is slightly better in the VDS approach. This shows that the VDS has sufficient capabilities to classify data. The example was a simple one, why the comparing algorithm was quite easy to design. If however noisy signals are used or the clusters are not Gaussian, there might be a lot more problems to find a suitable algorithm, mainly because of that the cluster centres are difficult to estimate. The VDS however, is used the same way, which is why it has a strong advantage in simplicity. R The Vindax Development System 30 Figure 3.11. Correlation plot with estimated values compared to true values using least squares estimation. 3.6.2 Function estimation through polynomial regression The estimation of the system in equation (3.1) with the VDS in section 3.4 is put R . into perspective with a least squares parameter estimation with Matlab With no noise added to inputs and outputs the estimation problem is deterministic. Therefore only the case with noise is interesting for a comparison. With knowledge of the system the estimation problem is formulated as: y=Θu (3.6) where Θ = [θ1 θ2 ] are the parameters to be estimated and u = [u1 u2 ]T the input signals. This is an over determined equation system that is solved with least square minimization. Using exactly the same data set as in the noise example in section 3.4 but R using Matlab ’s polynomial regression the result shown in figure 3.11 is achieved. Comparing this to figure 3.6, the polynomial regression look worse and table 3.9 confirms this. The VDS performs better than a polynomial regression. Method VDS Polynomial regression Max ABSE 1078 1371 Average ABSE 189.3 237.1 RMSE 254.0 297.3 Table 3.9. Errors when using polynomial regression compared to the results from the VDS. 4 The first step of this algorithm correspond to the training and labeling steps in the VDS and steps 2-4 to the classification step. Chapter 4 Data collection and pre-processing Collecting and pre-processing data are two subjects that due to their complexity could be subject for a thesis of their own. Therefore, this report does not handle them in depth and only relevant issues are discussed. It is however important to stress that data collection and pre-processing are two key factors to succeed in developing applications with SOMs. 4.1 Data collection The entirely data driven property of the SOMs, and therefore the VDS makes data collection extremely important. The training of the SOM is affected by the way data are collected. The • amount of data • sampling frequency • quality of data • measurement method are issues that have to be considered. 4.1.1 Amount of data A rule of thumb is that during the training sequence the VDS should be presented to, at least, an amount of data points 500 times larger than the size of the SOM1 . This is to ensure that the network approximates the probability density function 1 There is no risk of over-training (over-fitting). The SOM method has no such problems. 31 32 Data collection and pre-processing of the input data. In a standard size VP with 256 neurons, this means 128000 samples. When not enough data is available, training data can be presented to the SOM in a cyclic way until the required number of data points have been processed2 . However, doing this may cause information to be lost, e.g. input data with gaps are not representative. In addition, enough data for both training and validation has to be available. 4.1.2 Sampling frequency The sampling frequency has to be considered when dealing with dynamics. As was shown in section 3.5, dynamics can be handled by lagging input and/or output signals. How many and which old values should be used, depend on the dynamics of the system combined with the sampling frequency. With a high sampling frequency, many old values are needed to cover the dynamics of the system. On the other hand, a low sampling frequency, might cause dynamic information to be lost. In this thesis a sampling frequency of 10 Hz is used. This is the sampling frequency that is available in the test cell equipment used. 4.1.3 Quality of data The quality of data is the most important factor to get good results. There are many aspects of this, but the two most important are that data have to be representative and correctly distributed. This is, again, due to the fact that the SOM-algorithm is data driven and cannot extrapolate or interpolate. The VDS will, most likely, generate unreliable output if variables are not representative and correctly distributed. Representative data contain values from the entire operating range. This means that the variables have varied from their maximum to their minimum values during the measurement. All possible combinations of variables should also be available to ensure representativeness. Data from variables that are not representative should either be omitted or used for scaling/normalization. E.g. the ambient air pressure variable is hard to measure in its full range. Therefore it is better to use it to scale other pressure variables that depend upon it. The distributions of the input variables depend on the test cycle used when collecting data. Therefore the test cycle has to produce data that are suitable for the intended application. There are two different categories that are used: uniform and real life distributions. Uniform distribution The distribution should be uniform when estimating functions and doing condition monitoring. In both these cases it is important that 2 In this thesis the data are also randomized to ensure that they are not grouped in a bad manner. 4.1 Data collection 33 the application has equal performance over the entire input space, why input data should to be uniformly distributed. Real life distribution A classification problem does not require equal performance over the entire input space. It is sufficient, and often advantageous, to make the classification in areas where the input signal resides frequently in a real situation. Although these are the distributions wanted for all input signals, often not all of them can be controlled individually, i.e the boost temperature depends on the torque and the engine speed and can not be controlled manually. This is something that needs to be considered from case to case when constructing the cycle used for gathering data. 4.1.4 Measurement method Data are collected in an engine test cell where an engine is run in a test bench. In the bench an electromagnetic braking device is connected to the engine. This device brakes the engine to the desired speed. Different operating conditions can then be obtained. A lot of variables can be measured but the ones in appendix B are the ones measured for this thesis. As there are deviations between engine individuals, only one engine is used. In the laboratory, the engine is controlled with test cycles/schemes to simulate different driving styles and routes. It is important that the cycle generates appropriate data that suite the purpose, as discussed in section 4.1.3. For this thesis, two different test cycles are chosen, the Dynamic Cycle, DC, and the European Transient Cycle, ETC. A DC provides the possibility to control variables using different regulators, here the engine speed and torque are used, whereas the ETC is a standardized cycle. Aiming for data from the entire dynamical range of the engine, a randomized DC is created. The cycle starts with a torque of 600 Nm and an engine speed of 1350 RPM and then desires new torques and engine speeds twice per second. These values are randomly set under the constraint to prevent the engine to overload3 . This is followed by not allowing the speed to vary with no more than 100 RPM/second and the torque with 1000 Nm/second. In this way, the cycle will take the engine through a large amount of operating conditions. Two different change rates is used in this thesis. The first sets new desired values two times per second and runs for 2000 seconds. With a sampling frequency of 10 Hz 20000 samples are collected per cycle. The other sets new desired values five times per 10 seconds and runs for 20000 seconds. The same sampling frequency is used hence 200000 samples are collected. The ETC is a standardized cycle that contains three phases: 3 An overload situation can occur if the engine is running at a very high speed. If the demanded speed drops from a high level to a much lover level in a short time, for example going from 2500 RPM to 1000 RPM in one second, the breaking of the engine will cause the torque to peak. 34 Data collection and pre-processing 1. city driving - many idle points, red lights, and not very high speed. 2. country road driving - small road with lot of curves driving 3. high-way driving - rather continuous high speed driving The data from this cycle are not as uniformly distributed as the DC but contain other types information about the dynamical behavior of the engine. Sequences such as taking the engine from idle to full load are captured in the ETC cycle but not in the DC. The ETC cycle takes approximately 1800 seconds. Again 10 Hz is used as sampling frequency why approximately 18000 samples are collected. Measurements are performed in three different cases, listed in table 4.1. They are performed on different occasions (read days) why the quality of the collected data has to be examined closely. Both the DC and the ETC are used, thus six sets of measurements are collected. Case 1. 2. 3. Description Fault free engine Engine with a 11 mm leakage on the air inlet pipe (after the intercooler, see figure 2.2) Engine with reduced intercooler efficiency to 80% Table 4.1. Description of the three different measurement cases. 4.2 Pre-processing The pre-processing of data can be divided into specific and general pre-processing. The specific pre-processing deals with how to generate proper files for use as input to the VDS. This can be very different depending on how the data files are structured. In addition to this, data may have to be filtered, time shifted, differentiated or other. This is also, more appropriately, called feature extracting and is more a part of the problem solving method.4 When the input files are in order, data are ready to be used as input to the VDS but not to the VP. The general pre-processing involves file segmenting, statistics generation and channel splitting where different toolboxes in the VDS can be used as well as other software. Also, the general pre-processing includes transforming data to a form the VP handles. The input interface of the VP uses 8-bit unsigned integers. Due to this the data, usually provided as floats, are scaled to the range 0-255 and then converted to unsigned integers. It is convenient to use the VDS for these tasks. One variable that is handled in the same way for all applications (chapters 5-7) is the ambient air pressure. The mean value and the standard deviation in table 4.2 4 The feature extracting is handled in the three chapters 5-7 where the developments of the applications are discussed. 4.2 Pre-processing Case Mean value [kPa] Standard deviation [kPa] 35 1 97.3 0.0603 2 98.1 0.0448 3 99.7 0.0604 Table 4.2. Ambient air pressure mean values and standard deviations for the three cases, see table 4.1, using the DC. reveals that this variable is not representative. The pressure values collected, stem from the ambient air pressure outdoors, which varies from day to day5 . Measurements from all different altitudes and weather conditions are not available, and therefore this variable is used to normalize turbo boost pressure, which is the only other pressure signal used. 5 All cases are measured in different days. This will cause the VDS to produce very impressive results if, for example, a distinction is to be made between case one and two. The VDS will only use the ambient pressure when deciding the winner. The results are not applicable on a real situation where the ambient air pressure will vary more. 36 Data collection and pre-processing Chapter 5 Conditioning - Fault detection In this chapter a fault detection problem is treated. This is done using the condiR tioning method in the VDS. The goal is to test if Vindax can be used to detect some faults in a diesel engine. 5.1 Introduction The functionality used for fault detection is described in section 3.2. This method is based on the distance measurement between neurons and input data that can be extracted from the VDS. When a combination of the engine states produces a larger distance between the winning neuron and the input signal than occurred during training, an error is detected, or, at least, an unknown situation has occurred. 5.2 Method Data from a DC and an ETC collected according to chapter 4, are used. The system is trained and labeled with 75% of the data from the engine with no faults. The verification data consist of the remaining 25% of the fault free data plus data from an engine with an air leakage and an engine with reduced InterCooler, IC, efficiency. The following variables are used as input: • moving average over 10 values of boost pressure • moving average over 100 values of boost temperature • moving average over 100 values of fuel value • moving average over 100 values of engine speed 37 38 Conditioning - Fault detection The smoothing, moving averages, is done to solve the problem with dynamics, see section 6.2 for a more thorough discussion. It would also be possible to delay signals to solve dynamic problems. However, since different faults show with different delays and no obvious delays can be detected, that approach is not used. 5.3 Results and discussion The results from the conditioning are summarized in table 5.1 and visualized in figure 5.1. Healthy engine Air leakage IC reduced % 0.9 11.8 35.7 mean 4.3 9.1 33.9 Table 5.1. Results during fault detection with a fault free engine, an engine with an air leakage and an engine with reduced IC efficiency. The second column shows the percentage, of all samples, the samples where the difference is above zero. The last column shows the mean value of the difference values when the difference is above zero. Values above zero indicates an unknown input combination, i.e. a probable fault. This shows that faults can be detected. As explained in section 3.2 difference values above zero indicates a probable fault in the system. There are definitely a lot more samples with a difference above zero when there is a fault in the system. The differences are also in general more above zero in the faulty cases than in the fault free case. In the results presented it is clear that the fault detection performs better when the error is a reduced IC efficiency than when it is an air leakage. This can be explained by comparing the chosen input variables to variables chosen in chapter 6. They are similar to the variables chosen for isolation of reduced IC efficiency faults, which explains the better performance. It would be the other way around if the chosen variables would be more similar to the ones for isolation of an air leakage. Therefore a change in variables towards this would improve the fault detection when there is an air leakage. This is however not a solid approach as the idea of conditioning is to detect new types, i.e. unseen, faults. Input variables should not be adapted to a specific problem but should be quite general to ensure that unseen faults are detected. 5.3 Results and discussion 39 Figure 5.1. Results during fault detection with (from top to bottom) a healthy engine, an engine with air system leakage and an engine with reduced IC. Values above zero indicates an unknown input combination, i.e. a probable fault. 40 Conditioning - Fault detection Chapter 6 Classification - Fault detection and isolation This chapter describes fault detection and isolation with the VDS. The purpose is to demonstrate how to develop such an application and to give an indication of R for this task. The goal is to test results that can be expected when using Vindax R whether or not Vindax is capable to produce sufficient information to make a diagnosis on a diesel engine inlet air system. 6.1 Introduction As was described in section 4.1.4, data from three different cases are available. In engine diagnostics it is desirable to isolate the faults in these cases, i.e. distinguish a fault free engine from an engine with a leakage in the inlet air system or a reduced intercooler efficiency. This is what the VDS is going to be used for in this chapter. Two applications are developed to fulfill the purpose of this chapter; detection of a leakage and detection of a reduced intercooler efficiency. DC and ETC data from a fault free engine and an engine with implemented faults are combined and are used for input. For training and labeling 75 % of the data are used and the remaining part is used for verification. 6.2 Method To develop these applications the network is trained to recognize fault situations. The engine state will reveal these situations by the combination of its state variables, e.g. load, speed, oil and water temperature. All variables will not contain information about the fault and hence the selection of the proper variables is vital. 41 42 Classification - Fault detection and isolation 6.2.1 Leakage isolation When there is an air leakage, the compressor is not able to build up the intake manifold pressure, i.e. the boost pressure signal, as fast and to the same level, compared to the fault free engine. Therefore, the boost pressure signal and its derivative are suitable as inputs. Additional state variables are the amount of fuel injected, i.e. fuel value, with its derivative. This variable is selected because the amount of fuel injected is correlated with the boost pressure. See section 2.2.4 for details. The following signals are used as inputs and pre-processed as follows: • boost pressure Smoothed by taking a 10 sample average and then normalized with the ambient air pressure. • boost pressure derivatives The difference between the current boost pressure and the values at t − 4 and t − 8 are used as inputs to incorporate the slope. These two where chosen simply by examining the slope of the boost pressure and trying to incorporate as useful information about the derivative as possible. • fuel value Smoothed by taking a 10 sample average delayed by 7 samples to account for dynamics, see section 3.5. The time constant, 7 samples, is estimated by maximizing the correlation between the fuel value and the boost pressure signals. • fuel value derivatives The differences between the current fuel value and the values at t − 4 and t − 8 are used as inputs to incorporate the slope. These two where chosen of the same reasons as with the boost pressure derivatives. The smoothing reduces the effect of noise and dynamic behavior. This makes a total of six input signals. Another important variable is the engine speed, however not suitable for input. Experiments using engine speed as an input variable, give worse results1 . The information that however resides in the engine speed signal is incorporated in another way. As the boost pressure depends on this variable, but not as strongly as on the fuel value, the input space is divided into three areas: 1. Low speed : 0 rpm → 1250 rpm 2. Middle speed : 1150 rpm → 1650 rpm 3. High speed : 1550 rpm → ∞ rpm This is done to reduce the effect that the engine speed has on the boost pressure. 1 Why this is the case is not obvious. Maybe the VP is confused as more inputs are introduced. But it could also be that the dynamics are not properly handled, i.e. the signal lagging is not correct. Experiments could be done to try to solve this, but the solution presented here gives satisfactory results. 6.2 Method 43 In each range, the fuel value will become more dominant as the engine speed does not differ as much. The ranges are chosen by looking at the distribution of the engine speed signal. The test cycles used, DC and ETC, generate more measurements of the engine speed in the mid range. This is due to the fact that the regulator used to control the engine speed is not optimal, hence will the distribution be slightly ‘normalized’. In addition, the ETC cycle is not designed to generate a uniform distribution of the engine speed. A separate network is used for each area. The size of them is 256 neurons except for the middle range where a 1024 neuron network is tested as well. This is done because the measurements contain more input signals in this area. There are approximately four times as many samples here, compared to the other ranges. Trying to increase the network size in the other ranges will probably not improve the results much because the size is already sufficient. 6.2.2 Reduced intercooler efficiency isolation Reduced intercooler efficiency, will result in a different behavior in the intake manifold temperature, i.e. the boost temperature signal. When the efficiency is reduced, the air flowing through the intercooler will not be cooled as efficiently as with a fault free engine. Both engine speed and fuel value, i.e. amount of fuel injected, signals affect the boost temperature. In the process in section 2.2.4, it is described how an increase in fuel value causes the temperature to rise. This puts more pressure on the intercooler to cool the air. Also an increase in engine speed would mean that more air would be flowing through the intercooler. These two should therefore be the states that the boost temperature depends upon. The relationship, in time, between the fuel value and engine speed states and the boost temperature cannot be concluded from the figure 6.1. It is difficult to estimate a time constant and solve the dynamic behavior by delaying input signals. Instead of delaying signals, a very large average is used. This means that the signals are smoothed and the time delay will not be as significant. This is actually how the temperature changes in the engine, hence a way to solve the dynamic problems. Therefore, these three signals are used as inputs and pre-processed as follows: • boost temperature Smoothed by taking a 100 sample average. • fuel value Smoothed by taking a 100 sample average. • engine speed Smoothed by taking a 100 sample average. Figure 6.2 shows the signals after smoothing. The correlation between the three states is much higher now. The actual changes in fuel value can be recognized in the 44 Classification - Fault detection and isolation Figure 6.1. Boost temperature from an engine without faults during a DC. The boost temperature is presented in both graphs and compared with, top: the fuel value and bottom: the engine speed. boost temperature curve. It is harder to see the same relationship between engine speed and boost temperature but the improvement in results indicates a higher correlation. There is a risk that information is lost when the signals are heavily smoothed, but experiments show good results. As with leakage detection the problem is divided into three intervals to increase the resolution. This time, the intervals are in the boost pressure signal: 1. Low pressure: 0 kPa → 75 kPa 2. Middle pressure : 65 kPa → 155 kPa 3. High pressure : 145 kPa → ∞ kPa 6.3 Results and discussion Tables 6.1-6.2 show the results from the classification. This is achieved by using an activation frequency threshold of 80% for labeling; see section 3.3. This gives a high response frequency, still keeping the ratio between correct and incorrect 6.3 Results and discussion 45 Figure 6.2. Averaged boost temperature from an engine without faults during a DC. The boost temperature is presented in both graphs and compared with, top: the averaged fuel value and bottom: the averaged engine speed. classifications at a low level. Appendix C shows results when using 90%, 80%, 70% and 60% as activation frequency threshold as well as most frequent method. The results presented in tables 6.1-6.2, were found to be the best. Different frequency thresholds produce different kinds of information. For the purpose here, passing classification results to a diagnosis algorithm, a high ratio between correct and incorrect classifications has to be achieved. Also a sufficiently high response frequency, i.e. few non-classifiable signals, should be achieved to be able to make a diagnosis. 46 Classification - Fault detection and isolation true mode \ classified as Low rpm NF FAL Mid rpm (256) NF FAL Mid rpm (1024) NF FAL High rpm NF FAL Over all (256) NF FAL Over all (256/1024/256) NF FAL NF FAL Not cl. 13.80% 1.41% 0.94% 10.01% 85.27% 88.58% 38.92% 2.09% 3.18% 29.98% 57.89% 67.93% 46.76% 2.13% 4.65% 48.94% 48.59% 48.92% 68.95% 1.86% 4.93% 65.72% 26.12% 32.42% 37.44% 1.86% 2.89% 30.86% 59.67% 67.27% 41.81% 1.89% 3.70% 41.20% 54.48% 56.91% Table 6.1. Classification results with an activation frequency of 80% at labeling. N F = No Fault, FAL = Air Leakage. true mode \ classified as Low BP NF FRIE Mid BP NF FRIE High BP NF FRIE Over all NF FRIE NF FRIE Not cl. 48.85% 2.61% 3.89% 51.63% 47.25% 45.76% 66.58% 3.01% 2.41% 70.99% 31.00% 26.00% 69.60% 4.66% 2.48% 71.87% 27.93% 23.48% 59.91% 3.11% 3.03% 62.69% 37.07% 34.21% Table 6.2. Classification results with an activation frequency of 80% at labeling. N F = No Fault, FRIE = Reduced Intercooler Efficiency. Chapter 7 Function estimation - Virtual sensor In this chapter a function estimation problem is solved. The VDS is used as a R virtual sensor to estimate the engine torque. The goal is to see how well Vindax performs as a virtual sensor on a test problem. 7.1 Introduction When discussing the torque estimation problem only the DC data are used. This is because the quality of the ETC data that was acquired is not as high as for the DC data, i.e. the number of samples is too low.1 Two different DC runs have been used. First, the same cycle as in the conditioning and classification problems is used. Then, a DC that is slower and longer is used. The second cycle is run on a different engine. For training and labeling 75 % of the data are used and the rest are used for validation. Only data from engines with no faults are used. 7.2 Method As discussed in section 2.2, the torque value is connected to the amount of fuel injected into the cylinders, but also the amount of oxygen pushed into the cylinder is of interest. This leads to the use of the following variables: • Fuel value • Engine speed • Boost pressure 1 See chapter 4.1.4 for a description of the two test cycles. 47 48 Function estimation - Virtual sensor • Boost temperature where the last three are connected to the amount of oxygen. It is important to realize what using the fuel value, as one of the input signals, leads to. This signal is an input signal for the fuel system in the engine, specifying the amount of fuel that should be injected. If there is a fault in the system, e.g. a clogged injector, this is not what will be realized. This might lead to a fault intolerant torque estimation. 7.3 Results and discussion The quality of the estimation is measured using a comparison between the estimated torque value and the torque value used for labeling, i.e. the torque value measured in the test bench. This result is put in relation to the torque estimation done by the EMS. The best result, on the first data set, is achieved using fuel value, engine speed and boost pressure as input signals. The RMSE is then 146.5 Nm compared to 165.5 Nm for the estimation made by the EMS. If also boost temperature is used, which might be desirable to compensate for faults in the system, the RMSE increases to 158.5 Nm. On the second set of data the corresponding RMSE value is 90.1 Nm (using all four variables) compared to 66.9 Nm for the estimation by the EMS. Using 1024 Nm neurons lowers the RMSE value to 74.6 Nm. This set of data has slower changes and not as many transients as the first set of data. Two things can be seen that make the results worse. First, it is a resolution problem. It is not possible to get that much better result using only 256/1024 neurons. In addition the I/O-interface of the VP causes problems on resolution. The interface uses 8-bit integers for input/output which causes quantization errors. The second problem is unaccountable peaks in the torque signal, several hundred Nm in 0.1 seconds, from the test bench, as can be seen in figure 7.1. These peaks can not be explained2 and are of no interest, as they are too short to affect the truck. Therefore, they can be removed by a filter, that eliminates peaks that are larger than 100 Nm. This filter is used on the signal before using it for labeling and verification. Then the RMSE (without boost temperature) decreases to 67.0 with 256 neurons and 55.4 with 1024 neurons. Also the RMSE for the EMS. estimation decreases, down to 55.5. For the faster set of data, the estimation made by the VP is clearly better than the EMS estimation. For the slower set of data it’s a more even game, but here a R resolution issue in the Vindax can be seen. The increase in neurons from 256 to 1024 gives a clear increase in the accuracy of the estimation. This suggests that more neurons probably would improve the results even more. Also for the second, slower, set of data, more variables should probably be used. The set of data is collected from an engine equipped with an exhaust gas 2 They are probably a result of the test bench configuration and are not present when the engine is installed in a truck. 7.3 Results and discussion 49 Figure 7.1. Unaccountable peaks in the torque signal from the test bench. recirculation system that takes some of the exhaust gases and leads them into the inlet air system. This is a way to reduce the emissions, but it also results in that less oxygen in the air that is pushed into the cylinders. The amount of oxygen is one of the key factors used in the estimation so using more variables, incorporating the amount of oxygen lost, might improve the results for the slower set of data. 50 Function estimation - Virtual sensor Chapter 8 Conclusions and future work In this chapter conclusions from the work and suggestions of future work in the area will be presented. 8.1 Conclusions The probability density functions of the input signals determine the structure of the SOM. This is all according to the SOM theory and a confirmation that the VDS follows this. Depending on the application area, the distributions of input signals have to be analyzed before using it for training, i.e. the training data should be suitable for the purpose of the network. A desired distribution of data can however be extremely hard to achieve. Often measurements are done by controlling a limited number of variables. It is possible to control the distribution of these but it is hard, or even impossible, to control the distribution of the rest of the measured variables. An advantage, and disadvantage, of the SOM approach is that prior knowledge most often cannot be invoked to improve the results. Dealing with system identification usually starts with assuming the order of the system and then estimating parameters. With prior knowledge about the system it is sometimes very straightforward to estimate the parameters. But with a more complex system this becomes harder to do. Therefore this is both an advantage as well as a disadvantage. Some types of prior knowledges might be incorporated by smart choices of input signals, e.g. using derivatives instead of absolute values, combine signals into a new signals etc. Noise redundancy properties are hard to conclude. The numbers in table 3.6 showed a six times as high estimation error when the noise was introduced. Comparing with the polynomial regression there was a slightly better performance. More studies should be performed to investigate the connection between the noise and the estimation error. It is however clear that the signals should not be too noisy to get a good result. 51 52 Conclusions and future work Working with linear or non-linear systems is not much of a difference. This is the nature of the SOM network. It does a mapping of the input data independently of what the system is like. Sending the same input data to a linear and a nonlinear system gives the same network. In the end the system layout only affects the labeling. This is a big benefit of the SOM based methods. The SOM method is a mapping for pure static relationships. It has no ability to handle dynamic data. But, by feeding it with old input and/or output signals as input the problems goes from being dynamic to being static. This way SOM’s (including the VDS) can be used to solve dynamic problems. Dealing with dynamics, a problem is to find out which input (and/or output) signals that need to be lagged and with how much. Optimal is when all dynamics of the system are included but nothing more. Using signals that are irrelevant makes the estimation worse. The VP tries to adapt to information that is not relevant to the problem. As was mentioned in section 3.1, the thesis evaluated a PCI-version of the VP. The processor is accessed through the VDS. This creates a very fast system and the time required to train and label a network is negligible compared to the preprocessing. Using the software emulation mode is however very slow and usually takes more than an hour.1 The VDS has a strong advantage in visualization. When the VP is trained it is easy to visualize with different kinds of graphs how the training turned out. Then for example adjustments in data pre-processing can be made to get better performance. Conclusions so far have been general conclusions applicable to all usage areas. The following sections handle each area more specifically. 8.1.1 Conditioning The conditioning functionality of the VDS is questionable for detection of faults. The reason for this is because the VDS requires very precise handling of dynamics. An on beforehand known possible fault can, probably, be detected by designing a VDS application. But to say whether or not unknown errors will be recognized is impossible. Probably, this new error shows up in a specific state and this information has to be extracted in a proper way to be detected. There will be a problem if unseen data is presented to the VP as it will be classified as faulty. This means that all possible combinations have to be measured and used to train the VP if the conditioning is going to be useful. 8.1.2 Classification When working with classification problems, the VDS has a strong advantage in simplicity. The method to design an application is basically the same, no matter the dimensionality or other aspects that complicates the problem. This also makes 1 This depends on the performance of the PC that is used. 8.2 Method criticism 53 it possible to get results very fast, and it is also very easy to change classification boundaries to try out a good combination. There is a high potential in this area. As the demands on representability of data are lower when making a diagnosis, i.e. a diagnose does not have to be made during every operating condition, the training data do not need to incorporate all engine properties. 8.1.3 Function estimation The accuracy of the function estimation depends on how many neurons that are used in the SOM and the number of misclassified signals. In the example with noisy signals the number of misclassified signals were high. This caused the error to grow although the neuron labels had approximately the same distance, i.e. the network had the same resolution. As was discussed in section 7.3 there is a resolution problem in some cases. In general, this applies to function estimation, but could be an issue for the other areas as well. The I/O interface uses 8-bit integers, i.e. 256 levels, which means that there will be quantification errors in both the input signals and the output signal. It is a disadvantage that the output is discrete and limited to as many values as the number of neurons. This might be compensated for with hierarchies in the VDS but in the end it is a limitation. 8.2 Method criticism The same kind of data have been used for training as for verification. It is arguable that this approach does not give a true image of the VDS capabilities. Classifying other kinds of data, e.g. more stationary data, may not give as good results. This is because these are unseen data that the system is not trained to classify. A temporary solution would be to incorporate part of the new data into the training set and the system would then be able to classify it. Then again, a new type of data could be introduced and so on. The solution to this problem is to use training data that capture sufficient properties of the system to be able to classify signals and make a diagnosis. 8.3 Future Work Going forth would probably involve developing a new and better test cycle with both faster and slower parts. Also more variations in ambient air pressure and all the temperatures are needed. New hierarchies of networks could increase the resolution where separation can be based on different intervals in fuel value, engine speed or other variables. Additionally, to improve the torque estimation results it would probably be necessary to incorporate flywheel information. Some of the input signals used for torque estimation might be sensitive to faults in the system. This should be tested and some new input signals might be 54 Conclusions and future work needed to compensate this. Also for this reason, looking into flywheel information is preferable. The data used in this thesis have been collected in an engine laboratory on one engine individual. The collecting of data should also be examined when it comes to deploying a VP-based application in a truck. There will probably be needed some adaptations for this, which is something that should be investigated. And the capabilities of a network trained on one engine individual be used on another need looking in to. For the method to be useful in an engine this must be solved without or with just minor modifications to the network. The experience from the fault isolation where a such an important variable as engine speed made the isolation worse when it was included as an input variable. The approach with separation into different intervals sort of solved this problem. It would however be interesting to investigate why this behavior was encountered. The dynamic behavior has been handled by delaying input signals or using long averages. When a signal is delayed, it is important that the estimated time constant is correct, see section 3.5. These time constants change depending on different driving conditions, e.g. engine speed, ambient air temperature, coolant temperature, and so on. Further investigation should be done to solve the varying time constants and these are some suggestions: • Sampling based on crank-shaft angle instead of time. This would eliminate the affect from engine speed on time constants. • Use separate networks for different temperature areas. In each area a separate time constant could be estimated to give a higher accuracy. The long averages solution to dynamics included one length of the averaging. Experiments with different lengths would reveal if the accuracy can be improved additionally. Also using a combination where the boost temperature signal is not averaged while fuel value and engine speed are, would be an interesting test. The torque estimation did not include any delay of signals. This was done because the torque is a very fast ‘output signal’. Delaying the inputs one or two samples may improve the results even more and could be investigated. Bibliography [1] T. Kohonen, Self-Organizing Maps, 3rd ed. Springer-Verlag Berlin Heidelberg New York, 2001. [2] B. Challen and R. Baranescu, Diesel Engine Reference Book, 2nd ed. Elsevier, 1999. [3] L. Nielsen and L. Eriksson, Course material Vehicular systems. Institute of Technology, Vehicular Systems, ISY, 2003. Linköping [4] G. de A. Barreto and A. F. Araújo, “Nonlinear modeling of dynamic systems with the self-organizing map,” ARTIFICIAL NEURAL NETWORKS - ICANN 2002 LECTURE NOTES IN COMPUTER SCIENCE 2415, pp. 975–980, 2002. 55 56 Conclusions and future work Appendix A RMSE versus ABSE Perhaps the easiest way of measuring an error is to look at the absolute error in every sample and take the mean value or the maximum value. In this thesis this measure is called the mean/max ABSolute Error, ABSE. mean ABSE = 1 X |x̂i − xi | N i∈[1,N ] max ABSE max |x̂i − xi | = i∈[1,N ] Here x̂ is the measured value, x is the real/theoretic value and N is the number of samples. One of the most common error measurements is the Root Mean Square Error, RMSE, which is a quadratic version of the mean ABSE. The name comes from the formula in which the value is calculated: v u u1 X (x̂i − xi )2 RM SE = t N i∈[1,N ] The RMSE gives a value that is more sensitive to large errors than the mean ABSE. When taking the square of the errors large errors gets more emphasis. 57 58 RMSE versus ABSE Appendix B Measured variables Following sensor variables are available: • Time • Ambient air temperature • Ambient air pressure • Oil temperature • Coolant temperature • Boost temperature • Boost pressure • Engine speed Boost temperature and pressure are measured in the intake manifold. In addition to this, calculated values, e.g. torque and amount of fuel injected, are available from the EMS. 59 60 Measured variables Appendix C Fault detection results In this appendix results from the fault isolation problem in chapter 6 are listed. Results from using 90%, 80%, 70% and 60% as activation frequency threshold as well as using the most frequent method when trying to isolate air leakage and reduced intercooler efficiency faults are presented. For more details see section 6.2. 61 62 C.1 Fault detection results 90 % activation frequency resolving true mode \ classified as Low rpm NF FAL Mid rpm (256) NF FAL Mid rpm (1024) NF FAL High rpm NF FAL Over all (256) NF FAL Over all (1024) NF FAL NF FAL Not cl. 6.96% 0.08% 0.28% 6.03% 92.76% 93.89% 26.43% 0.57% 1.35% 12.47% 72.21% 86.96% 39.35% 0.69% 1.33% 31.46% 59.32% 67.85% 64.28% 0.93% 2.00% 53.32% 33.73% 45.75% 27.83% 0.50% 1.18% 18.00% 70.99% 81.50% 35.03% 0.57% 1.17% 28.36% 63.80% 71.08% Table C.1. Classification results with an activation frequency of 90% at labeling. N F = No Fault, FAL = Air Leakage. true mode \ classified as Low BP NF FRIE Mid BP NF FRIE High BP NF FRIE Over all NF FRIE NF FRIE Not cl. 36.10% 0.53% 0.63% 34.23% 63.27% 65.24% 55.07% 0.91% 0.71% 58.85% 44.22% 40.24% 53.44% 1.51% 0.56% 63.48% 46.00% 35.01% 47.04% 0.84% 0.65% 48.87% 52.31% 50.28% Table C.2. Classification results with an activation frequency of 90% at labeling. N F = No Fault, FRIE = Reduced Intercooler Efficiency. C.2 80 % activation frequency resolving C.2 63 80 % activation frequency resolving true mode \ classified as Low rpm NF FAL Mid rpm (256) NF FAL Mid rpm (1024) NF FAL High rpm NF FAL Over all (256) NF FAL Over all (1024) NF FAL NF FAL Not cl. 13.80% 1.41% 0.94% 10.01% 85.27% 88.58% 38.92% 2.09% 3.18% 29.98% 57.89% 67.93% 46.76% 2.13% 4.65% 48.94% 48.59% 48.92% 68.95% 1.86% 4.93% 65.72% 26.12% 32.42% 37.44% 1.86% 2.89% 30.86% 59.67% 67.27% 41.81% 1.89% 3.70% 41.20% 54.48% 56.91% Table C.3. Classification results with an activation frequency of 80% at labeling. N F = No Fault, FAL = Air Leakage. true mode \ classified as Low BP NF FRIE Mid BP NF FRIE High BP NF FRIE Over all NF FRIE NF FRIE Not cl. 48.85% 2.61% 3.89% 51.63% 47.25% 45.76% 66.58% 3.01% 2.41% 70.99% 31.00% 26.00% 69.60% 4.66% 2.48% 71.87% 27.93% 23.48% 59.91% 3.11% 3.03% 62.69% 37.07% 34.21% Table C.4. Classification results with an activation frequency of 80% at labeling. N F = No Fault, FRIE = Reduced Intercooler Efficiency. 64 C.3 Fault detection results 70 % activation frequency resolving true mode \ classified as Low rpm NF FAL Mid rpm (256) NF FAL Mid rpm (1024) NF FAL High rpm NF FAL Over all (256) NF FAL Over all (1024) NF FAL NF FAL Not cl. 19.86% 3.98% 3.95% 19.10% 76.19% 76.92% 45.02% 4.19% 8.60% 45.89% 46.39% 49.92% 57.70% 5.59% 8.73% 59.75% 33.57% 34.66% 75.44% 4.71% 7.98% 77.31% 16.58% 17.98% 43.60% 4.23% 7.24% 44.12% 49.16% 51.65% 50.67% 4.99% 7.32% 51.68% 42.01% 43.33% Table C.5. Classification results with an activation frequency of 70% at labeling. N F = No Fault, FAL = Air Leakage. true mode \ classified as Low BP NF FRIE Mid BP NF FRIE High BP NF FRIE Over all NF FRIE NF FRIE Not cl. 57.28% 4.86% 10.05% 67.50% 32.67% 27.63% 73.99% 5.27% 4.47% 76.04% 21.55% 18.69% 77.59% 6.49% 4.33% 76.52% 18.08% 16.98% 67.83% 5.30% 6.72% 72.39% 25.45% 22.31% Table C.6. Classification results with an activation frequency of 70% at labeling. N F = No Fault, FRIE = Reduced Intercooler Efficiency. C.4 60 % activation frequency resolving C.4 65 60 % activation frequency resolving true mode \ classified as Low rpm NF FAL Mid rpm (256) NF FAL Mid rpm (1024) NF FAL High rpm NF FAL Over all (256) NF FAL Over all (1024) NF FAL NF FAL Not cl. 32.68% 12.67% 14.12% 31.52% 53.20% 55.81% 50.41% 10.91% 20.17% 61.95% 29.42% 27.15% 67.51% 11.07% 14.61% 72.00% 17.88% 16.93% 80.99% 8.49% 11.03% 89.26% 7.98% 8.25% 51.05% 10.96% 16.69% 57.37% 32.25% 31.67% 60.54% 11.05% 13.86% 62.85% 25.60% 26.10% Table C.7. Classification results with an activation frequency of 60% at labeling. N F = No Fault, FAL = Air Leakage. true mode \ classified as Low BP NF FRIE Mid BP NF FRIE High BP NF FRIE Over all NF FRIE NF FRIE Not cl. 69.15% 10.66% 14.53% 75.83% 16.32% 13.52% 79.77% 4.56% 7.57% 43.23% 12.66% 52.21% 84.77% 12.07% 7.31% 78.88% 7.93% 9.05% 76.35% 7.42% 10.36% 57.99% 13.29% 34.59% Table C.8. Classification results with an activation frequency of 60% at labeling. N F = No Fault, FRIE = Reduced Intercooler Efficiency. 66 C.5 Fault detection results Most frequent resolving true mode \ classified as Low rpm NF FAL Mid rpm (256) NF FAL Mid rpm (1024) NF FAL High rpm NF FAL Over all (256) NF FAL Over all (1024) NF FAL NF FAL Not cl. 58.85% 34.74% 39.72% 63.13% 1.42% 2.13% 67.90% 19.86% 31.95% 79.89% 0.16% 0.24% 76.16% 17.99% 22.84% 81.28% 1.00% 0.73% 84.73% 11.78% 14.21% 87.48% 1.06% 0.74% 68.42% 22.52% 30.93% 76.63% 0.65% 0.85% 73.03% 21.50% 25.85% 77.37% 1.12% 1.12% Table C.9. Classification results with most frequent as multi-classification resolving. N F = No Fault, FAL = Air Leakage. true mode \ classified as Low BP NF FRIE Mid BP NF FRIE High BP NF FRIE Over all NF FRIE NF FRIE Not cl. 79.87% 19.21% 19.99% 80.59% 0.14% 0.20% 87.01% 13.41% 12.86% 86.39% 0.14% 0.19% 88.48% 13.70% 11.21% 85.70% 0.31% 0.59% 84.37% 15.99% 15.47% 83.75% 0.17% 0.26% Table C.10. Classification results with most frequent as multi-classification resolving. N F = No Fault, FRIE = Reduced Intercooler Efficiency. På svenska Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare – under en längre tid från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/ In English The publishers will keep this document online on the Internet - or its possible replacement - for a considerable time from the date of publication barring exceptional circumstances. The online availability of the document implies a permanent permission for anyone to read, to download, to print out single copies for your own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its WWW home page: http://www.ep.liu.se/ © Conny Bergkvist and Stefan Wikner

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Self-organizing maps for virtual sensors, fault detection and fault isolation in