Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Copyright R. Weber Machine Learning, Data Mining, Genetic Algorithms, Neural Networks ISYS370 Dr. R. Weber Concept Learning is a Form of Inductive Learning •Learner uses: Copyright R. Weber –positive examples (instances ARE examples of a concept) and –negative examples (instances ARE NOT examples of a concept) Concept Learning Copyright R. Weber • Needs empirical validation • Dense or sparse data determine quality of different methods Validation of Concept Learning i • The learned concept should be able to correctly classify new instances of the concept Copyright R. Weber – When it succeeds in a real instance of the concept it finds true positives – When it fails in a real instance of the concept it finds false negatives Validation of Concept Learning ii • The learned concept should be able to correctly classify new instances of the concept Copyright R. Weber – When it succeeds in a counterexample it finds true negatives – When it fails in a counterexample it finds false positives Rule Learning Copyright R. Weber • Learning widely used in data mining • Version Space Learning is a search method to learn rules • Decision Trees Decision trees Copyright R. Weber • Knowledge representation formalism • Represent mutually exclusive rules (disjunction) • A way of breaking up a data set into classes or categories • Classification rules that determine, for each instance with attribute values, whether it belongs to one or another class • Not incremental Decision trees -leaf nodes (classes) - decision nodes (tests on attribute values) -from decision nodes Copyright R. Weber branches grow for each possible outcome of the test From Cawsey, 1997 Decision tree induction Copyright R. Weber • Goal is to correctly classify all example data • Several algorithms to induce decision trees: ID3 (Quinlan 1979) , CLS, ACLS, ASSISTANT, IND, C4.5 • Constructs decision tree from past data • Attempts to find the simplest tree (not guaranteed because it is based on heuristics) •From: ID3 algorithm – a set of target classes –Training data containing objects of more than one class Copyright R. Weber •ID3 uses test to refine the training data set into subsets that contain objects of only one class each •Choosing the right test is the key How does ID3 chooses tests Copyright R. Weber • Information gain or ‘minimum entropy’ • Maximizing information gain corresponds to minimizing entropy •Predictive features (good indicators of the outcome) Choosing tests Copyright R. Weber • Information gain or ‘minimum entropy’ • Maximizing information gain corresponds to minimizing entropy •Predictive features (good indicators of the outcome) Copyright R. Weber Monthy income Job status Repayment Loan status 1 2,000 Salaried 200 Good 2 4,000 Salaried 600 Very bad 3 3,000 Waged 300 Very good 4 1,500 salaried 400 Bad Data mining tasks ii • Link analysis Rules: • Association generation • Relationships between entities • Deviation detection Copyright R. Weber • How things change over time, trends KDD applications • Fraud detection – Telecom (calling cards, cell phones) – Credit cards – Health insurance Loan approval Investment analysis Marketing and sales data analysis Copyright R. Weber Identify potential customers Effectiveness of sales campaign Store layout Text mining Copyright R. Weber The problem starts with a query and the solution is a set of information (e.g., patterns, connections, profiles, trends) contained in several different texts that are potentially relevant to the initial query. Text mining applications • IBM Text Navigator – Cluster documents by content; – Each document is annotated by the 2 most frequently used words in the cluster; • Concept Extraction (Los Alamos) Copyright R. Weber – Text analysis of medical records; – Uses a clustering approach based on trigram representation; – Documents in vectors, cosine for comparison; Problem solving method Copyright R. Weber rule-based ES Reasoning type deductive reasoning case-based reasoning analogical reasoning inductive ML, NN inductive reasoning algorithms search Copyright R. Weber Genetic Algorithms (GA) Genetic algorithms (i) Copyright R. Weber • learn by experimentation • based on human genetics, it originates new solutions • representational restrictions • good to improve quality of other methods e.g., search algorithms, CBR • evolutionary algorithms (broader) Genetic algorithms (ii) Copyright R. Weber • • • • • • • requires an evaluation function to guide the process population of genomes represent possible solutions operations are applied over these genomes operations can be mutation, crossover operations produce new offspring an evaluation function tests how fit an offspring is the fittest will survive to mate again Genetic Algorithms ii Copyright R. Weber • http://ai.bpa.arizona.edu/~mramsey/ga.html You can change parameters • http://www.rennard.org/alife/english/gavgb.html Steven Thompson presented Copyright R. Weber Neural Networks (NN) the evidence Copyright R. Weber ~= 2nd-5th week training vision the evidence ~= 2nd-5th week training vision Copyright R. Weber 10 the evidence ~= 2nd-5th week training vision Copyright R. Weber 10 the evidence ~= 2nd-5th week Copyright R. Weber training vision NN: model of brains input output Copyright R. Weber neurons synapses electric transmissions : Elements Copyright R. Weber • • • • input nodes output nodes links weights terminology • input and output nodes (or units) connected by links • each link has a numeric weight • weights store information Copyright R. Weber • networks are trained on training sets (examples) and after are tested on test sets to assess networks’ accuracy • learning/training takes place as weights are updated to reflect the input/output behavior The concept 1 Yes, 0 No fly lay eggs 0 1 1 1 0 0 1 1 0 Copyright R. Weber 4 legs => bird => mammal => mammal The concept 1 Yes, 0 No fly lay eggs 0 1 1 1 0 0 1 1 0 Copyright R. Weber 4 legs => bird => mammal => mammal The concept 1 Yes, 0 No 4 legs 0 fly lay eggs 1 1 => bird Copyright R. Weber 0.5 0.5 0.5 1 0 0 1 1 0 => mammal => mammal 4 legs 0 fly lay eggs 1 1 => bird 0*0.5+1*0.5+1*0.5= 1 1 0 1*0.5+0*0.5+0*0.5= 0.5 1 1 => mammal 0 0 1*0.5+1*0.5+0*0.5= 1 0.5 0.5 0.5 => mammal Goal is to have weights that recognize different representations of mammals and birds as such 4 legs 0 fly lay eggs 1 1 => bird 0*0.5+1*0.5+1*0.5= 1 1 0 1*0.5+0*0.5+0*0.5= 0.5 1 1 => mammal 0 0 1*0.5+1*0.5+0*0.5= 1 0.5 0.5 0.5 => mammal Suppose we want bird to be greater 0.5 and mammal to be equal or less than 0.5 4 legs 0 fly lay eggs 1 1 => bird 0*0.25+1*0.25+1*0.5= 0.75 1 0 1*0.25+0*0.25+0*0.5= 0.25 1 1 => mammal 0 0 1*0.25+1*0.25+0*0.5= 0.5 0.25 0.25 0.5 => mammal Suppose we want bird to be greater 0.5 and mammal to be equal or less than 0.5 The training => mammal (1) => bird (0) Output=Step(wij f ij ) learning takes place as weights are updated to reflect the input/output behavior i=1 i=2 i=3 Copyright R. Weber 4 legs flies 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 1 1 1 eggs 0 1 1 j=1 j=2 j=3 Goal minimize error between representation of the expected and actual outcome 20 Copyright R. Weber NN demo….. Characteristics Copyright R. Weber • NN implement inductive learning algorithms (through generalization) therefore, it requires several training examples to learn • NN do not provide an explanation why the task performed the way it was • no explicit knowledge; uses data • Classification (pattern recognition), clustering, diagnosis, optimization, forecasting (prediction), modeling, reconstruction, routing Where are NN applicable? Copyright R. Weber • Where they can form a model from training data alone; • When there may be an algorithm, but it is not known, or has too many variables; • There are enough examples available • It is easier to let the network learn from examples • Other inductive learning methods may not be as accurate Applications (i) Copyright R. Weber • predict movement of stocks, currencies, etc., from previous data; • to recognize signatures made (e.g. in a bank) with those stored; • to monitor the state of aircraft engines (by monitoring vibration levels and sound, early warning of engine problems can be given; British Rail have been testing an application to monitor diesel engines; Applications (ii) • Pronunciation (rules with many exceptions) • Handwritten character recognition (network w/ 200,000 is impossible to train, final 9,760 weights, used 7300 examples to train and 2,000 to test, 99% accuracy) Copyright R. Weber • Learn brain patterns to control and activate limbs as in the “Rats control a robot by thought alone” article • Credit assignment CMU Driving ALVINN Copyright R. Weber • learns from human drivers how to steer a vehicle along a single lane on a highway • ALVINN is implemented in two vehicles equipped with computer-controlled steering, acceleration, and braking • cars can reach 70 m/h with ALVINN • programs that consider all the problem environment reach 4 m/h only Why using NN for the driving task? • there is no good theory of driving, but it is easy to collect training samples • training data is obtained with a human* driving the vehicle –5min training, 10 min algorithm runs • driving is continuous and noisy • almost all features contribute with useful information Copyright R. Weber *humans are not very good generators of training instances when they behave too regularly without making mistakes the neural network • INPUT: Copyright R. Weber video camera generates array of 30x32 grid of input nodes • OUTPUT: 30 nodes layer corresponding to steering direction • vehicle steers to the direction of the layer with highest activation Resources http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html#what http://www.ri.cmu.edu/projects/project_160.html Copyright R. Weber http://www.txtwriter.com/Onscience/Articles/ratrobot.html