Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Mining 2 (ex Análisis Inteligente de Datos y Data Mining) Lluís A. Belanche www.lsi.upc.edu/... /~belanche/docencia/aiddm/aiddm.html /~avellido/teaching/data_mining.htm Contents of the course (hopefully) 1. Introduction & methodologies 2. Exploratory DM through visualization 3. Pattern recognition: introduction 4. Pattern recognition: the Gaussian case 5. Feature extraction 6. Feature selection & weighing 7. Error estimation 8. Linear methods are nice! 9. Probability in Data Mining 10. Latency, generativity, manifolds and all that 11. Application of GTM: from medicine to ecology 12. DM Case studies Sorry guys! … no fuzzy systems … Error estimation Feature extraction, selection and weighing have many uses Linear classifiers are nice! (I) Linear classifiers are nice! (II) F Transformation F (x) = [ F1(x), F2(x), … Fm(x) ] with x = [ x1, x2, …, xn ] Useful for “ascending” (m>n) or “descending” (m>n) with 0 < m,n < oo (integers) … an example? Linear classifiers are nice! (III) F Nets F (x) = [ F1(x), F2(x), … Fm(x) ] with x = [ x1, x2, …, xn ] x F(x) Utility • This is a very powerful setting • Let us suppose: r>s increase in dimension increase in expressive power, ease the task for almost any learning machine r<s decrease in dimension visualization, compactation, noise reduction, removal of useless information Contradictory !? On intelligence … • What is Intelligence? • What is the function of Intelligence? to ensure survival in nature • What are the ingredients of intelligence? – Perceive in a changing world – Reason under partial truth – Plan & prioritize under uncertainty – Coordinate different simultaneous tasks – Learn under noisy experiences Parking a Car (difficult or easy?) “Generally, a car can be parked rather easily because the final position of the car is not specified exactly. It it were specified to within, say, a fraction of a millimeter and a few seconds of arc, it would take hours of maneuvering and precise measurements of distance and angular position to solve the problem.” High precision carries a high cost. The primordial soup Belief Networks Fuzzy Logic Neural Networks Soft Computing Chaos & Fractals Rough Sets Evolutionary Algorithms What could MACHINE LEARNING possibly be? In the beginning, there was a set of examples … • To exploit imprecision, uncertainty, robustness, data dependencies, learning and/or optimization ability, to achieve a working solution to a problem which is hard to solve. • To find an exact (approximate) solution to an imprecisely (precisely) formulated problem. So what is the aim? The challenge is to put these capabilities into use by devising methods of computation which lead to an acceptable solution at the lowest possible cost. This should be the guiding principle Different methods = different roles Fuzzy Logic : the algorithms for dealing with imprecision and uncertainty Neural Networks : the machinery for learning and function approximation with noise Evolutionary Algorithms : the algorithms for adaptive search and optimization uncertainty arising from the Rough RS granularity in the domain of Sets discourse Examples of soft computing • TSP: 105 cities, – accuracy within 0.75%, 7 months – accuracy within 1%, 2 days • Compare – “absoulute best for sure” with “very good with very high probability” Are you one of the top guns? • Consider … – Search space of size s – Draw N random samples – What is the probability p that at least one of them is in the top t ? • Answer: p = 1 – (1-t/s)N • Example: s= 1012, N=100.000, t=1.000 1 in 10.000 ! On Algorithms • what is worth? Specialized algorithms: best performance for special problems Generic algorithms: good performance over a wide range of problems Generic Algorithms Specialized Algo. Problems P Words are important ! • • • • • • What is a theory ? What is an algorithm ? What is an implementation ? What is a model ? What does “non-linear” mean ? What does “non-parametric” mean ? The problem of induction • Classical problem in Philosophy • Example: 1,2,3,4,5,? • A more through example: JT What are the conditions for successful learning? • Training data (sufficiently) representative • Principle of similarity • Target function within capacity of the learner • Non-dull learning algorithm • Enough computational resources • A correct (or close to) learning bias And the Oscar goes to … The real problem is not whether machines think, but whether men do. B.F. Skinner, Contingencies of Reinforcement