Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Massimo Panella Dipartimento di Scienza e Tecnica dell’Informazione e della Comunicazione (INFO--COM) (INFO Università di Roma “La Sapienza” Facoltà di Ingegneria dell’Informazione Via Eudossiana 18, 18, 00184 Roma [email protected] [email protected] 1.it Ministero dello Sviluppo Economico (ISCOM) Roma, 9 giugno 2010 Soft computing, reti neurali e algoritmi evolutivi Outline • Intelligent Systems/Historical Perspective •Foundation of Soft Computing •Evolution of Soft Computing o Neural Network / Neuro Computing o Fuzzy Logic / Fuzzy Computing o Genetic Algorithm / Evolutionary Computing o Hybrid Systems • Demo •Conclusions 1 Computation? • Traditional Sense: Manipulation of Numbers • Human: Uses Word for Computation and Reasoning • Conclusions <= Word <== Natural Language Intelligent System? The role model for intelligent system is Human Mind. Dreyfus: Minds do not use a theory about the everyday world Know Know--how vs know that Winograd Intelligent systems act, don't think 2 Computational Intelligence Knowledge Representation Predicates Production rules Semantic networks Frames Inference Engine Learning Common Sense & Heuristics Uncertainty Computational Intelligence Applications Expert tasks The algorithm does not exist A medical encyclopedia is not equivalent to a physician Heuristics There is an algorithm but it is “useless” Uncertainty The algorithm is not possible Complex problems The algorithm is too complicated Technologies Expert systems Natural language processing Symbolic processing Knowledge engineering 3 Cost Uncertainty “As complexity rises, precise statements lose meaning, and meaningful statements lose precision.” (L.A. Zadeh) Principle of incompatibility (Pierre Duhem) The certainty that a proposition is true decreases with any increase of its precision The power of a vague assertion rests in its being vague (“I am not tall”) A very precise assertion is almost never certain (“I am 1.71cm tall) Precision The Nature of Mind The Contribution of Information Science The mind as a symbol processor Formal study of human knowledge Knowledge processing Common--sense knowledge Common Neural Networks 4 The Nature of Mind The Contribution of Psychology The mind as a processor of concepts Reconstructive memory Memory is learning and is reasoning. Fundamental unity of cognition The Nature of Mind The Contribution of Neurophysiology The brain is an evolutionary system Mind shaped mainly by genes and experience Neural--level competition Neural Connectionism 5 The Nature of Mind The Contribution of Physics Living beings create order from disorder Non--equilibrium thermodynamics Non Self--organizing systems Self The mind as a selfself-organizing system Theories of consciousness based on quantum & relativity physics What is Soft Computing? The basic ideas underlying soft computing in its current incarnation have links to many earlier influences, among them Prof. Zadeh’s 1965 paper on fuzzy sets; the 1973 paper on the analysis of complex systems and decision processes; and the 1979 report (1981 paper) on possibility theory and soft data analysis. The principal constituents of soft computing (SC) are fuzzy logic (FL), neural network theory (NN) and probabilistic reasoning (PR), with the latter subsuming belief networks, evolutionary computing including DNA computing, chaos theory and parts of learning theory. Fuzzy Set: 1965 … Fuzzy Logic: 1973 … Soft Decision: 1981 … BISC: 1990 … HumanHuman-Machine Perception: 2000 - … 6 SOFT COMPUTING SOFT COMPUTING “Soft computing is consortium of computing methodologies which collectively provide a foundation for the Conception, Design and Deployment of Intelligent Systems.” L.A. Zadeh "...in contrast to traditional hard computing, soft computing exploits the tolerance for imprecision, uncertainty, and partial truth to achieve tractability, robustness, low solution-cost, and better rapport with reality” L.A. Zadeh The role model for Soft Computing is the Human Mind. 7 SOFT COMPUTING • Neuro-Computing (NC) • Fuzzy Logic (GL) • Genetic Computing (GC) • Probabilistic Reasoning (PR) • Chaotic Systems (CS), Belief Networks (BN), Learning Theory (LT) Related Technologies • Statistics (Stat.) •Artificial Intelligence (AI): –Case-Based Reasoning (CBR) –Rule-Based Expert Systems (RBR) –Machine Learning (Induction Trees) –Bayesian Belief Networks (BBN) SOFT COMPUTING Neural Networks Fuzzy Logic create complicated models without knowing their structure gradually adapt existing models using “training data” Fuzzy Rules are easy and intuitively understandable Genetic Algorithms find parameters through evolution (usually when a direct algorithm is unknown) 8 Neural Networks Ensemble of simple processing units Connection weights define functionality Derive weights from “training data” (usually gradient descent based algorithms) Fuzzy Logic Allow partial membership to sets Express knowledge through linguistic terms and rules (“Computing with Words”) Derive sets of Fuzzy Rules from data (usually based on heuristics) 9 Evolutionary Algorithms Finding an optimal structure (parameters) for a model is often complicated (due to large search space, complex structure) Find structure (parameters) through evolution (generate population, evaluate, breed new pop.) Why Fuzzy Logic? •Uncertainty in the data and laws of nature* •Imprecision due to measurement & human error •Incomplete and sparse information •Subjective and Linguistic rules •So far as the laws of mathematics refer to reality, they are not certain; and so far as they are certain, they don’t refer to reality” Albert Einstein 10 Words are less precise than numbers! • When information is too imprecise • Close to reality • Complex problem “As complexity rises, precise statements lose meaning, and meaningful statements lose precision.” Lotfi A. Zadeh Why Neural Network? • Structure Free & Nonlinear Mapping • Multivariable Systems • Trains Easily Based on Historical Data • Parallel Processing & Fault Tolerance *Much Like Human Brain 11 Why Evolutionary Computing? •For Multi-objectives and Multi-Criteria Optimization Purposes • Resolving Conflict • Capability to learn adaptively and to be self-aware * Darwinian's law SOFT COMPUTING Neural Network 12 Biological Neuron Biological Neuron 13 Biological vs. Artificial Neuron Synapse Axo n Soma Dendrites Synapse Axo n Soma De ndrites Synapse 14 Analogy between biological and artificial neural networks Biological Neural Network Soma Dendrite Axon Synapse Artificial Neural Network Neuron Input Output Weight Soma Synapse Dendrites Input Signals Axon Out put Signals Synapse Axon Soma Dendrites Middle Layer Synapse Input Layer Output Layer Artificial Neuron Input Signals Weights OutputSignals Neuron 15 Schematic Diagram for Single Neuron b 1 w1 x1 s=∑ ∑xiwi y=f(s) w22 y wk xk y = f [ b+ w1 x1 + w2 x2 + … + wk xk ] . w22 Activation Functions f(s) sgn(s) 1 tanh(s) s -1 16 Input Signals Output Signals Multilayer perceptron with two hidden layers Input layer First hidden layer Second hidden layer Output layer Artificial Neural Network (Feedforward) Multi Layer Perceptron ANN X1 Σ X2 Neural Network X1 Y X2 17 Artificial Neural Network vs. Human Brain Largest neural computer: 20,000 neurons Worm’s brain: 1,000 neurons But the worm’s brain outperforms neural computers It’s the connections, not the neurons! Human brain: 100,000,000,000 neurons 200,000,000,000,000 connections Brain vs. Computer Processing • Processing Speed: Milliseconds VS Nanoseconds. • Processing Order: Massively parallel.VS serially. • Abundance and Complexity: 1011 and 1014 of neurons operate in parallel in the brain at any given moment, each with between 103 and 104 abutting connections per neuron. • Knowledge Storage: Adaptable VS New information destroys old information. • Fault Tolerance: Knowledge is retained through the redundant, distributed encoding information VS the corruption of a conventional computer's memory is irretrievevable and leads to failure as well. 18 NEURAL NETWORK Different Non-Linearly Separable Problems Structure Single-Layer Two-Layer Three-Layer Types of Decision Regions Exclusive-OR Problem Half Plane Bounded By Hyperplane A B A Convex Open Or Closed Regions A B Arbitrary (Complexity Limited by No. of Nodes) Classes with Most General Meshed regions Region Shapes B B B B A A B B B A A A A 19 Learning Paradigms Supervised learning Unsupervised learning network trained by showing a set of input and output patterns network is shown only the input patterns Reinforcement learning Information on quality of response is available Neural Network Classification 20 Multilayer Perceptron Kohonen Radial Basis Functions Neural Network Models Generalised Regression Probabilistic ART Recurrent Neural Network Classification 21 Output layer Output layer Hidden layer Hidden layer Input layer Input layer } Unit delay operator Recurrent network without hidden units inputs { outputs Recurrent network with hidden units Artificial Neural Network (Feedforward and Recurrent) Other Types of Neural Networks x1 G ω1 x2 ωj G xm Input Layer SOM Committee Machines F (x) ωN x m−1 ω = (G T G ) −1 G T d G Hidden layer of N Green’s functions Output layer RBFN ART1 22 Output layer Hidden layer Input layer Artificial Neural Network (recurrent) SOFT COMPUTING Fuzzy Logic 23 VARIABLES AND LINGUISTIC VARIABLES one of the most basic concepts in science is that of a variable variable a linguistic variable is a variable whose values are words or sentences in a natural or synthetic language (Zadeh 1973) the concept of a linguistic variable plays a central role in fuzzy logic and underlies most of its applications -numerical (X=5; X=(3, 2); …) -linguistic (X is small; (X, Y) is much larger) Fuzzy Sets Fuzzy Logic Classical Logic Element x belongs to set A or it does not: µ(x)∈{0,1} µA(x) µA(x) A=“young” 1 0 Element x belongs to set A with a certain degree of membership: µ(x)∈[0,1] 1 x [years] 0 A=“young” x [years] 24 Membership Functions Predicate “Old” Predicate “Old” 1 x ≥ 50 years | Middle − Aged ( x ) = 0 x ≤ 50 years Crisp Set Fuzzy Set Other Types of Membership Function Triangular Trapezoid Gaussian EXAMPLES OF F-GRANULATION (LINGUISTIC VARIABLES) color: red, blue, green, yellow, … age: young, middle-aged, old, very old size: small, big, very big, … distance: near, far, very, not very far, … µ young 1 middle-aged old very old 0 Age 25 Fuzzy Logic Linguistic Rule Knowledge Base Crisp Input Fuzzifier Module Fuzzy Inference Engine Defuzzifier Module Crisp Output Fuzzy Sets Fuzzy Numbers Fuzzification Fuzzy Operators Fuzzy Rules Fuzzy Inference Defuzzification Fuzzy Rule Base If Age is old then Roya is 70 If Age is milddle-Aged then Roya is 45 If Age is Young then Roya is 20 26 Inferencing Decision = {20|0, 45|0.75, 70|0.25} µ Age Defuzzification Output = (20×0 + 45×0.75 + 70×0.25) ÷ (0 + 0.75 + 0.25) Output = 51.2 Middle-Aged 27 Schema of a Fuzzy Decision Inference Fuzzification Defuzzification rule-base µcold µwarm µhot 0.7 if temp is cold then valve is open µcold =0.7 if temp is warm then valve is half 0.2 measured temperature t µopen µhalf µclose 0.7 0.2 µwarm =0.2 if temp is hot then valve is close µhot =0.0 v crisp output for valve-setting SOFT COMPUTING Fuzzy Logic 28 WHAT IS FUZZY LOGIC? fuzzy logic has been and still is, though to a lesser degree, an object of controversy for the most part, the controversies are rooted in misperceptions, especially a misperception of the relation between fuzzy logic and probability theory a source of confusion is that the label “fuzzy logic” is used in two different senses (a) narrow sense: fuzzy logic is a logical system (b) wide sense: fuzzy logic is coextensive with fuzzy set theory today, the label “fuzzy logic” (FL) is used for the most part in its wide sense PRINCIPAL APPLICATIONS OF FUZZY LOGIC control consumer products industrial systems automotive decision analysis medicine geology pattern recognition robotics FL CFR CFR: calculus of fuzzy rules 29 EMERGING APPLICATIONS OF FUZZY LOGIC computational theory of perceptions natural language processing financial engineering biomedicine legal reasoning forecasting Mamdani Inference System Output Z Input MF A1 B1 X A2 Y B2 X x C1 C2 Y y Z1 Z = (centroid of area) Z2 Output MF Input (x,y) 30 First-Order Takagi Sugeno FIS • Fuzzy Rule base If X is A1 and Y is B1 then Z = p1*x + q1*y + r1 If X is A2 and Y is B2 then Z = p2*x + q2*y + r2 • Fuzzy reasoning A1 B1 x=1 B2 X z1 = p1*x+q1*y+r1 Y X A2 w1 y=3 w2 Y Π z2 = p2*x+q2*y+r2 z= w1*z1 + w2*z2 w1+w2 SOFT COMPUTING Neuro-Fuzzy Computing 31 Neuro-Fuzzy Modeling Hybrid Model Neural Networks Fuzzy Inference System Prior rule-based knowledge cannot be used Prior rule-based can be incorporated Learning from scratch Cannot learn (use linguistic knowledge) Black box Interpretable (if-then rules) Complicated learning algorithms Simple interpretation and implementation Difficult to extract knowledge Knowledge must be available Adaptive Neuro-Fuzzy Inference System (ANFIS ) Takagi Sugeno FIS Input partitioning LSE + gradient descent training nonlinear parameters x y A1 Π A2 Π B1 Π B2 Π w1 linear parameters w1*z1 Σ Σ wi*zi w4 w4*z4 Σ Σ wi / z Forward pass Backward pass MF parameter fixed steepest descent (nonlinear) Coefficient parameter least-squares fixed (linear) 32 Evolutionary Design of Neuro-Fuzzy Systems SOFT COMPUTING Evolutionary Computing 66 33 What is a GA? Genetic Algorithms (GAs) are adaptive heuristic search algorithm based on the evolutionary ideas of natural selection and genetics. As such they represent an intelligent exploitation of a random search used to solve optimization problems. Although randomized, GAs are by no means random, instead they exploit historical information to direct the search into the region of better performance within the search space. The basic techniques of the GAs are designed to simulate processes in natural systems necessary for evolution, specially those follow the principles first laid down by Charles Darwin of "survival of the fittest.". Since in nature, competition among individuals for scanty resources results in the fittest individuals dominating over the weaker ones. Evolutionary Algorithms Evolution Strategies Genetic Programming Genetic Algorithms Classifier Systems Evolutionary Programming • genetic representation of candidate solutions • genetic operators • selection scheme • problem domain 34 History of GAs Genetic Algorithms were invented to mimic some of the processes observed in natural evolution. Many people, biologists included, are astonished that life at the level of complexity that we observe could have evolved in the relatively short time suggested by the fossil record. The idea with GA is to use this power of evolution to solve optimization problems. The father of the original Genetic Algorithm was John Holland who invented it in the early 1970's. 1970's. Classes of Search Techniques DFS, BFS Tabu Search Hill Climbing Genetic Programming 35 The Genetic Algorithm Directed search algorithms based on the mechanics of biological evolution Developed by John Holland, University of Michigan (1970 1970’s) ’s) To understand the adaptive processes of natural systems To design artificial systems software that retains the robustness of natural systems The genetic algorithms, first proposed by Holland (1975 (1975), ), seek to mimic some of the natural evolution and selection. The first step of Holland’s genetic algorithm is to represent a legal solution of a problem by a string of genes known as a chromosome. Evolutionary Programming First developed by Lawrence Fogel in 1966 for use in pattern learning Early experiments dealt with a number of Finite State Automata FSA were developed that could recognise recurring patterns and even primeness of numbers Later experiments dealt with gaming problems (coevolution) More recently has been applied to training of neural networks, function optimisation & path planning problems 36 Biological Terminology • gene • functional entity that codes for a specific feature e.g. eye color • set of possible alleles • allele • value of a gene e.g. blue, green, brown • codes for a specific variation of the gene/feature • locus • position of a gene on the chromosome • genome • set of all genes that define a species • the genome of a specific individual is called genotype • the genome of a living organism is composed of several chromosomes • population • set of competing genomes/individuals Genotype versus Phenotype • genotype • blue print that contains the information to construct an organism e.g. human DNA • genetic operators such as mutation and recombination modify the genotype during reproduction • genotype of an individual is immutable (no Lamarckian evolution) • phenotype • physical make-up of an organism • selection operates on phenotypes (Darwin’s principle : “survival of the fittest”) 37 Courtesy of U.S. Department of Energy Human Genome Program , http://www.ornl.gov/hgmis Genotype Operators • recombination (crossover) • combines two parent genotypes into a new offspring • generates new variants by mixing existing genetic material • stochastic selection among parent genes • mutation • random alteration of genes • maintain genetic diversity • in genetic algorithms crossover is the major operator whereas mutation only plays a minor role 38 Crossover • crossover applied to parent strings with probability pc : [0.6..1.0] • crossover site chosen randomly • one-point crossover parent A 1 1 0 1 0 parent B 10001 offspring A offspring B 11011 offspring A offspring B 1100 0 10000 • two-point crossover parent A 1 1 0 1 0 parent B 10001 1001 1 Mutation • mutation applied to allele/gene with probability Pm : [0.001..0.1] • role of mutation is to maintain genetic diversity offspring: 11000 Mutate fourth allele (bit flip) mutated offspring: 1 1 010 0 39 Structure of an Evolutionary Algorithm mutation population of genotypes 10111 10011 10001 phenotype space 00111 01001 01001 11001 01011 recombination coding scheme selection 10011 10 10001 011 001 01001 01 01011 001 011 f x 10001 10001 fitness 11001 01011 Pseudo Code of an Evolutionary Alg. Create initial random population Evaluate fitness of each individual yes Termination criteria satisfied ? no stop Select parents according to fitness Recombine parents to generate offspring Mutate offspring Replace population by new offspring 40 Areas EAs Have Been Used In Design of electronic circuits Telecommunication network design Artificial intelligence Study of atomic clusters Study of neuronal behaviour Neural network training & design Automatic control Artificial life Scheduling Travelling Salesman Problem General function optimisation Bin Packing Problem Pattern learning Gaming Self--adapting computer programs Self Classification Test--data generation Test Medical image analysis Study of earthquakes Swarm Intelligence Modelli computazionali (o metaeuristiche) che imitano il comportamento sociale di specie biologiche (formiche, api, pesci, uccelli, …): “SI is the emergent collective intelligence of groups of simple agents.” (Bonabeau et al., 1999) Roma, 04/03/2008 CATTID, Università di Roma “La Sapienza” 82 41 Swarm intelligence (cnt) Imitazione: comportamento di gruppi di individui (sciami), formiche, uccelli, pesci, …. Prestazione ottimizzata: ricerca del cibo, movimento del gruppo, …. Swarm Intelligence Vantaggi: non richiedono il gradiente possono evitare i minimi locali ottimizazione distribuita. Non è necessario un coordinamento centralizzato; ci sono “agenti” che si influenzano tra di loro per ottenere l’ottimo. Inconvenienti: richiedono un alto costo computazionale ed una certa abilità nell’implementazione Swarm Intelligence Swarm Intelligence ACO: Ant Colony Optimization PSO: Particle Swarm Optimization ACO: ant colony optimization PSO: particle swarm optimization Swarm intelligence (cnt) Individuazione del cammino più breve tra il nido ed il cibo nel superamento di un ostacolo. Si considera la trasmissione indiretta d’informazione tra formiche (informazione stigmergetica) ottenuta tramite il deposito di feromone. Si considera un insieme di particelle che si muovono nello spazio soluzione, in cui ogni punto rappresenta una soluzione del problema d’interesse. Il movimento di ogni particella è regolato dall’inerzia e dalla conoscenza delle posizioni in cui essa ha ottenuto in precedenza la soluzione migliore e l’intero gruppo ha ottenuto la soluzione migliore in senso assoluto. 42 Illustrazione del meccanismo di superamento di un ostacolo da parte di una colonia di formiche con individuazione del cammino più corto Illustrazione del meccanismo di superamento di un ostacolo da parte di una colonia di formiche con individuazione del cammino più corto (cnt) Evaporazione del feromone nel cammino più lungo 43 SOFT COMPUTING Hybrid Systems Computing Models SOFT COMPUTING HARD COMPUTING Precise Models Symbolic Logic Reasoning (Traditional AI) Probabilistic Models Traditional Numerical Modeling and Search Multivalued & Fuzzy Logics Approximate Models Approximate Reasoning Neural Networks Functional Approximation and Randomized Search Evolutionary Computing 44 Cost Uncertainty “As complexity rises, precise statements lose meaning, and meaningful statements lose precision.” (L.A. Zadeh) Principle of incompatibility (Pierre Duhem) The certainty that a proposition is true decreases with any increase of its precision The power of a vague assertion rests in its being vague (“I am not tall”) A very precise assertion is almost never certain (“I am 1.71cm tall) Precision Soft Computing: Hybrid Probabilistic Systems Approximate Reasoning Multivalued & Fuzzy Logics Probabilistic Models Bayesian Belief Nets DempsterShafer Probability of Fuzzy Events Belief of Fuzzy Events Fuzzy Influence Diagrams 45 Soft Computing: Hybrid FL Systems Approximate Reasoning Functional Approximation/ Randomized Search Neural Networks Multivalued & Fuzzy Logics Evolutionary Computing Fuzzy Systems NN modified by FS (FNS) FL Tuned by NN (NFS) FL-EA Soft Computing: Hybrid NN Systems Approximate Reasoning Functional Approximation/ Randomized Search Multivalued & Fuzzy Logics Neural Networks RBF Feedforward NN Recurrent NN Single/Multiple Layer Perceptron Hopfield NN parameters controlled by FLC Evolutionary Algorithms SOM ART NN structure Weights generated by EAs 46 Soft Computing: Hybrid EA Systems Approximate Reasoning Multivalued & Fuzzy Logics Functional Approximation/ Randomized Search Neural Networks Evolutionary Algorithms Evolution Strategies Genetic Algorithms Evolutionary Programs Genetic Progr. EA parameters (N, Pcr, Pmu ) EA-based search inter-twined with EA parameters controlled by FLC hill-climbing controlled by EA SOFT COMPUTING 47 Conclusions: SC New Directions • Present -> Short Term Future - SC technologies will widen beyond its current constituents. - Artificial Immune Systems (for Information Assurance) - Fractals (as building blocks in GP or for bacteria identification) - Development of hybrid SC systems with other AI paradigms - EA for model/software update - Evolutionary software agents - Push for low-cost solutions / intelligent tools will lead to deployment of hybrid SC systems that efficiently integrate reasoning and search techniques. • Medium Term Future - SC technologies are (or will soon be) implemented on alternative, nonstandard computing mechanisms - Evolvable Hardware (Field Programmable Gate Arrays) - Bio-inspired Systems: DNA and Molecular Computing • Great Potential for Hybrid Soft Computing with new computing mechanisms: DNA and Molecular Computing, Intelligent Matter Conclusions: SC Experiments •Bio-inspired Systems: DNA and Molecular Computing (Examples) - Molecular Genetic Programming (Wasiewicz & Mulawka, 2001) - Representation of GP graphs by DNA molecules, with crossover and negation operators implemented using data flow techniques in DNA computing - DNA-based Fuzzy Systems (Deaton & Garzon, 2001) - Encoding of fuzzy membership functions in Gibbs free energy (released upon DNA hybridization*), leading to the representation of a fuzzy rule set (fuzzy associative memory) - Fuzzy inference performed by modified hybridization process - DNA Neural Network Computation (Mills et al., 2001) - DNA analog neural network in which axons and neuron are replaced by diffusion and molecular recognition of DNA. - DNA Evolutionary Computation (Wood et al., 2001) - Binary-encoded Evolutionary Algorithms, with point-wise mutation and implemented in molecular computing, evaluating the “OneMax” fitness function. crossover, ____________________________________________________________________________ *Watson-Crick hybridization of a pair of complementary DNA strands makes possible a representation of highly parallel selective operations that is key for molecular computing (Adleman 1994) •Potential for EC: DNA & Molecular Computing process, in parallel, populations that are billions of time larger than the ones used in conventional computers •Massive Information Storage (1 g DNA = 2 x1021 bits) 48