* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Artificial Intelligence 4. Knowledge Representation
Survey
Document related concepts
Concept learning wikipedia , lookup
Unification (computer science) wikipedia , lookup
Pattern recognition wikipedia , lookup
Embodied cognitive science wikipedia , lookup
Linear belief function wikipedia , lookup
Machine learning wikipedia , lookup
Hard problem of consciousness wikipedia , lookup
History of artificial intelligence wikipedia , lookup
Computer Go wikipedia , lookup
Ecological interface design wikipedia , lookup
Logic programming wikipedia , lookup
Gene expression programming wikipedia , lookup
Transcript
Artificial Intelligence 18. Revision Lecture Course V231 Department of Computing Imperial College © Simon Colton Overview Exams: style and example questions – Covered in next Thursday’s lecture (12pm) Today we review the course material – – Could you get a job in the AI industry? Knowledge, understanding, abilities What can I get you to write about? What can I get you to explain? What can I get you to do? Try not to infer anything explicit about the exam! – Disclaimer: I may miss something today which appears on the exam. I will certainly cover more stuff than is on the exam. Four General Areas Covered General notions of AI – – Automated reasoning – – Predicate logic, automating deduction, Resolution theorem proving, constraint solving Machine learning – – Characterisations, autonomous agents, search, Representations, game playing to demonstrate these notions Overview (FIND-S), decision trees, Artificial Neural Networks, Inductive Logic Programming Evolutionary approaches – Genetic algorithms, genetic programming Characterisations of AI All about understanding – Know that there are different ways to split up AI – Long term goals – weak & strong AI Inspirations, e.g., brain, society, logic, evolution,… Methodologies: scruffies (me) and neat (good AI people) Techniques, representations, applications, products An important characterisation: Into general problems to be solved, – E.g., learning, proving, competing, communicating, creating, – Exhibiting life, etc. Artificial Intelligence Agents Knowledge: – – Definitions of rational, autonomous, agents What we have to worry about internally (agent) – What we have to worry about externally (environment) Environmental knowledge, utility functions, goals, etc. Software engineering considerations Accessibility, determinism, episodes, etc. Understanding – – Why we use agents as a concept in AI Why we worry about rationality and autonomy Search in Problem Solving Knowledge – How to specify a problem as a search problem – What the general problems are in search applications – – Depth first, breadth first, iterative deepening, bidirectional Difference between path cost and heuristic functions Types of heuristic search – What are you looking for (path or artefact)? Completeness/soundness, time/space tradeoffs, background Types of uninformed search – Initial state, operators, goal test Uniform path cost, greedy, A*, IDA* Admissible heuristics, comparing heuristics with effective branching Hill climbing searches Local max/min problems, random restart, simulated annealing Search in Problem Solving Understanding – – – Agenda analogy and graph analogy Why we have to search for an answer to problems (+coursework) Why we need different types of search – – Uninformed searches, heuristic searches Why completeness & soundness are important notions Why heuristics are needed, what they are in general Abilities – – Specify a problem as a search problem Simulate a specific kind of search – E.g., which nodes are expanded next (draw graph, etc.) Calculate effective branching rates, compare heuristics Calculate search space sizes, calculate heuristic measures Knowledge Representation Knowledge – General types of representation available – Logical representations available Propositional, predicate, higher order, fuzzy, etc. Understanding – – – Logical, graphical, production rules, frames Why different representations are required Limitations of each representation Expressiveness in logical representations Abilities – Represent information Logically, as semantic networks, in a frame based way, etc. Game Playing Knowledge – – – – – Understanding – What a zero-sum two player game is What the minimax principle is in general, what cutoff search is Evaluations functions, weighted linear functions Alpha-beta pruning Expectimax, chance nodes Why using minimax strategies is rational Abilities – – – Write down entire search trees for simple games Propagate scores from the bottom to top of the trees Work out the next move for a player, including expectimax Representing Knowledge in Predicate Logic Knowledge – Syntax and semantics of first order predicate logic – – – – – Sentences, connectives, constants, predicates, variables, functions, quantifiers, etc. What quantifiers mean, their scope, etc. Instantiation, ground variables, etc. Horn clauses, logic programs, body, head How search is undertaken in Prolog, LIPS, WAM OR-parallelism, AND-parallelism Representing Knowledge in Predicate Logic Understanding – Differences between propositional and predicate logic – – – Benefits to predicate logic (expressivity) What the terms syntax and semantics mean Prolog is a declarative, not procedural language How logic-based expert systems work in general Abilities – – Translate English to first order predicate logic Translate first order predicate logic to English – – (Without making mistakes either way!) Simulate a Prolog-style search Identify parts of logic sentences and logic databases e.g., quantifiers, constants, head, body, literal, Horn clauses Making Deductive Inferences Knowledge – – That and, or are commutative, distributive Some commonly used propositional equivalence rules – Some commonly used implication rules – Unit resolution, and elimination, or introduction, Existential elimination, etc. I tend to supply inference rules in exams – Double negation, rules [E1], [E4], [E5], de Morgans law Not simple/common ones Different ways of chaining together inference steps Forward/backward chaining, proof by contradiction Making Deductive Inferences Understanding – – – – What it means for two sentences to be logically equivalent What it means for a sentence to be false What it means for one sentence to entail another How rewrite rules can be used for proving equivalences Abilities – Use truth tables to – Apply inference rules – Show equivalences, tautologies, that a statement is false That one statement implies another Show what’s above and below the line Translate sentences: Be fluent in rewriting sentences The Resolution Method Knowledge – – – – What conjunctive normal form is What a substitution is, what unification does Overview of the unification algorithm The resolution rule Unit resolution, full resolution, generalised resolution Understanding – – – That resolution is refutation-complete Why we need conjunctive normal form/unification Why the occurs-check is important in unification The Resolution Method Abilities – Translate something into conjunctive normal form By using equivalence rules Organising quantifiers, standardising variables, etc. Existential elimination – Put a constant in place of an existential variable – Not using full skolemisation (we skipped over that) – Prepare a set of sentences for use in a resolution proof – Needs all sentences as single clauses (just split them) Find a unifying set of substitutions Apply them to unify two sentences Resolution Theorem Proving Knowledge – – – – Specifying a problem as axioms and theorem As a search problem: operators, initial states (CNF), the goal test Dealing with equality (demodulation) Heuristic strategies: – Overview of some other topics Unit preference, set of support, input resolution, subsumption Higher order proving, interactive, etc. Understanding – – – – Why deriving the empty clause means a contradiction Why we negate the theorem statement Why proof by contradiction is valid Know that resolution has been applied to mathematics Resolution Theorem Proving Abilities: – Prove a theorem using the resolution method Remember to negate theorem statement Follow proof all the way Draw the proof tree Or organise the resolution steps in a way – – Which makes me think you know what you’re doing Deduce something from a set of axioms Not necessarily related to proving something Overview of Machine Learning Knowledge – What ML problem constituents are: – Examples (pos & neg), background information What kind of errors can occur in the data What ML method constituents are: Representation of solutions (v. important) How to search for solutions, how to choose between solutions – – The FIND-S method – Simple, guaranteed to find the most specific solutions How we assess hypotheses – Occam’s (Ockham’s) razor: choose simplest if all else equal False negatives, false positives Predictive accuracy on training set, test set, comprehensibility How we assess learning methods: cross-validation, hold-back Definition of overfitting Overview of Machine Learning Understanding – – – Learning from examples How induction differs from deduction That positives and negatives are: – – What robustness means That hypothesis accuracy doesn’t necessarily mean that the method is a good one Correct and incorrect classifications That methods can overfit data (memorise) Abilities – – – Specify machine learning problems Identify problem constituents/method constituents Simulate search in the FIND-S method Decision Tree Learning Knowledge – – – – What entropy and information gain are How the ID3 algorithm works How we can try to avoid overfitting trees What are good problems for decision tree approaches Attribute-value pairs, discrete values, etc. Understanding – What the tree representation is – Why it is both a graphical and a logical representation How it can be thought of as a categorisation problem What entropy is estimating Decision Tree Learning Abilities – – – – – To read decision trees To construct decision trees from English specifications Specify a learning problem for decision tree learning Calculate entropy and information gain for attributes Simulate the ID3-algorithm in action Calculate information gain, choose attributes, restrict data (Sv), how/when to end branches Two Layer ANNs Knowledge – How information is stored in an ANN (in the weights) – – – How ANNs are used to classify examples What are input/hidden/output units/layers What a perceptron is – How weighted sums are calculated Threshold functions: step, sigma, linear Perceptron training rule Using a learning rate How to calculate weight changes – – Target and observed output values for output units Epochs over all the examples What linearly seperable means, what boolean function means Two Layer ANNs Understanding – – – – – Difference between symbolic and non-symbolic representations Motivation from biology (and the limitations of this) Why perceptrons are limited Why linear separability is required for perceptrons to learn something Why learning rate is usually set to a small value (undo previous) Abilities – – – – – Write down a perceptron to calculate a given function (e.g., boolean) Describe what a ANN would calculate Propagate values from left to right to make classifications Calculate weight changes in the perceptron learning rule Simulate the perceptron learning rule Multi-layer ANNs Knowledge – – Perceptron units in multi-layer ANNs How sigmoid units calculate outputs from weighted sums – How feed forward networks calculate values – Overview of how this works (prop-back), epochs, weight changes, initial random assignment of weights, etc. How to avoid local minima – How these values are turned into classifications Backpropagation Learning Routine – Formula for the sigma function Calculating network error (in overview) Adding momentum How to avoid overfitting Validation set, weight decay Multi-Layer ANNs Understanding – – Why sigma being differentiable is important Which kinds of problems are suitable for ANNs – Long training, short execution, comprehension not a problem, etc. Why momentum works Abilities – Feed values forward to calculate outputs, use to classify examples – – – – – E.g., numerical functions, pixel data, etc., Calculate output values for the sigma function Calculate error terms for the output units (given formula) Calculate error terms for the hidden units (given formula) Calculate weight changes using the error terms Simulate the back-propagation algorithm Inductive Logic Programming Knowledge – Problem context and specification – How we can invert resolution – Logic programs (background, E+, E-, hypothesis) Prior satisfiability and necessity Posterior satisfiability and sufficiency To use induction (rules given) to find new sentences Absorption, identification, etc. What V and W operators are How ILP systems search for hypotheses Specific to general (using induction) G to S (using deduction) How pruning and sorting increase efficiency How language restriction increase efficiency Inductive Logic Programming Understanding – Why logic programs are a good representation – – Why the problem specification is necessary Why we invert resolution for our operators Easy to read, logical Prove that the observations follows from axioms + H Abilities – Apply rules of inference such as absorption – – Rules would be given Use resolution to demonstrate how the hypothesis proves the observations (other sentences) Determine which are more general/specific sentences using entailment Constraint Satisfaction Problems Knowledge – What a formal representation of a CSP is – – In terms of variables, domains, constraints (fully written out) What a binary CSP is What arc consistency is How to make a problem specification arc-consistent – – – – – By removing values from domains of variables What happens in backtracking search What forward checking is What variable and value ordering heuristics do What the fail first heuristic is Understanding – – Why we write out CSPs formally Why we write out constraints as tuples which are allowed Constraint Satisfaction Problems More understanding – – – – Why binary CSPs are important Why forward checking works Why fail first is so-called, and why it works Why value-ordering methods may be bad ideas Abilities – – – – To specify a problem as a formal constraint satisfaction problem (even if this is annoying for you!) Translate constraint formalisms to general constraints (e.g., using less than/greater than in linear arithmetic) Make problem specifications arc-consistent Simulate back-tracking search (with forward checking) Show understanding of fail-first etc. by doing hand calculations Genetic Algorithms Knowledge – Of the evolutionary approach in overview – Generate populations, fitness functions, recombination, etc. Canonical genetic algorithm Describe this schematically How individuals are selected to mate (fitness function, number which are guaranteed entry into intermediate population) How pairs are chosen to mate, and chosen to produce offspring How offspring are produced through recombination Genetic Algorithms Understanding – The inspiration from natural evolution (species/genes) – – – And its limitations That it’s difficult to specify solution representations That it’s difficult to specify fitness functions Why mutation is important (local maxima avoidance) Abilities – – Represent things (e.g., integers) as bit strings Perform crossover and mutation operators – One point and two point crossover Calculate evaluation functions Use these in fitness functions for the intermediate population Genetic Programming Knowledge – Representation of programs as graphs – Specifying a problem for a GP approach – – What result producing branches are Solution space of programs, fitness function (evaluation function) What the terminal set, function set, control parameters and termination conditions are How individuals are chosen for mating Using fitness for probabilities for intermediate population – – – Or using tournament selection, or ranking What the reproduction, crossover and mutation operators do What the crossover fragment is Genetic Programming Understanding – That it is automatic programming – – Why graphical representation of programs is good That architecture altering operations are used – That this is hard, other programs do this in a limited way On more sophisticated program spaces That GP can achieve human-level performance Abilities – Translate from functions (with things like if-statements) – – – To graphs and back again Determine function and terminal sets from programs Perform crossover and mutation Simulate (describe) how a GP approach would proceed