* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture 45 - KDD - Kansas State University
Survey
Document related concepts
Catastrophic interference wikipedia , lookup
Human-Computer Interaction Institute wikipedia , lookup
Personal knowledge base wikipedia , lookup
Philosophy of artificial intelligence wikipedia , lookup
Affective computing wikipedia , lookup
Embodied cognitive science wikipedia , lookup
Intelligence explosion wikipedia , lookup
Collaborative information seeking wikipedia , lookup
Knowledge representation and reasoning wikipedia , lookup
Ethics of artificial intelligence wikipedia , lookup
Concept learning wikipedia , lookup
Existential risk from artificial general intelligence wikipedia , lookup
Pattern recognition wikipedia , lookup
Transcript
Lecture 45 Course Review and Future Research Directions Friday, May 5, 2000 William H. Hsu Department of Computing and Information Sciences, KSU http://www.cis.ksu.edu/~bhsu Readings: Chapters 1-10, 13, Mitchell Chapters 14-21, Russell and Norvig CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Main Themes Artificial Intelligence and KDD • Analytical Learning: Combining Symbolic and Numerical AI – Inductive learning – Role of knowledge and deduction in integrated inductive and analytical learning • Artificial Neural Networks (ANNs) for KDD – Common neural representations: current limitations – Incorporating knowledge into ANN learning • Uncertain Reasoning in Decision Support – Probabilistic knowledge representation – Bayesian knowledge and data engineering (KDE): elicitation, causality • Data mining: KDD applications – Role of causality and explanations in KDD – Framework for data mining: wrappers for performance enhancement • Genetic Algorithms (GAs) for KDD – Evolutionary algorithms (GAs, GP) as optimization wrappers – Introduction to classifier systems CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Class 0: A Brief Overview of Machine Learning • Overview: Topics, Applications, Motivation • Learning = Improving with Experience at Some Task – Improve over task T, – with respect to performance measure P, – based on experience E. • Brief Tour of Machine Learning – A case study – A taxonomy of learning – Intelligent systems engineering: specification of learning problems • Issues in Machine Learning – Design choices – The performance element: intelligent systems • Some Applications of Learning – Database mining, reasoning (inference/decision support), acting – Industrial usage of intelligent systems CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Class 1: Integrating Analytical and Inductive Learning • Learning Specification (Inductive, Analytical) – Instances X, target function (concept) c: X H, hypothesis space H – Training examples D: positive, negative examples of target function c – Analytical learning: also given domain theory T for explaining examples • Domain Theories – Expressed in formal language: propositional logic, predicate logic – Set of assertions (e.g., well-formed formulae) for reasoning about domain • Expresses constraints over relations (predicates) within model • Example: Ancestor (x, y) Parent (x, z) Ancestor (z, y). • Determine – Hypothesis h H such that h(x) = c(x) for all x D – Such h are consistent with training data and domain theory T • Integration Approaches – Explanation (proof and derivation)-based learning: EBL – Pseudo-experience: incorporating knowledge of environment, actuators – Top-down decomposition: programmatic (procedural) knowledge, advice CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 2-3: Explanation-Based Neural Networks • Paper – Topic: Explanation-Based and Inductive Learning in ANNs – Title: Integrating Inductive Neural Network Learning and EBL – Authors: Thrun and Mitchell – Presenter: William Hsu • Key Strengths – Idea: (state, action)-to-state mappings as steps in generalizable proof (explanation) for observed episode – Generalizable approach (significant for RL, other learning-to-predict inducers) • Key Weaknesses – Other numerical learning models (HMMs, DBNs) may be more suited to EBG – Tradeoff: domain theory of EBNN lacks semantic clarity of symbolic EBL • Future Research Issues – How to get the best of both worlds (clear DT, ability to generate explanations)? – Applications: to explanation in commercial, military, legal decision support – See work by: Thrun, Mitchell, Shavlik, Towell, Pearl, Heckerman CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 4-5: Phantom Induction • Paper – Topic: Distal Supervised Learning and Phantom Induction – Title: Iterated Phantom Induction: a Little Knowledge Can Go a Long Way – Authors: Brodie and Dejong – Presenter: Steve Gustafson • Key Strengths – Idea: apply knowledge to generate (pseudo-experiential) training data – Speedup – learning curve significantly shortened with respect to RL by application of “small amount” of knowledge • Key Weaknesses – Haven’t yet seen how to produce plausible, comprehensible explanations – How much knowledge is “a small amount”? (How to measure?) • Future Research Issues – Control, planning domains similar (but not identical) to robot games – Applications: adaptive (e.g., ANN, BBN, MDP, GA) agent control, planning – See work by: Brodie, Dejong, Rumelhart, McClelland, Sutton, Barto CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 6-7: Top-Down Hybrid Learning • Paper – Topic: Learning with Prior Knowledge – Title: A Divide-and-Conquer Approach to Learning from Prior Knowledge – Authors: Chown and Dietterich – Presenter: Aiming Wu • Key Strengths – Idea: apply programmatic (procedural) knowledge to select training data – Uses simulation to boost inductive learning performance (cf. model checking) – Divide-and-conquer approach (multiple experts) • Key Weaknesses – Doesn’t illustrate form, structure of programmatic knowledge clearly – Doesn’t systematize and formalize model checking / simulation approach • Future Research Issues – Model checking and simulation-driven hybrid learning – Applications: “consensus under uncertainty”, simulation-based optimization – See work by: Dietterich, Frawley, Mitchell, Darwiche, Pearl CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 8-9: Learning Using Prior Knowledge • Paper – Topic: Refinement of Approximate Domain-Theoretic Knowledge – Title: Refinement of Approximate Domain Theories by Knowledge-Based Neural Networks – Authors: Towell, Shavlik, and Noordewier – Presenter: Li-Jun Wang • Key Strengths – Idea: build relational explanations; compile into ANN representation – Applies structural, functional, constraint-based knowledge – Uses ANN to further refine domain theory • Key Weaknesses – Can’t get refined domain theory back! – Explanations also no longer clear after “compilation” (transformation) process • Future Research Issues – How to retain semantic clarity of explanations, DT, knowledge representation – Applications: intelligent filters (e.g., fraud detection), decision support – See work by: Shavlik, Towell, Maclin, Sun, Schwalb, Heckerman CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Class 10: Introduction to Artificial Neural Networks • Architectures – Nonlinear transfer functions – Multi-layer networks of nonlinear units (sigmoid, hyperbolic tangent) – Hidden layer representations • Backpropagation of Error – The backpropagation algorithm • Relation to error gradient function for nonlinear units • Derivation of training rule for feedfoward multi-layer networks – Training issues: local optima, overfitting • References: Chapter 4, Mitchell; Chapter 4, Bishop; Rumelhart et al • Research Issues: How to… – Learn from observation, rewards and penalties, and advice – Distribute rewards and penalties through learning model, over time – Generate pseudo-experiential training instances in pattern recognition – Partition learning problems on the fly, via (mixture) parameter estimation CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 11-12: Reinforcement Learning and Advice • Paper – Topic: Knowledge and Reinforcement Learning in Intelligent Agents – Title: Incorporating Advice into Agents that Learn from Reinforcements – Authors: Maclin and Shavlik – Presenter: Kiranmai Nandivada • Key Strengths – Idea: compile advice into ANN representation for RL – Advice expressed in terms of constraint-based knowledge – Like KBANN, achieves knowledge refinement through ANN training • Key Weaknesses – Like KBANN, lose semantic clarity of advice, policy, explanations – How to evaluate “refinement” effectively? Quantitatively? Logically? • Future Research Issues – How to retain semantic clarity of explanations, DT, knowledge representation – Applications: intelligent agents, web mining (spiders, search engines), games – See work by: Shavlik, Maclin, Stone, Veloso, Sun, Sutton, Pearl, Kuipers CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 13-14: Reinforcement Learning Over Time • Paper – Topic: Temporal-Difference Reinforcement Learning – Title: TD Models: Modeling the World at a Mixture of Time Scales – Author: Sutton – Presenter: Vrushali Koranne • Key Strengths – Idea: combine state-action evaluation function (Q) estimates over multiple time steps of lookahead – Effective temporal credit assignment (TCA) – Biologically plausible (simulates TCA aspects of dopaminergic system) • Key Weaknesses – TCA methodology is effective but semantically hard to comprehend – Slow convergence: can knowledge help? How will we judge? • Future Research Issues – How to retain clarity, improve convergence speed, of multi-time RL models – Applications: control systems, robotics, game playing – See work by: Sutton, Barto, Mitchell, Kaelbling, Smyth, Shafer, Goldberg CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 15-16: Generative Neural Models • Paper – Topic: Pattern Recognition using Unsupervised ANNs – Title: The Wake-Sleep Algorithm for Unsupervised Neural Networks – Authors: Hinton, Dayan, Frey, and Neal – Presenter: Prasanna Jayaraman • Key Strengths – Idea: use 2-phase algorithm to generate training instances (“dream” stage) and maximize conditional probability of data given model (“wake” stage) – Compare: expectation-maximization (EM) algorithm – Good for image recognition • Key Weaknesses – Not all data admits this approach (small samples, ill-defined features) – Not immediately clear how to use for problem-solving performance elements • Future Research Issues – Studying information theoretic properties of Helmholtz machine – Applications: image/speech/signal recognition, document categorization – See work by: Hinton, Dayan, Frey, Neal, Kirkpatrick, Hajek, Gharahmani CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 17-18: Modularity in Neural Systems • Paper – Topic: Combining Models using Modular ANNs – Title: Modular and Hierarchical Learning Systems – Authors: Jordan and Jacobs – Presenter: Afrand Agah • Key Strengths – Idea: use interleaved EM update steps to update expert, gating components – Effect: forces specialization among ANN components (GLIMs); boosts performance of single experts; very fast convergence in some cases – Explores modularity in neural systems (artificial and biological) • Key Weaknesses – Often cannot achieve higher accuracy than ML, MAP, Bayes optimal estimation – Doesn’t provide experts that specialize in spatial, temporal pattern recognition • Future Research Issues – Constructing, selecting mixtures of other ANN components (not just GLIMs) – Applications: pattern recognition, time series prediction – See work by: Jordan, Jacobs, Nowlan, Hinton, Barto, Jaakola, Hsu CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Class 19: Introduction to Probabilistic Reasoning • Architectures – Bayesian (Belief) Networks • Tree structured, polytrees • General – Decision networks – Temporal variants (beyond scope of this course) • Parameter Estimation – Maximum likelihood (MLE), maximum a posteriori (MAP) – Bayes optimal classification, Bayesian learning • References: Chapter 6, Mitchell; Chapters 14-15, 19, Russell and Norvig • Research Issues: How to… – Learn from observation, rewards and penalties, and advice – Distribute rewards and penalties through learning model, over time – Generate pseudo-experiential training instances in pattern recognition – Partition learning problems on the fly, via (mixture) parameter estimation CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 20-21: Approaches to Uncertain Reasoning • Paper – Topic: The Case for Probability – Title: In Defense of Probability – Author: Cheeseman – Presenter: Pallavi Paranjape • Key Strengths – Idea: probability is mathematically sound way to represent uncertainty – Views of probability considered: objectivist, frequentist, logicist, subjectivist – Argument made for meta-subjectivist belief measure concept of probability • Key Weaknesses – Highly dogmatic view without concrete justification for all assertions – Does not quantitatively, empirically compare Bayesian, non-Bayesian methods • Future Research Issues – Integrating symbolic and numerical (statistical) models of uncertainty – Applications: uncertain reasoning, pattern recognition, learning – See work by: Cheeseman, Cox, Good, Pearl, Zadeh, Dempster, Shafer CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 22-23: Learning Bayesian Network Structure • Paper – Topic: Learning Bayesian Networks from Data – Title: Learning Bayesian Network Structure from Massive Datasets – Authors: Friedman, Pe'er, Nachman – Presenter: Jincheng Gao • Key Strengths – Idea: can use graph constraints, scoring functions to select candidate parents in constructing directed graph model of probability (BBN) – Tabu search, greedy score-based methods (K2), etc. also considered • Key Weaknesses – Optimal Bayesian network structure learning still intractable for conventional (single-instruction sequential) architectures – More empirical comparison among alternative methods warranted • Future Research Issues – Scaling up to massive real-world data sets (e.g., medical, agricultural, DSS) – Applications: diagnosis, troubleshooting, user modeling, intelligent HCI – See work by: Friedman, Goldszmidt, Heckerman, Cooper, Beinlich, Koller CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 24-25: Bayesian Networks for User Modeling • Paper – Topic: Decision Support Systems and Bayesian User Modeling – Title: The Lumiere Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users – Authors: Horvitz, Breese, Heckerman, Hovel, Rommelse – Presenter: Yuhui (Cathy) Liu • Key Strengths – Idea: BBN model is developed from user logs, used to infer mode of usage – Can infer goals, skill level of user • Key Weaknesses – Need high accuracy in inferring goals to deliver meaningful content – May be better to use next-generation search engine (more interactivity, less passive monitoring) • Future Research Issues – Designing better interactive user modeling – Applications: clickstream monitoring, e-commerce, web search, help – See work by: Horvitz, Breese, Heckerman, Lee, Huang CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 26-27: Causal Reasoning • Paper – Topic: KDD and Causal Reasoning – Title: Symbolic Causal Networks for Reasoning about Actions and Plans – Authors: Darwiche and Pearl – Presenter: Yue Jiao • Key Strengths – Idea: use BBN to represent symbolic constraint knowledge – Can use to generate mechanistic explanations • Model actions • Model sequences of actions (plans) • Key Weaknesses – Integrative methods (numerical, symbolic BBNs) still need exploration – Unclear how to incorporate methods for learning to plan • Future Research Issues – Reasoning about systems – Applications: uncertain reasoning, pattern recognition, learning – See work by: Horvitz, Breese, Heckerman, Lee, Huang CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 28-29: Knowledge Discovery from Scientific Data • Paper – Topic: KDD for Scientific Data Analysis – Title: KDD for Science Data Analysis: Issues and Examples – Authors: Fayyad, Haussler, and Stolorz – Presenter: Arulkumar Elumalai • Key Strengths – Idea: investigate how and whether KDD techniques (OLAP, learning) scale up to huge data sets – Answer: “it depends” – on computational complexity, many other factors • Key Weaknesses – Haven’t developed clear theory yet of how to assess “how much data is really needed” – No technical treatment or characterization of data cleaning • Future Research Issues – Data cleaning (aka data cleansing), pre- and post-processing (OLAP) – Applications: intelligent databases, visualization, high-performance CSE – See work by: Fayyad, Smyth, Uthurusamy, Haussler, Foster CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 30-31: Relevance Determination • Paper – Topic: Relevance Determination in KDD – Title: Irrelevant Features and the Subset Selection Problem – Authors: John, Kohavi, and Pfleger – Presenter: DingBing Yang • Key Strengths – Idea: cast problem of choosing relevant attributes (given “top-level” learning problem specification) as search – Effective state space search (A/A*-based) approach demonstrated • Key Weaknesses – May not have good enough heuristics! – Can either develop them (via information theory) or use MCMC methods • Future Research Issues – Selecting relevant data channels from continuous sources (e.g., sensors) – Applications: bioinformatics (genomics, proteomics, etc.), prognostics – See work by: Kohavi, John, Rendell, Donoho, Hsu, Provost CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 32-33: Learning for Text Document Categorization • Paper – Topic: Text Documents and Information Retrieval (IR) – Title: Hierarchically Classifying Documents using Very Few Words – Authors: Koller and Sahami – Presenter: Yan Song • Key Strengths – Idea: use rank-frequency scoring methods to find “keywords that make a difference” – Break into meaningful hierarchy • Key Weaknesses – Sometimes need to derive semantically meaningful cluster labels – How to integrate this method with dynamic cluster segmentation, labeling? • Future Research Issues – Bayesian architectures using “non-Bayesian” learning algorithms (e.g., GAs) – Applications: digital libraries (hierarchical, distributed dynamic indexing), intelligent search engines, intelligent displays (and help indices) – See work by: Koller, Sahami, Roth, Charniak, Brill, Yarowsky CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Classes 34-35: Web Mining • Paper – Topic: KDD and The Web – Title: Learning to Extract Symbolic Knowledge from the World Wide Web – Authors: Craven, DiPasquo, Freitag, McCallum, Mitchell, Nigam, and Slattery – Presenter: Ping Zou • Key Strengths – Idea: build probabilistic model of web documents using “keywords that matter” – Use probabilistic model to represent knowledge for indexing into web database • Key Weaknesses – How to account for concept drift? – How to explain and express constraints (e.g., “proper nouns that are person names don’t matter”)? Not considered here… • Future Research Issues – Using natural language processing (NLP), image / audio / signal processing – Applications: searchable hypermedia, digital libraries, spiders, other agents – See work by: McCallum, Mitchell, Roth, Sahami, Pratt, Lee CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Class 36: Introduction to Evolutionary Computation • Architectures – Genetic algorithms (GAs), genetic programming (GP), genetic wrappers – Simple vs. parameterless GAs • Issues – Loss of diversity • Consequence: collapse of Pareto front • Solutions: niching (sharing, preselection, crowding) – Parameterless GAs – Other issues (not covered): genetic drift, population sizing, etc. • References: Chapter 9, Mitchell; Chapters 1-6, Goldberg; Chapter 1-5, Koza • Research Issues: How to… – Design GAs based on credit assignment system (in performance element) – Build hybrid analytical / inductive learning GP systems – Use GAs to perform relevance determination in KDD – Control diversity in GA solutions for hard optimization problems CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Class 37-38: Genetic Algorithms and Classifier Systems • Paper – Topic: Classifier Systems and Inductive Learning – Title: Generalization in the XCS Classifier System – Author: Wilson – Presenter: Elizabeth Loza-Garay • Key Strengths – Idea: incorporate performance element (classifier system) into GA design – Solid theoretical foundation: advanced building block (aka schema) theory – Can use to engineer more efficient GA model, tune parameters • Key Weaknesses – Need to progress from toy problems (e.g., MUX learning) to real-world ones – Need to investigate scaling up of GA principles (e.g., building block mixing) • Future Research Issues – Building block scalability in classifier systems – Applications: reinforcement learning, mobile robotics, other animats, a-life – See work by: Wilson, Goldberg, Holland, Booker CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Class 39-40: Knowledge-Based Genetic Programming • Paper – Topic: Genetic Programming and Multistrategy Learning – Title: Genetic Programming and Deductive-Inductive Learning: A Multistrategy Approach – Authors: Aler, Borrajo, and Isasi – Presenter: Yuhong Cheng • Key Strengths – Idea: use knowledge-based system to calibrate starting state of MCMC optimization system (here, GP) – Can incorporate knowledge (as in CIS830 Part 1 of 5) • Key Weaknesses – Generalizability of HAMLET population seeding method not well established – “General-purpose” problem solving systems can become Rube Goldberg-ian • Future Research Issues – Using multistrategy GP systems to provide knowledge-based decision support – Applications: logistics (military, industrial, commercial), other problem solving – See work by: Aler, Borrajo, Isasi, Carbonell, Minton, Koza, Veloso CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Class 41-42: Genetic Wrappers for Inductive Learning • Paper – Topic: Genetic Wrappers for KDD Performance Enhancement – Title: Simultaneous Feature Extraction and Selection Using a Masking Genetic Algorithm – Authors: Raymer, Punch, Goodman, Sanschagrin, Kuhn – Presenter: Karthik K. Krishnakumar • Key Strengths – Idea: use GA to empirically (statistically) validate inducer – Can use to select, synthesize attributes (aka features) – Can also use to tune other GA parameters (hence “wrapper”) • Key Weaknesses – Systematic experimental studies of genetic wrappers have not yet been done – Wrappers don’t yet take performance element into explicit account • Future Research Issues – Improving supervised learning inducers (e.g., in MLC++) – Applications: better combiners; feature subset selection, construction – See work by: Raymer, Punch, Cherkauer, Shavlik, Freitas, Hsu, Cantu-Paz CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Class 43-44: Genetic Algorithms for Optimization • Paper – Topic: Genetic Optimization and Decision Support – Title: A Niched Pareto Optimal Genetic Algorithm for Multiobjective Optimization – Authors: Horn, Nafpliotis, and Goldberg – Presenter: Li Lian • Key Strengths – Idea: control representation of neighborhoods Pareto optimal front by niching – Gives abstract and concrete case studies of niching (sharing) effects • Key Weaknesses – Need systematic exploration, characterization of “sweet spot” – Shows static comparisons, not small-multiple visualizations that led to them • Future Research Issues – Biologically (ecologically) plausible models – Applications: engineering (ag / bio, civil, computational, environmental, industrial, mechanical, nuclear) optimization; computational life sciences – See work by: Goldberg, Horn, Schwefel, Punch, Minsker, Kargupta CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Class 45: Meta-Summary • Data Mining / KDD Problems – Business decision support • Classification • Recommender systems – Control and policy optimization • Data Mining / KDD Solutions: Machine Learning, Inference Techniques – Models • Version space, decision tree, perceptron, winnow • ANN, BBN, SOM • Q functions • GA/GP building blocks (schemata), GP building blocks – Algorithms • Candidate elimination, ID3, delta rule, MLE, Simple (Naïve) Bayes • K2, EM, backprop, SOM convergence, LVQ, ADP, simulated annealing • Q-learning, TD() • Simple GA, GP CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences