Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 10 KDD Presentation (3 of 3): Rule Induction Wednesday, February 7, 2001 William H. Hsu Department of Computing and Information Sciences, KSU http://www.cis.ksu.edu/~bhsu Readings: “Using Inductive Learning to Generate Rules for Semantic Query Optimization”, Hsu and Knoblock CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Presentation Outline • Paper – “Using Inductive Learning to Generate Rules for Semantic Query Optimization” – Authors: C.-N. Hsu and C. A. Knoblock – In Advances in Knowledge Discovery in Databases (Fayyad, Piatetsky-Shapiro, Smyth, Uthurusamy, eds.) • Overview – Learning semantic knowledge • Rule induction • Purpose: semantic query optimization (SQO) – Analogue: inductive logic programming (ILP) • Knowledge representation: Horn clauses • Idea: use reformulation of queries to learn (induce) rules • Application of Machine Learning to KDD: Issues – Rules: Good hypothesis language for performance element (SQO)? – How are goals of database query speedup achieved? – Key strengths: straightforward induction method; can use domain theory CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Induction as Inverted Deduction: Design Principles Deductive System for Inductive Learning Training Examples New Instance Theorem Prover Assertion { c H } Classification of New Instance (or “Don’t Know”) Inductive bias made explicit • Recall: Definition of Induction – Induction: finding h such that <xi, f(xi)> D . (B D xi) | f(xi) • A | B means A logically entails B • xi ith target instance • f(xi) is the target function value for example xi (data set D = {<xi, f(xi)>}) • Background knowledge B (e.g., inductive bias in inductive learning) • Idea – Design inductive algorithm by inverting operators for automated deduction – Same deductive operators as used in theorem proving CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Induction as Inverted Deduction: Example • Deductive Query – “Pairs <u, v> of people such that u is a child of v” – Relations (predicates) • Child (target predicate) • Father, Mother, Parent, Male, Female • Learning Problem – Formulation • Concept learning: target function f is Boolean-valued • i.e., target predicate – Components • Target function f(xi): Child (Bob, Sharon) • xi: Male (Bob), Female (Sharon), Father (Sharon, Bob) • B: {Parent (x, y) Father (x, y). Parent (x, y) Mother (x, y).} – What satisfies <xi, f(xi)> D . (B D xi) | f(xi)? • h1: Child (u, v) Father (v, u). - doesn’t use B • h2: Child (u, v) Parent (v, u). - uses B CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Induction as Inverted Deduction: Advantages and Disadvantages • Advantages (Pros) – Subsumes earlier idea of finding h that “fits” training data – Domain theory B helps define meaning of “fitting” the data: B D xi | f(xi) – Suggests algorithms that search H guided by B • Theory-guided constructive induction [Donoho and Rendell, 1995] • aka Knowledge-guided constructive induction [Donoho, 1996] • Disadvantages (Cons) – Doesn’t allow for noisy data • Q: Why not? • A: Consider what <xi, f(xi)> D . (B D xi) | f(xi) stipulates – First-order logic gives a huge hypothesis space H • Overfitting… • Intractability of calculating all acceptable h’s CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Inverting Resolution: Example C1: Pass-Exam Know-Material C2: Know-Material Study Resolution P L C: Pass-Exam Study C1: Pass-Exam Know-Material L R P R C2: Know-Material Study Inverse Resolution C: Pass-Exam Study CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Semantic Query Optimization (SQO) Methodology • Goals (Section 17.1) – Use semantic rules to find “shortcuts” to queries • Example: all CIS 864 students have studied basic probability • Query: “Find all CIS 864 students who have had courses in probability and stochastic processes” • Can drop condition – Learn rules from data • Observe when query can be simplified • Generalize over these “training cases” • Background (Section 17.2) – Queries: Datalog select-from-where subset of Structured Query Language (SQL) – Semantic rules: Horn clauses (cf. Prolog) • Learning Framework (Section 17.3) – Concept: SatisfyInputQuery (+ iff instance, i.e., tuple, satistifes query) – Algorithm for dropping constraints (generalization): greedy min-set-cover – Heuristic (preference bias): gain/cost ratio CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Learning Framework and Algorithm • Given: Few Example Queries, Data Set D (Many Tuples) • Methodology (Sections 17.3-4) – Step 1 (Optimizer): optimize queries by dropping constraints if possible • Use Greedy-Min-Set-Cover algorithm • Call learning module to add rules to rule base – Step 2 (Find Alternative Queries): • 2a (Construct Candidate Constraints): use gain/cost ratio (number of – cases excluded / syntactic length of constraint) • Rationale: Occam’s Razor bias, min-set-cover (ratio-bounded approximation) • 2b (Search for Constraints): build on newly-introduced relations – Step 3 (Update Rule Bank): apply newly discovered rules • Put newly-induced rules into rule base • Use inference engine (Prolog) to generate facts that will shorten query search CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Design Rationale • Problem (Sections 17.1-4) – How to generalize well over reformulable queries? – Want to make sure inducer does not overfit observed pattern of training examples • Solution Approach (Section 17.3-4) – Idea: Occam’s Razor bias • Prefer shorter hypotheses, all other things being equal • Why does this work? • Types of Bias – Preference bias • Captured (“encoded”) in learning algorithm • Compare: search heuristic – Language bias • Captured (“encoded”) in knowledge (hypothesis) representation • Compare: restriction of search space • aka restriction bias CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Experimental Method • Experimental Results (Section 17.5) – Improvement using SQO by rule induction (Table 17.4) • Reformulation using induced rules improves short and long queries (about uniformly) • Speedup • Breakdown of savings by NIL queries vs. overall • Claims (Section 17.5) – SQO is scalable: can use rule induction on large DBs – SQO is general: can apply other search techniques, heuristics CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Summary: Content Critique • Key Contribution – Simple, direct integration of inductive rule learning with SQO – Significance to KDD: good way to apply ILP-like learning in DB optimization – Applications • Inference • Decision support systems (DSS) • Strengths – Somewhat generalizable approach • Significant for KDD • Applies to other learning-for-optimization inducers – Formal analysis of SQO complexity – Experiments: measure • Speedup learning % time saved • How wasted time is saved (NIL queries, short vs. long queries) cf. performance profiling • Weaknesses, Tradeoffs, and Questionable Issues – Insufficient comparison of alternative heuristics (MDL, etc.) – Empirical performance of exhaustive search? CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences Summary: Presentation Critique • Audience: Researchers and Practitioners of – AI (machine learning, intelligent database optimization) – Database management systems – Applied logic • Positive and Exemplary Points – Good, abstract examples illustrating role of SQO and ILP – Real DB optimization example (3 Oracle DBs) • Negative Points and Possible Improvements – Insufficient description of analytical hypothesis representations – Semantics: not clear how to apply other algorithms of rule induction • Decision tree • First-order ILP (e.g., FOIL) CIS 830: Advanced Topics in Artificial Intelligence Kansas State University Department of Computing and Information Sciences