Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Learning to Parse Database Queries Using Inductive Logic Programming 施林锋 南京大学计算机科学与技术系 Outline • Introduction • Learning to Parse DB Queries • Overview of CHILL • Parsing DB Queries • Experimental • Future work & Conclusions • References & Related articles Introduction • Empirical or corpus-based methods for constructing natural language systems replace hand-generated • Statistical and probabilistic to constructing parsers • Stochastic grammars(Black,Lafferty, …) • Transition networks(Miller et al.) • Acid test for empirical methods • Construction of better natural language systems • The author is aim to use CHILL to engineer a natural language frontend for a database-query task. Overview of CHILL • CHILL: Constructive Heuristics Induction for Language Learning • CHILL is a general approach to the problem of inducing natural language parsers. • Chill use inductive logic programming to learn a deterministic shiftreduce parser written in Prolog. • Input: • A set of training instances <sentence, desired parses> • Output: • Shift-reduce parser maps sentence to parses Overview of CHILL Parsing DB Queries • Example • What is the capital of the state with the largest population? answer(C, (capital(S,C),largest(P,(state(S),population(S,P))))). • What are the major cities in Kansas? answer(C, (major(C), city(C), loc(C, S), equal(S, stateid(Kansas)))) • Query language • Logical form • More straightforward from natural language utterances than SQL Parsing DB Queries • Database • United States geography database system • An existing natural language interface called Geobase • Geobase contains 800 Prolog facts about state, capital city, population, area, major rivers, major cities, highest and lowest points Parsing DB Queries • Query language – Geoquery • Basic Objects Parsing DB Queries • Query language – Geoquery • Basic relations (right) • Meta-predicate (left) Expreimental • 250 sentences with its parses • Question pattern: • • • • which states | where is | what be/states/rivers (totally 203) how many/long/large/high (totally 41) give me… name the rivers in arkansas (totally 6) • Mainly ask states,rivers,city,population attach with superlative Expreimental • Random splits • 225 training example, 25 test • 10 fold cross validation • Sentence use CHILL to produce query, then executed the query • Evaluation • Same answer score correct, otherwise false Expreimental • Result • CHILL outperforms the existing system when trained on 175 or more examples • Two different failure • Wrong parses • Wrong answer Conclusions & Future work • Conclusions • CHILL parsers outperform an existing system • Empirical approach is important to NLP application • Future work • Much larger corpora and other domain • Extent to which performance can be improved by corpus “manufacturing” References & Related articles • Zelle J M, Mooney R J. Learning semantic grammars with constructive inductive logic programming[C]//AAAI. 1993: 817-822. • Zelle J M, Mooney R J. Inducing deterministic Prolog parsers from treebanks: A machine learning approach[C]//AAAI. 1994: 748-753. • Zettlemoyer L S, Collins M. Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars[J]. arXiv preprint arXiv:1207.1420, 2012. • Q&A