Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Learning to Parse Database Queries Using Inductive Logic Programming 施林锋 南京大学计算机科学与技术系 Outline • Introduction • Learning to Parse DB Queries • Overview of CHILL • Parsing DB Queries • Experimental • Future work & Conclusions • References & Related articles Introduction • Empirical or corpus-based methods for constructing natural language systems replace hand-generated • Statistical and probabilistic to constructing parsers • Stochastic grammars(Black,Lafferty, …) • Transition networks(Miller et al.) • Acid test for empirical methods • Construction of better natural language systems • The author is aim to use CHILL to engineer a natural language frontend for a database-query task. Overview of CHILL • CHILL: Constructive Heuristics Induction for Language Learning • CHILL is a general approach to the problem of inducing natural language parsers. • Chill use inductive logic programming to learn a deterministic shiftreduce parser written in Prolog. • Input: • A set of training instances <sentence, desired parses> • Output: • Shift-reduce parser maps sentence to parses Overview of CHILL Parsing DB Queries • Example • What is the capital of the state with the largest population? answer(C, (capital(S,C),largest(P,(state(S),population(S,P))))). • What are the major cities in Kansas? answer(C, (major(C), city(C), loc(C, S), equal(S, stateid(Kansas)))) • Query language • Logical form • More straightforward from natural language utterances than SQL Parsing DB Queries • Database • United States geography database system • An existing natural language interface called Geobase • Geobase contains 800 Prolog facts about state, capital city, population, area, major rivers, major cities, highest and lowest points Parsing DB Queries • Query language – Geoquery • Basic Objects Parsing DB Queries • Query language – Geoquery • Basic relations (right) • Meta-predicate (left) Expreimental • 250 sentences with its parses • Question pattern: • • • • which states | where is | what be/states/rivers (totally 203) how many/long/large/high (totally 41) give me… name the rivers in arkansas (totally 6) • Mainly ask states,rivers,city,population attach with superlative Expreimental • Random splits • 225 training example, 25 test • 10 fold cross validation • Sentence use CHILL to produce query, then executed the query • Evaluation • Same answer score correct, otherwise false Expreimental • Result • CHILL outperforms the existing system when trained on 175 or more examples • Two different failure • Wrong parses • Wrong answer Conclusions & Future work • Conclusions • CHILL parsers outperform an existing system • Empirical approach is important to NLP application • Future work • Much larger corpora and other domain • Extent to which performance can be improved by corpus “manufacturing” References & Related articles • Zelle J M, Mooney R J. Learning semantic grammars with constructive inductive logic programming[C]//AAAI. 1993: 817-822. • Zelle J M, Mooney R J. Inducing deterministic Prolog parsers from treebanks: A machine learning approach[C]//AAAI. 1994: 748-753. • Zettlemoyer L S, Collins M. Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars[J]. arXiv preprint arXiv:1207.1420, 2012. • Q&A