Download - aes journals

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

SQL wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
COMPUTER ENGINEERING
PROBABILISTIC CONTEXT FREE GRAMMAR: AN
APPROACH TO GENERIC INTERACTIVE NATURAL
LANGUAGE INTERFACE TO DATABASES
1
1, 2
P. R. DEVALE, 2ARATI DESHPANDE
Department of Information Technology, Bharati Vidyapeeth Deemed University
College Of Engineering, Pune-46
[email protected]
ABSTRACT: Information is playing a crucial role and one of the major sources of information is databases.
Almost all IT applications are storing and retrieving information from databases. The Structured Query
Language (SQL) is used in almost all languages for relational database systems. The idea of using Natural
Language instead of SQL has prompted the development of new type of processing called Natural language
Interface to Database. The natural language interface takes natural language questions as input and gives
textual response. Shadow parsing used in natural language processing can analyze a wide range of questions
with high accuracy and produce reasonable textual responses. The unawareness of structure of database in non
expert users led to the development of Intelligent Database System ( IDBS). Many systems such as LUNAR and
LADDER have been developed using the concept of natural language processing . The drawback of these
systems is that grammar must be tailor-made for each given database and they cover only a small domain of
the English language questions. Generic Interactive Natural Language Interface to Databases (GINLIDB)
system is generic in nature given the appropriate database and knowledge base It is designed by the use of UML
and developed using Visual Basic.NET-2005. An overview of NLIDB subcomponents is given, NLIDB
architectures and various approaches are also covered.
Keywords: Natural Language Database Interface, Probabilistic Context Free Grammars, Natural Language
Interface To Databases (NLIDB), Structured Query Language (SQL), Unified Modeling Language (UML),
Shallow Parsing, Intelligent Database Systems (IDBS), LUNAR, LADDER, Databases, Database
Management System (DBMS)
1. INTRODUCTION:Information is playing a very important role in our
lives today and one of the major sources of
information is databases. Databases are gaining prime
importance in a huge variety of application areas both
private and public information systems. Databases
are built with the objective of facilitating the
activities of data storage, processing, and retrieval
etc. all of which are associated with data management
in information systems. In relational databases, to
retrieve information from a database, one needs to
formulate a query in such way that the computer will
understand and produce the desired output [4].
Retrieval of information from the databases requires
the knowledge of databases language like the
Structured Query Language (SQL). The SQL norms
are based on a Boolean interpretation of the queries.
But not all users are aware of the Query language and
not all the queries of user’s seeking information from
the database can be answered explicitly by the
querying system. It is due to the fact that not all the
requirement’s characteristics can be expressed by
regular query languages. To simplify the complexity
of SQL and to manipulate of data in databases for
common people (not SQL professionals), many
researches have turned out to use natural language
instead of SQL. The idea of using natural language
instead of SQL has led to the development of new
type of processing method called Natural Language
Interface to Database systems (NLIDB) [12, 16]. The
NLIDB system is actually a branch of more
comprehensive method called Natural Language
Processing (NLP). In general, the main objective of
NLP research is to create an easy and friendly
environment to interact with computers in the sense
that computer usage does not require any
programming language skills to access the data; only
natural language (i.e. English) is required. One
drawback of previous systems is that the grammar
must be tailor-made for each given database. Another
drawback is that many NLP systems cover only a
small domain of the English language questions. The
system is generic in nature given the appropriate
database and knowledge base. This feature makes our
system distinguishable. Natural language computing
(NLC) is becoming one of the most active areas in
Human-Computer Interaction. The goal of NLC is to
enable communication between people and
computers without resorting to memorization of
complex commands and procedures [19]. Via
ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02
Page 52
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
COMPUTER ENGINEERING
computers around the globe the people accumulate
and manipulate huge amount of data every second of
the day. These huge amounts of data are located in
private personal computers or remote location (i.e.
the internet). Mostly, the data is stored in some kind
of databases or data warehouses. Data in database are
usually managed by DBMS. SQL is used to access
the database. The NLIDB system is actually a branch
of more comprehensive method called Natural
Language Processing (NLP). The main objective of
NLP research is to create an easy and friendly
environment to interact with computers in natural
language (i.e. English) [16]. In recent times, there is a
rising demands for non-expert users to query
relational databases in a more natural language
encompassing linguistic variables and terms, instead
of operating on the values of the attributes. Therefore
the idea of using natural language instead of SQL has
prompted the development of new type of processing
method called Natural Language Interface to
Database systems (NLIDB). NLIDB is a step towards
the development of intelligent database systems
(IDBS) to enhance the users in performing flexible
querying in databases.
2. EARLY SYSTEMS
The early efforts in the NL interfaces area started
back in fifties. Prototype systems were introduced in
the late sixties and early seventies. Many of these
systems relied on pattern matching to directly
mapping the user input to the database . Formal LIst
Processor (FLIP) is an early language for patternmatching based on LISP structure works on the
bases that if the input matches one of the patterns
then the system is able to build a query for the
database. In the pattern matching based systems, the
database details were inter-mixed into the code,
limited to specific databases and to the number and
complexity of the patterns. Natural language
processing was one of the approach, where the user
interactively is allowed to interrogate the stored data.
2.1 LUNAR system
The system LUNAR is a system that answers
questions about samples of rocks brought back from
the moon. The meaning of systems’ name is that is in
relation to the moon. The system was informally
introduced in 1971. The LUNAR system uses two
databases; one for the chemical analysis and the other
for literature references which helps it to achieve its
functions. The LUNAR system uses an Augmented
Transition Network (ATN) parser and Woods'
Procedural Semantics. According to [13], the
LUNAR system performance was quite impressive; it
managed to handle 78% of requests without any
errors and this ratio rose to 90% when dictionary
errors were corrected. The system was not put to
intensive use because of its linguistic limitations.
2.2 LADDER
The LADDER system was designed as a natural
language interface to a database of information about
US Navy ships. According to, the LADDER system
uses semantic grammar to parse questions to query a
distributed database. The system uses semantic
grammars technique that interleaves syntactic and
semantic processing. The question answering is done
via parsing the input and mapping the parse tree to a
database query. The system LADDER is based on a
three-layered architecture [14]. The first component
of the system is for Informal Natural Language
Access to Navy Data (INLAND). This component
accepts questions in a natural language and generates
a query to the database. The queries from the
INLAND are directed to the Intelligent Data Access
(IDA). This component which is the second
component of LADDER builds a fragment of a query
to IDA for each lower level syntactic unit in the
English Language. Input query and these fragments
are then combined to higher level syntactic units to
be recognized.The combined fragments are sent as a
command to IDA at the sentence level. IDA would
compose an answer that is relevant to the user’s
original query in addition to planning the correct
sequence of file queries. The third component of the
LADDER system is for File Access Manager (FAM).
The FAM finds the location of the generic files and
handles the access to them in the distributed database.
LISP was used to implement the LADDER system.
2.3 CHAT-80
The system CHAT-80 is one of the most referenced
NLP systems in the 80’s. Prolog was used to
implement this system. The CHAT-80 was an
impressive, efficient and sophisticated system. The
database of CHAT-80 consists of facts (i. e. oceans,
major seas, major rivers and major cities) about 150
of the countries world and a small set of English
language vocabulary that are enough for querying the
database. An English language question is processed
in three stages [15]. The system translates the
English language question by the creation of a logical
form as processes of three serial and complementary
functions where:
1.
Logical constants are used to represent
words.
2.
Predicates are used to represent Verbs,
nouns, and adjectives with their associated
prepositions. The predicates can have one or more
arguments.
3.
Conjunctions of predicates are used to
represent complex phrases or sentences.
Figure 1 Chat-80 Processing Scheme
These functions are being; parsing, interpretation and
scoping. Grammatical structure of a sentence is
ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02
Page 53
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
COMPUTER ENGINEERING
determined by the parsing module and the
interpretation andscoping consist of various
translation rules, expressed directly as Prolog clauses.
The basic method followed by Chat-80 is to attach
some extra control information to the logical form of
a query in order to make it an efficient piece of
Prolog program that can be executed directly to
produce the answer. The generated control
information comes into two forms:
1.
The order in which Prolog will attempt to
satisfy the query is determined by the Order of the
predications for that query.
2.
Separates the over all program into a
number of independent sub problems to limit the
amount of backtracking performed by Prolog.
In this way, Prolog is led to answer the queries in an
obviously sensible way and the Prolog compiler can
compile the transformed query into code that is as
efficient as iterative loops in a conventional
language.
3. VARIOUS APPROACHES TO NATURAL
LANGUAGE PROCESSING INTERFACE
Natural language is the topic of interest from
computational viewpoint due to the implicit
ambiguity that language possesses [1]. Several
researchers applied different techniques to deal with
language. Next few sub-sections describe diverse
strategies that are used to process language for
various purposes.
3.1Symbolic Approach (Rule Based Approach)
Natural Language Processing appears to be a strongly
symbolic activity. Words are symbols that stand for
objects and concepts in real worlds, and they are put
together into sentences that follow the well specified
grammar rules. Knowledge about language is
explicitly encoded in rules or other forms of
representation. Language is analyzed at various levels
to obtain information. On this obtained information
certain rules are applied to achieve linguistic
functionality [1]. As Human Language capabilities
include rule-base reasoning, it is supported well by
symbolic processing. In symbolic processing rules
are formed for every level of linguistic analysis.
3.2 Empirical Approach (Corpus Based
Approach)
Empirical approaches are based on statistical analysis
as well as other data driven analysis, of raw data
which is in the form of text corpora i.e. a collection
of machine readable text. The approach has been
around since NLP began in the early 1950s. The
empirical NLP has emerged as a replacement to rulebased NLP only in the last decade. Corpora are
primarily used as a source of information about
language and a number of techniques have emerged
to enable the analysis of corpus data. Syntactic
analysis can be achieved on the basis of statistical
probabilities estimated from a training corpus.
Lexical ambiguities can be resolved by considering
the likelihood of one or another interpretation on the
basis of context. Given the positive results shown by
the Empirical approach several different symbolic
and statistical methods have been employed, but most
of them are used to generate one part of a larger
information extraction system.
3.3 Connectionist Approach (Using Neural
Network)
Since human language capabilities are based on
neural network in the brain, Artificial Neural
Networks (also called as connectionist network)
provides on essential starting point for modeling
language processing [2, 18]. Instead of symbols, the
approach is based on distributed representations that
correspond to statistical regularities in language.
Though there has been significant research on the use
of applying neural networks for language processing
little has been done in the field of using the subsymbolic learning in language processing. SHRUTI
system developed by [3] is a neurally inspired system
for event modeling and temporal processing at a
connectionist level.
4.
NATURAL
LANGUAGE
DATABASE
INTERFACE BASED ON PROBABILISTIC
CONTEXT FREE GRAMMAR:
All the earlier tools translates a natural language
query into an intermediate language similar to firstorder logic and then into SQL using a set of
definitive rules. The intermediate language expresses
the meaning of the query in terms of high-level
concepts that are independent of database structures.
It is lack of robust in the translating process, since
premises of rules must be matched with phrases of
the query exactly or else the rule will be rejected. The
purpose of CFG is to overcome this problem by
constructing the most probable grammar tree and
analyzing the non-terminals (phrases) in the grammar
tree to collect the parameters which will be used in
SQL.
4.1 Overview of Natural Language Database
Interface
The overall system architecture of our natural
language database interface is given in Figure 2. The
original input to the system is a query expressed in
natural language. To process a query, the first step is
part of speech tagging; after this step each word of
the query is tagged [6, 7]. The second step is parsing
the tagged sentence by a PCFG. The PCFG parser
analyzes the query sentence according to the tag of
each word, and calculates the probability of all
possible grammar trees. The result of the analysis
forms a grammar tree with the maximum probability.
Finally, the SQL translator processes the grammar
tree to obtain the SQL query by a series of
dependency rules.
ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02
Page 54
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
COMPUTER ENGINEERING
Figure 2 NLDBI Architecture
4.2 Probabilistic Context Free Grammar
CFG has a formalization capability in describing
most sentence structures. Also, CFG is so well
formed that efficient sentence parser could be built
on top of it [19]. Probabilistic context free grammar
(PCFG) inherits most advantages of CFG, and it is
the simplest statistical model to analyze natural
language. Usually natural language sentences are
transformed into a tree structure through PCFG, and
the grammar tree is analyzed according to user’s
requirements. A probabilistic context free grammar
consists of the following:
A terminal set: {wk}, where wk is a word,
corresponding to a leaf in the grammar tree;
A non-terminal set: {Ni}, Ni which is a sign used to
generate terminals, corresponding to a non-leaf node
in the grammar tree;
The grammar is weakly equivalent to the Chomsky
Norm Form grammar (CNFG) which consists of rules
of only two forms which is the most concise
grammar. With n non-terminals and v terminals, the
number of parameters for each of the four forms is
shown in table 1.
s
Table 1
Consider a sentence w1m that is a sequence of words
w1 w2 w3……wm (ignoring punctuations), and each
string wi in the sequence stands for a word in the
sentence. The grammar tree of w1m can be generated
by a set of pre-defined grammar rules. As wi may be
generated by different non-terminals, and this
situation also appears when generating a nonterminal (a non-terminal may be generated by
different sets of several non-terminals), usually more
than one grammar tree may be generated. Take
sentence “ate dinner on the table with a fork” for
example, there are 2 grammar tress corresponding to
the sentence as shown in figure 3.
Figure 3
Two possible grammar trees that may be generated
for the sentence “ate dinner on the table with a fork”
4.3 SQL Translator
SQL translator is used translating the leaves of the
tree to the corresponding SQL. Actually the process
is collecting information from the parsed tree. Two
techniques may be used to collect the information:
dependency structure and verb sub categorization [8,
9, 10]. These techniques are also used in
disambiguation, since a PCFG is context-free and
ancestor-free, any information in the context of a
node in the parsed tree needs not be taken into
account when constructing the parsed tree. A
dependency structure is used to capture the inherent
relations occurs in the corpus texts that may be
critical in real-world applications, and it is usually
described by typed dependencies. A typed
dependency represents grammatical relations
between individual words with dependency labels,
such as subject or indirect object. Verb sub
categorization is the other technique [11]. If we know
the sub categorization frame of the verb, we can find
the objects of this verb easily, and the target of the
query can be found easily. Usually this process can
be done after scanning the parsed tree.
5. THE GENERIC INTERACTIVE NATURAL
LANGUAGE INTERFACE TO DATABASE
SYSTEM ARCHITECTURE:
The architecture of the GINLIDB system consists of
two major components:
1. Linguistic handling component
2. SQL constructing component.
ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02
Page 55
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
COMPUTER ENGINEERING
The natural language query correctness as far as the
grammatical structure and the possibility of
successful transformation to SQL statement is
concerned, is controlled by the first component.
The exact SQL statement that opens a connection to
the database in use and executes the generated SQL
statement and returns the query's result to the user is
generated by the second component in GINLIDB
[12].
Figure 4 The architecture of GINLIDB
A. Graphical User Interface (GUI)
The user interact with it GINLIDB system in a userfriendly Environment. The user does not require the
knowledge of computers and database terms. The
interaction with our system is through suitable visual
forms, buttons, and menus.
B. Linguistic handling component
The Linguistic handling component consists of three
parts: Lexical analysis, Parser, and Semantic
representation
1) Lexical analysis
Lexical analysis divides the sentence it into simpler
elements that called tokens. This process is
performed with the help of, token analyzing, spelling
checker, ambiguity reduction and excessive tokens
removal.
2) GINLIDB Parser
There is substantial ambiguity in the structure of
natural language queries so the queries are not easily
parsed by programs. The GINLIDB parser is
designed with two stages of grammars: lexical and
syntactic. The token generation (lexical analysis) is
the first stage where the input tokens' stream is split
into meaningful symbols. The syntactic analysis
based on Augmented Transition Network (ATN) is
the second stage, which checks if the tokens' structure
is in allowable grammatical structure. This is
processed via the parser according to a Context-Free
Grammar (CFG).
C. SQL constructing component
The SQL constructing component consists of three
parts; SQL Generator, database adaptor, and SQL
executor.
1) SQL generator
The elements of the natural query of the SQL is
mapped by the SQL generator. The SQL generator
uses four routine, each of which controls only one
specific part of the query. The overall SQL statement
is constructed from the concatenation of the output of
the four routines [16]. The first routine selects the
part of the natural language query that corresponds to
the appropriate DML command with the attributes'
names (i.e. SELECT * clause). The second routine
selects the part of the query that would mapped to a
table's name or a group of tables' names to construct
the FROM clause. The third routine selects the part
of the query that would be mapped to the WHERE
clause (condition). The fourth routine selects the part
of the natural language query that corresponds to the
order of displaying the result (ORDER BY clause
with the name the).
2) Database adaptor
Many different database management systems exist
today and a database adapter can be used to control
the variety of interfaces and techniques of these
different DBMS. Database connection, constraint,
data type, and SQL format are examples of such
varieties.
3) Database executor
The task of SQL executor is to get the required
results from the used database. To achieve this, the
generated SQL statement would be tested to verify
correctness before applied to the used database and
then represent the result to the user.
D. Syntactical knowledge base
The syntactical knowledge base of the GINLIDB
system is used by the linguistic component to
determine the accepted words, provide word
alternatives (in spelling correction process), and to
verify the natural language query grammar.
E. Semantic knowledge base
This knowledge base consists of the English semantic
grammar (grammar rules) and the schema of the
database in use. The semantic knowledge base is used
to replace words and/or phrases semantically by
equivalent words and/or phrases that are recognized
by our system (according to the system capabilities).
F. Knowledge extension
This component extends the syntactical knowledge
base by the adding new words and the semantic
knowledge base by adding new rules. The component
enlarges the system to accommodate variant domains
and to strengthen the terminology and rules of
existing domains.
5.1 Design of the GINLIDB System
The system is designed and implemented by the use
of Object Oriented (OO) techniques. The Unified
Modeling Language (UML) is an evolutionary
general-purpose [17], broadly applicable, toolsupported, and industry-standardized modeling
ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02
Page 56
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
COMPUTER ENGINEERING
language, used to design this system. The various
UML diagrams are:
• Use case diagrams are to conceptualize the
functionality of the system through the systems' cases
that represent different overall system scenarios.
• Sequence diagrams are used to show the
interactions among different elements of the system
in the shape of passing messages from and to each
object. Sequence diagrams depict the internal
behavior of the GINLIDB system.
• Class diagram is used to describe the static view of
the system by describing the classes and relationships
among them.
• Activity diagrams are used to capture the flow
from one activity to the next.
6.
THE
ADVANTAGES
AND
DISADVANTAGES OF NATURAL LANGUAGE
INTERFACE TO DATABASES
The following section discusses the advantages and
disadvantages of the Natural Language Interface to
databases systems [1, 5]: 6.1 The Advantages
a) No Artificial Language
One advantage of NLIDBs is that it allows a naïve
user who does not have knowledge of artificial
communication language to query the database
without learning the artificial language. Formal query
languages like SQL are difficult to learn and master,
at least by non-computer-specialists.
b) Simple, easy to use
Consider a database with a query language or a
certain form designed to display the query. While an
NLIDB system only requires a single input, a formbased may contain multiple inputs (fields, scroll
boxes, combo boxes, radio buttons, etc) depending on
the capability of the form. In the case of a query
language, a question may need to be expressed using
multiple statements which contain one or more subqueries with some joint operations as the connector.
c) Better for Some Questions
There are some kind of questions (e.g. questions
involving negation, or quantification) that can be
easily expressed in natural language, but that seem
difficult (or at least tedious) to express using
graphical or form-based interfaces [1, 5]. For
example, “Which department has no programmers?”
(Negation), or “Which company supplies every
department?” (Universal quantification), can be
easily expressed in natural language, but they would
be difficult to express in graphical or form-based
interfaces. They can be expressed using the query
language but it would require large complex queries
which can be only written by the computer experts.
d) Fault tolerance
Most of NLIDB systems provide some tolerances to
minor grammatical errors, while in a computer query
language the syntax and the rules of the language
must be obeyed, and any errors will cause the input
automatically be rejected by the system.
e) Easy to Use for Multiple Database Tables
Queries that involve multiple database tables like
“list the address of the farmers who got bonus greater
than 10000 rupees for the crop of wheat”, are
difficult to form in graphical user interface as
compared to natural language interface.
6.2 The Disadvantages
a) Linguistic coverage is not obvious
Currently all NLIDB systems can only handle some
subsets of a natural language and it is not easy to
define these subsets. Some NLIDB systems are not
able to provide answers of questions belonging to
their own subset which is not the case in a formal
language [5]. The formal language coverage is
obvious and any statements that follow the given
rules are guaranteed to give the corresponding
answer.
b) Linguistic vs. conceptual failures
In case of NLIDB system failures, it is usually seen
that the system does not provide any explanation as
to what caused the system to fail. Some users try to
rephrase the question or just leave the question
unanswered. Most of the time, it is left for the user to
fend for the cause of errors.
c) False expectations
People can be misled by an NLIDB system’s ability
to process a natural language: they may assume that
the system is intelligent [5]. Therefore many a times
it is seen that rather than asking precise questions
from a database, the user’s may be tempted to ask
questions that involve complex ideas, certain
judgments, reasoning capabilities, etc. for which an
NLIDB system cannot be relied upon.
7. CONCLUSIONS
Research is done from the last few decades on
Natural Language Interfaces. With the advancement
in hardware processing power, many natural
language interface to databases
got promising
results. The system accepts an English language
requests that is interpreted and translated into SQL
command using semantic grammar technique. In
addition, the system requires a knowledge base that
consists of a database and its schema. The result of
the number of experiments in the form of trials in a
user friendly environment had been very successful
and satisfactory. To improve the system performance
in natural language processing various issues like
enriching the knowledge sources of the system in
order to increase the system efficiency and
researching methods to improve the coherence and
the fluency of output texts must be considered. From
the experiments we can seen it is possible to translate
a natural language query to SQL, and a probabilistic
approach may be promising. So far, our NLDBI
system considers selection and a few simple
aggregations. The next step of our research is to
optimize the PCFG, to accommodate more complex
queries.
8. REFERENCES
[1]
Natural Language Interface using Shallow
Parsing, Rajendra Akerkar and Manish Joshi.
ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02
Page 57
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
COMPUTER ENGINEERING
International Journal of Computer Science and
Applications, Vol. 5, No. 3, pp 70 – 90
[2]
Miikkulainen R., “Natural language
processing with subsymbolic neural networks”,
Neural Network Perspectives on Cognition and
Adaptive Robotics.
[3]
Shashtri L., “A model of rapid memory
formation in the hippocampal system”, Proceeding of
Meeting of cognitive Science Society, Stanford
[4]
Natural language Interface for Database: A
Brief review, Mrs. Neelu Nihalani, Dr. Sanjay
Silakari, Dr. Mahesh Motwani. IJCSI International
Journal of Computer Science Issues, Vol. 8, Issue 2,
March 2011 ISSN (Online): 1694-0814
[5]
Androutsopoulos, G.D. Ritchie, and P.
Thanisch, Natural Language Interfaces to Databases
– An Introduction, Journal of Natural Language
Engineering 1 Part 1 (1995), 29–81
[6]
Johnson Mark. PCFG Models of Linguistic
Tree Representations. 24(4): 613-631, 1998.
[7]
A probabilistic corpus-driven model for
lexical-functional analysis. In Proc. COLINGACL’98.
[8]
Dan Klein, Christopher D. Manning:
Corpus-Based Induction of Syntactic Structure:
Models of Dependency and Constituency. ACL 2004:
478-485.
[9]
M-C.de Marneffe, B. MacCartney, and C.
D. Manning. “Generating Typed Dependency Parses
From Phrase Structure Parses”. In Proceedings of the
IEEE /ACL 2006 Workshop on Spoken Language
Technology. The Stanford Natural Language
Processing Group. 2006.
[10]
Dan Klein and Christopher D. Manning.
2003. Fast Exact Inference with a Factored Model for
Natural Language Parsing. In Advances in Neural
Information Processing Systems 15 (NIPS 2002),
Cambridge, MA: MIT Press, pp. 3-10.
[11]
Marie-Catherine
de
Marneffe,
Bill
MacCartney, and Christopher D. Manning.
Generating Typed Dependency Parses from Phrase
Structure Parses. In LREC 2006.
[12]
Interactive Natural Language Interface,
Faraj A. El-Mouadib, Zakaria Suliman Zubi, Ahmed
A. Almagrous, I. El-Feghi. WSEAS Transactions on
Computers, ISSN: 1109-2750 661 Issue 4, Volume 8,
April 2009
[13]
Hendrix, G., Sacrdoti, E., Sagalowicz, D.
and Slocum, J. (1978). Developing a natural language
interface to complex data. ACM Transactions on
Database Systems, Volume 3, No. 2, USA, Pages 105
– 147.
[14]
Woods, W. (1973). An experimental parsing
system for transition network grammars. In Natural
Language Processing, R. Rustin, Ed., Algorithmic
Press, New York.
[15]
Woods, W., Kaplan, R. and Webber, B.
(1972). The Lunar Sciences Natural Language
Information System. Bolt Beranek and Newman Inc.,
Cambridge, Massachusetts Final Report. B. B. N.
Report No 2378.
[16]
Generic Interactive Natural Language
Interface to Databases (GINLIDB) Faraj A. ElMouadib, Zakaria S. Zubi, Ahmed A. Almagrous,
and Irdess S. El-Feghi. International Journal of
Computers Issue 3, Volume 3, 2009
[17]
K. Hamilton and R. Miles “Learning UML
2.0. O'Reilly”, ISBN-10: 0- 596-00982-8, 2006.
[18]
Wermter S., “Hybrid approaches to neural
network-based language processing”, Technical
Report TR-97-030, International Computer Science
Institute.
[19]
Bei-Bei Huang, Guigang Zhang, PhillIp CY Sheu, A Natural Language Database Interface
Based On a Probabilistic Context Free Grammar
IEEE International Workshop on Semantic
Computing and Systems, 978-0-7695-3316-2/08
$25.00 © 2008 IEEE DOI 10.1109/WSCS.2008.14
ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02
Page 58