Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN COMPUTER ENGINEERING PROBABILISTIC CONTEXT FREE GRAMMAR: AN APPROACH TO GENERIC INTERACTIVE NATURAL LANGUAGE INTERFACE TO DATABASES 1 1, 2 P. R. DEVALE, 2ARATI DESHPANDE Department of Information Technology, Bharati Vidyapeeth Deemed University College Of Engineering, Pune-46 [email protected] ABSTRACT: Information is playing a crucial role and one of the major sources of information is databases. Almost all IT applications are storing and retrieving information from databases. The Structured Query Language (SQL) is used in almost all languages for relational database systems. The idea of using Natural Language instead of SQL has prompted the development of new type of processing called Natural language Interface to Database. The natural language interface takes natural language questions as input and gives textual response. Shadow parsing used in natural language processing can analyze a wide range of questions with high accuracy and produce reasonable textual responses. The unawareness of structure of database in non expert users led to the development of Intelligent Database System ( IDBS). Many systems such as LUNAR and LADDER have been developed using the concept of natural language processing . The drawback of these systems is that grammar must be tailor-made for each given database and they cover only a small domain of the English language questions. Generic Interactive Natural Language Interface to Databases (GINLIDB) system is generic in nature given the appropriate database and knowledge base It is designed by the use of UML and developed using Visual Basic.NET-2005. An overview of NLIDB subcomponents is given, NLIDB architectures and various approaches are also covered. Keywords: Natural Language Database Interface, Probabilistic Context Free Grammars, Natural Language Interface To Databases (NLIDB), Structured Query Language (SQL), Unified Modeling Language (UML), Shallow Parsing, Intelligent Database Systems (IDBS), LUNAR, LADDER, Databases, Database Management System (DBMS) 1. INTRODUCTION:Information is playing a very important role in our lives today and one of the major sources of information is databases. Databases are gaining prime importance in a huge variety of application areas both private and public information systems. Databases are built with the objective of facilitating the activities of data storage, processing, and retrieval etc. all of which are associated with data management in information systems. In relational databases, to retrieve information from a database, one needs to formulate a query in such way that the computer will understand and produce the desired output [4]. Retrieval of information from the databases requires the knowledge of databases language like the Structured Query Language (SQL). The SQL norms are based on a Boolean interpretation of the queries. But not all users are aware of the Query language and not all the queries of user’s seeking information from the database can be answered explicitly by the querying system. It is due to the fact that not all the requirement’s characteristics can be expressed by regular query languages. To simplify the complexity of SQL and to manipulate of data in databases for common people (not SQL professionals), many researches have turned out to use natural language instead of SQL. The idea of using natural language instead of SQL has led to the development of new type of processing method called Natural Language Interface to Database systems (NLIDB) [12, 16]. The NLIDB system is actually a branch of more comprehensive method called Natural Language Processing (NLP). In general, the main objective of NLP research is to create an easy and friendly environment to interact with computers in the sense that computer usage does not require any programming language skills to access the data; only natural language (i.e. English) is required. One drawback of previous systems is that the grammar must be tailor-made for each given database. Another drawback is that many NLP systems cover only a small domain of the English language questions. The system is generic in nature given the appropriate database and knowledge base. This feature makes our system distinguishable. Natural language computing (NLC) is becoming one of the most active areas in Human-Computer Interaction. The goal of NLC is to enable communication between people and computers without resorting to memorization of complex commands and procedures [19]. Via ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 52 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN COMPUTER ENGINEERING computers around the globe the people accumulate and manipulate huge amount of data every second of the day. These huge amounts of data are located in private personal computers or remote location (i.e. the internet). Mostly, the data is stored in some kind of databases or data warehouses. Data in database are usually managed by DBMS. SQL is used to access the database. The NLIDB system is actually a branch of more comprehensive method called Natural Language Processing (NLP). The main objective of NLP research is to create an easy and friendly environment to interact with computers in natural language (i.e. English) [16]. In recent times, there is a rising demands for non-expert users to query relational databases in a more natural language encompassing linguistic variables and terms, instead of operating on the values of the attributes. Therefore the idea of using natural language instead of SQL has prompted the development of new type of processing method called Natural Language Interface to Database systems (NLIDB). NLIDB is a step towards the development of intelligent database systems (IDBS) to enhance the users in performing flexible querying in databases. 2. EARLY SYSTEMS The early efforts in the NL interfaces area started back in fifties. Prototype systems were introduced in the late sixties and early seventies. Many of these systems relied on pattern matching to directly mapping the user input to the database . Formal LIst Processor (FLIP) is an early language for patternmatching based on LISP structure works on the bases that if the input matches one of the patterns then the system is able to build a query for the database. In the pattern matching based systems, the database details were inter-mixed into the code, limited to specific databases and to the number and complexity of the patterns. Natural language processing was one of the approach, where the user interactively is allowed to interrogate the stored data. 2.1 LUNAR system The system LUNAR is a system that answers questions about samples of rocks brought back from the moon. The meaning of systems’ name is that is in relation to the moon. The system was informally introduced in 1971. The LUNAR system uses two databases; one for the chemical analysis and the other for literature references which helps it to achieve its functions. The LUNAR system uses an Augmented Transition Network (ATN) parser and Woods' Procedural Semantics. According to [13], the LUNAR system performance was quite impressive; it managed to handle 78% of requests without any errors and this ratio rose to 90% when dictionary errors were corrected. The system was not put to intensive use because of its linguistic limitations. 2.2 LADDER The LADDER system was designed as a natural language interface to a database of information about US Navy ships. According to, the LADDER system uses semantic grammar to parse questions to query a distributed database. The system uses semantic grammars technique that interleaves syntactic and semantic processing. The question answering is done via parsing the input and mapping the parse tree to a database query. The system LADDER is based on a three-layered architecture [14]. The first component of the system is for Informal Natural Language Access to Navy Data (INLAND). This component accepts questions in a natural language and generates a query to the database. The queries from the INLAND are directed to the Intelligent Data Access (IDA). This component which is the second component of LADDER builds a fragment of a query to IDA for each lower level syntactic unit in the English Language. Input query and these fragments are then combined to higher level syntactic units to be recognized.The combined fragments are sent as a command to IDA at the sentence level. IDA would compose an answer that is relevant to the user’s original query in addition to planning the correct sequence of file queries. The third component of the LADDER system is for File Access Manager (FAM). The FAM finds the location of the generic files and handles the access to them in the distributed database. LISP was used to implement the LADDER system. 2.3 CHAT-80 The system CHAT-80 is one of the most referenced NLP systems in the 80’s. Prolog was used to implement this system. The CHAT-80 was an impressive, efficient and sophisticated system. The database of CHAT-80 consists of facts (i. e. oceans, major seas, major rivers and major cities) about 150 of the countries world and a small set of English language vocabulary that are enough for querying the database. An English language question is processed in three stages [15]. The system translates the English language question by the creation of a logical form as processes of three serial and complementary functions where: 1. Logical constants are used to represent words. 2. Predicates are used to represent Verbs, nouns, and adjectives with their associated prepositions. The predicates can have one or more arguments. 3. Conjunctions of predicates are used to represent complex phrases or sentences. Figure 1 Chat-80 Processing Scheme These functions are being; parsing, interpretation and scoping. Grammatical structure of a sentence is ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 53 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN COMPUTER ENGINEERING determined by the parsing module and the interpretation andscoping consist of various translation rules, expressed directly as Prolog clauses. The basic method followed by Chat-80 is to attach some extra control information to the logical form of a query in order to make it an efficient piece of Prolog program that can be executed directly to produce the answer. The generated control information comes into two forms: 1. The order in which Prolog will attempt to satisfy the query is determined by the Order of the predications for that query. 2. Separates the over all program into a number of independent sub problems to limit the amount of backtracking performed by Prolog. In this way, Prolog is led to answer the queries in an obviously sensible way and the Prolog compiler can compile the transformed query into code that is as efficient as iterative loops in a conventional language. 3. VARIOUS APPROACHES TO NATURAL LANGUAGE PROCESSING INTERFACE Natural language is the topic of interest from computational viewpoint due to the implicit ambiguity that language possesses [1]. Several researchers applied different techniques to deal with language. Next few sub-sections describe diverse strategies that are used to process language for various purposes. 3.1Symbolic Approach (Rule Based Approach) Natural Language Processing appears to be a strongly symbolic activity. Words are symbols that stand for objects and concepts in real worlds, and they are put together into sentences that follow the well specified grammar rules. Knowledge about language is explicitly encoded in rules or other forms of representation. Language is analyzed at various levels to obtain information. On this obtained information certain rules are applied to achieve linguistic functionality [1]. As Human Language capabilities include rule-base reasoning, it is supported well by symbolic processing. In symbolic processing rules are formed for every level of linguistic analysis. 3.2 Empirical Approach (Corpus Based Approach) Empirical approaches are based on statistical analysis as well as other data driven analysis, of raw data which is in the form of text corpora i.e. a collection of machine readable text. The approach has been around since NLP began in the early 1950s. The empirical NLP has emerged as a replacement to rulebased NLP only in the last decade. Corpora are primarily used as a source of information about language and a number of techniques have emerged to enable the analysis of corpus data. Syntactic analysis can be achieved on the basis of statistical probabilities estimated from a training corpus. Lexical ambiguities can be resolved by considering the likelihood of one or another interpretation on the basis of context. Given the positive results shown by the Empirical approach several different symbolic and statistical methods have been employed, but most of them are used to generate one part of a larger information extraction system. 3.3 Connectionist Approach (Using Neural Network) Since human language capabilities are based on neural network in the brain, Artificial Neural Networks (also called as connectionist network) provides on essential starting point for modeling language processing [2, 18]. Instead of symbols, the approach is based on distributed representations that correspond to statistical regularities in language. Though there has been significant research on the use of applying neural networks for language processing little has been done in the field of using the subsymbolic learning in language processing. SHRUTI system developed by [3] is a neurally inspired system for event modeling and temporal processing at a connectionist level. 4. NATURAL LANGUAGE DATABASE INTERFACE BASED ON PROBABILISTIC CONTEXT FREE GRAMMAR: All the earlier tools translates a natural language query into an intermediate language similar to firstorder logic and then into SQL using a set of definitive rules. The intermediate language expresses the meaning of the query in terms of high-level concepts that are independent of database structures. It is lack of robust in the translating process, since premises of rules must be matched with phrases of the query exactly or else the rule will be rejected. The purpose of CFG is to overcome this problem by constructing the most probable grammar tree and analyzing the non-terminals (phrases) in the grammar tree to collect the parameters which will be used in SQL. 4.1 Overview of Natural Language Database Interface The overall system architecture of our natural language database interface is given in Figure 2. The original input to the system is a query expressed in natural language. To process a query, the first step is part of speech tagging; after this step each word of the query is tagged [6, 7]. The second step is parsing the tagged sentence by a PCFG. The PCFG parser analyzes the query sentence according to the tag of each word, and calculates the probability of all possible grammar trees. The result of the analysis forms a grammar tree with the maximum probability. Finally, the SQL translator processes the grammar tree to obtain the SQL query by a series of dependency rules. ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 54 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN COMPUTER ENGINEERING Figure 2 NLDBI Architecture 4.2 Probabilistic Context Free Grammar CFG has a formalization capability in describing most sentence structures. Also, CFG is so well formed that efficient sentence parser could be built on top of it [19]. Probabilistic context free grammar (PCFG) inherits most advantages of CFG, and it is the simplest statistical model to analyze natural language. Usually natural language sentences are transformed into a tree structure through PCFG, and the grammar tree is analyzed according to user’s requirements. A probabilistic context free grammar consists of the following: A terminal set: {wk}, where wk is a word, corresponding to a leaf in the grammar tree; A non-terminal set: {Ni}, Ni which is a sign used to generate terminals, corresponding to a non-leaf node in the grammar tree; The grammar is weakly equivalent to the Chomsky Norm Form grammar (CNFG) which consists of rules of only two forms which is the most concise grammar. With n non-terminals and v terminals, the number of parameters for each of the four forms is shown in table 1. s Table 1 Consider a sentence w1m that is a sequence of words w1 w2 w3……wm (ignoring punctuations), and each string wi in the sequence stands for a word in the sentence. The grammar tree of w1m can be generated by a set of pre-defined grammar rules. As wi may be generated by different non-terminals, and this situation also appears when generating a nonterminal (a non-terminal may be generated by different sets of several non-terminals), usually more than one grammar tree may be generated. Take sentence “ate dinner on the table with a fork” for example, there are 2 grammar tress corresponding to the sentence as shown in figure 3. Figure 3 Two possible grammar trees that may be generated for the sentence “ate dinner on the table with a fork” 4.3 SQL Translator SQL translator is used translating the leaves of the tree to the corresponding SQL. Actually the process is collecting information from the parsed tree. Two techniques may be used to collect the information: dependency structure and verb sub categorization [8, 9, 10]. These techniques are also used in disambiguation, since a PCFG is context-free and ancestor-free, any information in the context of a node in the parsed tree needs not be taken into account when constructing the parsed tree. A dependency structure is used to capture the inherent relations occurs in the corpus texts that may be critical in real-world applications, and it is usually described by typed dependencies. A typed dependency represents grammatical relations between individual words with dependency labels, such as subject or indirect object. Verb sub categorization is the other technique [11]. If we know the sub categorization frame of the verb, we can find the objects of this verb easily, and the target of the query can be found easily. Usually this process can be done after scanning the parsed tree. 5. THE GENERIC INTERACTIVE NATURAL LANGUAGE INTERFACE TO DATABASE SYSTEM ARCHITECTURE: The architecture of the GINLIDB system consists of two major components: 1. Linguistic handling component 2. SQL constructing component. ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 55 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN COMPUTER ENGINEERING The natural language query correctness as far as the grammatical structure and the possibility of successful transformation to SQL statement is concerned, is controlled by the first component. The exact SQL statement that opens a connection to the database in use and executes the generated SQL statement and returns the query's result to the user is generated by the second component in GINLIDB [12]. Figure 4 The architecture of GINLIDB A. Graphical User Interface (GUI) The user interact with it GINLIDB system in a userfriendly Environment. The user does not require the knowledge of computers and database terms. The interaction with our system is through suitable visual forms, buttons, and menus. B. Linguistic handling component The Linguistic handling component consists of three parts: Lexical analysis, Parser, and Semantic representation 1) Lexical analysis Lexical analysis divides the sentence it into simpler elements that called tokens. This process is performed with the help of, token analyzing, spelling checker, ambiguity reduction and excessive tokens removal. 2) GINLIDB Parser There is substantial ambiguity in the structure of natural language queries so the queries are not easily parsed by programs. The GINLIDB parser is designed with two stages of grammars: lexical and syntactic. The token generation (lexical analysis) is the first stage where the input tokens' stream is split into meaningful symbols. The syntactic analysis based on Augmented Transition Network (ATN) is the second stage, which checks if the tokens' structure is in allowable grammatical structure. This is processed via the parser according to a Context-Free Grammar (CFG). C. SQL constructing component The SQL constructing component consists of three parts; SQL Generator, database adaptor, and SQL executor. 1) SQL generator The elements of the natural query of the SQL is mapped by the SQL generator. The SQL generator uses four routine, each of which controls only one specific part of the query. The overall SQL statement is constructed from the concatenation of the output of the four routines [16]. The first routine selects the part of the natural language query that corresponds to the appropriate DML command with the attributes' names (i.e. SELECT * clause). The second routine selects the part of the query that would mapped to a table's name or a group of tables' names to construct the FROM clause. The third routine selects the part of the query that would be mapped to the WHERE clause (condition). The fourth routine selects the part of the natural language query that corresponds to the order of displaying the result (ORDER BY clause with the name the). 2) Database adaptor Many different database management systems exist today and a database adapter can be used to control the variety of interfaces and techniques of these different DBMS. Database connection, constraint, data type, and SQL format are examples of such varieties. 3) Database executor The task of SQL executor is to get the required results from the used database. To achieve this, the generated SQL statement would be tested to verify correctness before applied to the used database and then represent the result to the user. D. Syntactical knowledge base The syntactical knowledge base of the GINLIDB system is used by the linguistic component to determine the accepted words, provide word alternatives (in spelling correction process), and to verify the natural language query grammar. E. Semantic knowledge base This knowledge base consists of the English semantic grammar (grammar rules) and the schema of the database in use. The semantic knowledge base is used to replace words and/or phrases semantically by equivalent words and/or phrases that are recognized by our system (according to the system capabilities). F. Knowledge extension This component extends the syntactical knowledge base by the adding new words and the semantic knowledge base by adding new rules. The component enlarges the system to accommodate variant domains and to strengthen the terminology and rules of existing domains. 5.1 Design of the GINLIDB System The system is designed and implemented by the use of Object Oriented (OO) techniques. The Unified Modeling Language (UML) is an evolutionary general-purpose [17], broadly applicable, toolsupported, and industry-standardized modeling ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 56 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN COMPUTER ENGINEERING language, used to design this system. The various UML diagrams are: • Use case diagrams are to conceptualize the functionality of the system through the systems' cases that represent different overall system scenarios. • Sequence diagrams are used to show the interactions among different elements of the system in the shape of passing messages from and to each object. Sequence diagrams depict the internal behavior of the GINLIDB system. • Class diagram is used to describe the static view of the system by describing the classes and relationships among them. • Activity diagrams are used to capture the flow from one activity to the next. 6. THE ADVANTAGES AND DISADVANTAGES OF NATURAL LANGUAGE INTERFACE TO DATABASES The following section discusses the advantages and disadvantages of the Natural Language Interface to databases systems [1, 5]: 6.1 The Advantages a) No Artificial Language One advantage of NLIDBs is that it allows a naïve user who does not have knowledge of artificial communication language to query the database without learning the artificial language. Formal query languages like SQL are difficult to learn and master, at least by non-computer-specialists. b) Simple, easy to use Consider a database with a query language or a certain form designed to display the query. While an NLIDB system only requires a single input, a formbased may contain multiple inputs (fields, scroll boxes, combo boxes, radio buttons, etc) depending on the capability of the form. In the case of a query language, a question may need to be expressed using multiple statements which contain one or more subqueries with some joint operations as the connector. c) Better for Some Questions There are some kind of questions (e.g. questions involving negation, or quantification) that can be easily expressed in natural language, but that seem difficult (or at least tedious) to express using graphical or form-based interfaces [1, 5]. For example, “Which department has no programmers?” (Negation), or “Which company supplies every department?” (Universal quantification), can be easily expressed in natural language, but they would be difficult to express in graphical or form-based interfaces. They can be expressed using the query language but it would require large complex queries which can be only written by the computer experts. d) Fault tolerance Most of NLIDB systems provide some tolerances to minor grammatical errors, while in a computer query language the syntax and the rules of the language must be obeyed, and any errors will cause the input automatically be rejected by the system. e) Easy to Use for Multiple Database Tables Queries that involve multiple database tables like “list the address of the farmers who got bonus greater than 10000 rupees for the crop of wheat”, are difficult to form in graphical user interface as compared to natural language interface. 6.2 The Disadvantages a) Linguistic coverage is not obvious Currently all NLIDB systems can only handle some subsets of a natural language and it is not easy to define these subsets. Some NLIDB systems are not able to provide answers of questions belonging to their own subset which is not the case in a formal language [5]. The formal language coverage is obvious and any statements that follow the given rules are guaranteed to give the corresponding answer. b) Linguistic vs. conceptual failures In case of NLIDB system failures, it is usually seen that the system does not provide any explanation as to what caused the system to fail. Some users try to rephrase the question or just leave the question unanswered. Most of the time, it is left for the user to fend for the cause of errors. c) False expectations People can be misled by an NLIDB system’s ability to process a natural language: they may assume that the system is intelligent [5]. Therefore many a times it is seen that rather than asking precise questions from a database, the user’s may be tempted to ask questions that involve complex ideas, certain judgments, reasoning capabilities, etc. for which an NLIDB system cannot be relied upon. 7. CONCLUSIONS Research is done from the last few decades on Natural Language Interfaces. With the advancement in hardware processing power, many natural language interface to databases got promising results. The system accepts an English language requests that is interpreted and translated into SQL command using semantic grammar technique. In addition, the system requires a knowledge base that consists of a database and its schema. The result of the number of experiments in the form of trials in a user friendly environment had been very successful and satisfactory. To improve the system performance in natural language processing various issues like enriching the knowledge sources of the system in order to increase the system efficiency and researching methods to improve the coherence and the fluency of output texts must be considered. From the experiments we can seen it is possible to translate a natural language query to SQL, and a probabilistic approach may be promising. So far, our NLDBI system considers selection and a few simple aggregations. The next step of our research is to optimize the PCFG, to accommodate more complex queries. 8. REFERENCES [1] Natural Language Interface using Shallow Parsing, Rajendra Akerkar and Manish Joshi. ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 57 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN COMPUTER ENGINEERING International Journal of Computer Science and Applications, Vol. 5, No. 3, pp 70 – 90 [2] Miikkulainen R., “Natural language processing with subsymbolic neural networks”, Neural Network Perspectives on Cognition and Adaptive Robotics. [3] Shashtri L., “A model of rapid memory formation in the hippocampal system”, Proceeding of Meeting of cognitive Science Society, Stanford [4] Natural language Interface for Database: A Brief review, Mrs. Neelu Nihalani, Dr. Sanjay Silakari, Dr. Mahesh Motwani. IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 2, March 2011 ISSN (Online): 1694-0814 [5] Androutsopoulos, G.D. Ritchie, and P. Thanisch, Natural Language Interfaces to Databases – An Introduction, Journal of Natural Language Engineering 1 Part 1 (1995), 29–81 [6] Johnson Mark. PCFG Models of Linguistic Tree Representations. 24(4): 613-631, 1998. [7] A probabilistic corpus-driven model for lexical-functional analysis. In Proc. COLINGACL’98. [8] Dan Klein, Christopher D. Manning: Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency. ACL 2004: 478-485. [9] M-C.de Marneffe, B. MacCartney, and C. D. Manning. “Generating Typed Dependency Parses From Phrase Structure Parses”. In Proceedings of the IEEE /ACL 2006 Workshop on Spoken Language Technology. The Stanford Natural Language Processing Group. 2006. [10] Dan Klein and Christopher D. Manning. 2003. Fast Exact Inference with a Factored Model for Natural Language Parsing. In Advances in Neural Information Processing Systems 15 (NIPS 2002), Cambridge, MA: MIT Press, pp. 3-10. [11] Marie-Catherine de Marneffe, Bill MacCartney, and Christopher D. Manning. Generating Typed Dependency Parses from Phrase Structure Parses. In LREC 2006. [12] Interactive Natural Language Interface, Faraj A. El-Mouadib, Zakaria Suliman Zubi, Ahmed A. Almagrous, I. El-Feghi. WSEAS Transactions on Computers, ISSN: 1109-2750 661 Issue 4, Volume 8, April 2009 [13] Hendrix, G., Sacrdoti, E., Sagalowicz, D. and Slocum, J. (1978). Developing a natural language interface to complex data. ACM Transactions on Database Systems, Volume 3, No. 2, USA, Pages 105 – 147. [14] Woods, W. (1973). An experimental parsing system for transition network grammars. In Natural Language Processing, R. Rustin, Ed., Algorithmic Press, New York. [15] Woods, W., Kaplan, R. and Webber, B. (1972). The Lunar Sciences Natural Language Information System. Bolt Beranek and Newman Inc., Cambridge, Massachusetts Final Report. B. B. N. Report No 2378. [16] Generic Interactive Natural Language Interface to Databases (GINLIDB) Faraj A. ElMouadib, Zakaria S. Zubi, Ahmed A. Almagrous, and Irdess S. El-Feghi. International Journal of Computers Issue 3, Volume 3, 2009 [17] K. Hamilton and R. Miles “Learning UML 2.0. O'Reilly”, ISBN-10: 0- 596-00982-8, 2006. [18] Wermter S., “Hybrid approaches to neural network-based language processing”, Technical Report TR-97-030, International Computer Science Institute. [19] Bei-Bei Huang, Guigang Zhang, PhillIp CY Sheu, A Natural Language Database Interface Based On a Probabilistic Context Free Grammar IEEE International Workshop on Semantic Computing and Systems, 978-0-7695-3316-2/08 $25.00 © 2008 IEEE DOI 10.1109/WSCS.2008.14 ISSN: 0975 – 6760| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 58