Download 2 - people.vcu.edu

Chapter 1 Overview of Research Project Spatial information can be stored in computers, but it is difficult to search for required information based purely on the visual information. It is necessary to describe this spatial information in order to facilitate searching. One aspect of this project is to become familiar with the problems involved with the storing and representations of this type of data. In every day usage items are described using subjective wording. For example a person’s face might be described by saying that they have a big nose, a light complexion and a wide mouth. The terms "big", "light” and "wide" are not strictly defined, but there is a community agreement on their meaning. The second aspect of this project is to use fuzzy logic and/or modal logic to create a system where pictures are described by community-agreed subjective terms. The hope is that this will allow better searching for spatial information given a query that allows the use of subjective wording. Queries that use such subjective wording are not easily stated in SQL. The third aspect of the project is to automate the translation of natural language queries using this subjective language into SQL statements. This is the portion of the project addressed by this report. The complete system, should allow a user to ask a question such as "Find me a man with a big nose, long hair, dark complexion and blue eyes" and have the system return all the pictures that meet those criteria. 1 Chapter 2 Natural Language to SQL basics Introduction The purpose of a Natural language Interface for a database system is to accept requests in English and attempt to “understand” them. A natural language interface usually has its own dictionary. This dictionary contains words related to a database and its relationships. In addition to this, the interface also maintains a standard dictionary (e.g. Webster’s dictionary). A natural language interface refers to words in its own dictionary, as well as to the words in the standard dictionary, in order to interpret a query. If the interpretation is successful, the interface generates a SQL query corresponding to the natural language request and submits it to the DBMS for processing; otherwise, a dialogue is started with the user to clarify the request. The purpose of this paper is to present an overview of Natural Language Interfaces. The PRECISE and MASQUE interfaces are discussed. The ELF (English Language Front End) interface is used to translate the natural language question into SQL for the project described in this paper. The project makes this SQL available to the fuzzy database research group which enables them to make changes to the SQL and use it for the fuzzy queries. The area of NLP research is still very experimental and systems so far have been limited to small domains, where only certain types of sentences can be used. When the systems are scaled up to cover larger domains, NLP becomes difficult due to the vast amount of 2 information that needs to be incorporated in order to parse sentences. For example, the sentence: “The woman saw the man on the hill with a telescope” Could have many different meanings. To understand what the intended meaning is, we have to take into account the current context, such as the woman is a witness, and any background information, such as there is a hill near by with a telescope on it. Alternatively the man could be on the hill, and the woman may be looking through the telescope. All this information is difficult to represent, so restricting the domain of an NLP system is a practical way to get a manageable subset of English to work with. The standard approach to database NLP systems is well established. This approach creates a ‘semantic grammar’ for each database, and uses this to parse the English question. The semantic grammar creates a representation of the semantics of a sentence. After some analysis of the semantic representation, a database query can be generated in SQL or any other database language. The drawback of this approach is that the grammar must be tailor-made for each database. Some systems allow automatic generation of an NLP system for each database, but in almost all cases there is insufficient information in the database to create a reliable NLP system. Many databases cover a small domain so that an English question about the data within it can easily be analyzed by an NLP system. The database can be consulted and an appropriate response can be generated. The need for a Natural Language Interface (NLI) to databases has become increasingly important as more and more people access information through web browsers, PDA’s and 3 cell phones. These people are casual users and it is necessary to have a way that they can make queries in their own natural language rather than to first learn and then write SQL queries. But the important point is that NLI’s are only usable if they map natural language questions to SQL queries correctly. 2.1 Fundamental steps involved in the conversion. The transformation of a given English query to an equivalent SQL form requires some basic steps. The workings of all Natural language to SQL software packages deal with these basic steps in some manner. First there is a dictionary, where all the words that are expected to be used in any English question are declared. These words consist of all the elements (relations, attributes and values) of the database and their synonyms. Then these words are mapped to the database system. This implies that the meaning of each word needs to be defined. They may be called by different names in different systems but these two things (i.e. definition and mapping of the words) form the basis of the conversion. These are domain dependent modules and have to be there. 4 These are the three basic steps for NL to SQL conversion. Domain Dependent Modules English Question Parser Lexical dictionary Parse tree Semantic dictionary & type hierarchy Semantic Interpreter LQL Query LQL to SQL Translator Interface with data SQL Query DBMS Tran receiver DBMS Query result Database Response generator Figure 2.1[Ref 5] Architecture of transformation process 5 The architecture of the transformation process is shown in Figure 2.1. Note that the domain dependent modules (the lexical dictionary, semantic dictionary and interface with data) are dependant on the data contained in the database. Below is the detailed explanation of each of these modules. Lexical dictionary: This holds the definition of all the words that may occur in a question. The first step of any system is to parse an English question and identify all the words that are found in the lexical dictionary. The lexical dictionary also contains the synonyms of root words. Semantic dictionary: Once the words are extracted from the English question using the lexical dictionary, they are mapped to the database. The semantic dictionary contains these mappings. This process transforms the English question to an internal language (LQL for the architecture shown in Figure 2.1) which is then converted to SQL. During the mapping process words are attached to each other or to the entities or to the relations. So the output of this step is a function. For example, consider the question “What is the salary of each manager”? Here the attributes, salary and manager are attached and so the output is the function has_salary (salary, manager). Interface with data: The next step is the conversion of the internal query developed above to an equivalent SQL statement. This is done somewhat differently by different systems. This step may be combined with the step above to directly get a SQL statement. There are some 6 predefined rules (depending on the interface) that change the internal language statement into SQL and so Interface with data contain all those rules. 2.2 MASQUE/SQL Introduction: There are many commercial Natural language Query Interfaces available, MASQUE/SQL is one of them. MASQUE/SQL is a modification of the earlier product MASQUE, which answered questions by generating Prolog queries [5].MASQUE (Modular Answering System for Queries in English) was developed at the Artificial Intelligence Applications Institute and the Department of Artificial Intelligence of the University of Edinburgh. Complex queries were answered by MASQUE in a couple of seconds. But since it answered queries in prolog, existing commercial databases were unable to use it and so MASQUE/SQL was developed. MASQUE/SQL can answer user’s questions almost as quickly as the original MASQUE. MASQUE/SQL answered most test questions in fewer than six seconds CPU time, including the time taken by the DBMS to retrieve data from the sample database [5]. One important point about MASQUE/SQL is that its performance is not significantly affected when using larger databases. This is because it transforms each user question into a single SQL query, and the relational DBMS is left to find answers to the query, utilizing its own optimization and planning techniques. Thus the full power of a relational DBMS is available during the question answering and therefore the system is expected to scale up easily to larger databases. The drawback of larger databases is that dictionary gets bigger and complicated. The parsing time also increases. 7 Working of MASQUE/SQL: MASQUE/SQL follows the same basic architecture as MASQUE. It starts with the building of the lexicon. This is done using a built-in domain-editor. The built-in domain-editor helps the user to declare words expected to be in a question. The second step, as per the general architecture, is building the semantic dictionary. This is also done by the built-in domain editor. It allows the user to define the meaning of each declared word in terms of a logic predicate. The logic predicate defines the mapping of a word to the database. An example is given in Figure 2.2 Person customer Staff salesperson manager Feature numeric technician age non-numeric salary name address Figure 2.2 (is-hierarchies) Figure 2.2 shows two is-hierarchies. The one on the left is for a set of entities and the right hand one tells about the properties of entities. In the database there is a relationship defined between these properties and entities. For instance, if salary is related to staff and age to person, then there are logic predicates such as has_salary(Salaries,Staff) and has_age(age,Person). There are also predicates such as is_manager(M) to tell whether M is a manager or not. Now when the noun “manager” (as in the query “Is PG a manager?”)is encountered, it is mapped to the predicate is_manager(M) and similarly the meaning of the 8 noun “salary” (as in “What is the salary of each manager?”) is expressed as the predicate has_salary(Sal,Stf). Here the predicate has two attributes. The first is salary, and since only staff can have salary (from the relationship in the database), ‘Stf’(signifies staff) is the second attribute. Anyone who is not a staff cannot replace this attribute. So ‘customer’ cannot be used. There are two types of predicates used in MASQUW/SQL. 1. Type A - Predicates that show relationship (mapping) between individual entities. For example has_salary links Staff and Salary. 2. Type B - Predicates that express relations linking entity-sets to entities. For example the meaning of “average” as in “What is the average age of the managers?” is av(Aver, Set), where set stands for a set of numeric entities, and aver stands for a numeric entity corresponding to the average of Set. These logic predicates form the subparts of a Prolog-like LQL (Logical Query Language). This LQL is the internal language which will ultimately be converted to SQL. For example: “What is the salary of each manager?” is translated as follows in LQL: Answer ([S, M]):- is_manager(M), has_salary(S,M) The process is as follows. When an English query is entered, the parser extracts the words that occur in the lexical dictionary (in this case “Salary” and “Manager”). Next the semantic dictionary gives the logic predicates for these two nouns, is_manager(M) for “manager” and has_salary(S,M) for “Salary”. 9 The last step is to convert the LQL into a SQL query that can be executed by the DBMS. This conversion process is explained with the help of the example given in figure 2.3. Given the English Query “What is the average age of the managers?”, the following LQL is generated: Answer ([Aver]):Setoff (Age:Mng, (is_manager(Mng), Has_age(Age, Mng)), Ages_list) Av (Aver, Ages_list) Figure 2.3 The translation algorithm uses a binding structure, which maps LQL variables to SQL code fragments. Whenever a new LQL variable is encountered, a suitable SQL fragment is stored as the binding of that variable. When the variable is reencountered, its binding is used to create a WHERE condition. Returning to the example, the type-A predicate is_manager(Mng) is translated into the SQL shown in Figure 2.4. SELECT * FROM is_manager#1 rel1 Figure 2.4 10 and rel1.arg1 becomes the binding of Mng. Then has_age(Age,Mng) is processed. The binding of Mng (i.e. rel1.arg1) is used to create a WHERE condition and rel2.arg1 becomes the binding of Age (see figure 2.5). SELECT * FROM has_age#2 rel2 WHERE rel2.arg2 = rel1.arg1 Figure 2.5 To obtain the final SQL query both of these sub queries need to be combined. To do this conjunction list is formed. To translate this conjunction list, the FROM and WHERE parts of the translations of the conjuncts are merged. In the example, the second argument of setoff (Figure 2.3) is as shown in Figure 2.6 SELECT * FROM is_manager#1 rel1, has_age#2 rel2 WHERE rel2.arg2 = rel1.arg1 Figure 2.6 The processing of the overall setof instance gives the binding of Ages list shown in Figure 2.7. 11 SELECT rel2.arg1, rel1.arg1 FROM is_manager#1 rel1, has_age#2 rel2 WHERE rel2.arg2 = rel1.arg1 Figure 2.7 The SELECT part from Figure 2.7 is generated by observing the _rst argument of setof (Age:Mng) and by using the bindings of Age (rel2.arg1) and Mng (rel1.arg1). The type-B predicate avg is linked to the pseudo-SQL query given in Figure 2.8. SELECT avg(first) FROM pair_set Figure 2.8 Consulting the SELECT part of the binding of Ages list, the first can be associated to rel2.arg1, and the second to rel1.arg1. Processing the avg instance causes the binding of Aver to become as shown in Figure 2.9. 12 SELECT avg(rel2.arg1) FROM is_manager#1 rel1, has_age#2 rel2 WHERE rel2.arg2 = rel1.arg1 Figure 2.9 where the FROM and WHERE parts are the same as in the binding of Ages list. The translation of the full LQL query is the binding of Aver. 2.3 PRECISE PRECISE is a natural language interface developed at University of Washington, Seattle, WA. The PRECISE Natural Language Interface to Databases is designed on the principle that it should guarantee the correctness of its output, or else indicate that it does not understand the input question. PRECISE is the first system that has formal guarantees on the soundness & completeness of a NLI [2]. A demo of PRECISE is available at the University of Washington website [1]. The database used for this demo is an Airline Travel Information Service (ATIS). In this demo, Precise answers questions about the ATIS database. This database contains information on flight schedules, fares, ground transportation, airports, and airplanes. For example asking,”Show me all flights from Seattle to Boston”, will give the result is given along with the SQL generated. However the PRECISE interface is not available for commercial use and so it was not tested for this project. 13 A database is made up of three types of elements: a relation, attribute and value. An attribute is a particular column in a particular relation. Each value element is the value of a particular attribute. The words that usually appear in questions (what, which, where, who, when) are known as Wh-words. These words help in identifying attributes in questions. The set of word stems that matches a database attribute are called Tokens. For Example: {require, experience} and {need, experience} could refer to the attribute Required Experience. There are two types of tokens - value tokens and attribute tokens. If the token corresponds to a value in the database then it is a value token. Similarly if the token corresponds to a attribute in a database then it is a attribute token. This token system helps match database elements with values. A situation where there is a set of tokens such that every word in the question appears in exactly one token is known as complete tokenization of a question. PRECISE uses Attachment function which maps pairs of tokens to TRUE or FALSE (tells whether two given tokens are attached). For example, consider the question “What French restaurants are located downtown?” In this example the tokens located and downtown are attached, while the tokens what and downtown are not. A valid mapping from a complete sentence tokenization to a set of database elements has the following characteristics: - each token matches a unique database element - each attribute token is attached to a unique value token - a relation token is attached to either an attribute token or value token 14 A question is semantically tractable if it has at least one complete sentence tokenization that has only distinct tokens, and has at least one value token that matches a Wh-word, and the question results in a valid mapping. Examples are given in Figures 2.10 and 2.11. What French restaurants are located downtown? - “What” is a Wh-word that corresponds to “restaurant” - the relation token “restaurant” refers to a Restaurants relation with attributes Name, Cuisine, & Location - the value token “French” is paired with the attribute token “Cuisine” In this case, Cuisine is called an implicit attribute - the value token “Downtown” is paired with the attribute token “Location” Figure 2.10 attribute. In this case, Location is called an explicit What French restaurants are located downtown? Examples of what attachment function does: - the tokens “located” & “downtown” are attached - the tokens “what” & “downtown” are not attached - the relation token “restaurant” is attached to the value token “French” Figure 2.11 15 How PRECISE works Given an English question, PRECISE determines whether it is semantically tractable i.e. it the process to has one complete tokenization and valid mapping. If it is, PRECISE generates the corresponding SQL query or queries. PRECISE uses a max-flow algorithm for graph matching problems to find a valid mapping from the complete sentence tokenization to database elements. For Example, consider “What are the HP jobs on a UNIX system?” Given a Job relation with attributes Description, Platform, & Company, PRECISE produces a complete tokenization of the question as shown in Figure (2.12) .The Syntactic markers (no impact on interpretation of the question) are: are, the, on, a. The Value tokens are: what, HP, UNIX. The only Attribute token is system. The only Relation token is job Figure 2.12 [Ref 4] An attribute-value graph is constructed as shown in Figure 2.13. 16 Figure 2.13 [Ref 4] The max-flow algorithm automatically handles ambiguity (i.e., ambiguity caused by “HP”) and “decides” on the final data flow path indicated by the solid lines in the Figure 2.13. After all attribute and value tokens have been matched to database elements, the system checks to make sure all relation tokens correspond to a value token or attribute token. When multiple relations are involved in a question, a relation flow graph is constructed and the max-flow algorithm is used in a similar manner. If all syntactic constraints are satisfied by the resulting value token-attribute token pairings, a valid mapping has been found and a resulting SQL query is generated. 17 PRECISE System Architecture Database PRECISE Lexicon Parser Plug in Tokenizer Matcher Equivalence Checker Query Generator English Question SQL Query Set + Answer Set Precise System Architecture Figure 2.14 Figure 2.14 shows the system architecture of PRECISE. The lexicon supports two operations. First, when given a word stem ws, it returns the set of tokens which contain ws. Second, when given a token t, it returns the set of database elements matching t. In this way the names of all database elements are extracted and split into individual words. Each word is then stemmed and a corresponding set of synonyms is identified using the Lexicon. The tokenizer’s input is a natural language question and its output is set of all possible complete tokenizations of a question. In the next step the problem of finding a semantic 18 interpretation of natural language tokens as database elements is reduced to a maximum matching problem. This is done at Matcher. Precise then extracts attachment relationship between tokens. For example consider “What are the capitals of the US state?” The parser enables PRECISE to understand that the token capital is attached to the tokens state. The query generator takes the database elements selected by the matcher and generates a SQL query. The equivalence checker tests whether there are multiple distinct solutions to a maxflow problem and whether these solutions translate into distinct SQL queries. 2.4 Summary For a broad class of semantically tractable natural language questions, PRECISE is guaranteed to map each question to the corresponding SQL query. Experiments conducted using three databases are discussed in Ref [4]. Several questions are asked to these databases. It is found that 80% of the questions are semantically tractable questions, which PRECISE answers correctly. PRECISE automatically recognizes the 20% of questions that it cannot handle, and requests a rephrase. 19 Chapter 3 Building Lexicon and Semantic Dictionary While studying Natural Language Interfaces such as MASQUE and Precise it becomes clear that the building of the Lexicon and the Semantic dictionary are the soul of the transformation. This chapter discusses how to make a Lexicon and Semantic dictionary. A description is given of how these are built in a commercial interface available for natural languages, called ELF (English Language Frontend) [3]. This chapter details the steps for translating an English query into SQL using ELF. 2.1 Introduction to ELF ELF is a commercial system that generates Natural Language Processing System for a database. It is developed by ELF Software Co. ELF is an interface which works with Microsoft Access and Visual Basic. 2.2 How are the Lexicon and Semantic dictionary built in ELF? The lexicon is built automatically in ELF. In other words, ELF takes an existing database and scans through it so that it can understand both the data and the relationships. This process is the Analyze function, and the interface to it is shown in Figure 3.1. For simpler cases, Express Analysis is sufficient. This causes ELF to automatically read all the information it needs out of the database. Words related to attributes and relationships of the database are stored into the lexicon dictionary. 20 Figure 3.1 There might be situations when certain tables and relationships need to be excluded from the lexicon. Custom Analysis is selected for such situations. Using this function, decisions can be made to help Access ELF decide where to concentrate, what to evaluate, and what to ignore. This allows use of common sense, which no computer program can claim -- and comes in very handy. The Figure 3.2 shows the custom analysis window where the tables to be considered can be manually selected. This window contains all the table names. When a table (or query) in the Custom Analysis window is de-selected, ELF is excused from answering any questions related to these tables. ELF will not make an attempt to look at how fields of these tables relate to each other, and will not store any of the words related to these table and their relationships in its own dictionary. Of course, this speeds up the Analysis process. Depending upon the situation, some information is used frequently in searches, occasionally or not at all. For information which is used frequently and if it's a significant amount of data, it may be wise to reduce 21 processing time by selectively ignoring parts of the table. To do this, right click on any table in the Custom Analysis window's Data Set list. A listing of the fields in that table will appear, giving the option of "Acknowledging" (Ack) and/or "Memorizing" (Mem) each one (Figure 3.3). Figure 3.2 If the Acknowledge field is not selected then it is similar to ignoring an entire table; ELF acts as if the field does not exist and will not be able to answer questions about it. If the Acknowledge field is selected but Memorize is de-selected then this means that ELF will know its type and which table it comes from, as well as other details such as whether it participates in relationships or whether it seems to be a person's name or a place. 22 Figure 3.3 The only thing it will not do is to save all the data entries from that particular field in the Lexicon. During the Analysis process, ELF examines the terms used in defining fields and tables, and uses its Lexicon to try to predict what kinds of synonyms might be used in queries. It also stores its type (e.g. noun, verb. preposition etc.) and which table it comes from (builds the Semantic dictionary). Considering the database used in this project, the entry for element “brown” in the Lexicon is shown in Figure 3.4. 23 Figure 18 The last two items in the entry, that is EYES_COLOR and HAS_EYE-COLOR, mean that “brown” is a data item of the COLOR field of the EYES table and also a data item of the EYE-COLOR field of the HAS table. NOBLE indicates that "Brown" is not part of a compound data item. CAPDATA indicates that it is spelled with an initial capital letter; and DATA means that it comes from a database field. Thus all the information needed for mapping is also stored in the lexicon. This completes the building the lexicon and the semantic dictionary. 3.3 Transformation of an English query to SQL in ELF Now the database system is ready to answer an English query. ELF does this in three steps. The question is typed in the query box as shown in Figure 3.5. 24 Figure 3.5 The first step is to parse the English question and find the words that are stored in the lexicon. Figure 3.6 shows the result after parsing. Figure 3.6 The words which are extracted are name, eye color and brown. One important point is that since ELF found brown, and brown is a data item, it is put in the “where clause”. In the next step (Figure 3.7) it finds the tables for these words. This information is in the lexicon. 25 Figure 3.7 After getting the tables, the last step involves joining these tables on the common attributes. In this example there are two tables involved ‘person’ and ‘has’. Both of these tables have the attribute ID and so the join is made on this attribute. So Figure 3.8 shows the final SQL statement. The final result generated is as shown in figure 3.9. Figure 3.8 26 elfWorksheetSub name EYE Color anshu brown Figure 3.9 2.4 Storing the SQL generated into a file The Natural language query has now been converted into SQL. The main aim of the project was to use this for fuzzy database queries. There had to be a way where the fuzzy group could access this SQL and make modifications to include the fuzzy data. The fuzzy group decided to use SQL SERVER for creating the fuzzy database. As stated earlier this experiment was done using ACCESS. Saving the SQL generated to a file provided a common interface. Since ELF supports VBScript, a script was written and embedded in the ELF software. The function was named getsql (Figure 3.10) Figure 3.10 27 This function is triggered after the SQL has been generated and, as indicated in the script , stores the SQL in a file named elfsqlGenerated.txt. 28 Chapter 4 Fuzzy word extraction As stated in chapter 2, the goal was to get the fuzzy database working with the Natural Language interface. The SQL generated for the query with the fuzzy word in Figure 3.8 should be SELECT DISTINCT person.name,has.[EYE.Color] FROM Person,has,Person LEFT Join has Person.ID WHERE has.[EYE Color] = “ very Brown” ; The first thought for achieving this was to include fuzzy words in the database as attribute values. For e.g. the values for the attribute COLOR in the table EYE could be “very brown” or “light brown”. In this case the words are no longer fuzzy but become distinct database values. For words to be fuzzy they need to have weights associated with them [6]. For example the “very brown” might have a weight associated as 8.1, and “light brown” a weight of 3.2. The SQL generated for the query “List all people with very brown eyes?” should be SELECT * FROM WHERE EYE.COLOR = “BROWN” AND WEIGHT >= 8.1. This query is now fuzzy. The second option was to add very brown as a synonym of brown in the Lexicon. This will also not help because the SQL generated by ELF always takes the root word (i.e. attribute value), which is “brown” and not “very brown” The last option was to extract the fuzzy word and send it to a separate file and leave the SQL generated as is. This was a better choice since the ultimate aim was that in the final SQL query, instead of “very”, an “and” clause for the weight can be added. The Fuzzy group then 29 picked up the SQL generated from the file and the fuzzy words from files. Based on the fuzzy word the appropriate “and clause” was then added to the SQL. SELECT DISTINCT person.name,has.[EYE.Color] FROM Person,has,Person LEFT Join has Person.ID WHERE has.[EYE Color] = “ Brown” and weight = 0.7. To achieve this VBScript shown in Figure 4.1 was written. function putq Set objFSO = CreateObject("Scripting.FileSystemObject") Set objTextFile = objFSO.OpenTextFile("c:\elfNatural.txt", 2, True) objTextFile.WriteLine(Question) objTextFile.Close Dim Qarray Dim str Dim result Dim final Set objFSO = CreateObject("Scripting.FileSystemObject") Set objTextFile = objFSO.OpenTextFile("c:\elfFuzzyWord.txt", 2, True) QArray=Split(Question," ") str = "very" for i = Lbound(QArray) to Ubound(QArray) if QArray(i) = str then result = Qarray(i+1) final = str& " " & result objTextFile.Write(final) end if next objTextFile.Close end function objTextFile.Close end function Figure 25 Figure 4.1 30 The file generated now contained the extracted fuzzy words. This function (putq) gets triggered after the Question has been asked and stores the results in an “elfFuzzyWord.txt”. The Fuzzy group then read the fuzzy words from the file and accordingly made changes in the SQL generated to get the correct result for the fuzzy database. 31 Chapter 5 Future Work Currently the interface does not handle negation with fuzzy words. The interface cannot handle if we change” very brown” to “not very brown”. Foe example, consider the query, “Find people with not very brown eyes”. This is interpreted as SELECT DISTINCT person.name,has.{EYE.Color] FROM Person,has,Person LEFT Join has Person.ID WHERE has.[EYE Color] is not Brown. This is not correct. The query was supposed to fetch people with light brown eyes. Current implementation allows asking Natural Language query in terms of one attribute. Use of compound attributes in a query (e.g., “List all people with very brown eyes and a broad face”) will make the system more useful and powerful. Comparing other Natural Language Interfaces in terms of efficiency, accuracy and quality would be an interesting research project. Then the interface which is more efficient and accurate can be used with the fuzzy database and spatial database. Another research project could be in terms of the handling of synonyms. The key to a good interface is a good Lexicon. Trying to work on an interface with a simple lexicon (no synonyms) and observing how it works would be an interesting subject. 32 References [1] www.cs.washington.edu/research/projects/WebWare1/www/precise/precise.html [2] Knowles, S., A Natural Language Database Interface for SQL-Tutor, Nov 1993. [3] ELF Software CO. Natural Language Database Interfaces from ELF Software Co. available at www.elfsoft.com [4] Popescu, A.M., Etzioni, O., Kautz, H., Towards a Theory of Natural Language Interfaces to Databases, Jan 2003 [5] Androutsopoulos, I., Ritchie, G., Thanisch, P., MASQUE/SQL – An Efficient and Portable Natural Language Query Interface for Relational Databases, Edinburgh, 1993. [6] Joy,Karen and Dattatri, Smita ,” Implementing a Fuzzy Relational Database and Querying System With Community Defined Membership Values “, VCU Directed Research Report ,November 2004 33 Overview of Natural Language Interfaces 34

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 2 - people.vcu.edu