Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Free-Form Dialog Program in Spanish Fred Jehle Indiana U.-Purdue U. at Fort Wayne ABSTRACT: SPANLAP is a computer program which attempts to engage students of Spanish in a free-form written dialog by allowing them to ask questions. The program performs morphological, syntactic, and some semantic parsing of the student input, and either generates a response according to predetermined patterns or selects a response written by an instructor. Although unfinished, the program is functional, and demonstrates the feasibility of communicative programs for students of foreign languages. KEYWORDS: artificial intelligence, CALL, communicative activities, (answer) generation, LISP, natural language processing, parsing, VAX. Computer-assisted language learning is a marvelous educational tool, one which has yet to see its full potential. However, there are serious drawbacks to traditional-type CALL, such as the need for having the instructor write in all the questions and possible variants for correct answers, if indeed the system allows for more than one right answer. In real life, there are often thousands of perfectly acceptable responses to a question, such as the simple one "Do you speak Spanish?". Furthermore, it is the student, not the program, who often needs to ask questions. In view of this, the Language Laboratory of Indiana UniversityPurdue University at Fort Wayne has started a project involving natural language processing, a branch of artificial intelligence (Al) that deals with trying to make the computer understand and process a language such as Spanish. The goal was to build a program to allow students to use their Spanish communicatively. Specifically, the program would converse with the student—via the keyboard and a monitor—in Spanish, at least at the elementary or intermediate college level, on common, simple conversational topics.1 It would have to be able to handle at least 3000 words occurring in as many as 30,000 CALICO Journal, Volume 5 Number 2 11 different forms, and it would have to work in real time, that is, process student input and respond to it in a matter of a few seconds. The program would not have to be truly "intelligent" linguistically (that is, be able to process virtually any sentence considered to be correct Spanish) but it would have to appear reasonably intelligent to the student of Spanish. And, in the first major stage of program development, it would have to be capable of answering at least certain kinds of questions put to it by students. The criteria for selecting the hardware and software were straightforward: megabytes of memory with high-speed processing, at virtually no cost to the lab or the department. The hardware, then, had to be one of the VAX 11/780 super mini (or mainframe) computers on our campus. The student terminals would have to be GIGIS, made (and a few years back given away) by Digital Equipment Corporation, the manufacturer of the VAX. GIGIs are no longer state-of-the-art; however, they do allow the creation of foreign language character sets and the use of color graphics. Economically speaking, their obsolescence is an advantage: they can often be obtained free, although that is not necessarily true of monitors and the T-boxes needed to connect them to a VAX. Since the program would employ AI techniques, colleagues in computer science recommended LISP (LIST Processing) as the programming language. The project was started using Franz LISP, then translated to VAX LISP when the latter language was at last successfully installed on our system. VAX LISP is a form of Common LISP, and as such is well documented, very powerful, and boasts an extraordinary number of built-in functions .2 A favorite feature is the suspended system. When starting up a long and complicated program, it can easily take six to ten minutes or more to load up all the necessary data. In VAX LISP the programmer can load up the entire program with all the necessary data, create a suspended system, then set up the students' account so that they resume the suspended system at the proper point with no waiting. The program is tentatively called SPANLAP (SPANish LAnguage Processing). Structurally, it runs in a simple cycle: 1. Print a message/reply to the screen 2. Get the student's input 3. Process the input A. Morphological parse B. Syntactic parse C. Semantic parse CALICO Journal, Volume 5 Number 2 12 4. Select or generate a reply to the student The dialog begins with a simple message, such as Buenos días (Good morning) or whatever would be appropriate for the current time of day, and the program waits for the student to reply. LISP cannot process punctuation marks, parentheses, and certain other characters as symbols or parts thereof. Therefore, a special but relatively simple input routine gets the input from the student as a list of individual characters. (A list is basically data enclosed in a set of parentheses; a symbol—or atom in other LISPs—is a single group of letters, possibly containing digits.) Since the login command file for the student account automatically sets the terminal in line editing mode, students can edit the input as they enter it, using the arrow keys to reposition the cursor to insert or delete characters, or to call up the previous response for re-use or editing. For now, let's say the student has typed the phrase: ¿C6mo estás, Carmen? The program then converts the list of characters into a list of symbols such as this: (=upq cómo estás =com carmen =que) making it easier to be processed by the next stage of the program. (In SPANLAP, symbols beginning with an equal sign represent punctuation; here, we will ignore them.) We now have the input but in order to understand it and reply, we have to analyze or parse it, normally on various levels. The morphological parse, at least in SPANLAP, involves determining the part of speech for each word in the student's input, as well as the morphological aspects of that word, such as mood, tense, person, and number for a verb. This parser is lexically based; that is, it checks each word in the student input against entries in its database of Spanish words and stems. The way in which this happens in SPANLAP, however, is linked to a feature which is apparently unique to LISP-like languages: the ability of the system to determine whether or not a value has been assigned to a symbol which is referenced in turn by another symbol. When the suspended system was created, SPANLAP declared c6mo to be a symbol or variable name to which there is attached a list of properties: (set 'cómo p_s (adverbio) clas_ (manera) tag_ (interrogativo) )) CALICO Journal, Volume 5 Number 2 13 In this code, the part-of -speech property (p_s) indicates that this word is an adverb, the class property (clas_) is "manner," and cómo carries a tag property indicating that the word is an interrogative. When the morphological parser comes to the student's word cómo, it asks the LISP system "Is the symbol associated with the variable 'student_word,' whose value is currently 'como,' a bound variable?" (boundp student_word). The answer is "Yes, this word has been assigned a value"(true); the program then retrieves the part-of-speech property from the value which had been assigned to cómo, revealing that it is an adverb. Spanish is an inflected language, however, and most other words are not quite so easy to parse. Taken out of context, the endings used in Spanish are virtually meaningless; 80 to 90 percent of Spanish words end in one of the three vowels "a", "e", or "o" plus an optional "s." The resultant six letter-combinations can occur as the final letters of any part of speech, and there is nothing in the word to indicate that these final letters constitute a true ending attached to a true stem. Furthermore, a regular verb in Spanish has more than 50 endings that can be applied to a single stem, and this figure does not include compound or progressive forms, or the 31 combinations of enclitic pronouns possible in certain cases. It would be impossible, or at least extraordinarily inefficient, to include in a lexicon all possible forms of Spanish words as separate entries. Therefore, the verb parser needs to know how verbs are conjugated, and how nouns and adjectives are formed. As an example, let's take the second word in the student's response, estás. As with the first word, the program asks, "Is this word (estás) a bound variable?" In this case the answer is negative (nil), so the program starts stripping off the last letter of the word, one at a time, substituting a hyphen, "Is estábound?" Again, the answer comes back "no" from LISP. Then is est- bound? Yes; it had been declared bound by SPANLAP: (set 'est- '( p_s (verbo adjetivo sustantivo) conj_ (-ar) clas_ (irregular pres_ind pres_sub strong_pretx) stem_ (- - - estuv-) pres_ind(oy ás á amos áis án) pres_sub (é és é emos éis én) type_ (intransitivo) tag_ (progresivo) alt_ ( CALICO Journal, Volume 5 Number 2 14 p_s (adjetivo) clas_ (0) end_ (e a _ os as) type_ (det) tag_ (demostrativo) opp_ (es-) alt_ ( p_s (sustantivo) clas_ (4) isa (punto_cardinal) opp_ (oest-) ) ) )) From the information in this property list, the program knows that the symbol "est-" can be the root form of a verb (estar, one of the verbs meaning "to be" in Spanish); it is also the root form of an adjective or determiner, "this"; finally, it is likewise the root of the noun meaning "east." From the class property (clas_), the verb parser finds that the verb corresponding to the root est- is irregular in certain forms, including the present indicative, abbreviated in the program as "pres_ind." Since there is no "X" attached to the end of this tense/mood indicator, the parser knows that it must use the current stem and a special set of endings for that tense/mood (rather than a special stem, as for the preterit). The parser then finds those special endings for the irregular tense under the corresponding property name (pres_ind). The ending ás matches one given there, and thus the parser determines that estás is the indicative mood, present tense, second person singular familiar form, and temporarily stores all this information. The last regular word in our example of student input is Carmen. This is recognized as a proper name used for a female. The name parser can recognize hundreds of proper nouns, including the names of some historical and literary persons as well as numerous countries and capitals. At the end of the morphological parse, we know that the student input (¿Cómo estás, Carmen?) consists of punctuation, plus an adverb, a verb, punctuation, a proper name, and final punctuation: (pun_ adverbio verbo pun_ nombre pun_) A few additional remarks are in order about morphological parsing in the program. Separate parts of the program have been written for each of the normal parts of speech, as well as for numbers, abbreviations, and proper names. Each searches out different types of information, with the verb parser being the most complex. The program looks ahead one and two words in the student input to CALICO Journal, Volume 5 Number 2 15 see if it can find a multiple-word entry. Thus, in one pass it will find antes de que to be a conjunction, but antes de to be a preposition and antes to be an adverb, The next step in processing the student input is the syntactic parse. That is, the input is analyzed at the sentence level, determining such things as the grammatical subject, predicate, direct object, and indirect object. Our original example of student input, ¿Cómo estás, Carmen?, is relatively easy to parse: the pronoun form of the subject is tú, the noun-group-subject is Carmen, and there are no objects (and the program knows from the "type" property that estar can only be used intransitively). Syntactic parsing in Spanish is made more difficult by such things as loose Spanish word order, apposition, and the use of redundant object pronouns; it is also more confusing since all third person direct object pronouns are identical in form to articles, and all other direct object pronouns are identical in form to indirect object pronouns. We are saved to some extent by the knowledge that indirect and direct object pronouns should occur precisely in that order and should be found attached to the end of affirmative commands, immediately before other conjugated verbs used alone, or in either the pre- or post-position if a conjugated verb is used in conjunction with an infinitive or gerund. Therefore, the plan of attack for the syntactic parser is to: (1) start at the verb; (2) if there are no enclitic objects, check for object pronouns in front of the verb; if any are found, identify them as such since they had been incorrectly identified as articles by the morphological parser; (3) analyze the left side of the student input, from the beginning up to the verb or object pronouns; (4) analyze the right side of the student input, from the verb or enclitics to the end of the sentence. This syntactic analysis searches primarily for noun groups (which serve as the objects of prepositions), the subjects and objects of verbs, predicate nouns, and adverbial expressions of time. The program uses several definitions for a noun group, including a pronoun of the appropriate type, a proper name, or the following combination of items: ( (("tod-") det) (número (número (número))) ((adverbio) adjetivo) sustantivo ((adverbio) adjetivo) ) Inside this definition list, symbols are obligatory elements and lists are optional ones. Accordingly, a noun phrase could be composed of: (1) an optional CALICO Journal, Volume 5 Number 2 16 determiner (such as an article, demonstrative, or possessive adjective), which could be preceded optionally by a form of the word todo ("all"); (2) an optional number, followed by another optional number or two; (3) an optional adjective, which could be preceded by an optional adverb; (4) a noun; if no noun is found here and the above optional adjective does occur, it is presumed to be nominalized, provided it is preceded by a determiner; (5) an optional adjective, which could be preceded by an optional adverb. This definition of a noun group is certainly not complete, but it will catch the majority of those used at the intermediate level. After finding the noun groups, the syntax parser determines their respective grammatical functions in the sentence in view of: (1) the type of verb involved (for example, whether or not it can be used transitively); (2) the type of noun in each noun group (for example, whether or not it represents a person or a unit of time); and (3) the order in which the noun groups occur. The syntax parser still needs considerable refining, but it does work relatively well, and can successfully parse such sentences as: Todos los días viene Juan Luis. El nuevo profesor le dará a María una nota muy mala este semestre. This syntax parser also sends along comments if it considers the sentence "unsyntactic", for example, if it finds a preposition without an object, or too many noun groups in a sentence. The next step would be the semantic parse, to determine what the student input means and whether or not it is logical or coherent. At present, the program is weak at this level, but it does have some idea of the relationship between things in the real world. Several hierarchies, such as "isa" (i.e., "is a..."), "has", and "can" relationships, help out. For example, a dog "isa" mammal, which in turn "isa" animal, which "isa" living - thing. Furthermore, a dog "has" four paws and a tail, and it "can" bark, lick, and bite: (set 'perr- '( p_s (sustantivo) clas_ (1) isa (mamifer-) has (4 pata- col-) can (ladr- lam- mord-) )) On the basis of a hierarchy such as this, and a few functions to manipulate the information involved, the program can determine, for example, that a dog is a CALICO Journal, Volume 5 Number 2 17 mammal (and thus has a mouth and a head and can walk and run), that it is an animal (and thus has a brain and can move and learn), and that it is a living thing (and thus can be born and die). Definitions are also possible, such as answering the question "What is a dog?". At present the program only gives the reply "It is a mammal," but it would be a relatively easy matter to enable the program to add such details as "with four legs and a tail and which can bark." This hierarchy is primarily a classification system for nouns. More information must be included for verbs; one simple way of handling them is to include in their list of properties the types of subjects and direct objects which normally occur with them. Thus, for example, if we would indicate that for the verb "to eat" the subject should normally be a living thing and that the object should be food, we could determine that a sentence such as "My car eats TV sets" is not particularly logical or coherent. The additional verb information system will help, but it will not permit the program to "understand" verbs beyond "has", "is," and "can," and verbs are probably the most important part of a student input. Therefore, the next step in the program development will be to insert a system of knowledge representation for verbs, probably using conceptual dependency.3 After an intelligent dialogue program understands the student input on a conceptual level, it should be able to generate a reply in the same way it analyzed the student input, but in reverse order, that is: A. Conceptual or semantic level (abstract terms) B. Syntactic level (subject, predicate, objects) C. Grammatical level (choice of words, word order, agreement, tense/mood) D. Morphological level (forms of individual words, plus capitalization and punctuation) As yet, the program skips the semantic level. If it doesn't understand a word in the student's input, it says so. if it finds the input to be unintelligible morphologically or syntactically, it indicates that it doesn't understand. Otherwise, it checks the input against a series of key words and phrases. There are lists of possible pat answers for such questions as "How are you?". Also, there are set procedures for such standard questions as "What time is it?", "What's your name?", and "What's today's date?". In the case of questions such as "What's the opposite of X?", "What is an X?", or "Can an X do such-and-such?", the program CALICO Journal, Volume 5 Number 2 18 generates a response according to a particular syntactic pattern, using morphological generators, which create the particular form needed for a word. These morphological generators are rather like morphological parsers in reverse, but simpler to write and fewer in number. There is one for verbs, one that does both nouns and adjectives, one for articles, one for numbers, and another which handles virtually all other parts of speech. The standard process in this part of the program is that the stem of the word to be processed is sent to the appropriate part-of-speech generator, together with the attributes desired for the resultant form, for example, mood, tense, person, and number for a verb. Each part-of-speech generator has its own set of defaults, so if no attributes are sent to it, it can still come up with a recognizable form. For this system to work, the property system represented by the root must include cross references if the word in question has any irregular roots. Thus, for example, the stem of the verb estar (est-) must be cross-referenced to the preterit stem (estuv-), and vice-versa. Also, mass nouns must be so designated, so the program knows whether or not to use an article or the plural form in a given situation. We have also found it important for opposite words to be cross referenced to each other: big-small, love-hate, always-never. If the student's input is morphologically and syntactically intelligible and the program is unable to formulate its own response, it selects from a series of instructor-written sentences one which will hopefully work, depending on the type of sentence involved. For example, if the student input represents a yes-no question, the procedure is to randomly decide "yes" or "no", then pick at random a phrase from a list of general-type affirmative or negative responses. if the student poses an information question, it tries to look up an answer involving the appropriate entity, such as time or quantity. If none of these patterns seem to fit, it resorts to a list of more general replies, such as "Well, a lot can be said on that score" or "Let's change the subject." The goal is to give the student an opportunity to practice communicating in Spanish; therefore, the program always responds to the student with a sentence or phrase in Spanish, directed where possible toward the content of the student input, not the (possibly poorly-phrased) way in which it was written. A few errors in the student's input are pointed out, such as the lack of a prepositional object; mistakes in agreement, both noun/modifier and subject/verb, will be pointed out in subsequent versions. However, error analysis will continue to be in the form of an "aside" except when errors prevent input from being understood. As has already been intimated, a great number of capabilities must still be CALICO Journal, Volume 5 Number 2 19 incorporated into SPANLAP. As a result, the program is both exhilaratingly "smart" and exasperatingly "dumb," at least as seen by its author. Nevertheless, it does demonstrate that communicative programs for our students of foreign language are within our grasp .4 Sample dialog In the following session with the program, the computer's responses are italicized, and the author's comments preceded by a prompt: Buenos días. > ¿Cómo estás? Bien, gracias. ¿Y tú? > ¿Cómo te llamas? Me llamo Carmen Luisa Martínez Quintana. (The program can assume any of four different identities.) > Quiero hablar con otra persona. Chau. Voy a ver si puedo encontrar a Rosario. Hola. > ¿Quién eres? Soy Rosario Consuelo Quiroga Ballesteros. > ¿Dónde vives? En Puebla. > ¿Tienes un novio? Claro. > ¿Cómo se llama tu novio? Se llama José Luis Martínez Quintana. > ¿Qué día es hoy? Hoy es jueves. > Ud. es estúpida. Favor de no insultarme. > ¿Qué es (The user hit the RETURN key before finishing the question.) Cuidado con la puntuación. > ¿Qué es el opuesto de amaste? Es odiaste. > ¿Tiene un perro patas? Tiene cuatro patas, > ¿Comes televisores? Imposible. > ¿Comes televisores? Imposible. CALICO Journal, Volume 5 Number 2 20 > ¿Comes televisores? Parece que sí. (The question is "Do you eat TV sets?", which the user repeated using the UP arrow key to retrieve it; the program doesn't yet understand verbs, so it randomly decided "yes" or "no," and then picked from a list of possible replies.) > ¿Quién fue Miguel de Cervantes? Creo que debemos hablar de otra cosa. (Another instructor-written response; the program does recognize the name of this famous Spanish author, but it didn't understand the question.) > ¿Cuál es la capital de Puerto Rico? SAN JUAN. (The program can recognize two or three consecutive words—e.g., "Puerto Rico"—as representing one entity. The answer is correct, but should not be entirely in capital letters.) > ¿Dónde duermes? Duermo en casa. > ¿Cuéndo almuerzas? No entiendo la expresión 'CUéNDO'. > ¿Cuándo almuerzas? (The user made a typographical error in the previous question; here, it was retrieved and edited.) Dentro de poco. >q (QUIT or Q may be used to sign off.) Muchas gracias por todo. Hasta la vista... Notes 1 Joseph Weizenbaum of MIT is credited with the first conversational-type program, the now-famous ELIZA. A few interesting foreign language programs have been designed along similar lines, primarily employing pattern matching rather than parsing, including the German PSYCHIATER and SPION by Ruth Sanders, as well as FAMILIA and an experimental Spanish ELIZA by John Underwood. For more information on most of these, see John H. Underwood, Linguistics, Computers, and the Language Teacher: A Communicative Approach (Rowley, Mass: Newbury House, 1984). 2 VAX LISP has considerable on-line help. As a practical supplement to that, I recommend Common LISPcraft, by Robert Wilensky (New York: W.W. Norton and Company, 1986). 3 Conceptual dependency involves a set of primitives such as ATRANS (transfer of an abstract relationship, e.g., give), PTRANS (transfer of the physical location CALICO Journal, Volume 5 Number 2 21 of an object, e.g., go), MTRANS (transfer of mental information, e.g., tell), etc. See Roger C. Shank, "Conceptual Dependency: A Theory of Natural Language Understanding," Cognitive Psychology 3.4:552-630 (1972). 4 This project was undertaken by someone with no staff support and no formal training in cognitive science, computational linguistics, artificial intelligence, or even computer science. Just imagine what we can expect—or at least hope for—from teams of experts in these fields. Author's Biodata Fred Jehle is an Associate Professor of Spanish and Director of the Language Lab at Indiana U.-Purdue U. at Fort Wayne. He has been writing CALL programs for language lab users for over five years, primarily on a VAX 11/780 running VMS. Author's Address Fred Jehle Director, Language Lab Indiana U.-Purdue U. at Ft. Ft. Wayne, IN 46815 CALICO Journal, Volume 5 Number 2 22