Download JP_springfinal

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Lisp machine wikipedia , lookup

Machine translation wikipedia , lookup

Embodied cognitive science wikipedia , lookup

Knowledge representation and reasoning wikipedia , lookup

Embodied language processing wikipedia , lookup

Wizard of Oz experiment wikipedia , lookup

AI winter wikipedia , lookup

Intelligence explosion wikipedia , lookup

Existential risk from artificial general intelligence wikipedia , lookup

Ethics of artificial intelligence wikipedia , lookup

Philosophy of artificial intelligence wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Transcript
ABSTRACT
The field of Natural Language Processing is emerging as a rich and useful field in
Computer Science. With the explosion of the Internet and other large-scale digital
document databases, the problem of doing useful analysis of natural language text has
come to the forefront of Artificial Intelligence and Machine Learning. This paper
presents an overview of the field of Natural Language Processing as the basis for a 10week introductory class on the subject matter.
The concept for this semester’s independent work stemmed from a previous
semester’s independent work. The earlier project entailed the creation of an Internetbased software package that allows users to highlight unknown foreign language words,
thus automatically triggering the display of an English translation. Ideally, the concept
would be expanded to translate sentences and paragraphs into English, but the technology
necessary for the accurate execution of such a task did not exist at the time of this
project’s completion.
My curiosity about the absence of this technology evolved into a semester-long
research exploration of Natural Language Processing, the field of Computer Science that
includes Machine Translation. As a testament to my learning, I decided to propose a
class on this topic, for other students like myself would presumably be interested in this
subject matter as well. The paper that follows is a summary of the key concepts in NLP
that I suggest be covered in a ten-week class. The instructor of the class is encouraged to
delve into more detail in certain sections, because there is a plethora of information and
specifics that I was unable to include in this paper. Topics such as parsing techniques
1
and statistical natural language processing were not expanded upon because of
time/length restrictions; however, they are very rich and fundamental topics in NLP.
2
1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE
The field of Artificial Intelligence (AI) is the branch of Computer Science that is
primarily concerned with the ability of machines to adapt and react to different situations
as human do. In order to achieve artificial intelligence, we must first understand the
nature of human intelligence. Human intelligence is a behavior that incorporates a sense
of purpose in actions and decisions. Intelligent behavior is not a static procedure.
Learning, defined as behavioral changes over time that better fulfill an intelligent being’s
sense of purpose,1 is a fundamental aspect of intelligence. An understanding of
intelligent behavior will be realized when either intelligence is replicated using machines,
or conversely when we prove why human intelligence cannot be replicated.
In an attempt to gain insight into intelligence, researchers have identified three
processes that comprise intelligence: searching, knowledge representation, and
knowledge acquisition. The field of AI can be broken down into five smaller
components, each of which relies on these three processes to be performed properly.
They are: game playing, expert systems, neural networks, natural language processing,
and robotics programming.2
Game playing is concerned with programming computers to play games, such as
chess, against human or machine opponents. This sub-field of AI relies mainly on the
speed and computational power of machines. Game playing is essentially a search
problem because the machine is required to consider a multitude of possibilities. While
the computational power of machines is greater than that of the human brain, machines
1
2
http://webopedia.internet.com/TERM/a/artificial_intelligence.html 2 May 2001
3
are unable to solve search problems perfectly because the size of the search space grows
exponentially with the depth of the search, making the problem intractable.
Expert systems are programmed systems that allow trained machines to make
decisions within a very limited and specific domain. Expert systems rely on a huge
database of information, guidelines, and rules that suggest the correct decision for the
situation at hand. Although they mainly rely on their working memory and knowledge
base, the systems must make some inferences. The vital importance of storing
information in the database in a manner such that the computer can “understand” it
creates a knowledge representation problem.
Neural networks, a field inspired by the human brain, attempts to accurately
define learning procedures by simulating the physical neural connections of the human
brain. A unique aspect of this field is that the networks change by themselves, adapting
to new inputs with respect to the learning procedures they have previously developed.
The learning procedures can vary and incorporate many different forms of learning,
which include learning by recording cases, by analyzing differences, or by building
identity trees (trees that represent hierarchical classification of data).
Natural Language Processing (NLP) and robotics programming are two fields that
simulate the way humans acquire information, an integral part to intelligence formation.
The two are separate sub-fields because of the drastic difference in the nature of their
inputs. Language, the input of NLP, is a more complex form of information to process
than the visual and tactile input of robotics. Robotics typically transforms its input into
motion, whereas NLP has no such state associated state transformation.
4
A perfection of each of the sub-fields is not necessary to replicate human
intelligence because a fundamental characteristic of humans is to err. However, it is
necessary to form a system that puts these components together in an interlocking
manner, where the outputs of some of these fields should be inputs for others, to develop
a high-level system of understanding. To date, this technology does not exist over a
broad domain.
The fundamental nature of intelligence, and the possibility of its replication is a
complex philosophical debate, better left for a more esoteric discussion. In order to learn
more about the field of AI from the Computer Science perspective, we will entertain the
notion that it is possible for machines to possess intelligence.
Before we explore this topic further, we must consider the social, moral and
ethical implications of AI. Beginning with the most immediate and realistic
consequences, this increase in machine capabilities could lead to a decrease in the amount
of human-to-human contact. This could change the development of humans as a species,
and is an important consideration to take in deciding the future of AI. Economic crisis
could also occur as a result of artificial intelligence. Machines have already begun to
replace humans in the workforce who do repetitive, “mindless” jobs. If machines could
perform intelligent tasks, they could overtake work fields and might cause mass
unemployment and poverty.
The perfection of an artificially intelligent being poses more significant albeit
distant consequences. With the advent of this technology we would be decreasing the
value of the human species because we would no longer be distinguished or unique. If
machines could do all that we can do, and at a fraction of the cost it takes to nurture a
5
new human to its full capability, then many humans may not feel the compelling need to
reproduce our species.
Further, the social implications pose a set of ethical dilemmas. If we can create
these intelligent beings that have no conscience or moral obligations it can be argued that
we will be creating a new breed of mentally deranged criminals, individuals who will
never feel responsibility or remorse for their actions. In the society in which we live it is
necessary for someone to be held responsible for each reprehensible action. Who would
be liable for any mistakes machines make, and who would deal with the consequences?
Should the machines be held accountable for their decisions or should the programmers
of the machines? If it is appropriate for the machine to be held responsible, how could
we punish them? The conventional methods we use to punish humans (e.g. jail,
community service) are not effective for machines. If the programmers should be held
responsible, it is our duty to warn them now as not to breach the ex-post facto clause of
the United States Constitution.
Many science-fiction movies and books give a glimpse into how much our world
could change with these technological advances. In 1984, the horror/science fiction
movie Terminator was released in theaters. The movie was set in the futuristic year
2029, and was based on the hypothetical situation that intelligent machines became more
sophisticated than their human creators were. The machines ‘ran the world’ and had the
power to obliterate the human race. There have been two sequels to this movie, each of
which is based on the same premise of artificial intelligence superseding human
intelligence. Although, these situations seem very far-fetched, one can not ignore their
implications.
6
Upon embarking on significant AI research, one must recognize the potential
consequences and strive to resolve them before artificial intelligence is fully realized.
We will end this introduction to Artificial Intelligence with one of the first
observations on the field. In 1642 René Descartes articulated his doubts about the
possibility of artificial intelligence in his paper “Discourse on the method of Rightly
Conducting the Reason and Seeking Truth in Sciences”
“…but if there were machines bearing the image of our bodies, and capable of imitating our
actions as far as it is morally possible, there would still remain two most certain tests whereby to
know that they were not therefore really men. Of these the first is that they could never use words
or other signs arranged in such a manner as is competent to us in order to declare our thoughts to
others: for we may easily conceive a machine to be so constructed that it emits vocables, and even
that it emits some correspondent to the action upon it of external objects which cause a change in
its organs; … but not that it should arrange them variously so as appositely to reply to what is said
in its presence, as men of the lowest grade of intellect can do. The second test is, that although
such machines might execute many things with equal or perhaps greater perfection than any of us,
they would, without doubt, fail in certain others from which it could be discovered that they did
not act from knowledge, but solely from the disposition of their organs: for while reason is an
universal instrument that is alike available on every occasion, these organs, on the contrary, need a
particular arrangement for each particular action; whence it must be morally impossible that there
should exist in any machine a diversity of organs sufficient to enable it to act in all the occurrences
of life, in the way in which our reason enables us to act.”
His basic statement is that machines cannot possess intelligence as humans
possess intelligence for two reasons: First because they do not have the capacity to use
language in the dynamic form that we humans can use it, and second because computers
lack the ability to reason.
The rest of this class will focus on dispelling the first half of Descartes’ statement.
We will use the terms ‘use of language’ and ‘Natural Language Processing’
synonymously. The goal of NLP, and the goal of this class, is to determine a system of
symbols, relations and conceptual information that can be used by computer to
communicate with humans.
7
2. INTRODUCTION TO NATURAL LANGUAGE
PROCESSING (NLP)
One of the most widely researched applications of Artificial Intelligence is
Natural Language Processing. NLP’s goal, as previously stated, is to determine a system
of symbols, relations and conceptual information that can be used by computer logic to
communicate with humans. This implementation requires the system to have the
capacity to translate, analyze and synthesize language. With the goal of NLP well
defined, one must clearly understand the problem of NLP. Natural language is any
human “spoken or written language governed by sets of rules and conventions
sufficiently complex and subtle enough for there to be frequent ambiguity in syntax and
meaning.”3 The processing of language entails the analysis of the relationship between
the mental representation of language and its manifestation into spoken or written form.4
Humans can process a spoken command into its appropriate action. We can also
translate different subsets of human language (e.g. French to English). If the results of
these processes are accurate, then the processor (the human) has understood the input.
The main tasks of artificial NLP are to replace the human processor with a machine
processor and to get a machine to understand the natural language input and then
transform it appropriately.
Currently, humans have learned computer languages (e.g. C, Perl, and Java) and
can communicate with a machine via these languages. Machine languages (MLs) are a
set of instructions that a computer can execute. These instructions are unambiguous and
3
4
C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, (Cambridge: The
MIT Press, 2000).
8
have their own syntax, semantics and morphology. The main advantage of machine
languages, and the major difference between ML’s and NL’s, is ML’s unambiguous
nature, which is derived from their mathematical foundation. They are also easier to
learn because their grammar and syntax are constrained by the finite set of symbols and
signals. Developing a means of understanding (a compiler) for these languages is
remarkably easy compared to the degree of difficulty of developing a means of
understanding for natural languages.
An understanding of natural languages would be much more difficult to develop
because of the numerous ambiguities, and levels of meaning in natural language. The
ambiguity of language is essentially why NLP is so difficult. There are five main
categories into which language ambiguities fall: syntactic, lexical, semantic, referential
and pragmatic.5
The syntactic level of analysis is strictly concerned with the grammar of the
language and the structure of any given sentence. A basic rule of the English language is
that each sentence must have a noun phrase and a verb phrase. Each noun phrase may
consist of a determiner and a noun, and each verb phrase may consist of a verb,
preposition and noun phrase. There are various different valid syntactic structures, and
rules such as this make up the grammar of a language and must be represented in a
concrete manner for the computer. Secondly, there must exist a parser, which is a system
that determines the grammatical structure of an input sentence by comparing it to the
existing rules. A parser must break the input down into words and determine by
5
Finlay, Janet, and Alan Dix. An introduction to Artificial Intelligence. London: UCL Press, 1996.
9
categorizing each word if the sentence is grammatically sound. Often, there may be more
than one grammatically sound parsing. (see figure 1)
Figure 1: A parsing example with more than one correct grammatical structure
The lexical level of analysis concerns the meanings of the words that comprise
each sentence. Ambiguity increases when a word has more than one meaning
(homonyms). For example “duck” could either be a type of bird, or an action involving
bending down. Since these two meanings have different grammatical categories (noun
and verb) the issue can be resolved by syntactic analysis. The sentence’s structure will be
grammatically sound with one of these parts of speech in place. From this information, a
machine can determine the definition that appropriately conveys the sense of the word
within the sentence. However this process does not resolve all lexical ambiguities. Many
words have multiple meanings within the same part of speech, or a part of speech can
have sub-categories that also need to be analyzed. The verb “can” can be considered an
auxiliary verb or a primary verb. If it is to be considered a primary verb, it can convey
different meanings. The primary verb “can” can either mean “to fire” or “the process of
10
putting stuff into a container”. In order to resolve these ambiguities we must resort to
semantic analysis.
The semantic level of analysis addresses the contextual meanings of the words as
they relate to word definitions. In the “can” example, if another verb follows the word,
then it is most likely an auxiliary verb. Otherwise, if the other words in the sentence are
related to jobs or work then the former definition of the real verb should be taken. If the
other words were related to preserves or jams, the latter definition would be more
suitable. The field of statistical analysis provides methodology to resolve this ambiguity.
When this type of ambiguity arises, we must rely on the meaning of the word to be
defined by the circumstances of its use. Statistical Natural Language Processing (SNLP)
looks at language as a non-categorical phenomenon and can use the current domain and
environment to determine the meanings of words.
SNLP can also be used to gather another type of contextual information. It can
track the slow evolution of word meanings. For example, years ago the word “like” was
used in comparisons, as a conjunction or a verb. Currently, it is often inadvertently used
as a colloquialism. This is the type of contextual information that is necessary in order to
resolve pragmatic ambiguities. Pragmatic ambiguities are cultural phrases or idioms that
have not been developed according to any set rules. For example, in the English
language, when a person asks, “Do you know what time it is?” he usually is not
wondering if you are aware of the hour, but more likely wants you to tell him the time.
Referential ambiguities deal with the way clauses of a sentence are linked
together. For example, the sentence “John hit the man with the hammer” has referential
ambiguity because it does not specify if John used a hammer to hit a man, or if John hit
11
the man who had a hammer. Referential ambiguities in a sentence are very difficult to
reduce because there may be no other clues in the sentence. In order to determine which
clauses of the sentence refer to or describe each other (in the example, who the hammer
belongs to), the processor would have to increase its scope of analysis and consider
surrounding sentences to look for clarification.
There are many tasks that require an understanding of Natural Language.
Database queries, fact retrieval, robot command, machine translation and automatic text
summarization are just a small subset of the tasks. Although complete understanding has
not yet been achieved, there are imperfect versions of NLP technologies on the market.
We will look at some of the current NLP technologies and discuss their limitations later
in the course.
12
3. NLP Programs
3.1. ELIZA
Joseph Weizenbaum developed Eliza in 1966. It is a program for the study of
natural language communication between man and machine. The format of this program
is dialogue via teletype between a computer ‘psychiatrist’ and a human ‘patient’. Eliza
merely simulates an understanding of language; it is not an intelligent system because
Eliza’s output is not reasoned feedback based on the input. Eliza determines its output by
recognizing patterns in the input and transforming them according to a series of
programmed scripts.
There are three main algorithmic steps. First, Eliza takes in an English sentence
as input and searches it for a key word and key grammatical structure. Potential key
words are articulated in Eliza’s programmed script. If more than one key word is
identified in the input, the right-most key word before the first punctuation mark is the
only word that will be considered. The key grammatical structure is identified by
decomposing the sentence. For example, if the structure “I am ____” exists in the
sentence, the sentence will be classified as an assertion. Based on these identifications, a
transformation of the input is selected. These transformations typically entail word
conversions (such as converting “I” in the input to “you” in the output). The
transformation also includes a re-assembly rule, which is essentially a series of text
manipulations. For instance, if the user inputs “I am feeling sad today”, Eliza will
manipulate the assertion to output the “insightful” question “Why are you feeling sad
today?”
The huge shortcoming of this program, which precludes it from ‘understanding’,
is that it never analyzes the input or output of the program on any other level than the
13
syntactic level. Another problem with this system is its tightly bounded nature. If no key
words are found in the input, Eliza has no alternative reasoning method to gain a sense of
the input. In this case, Eliza resorts to either outputting non-specific, broad comments
(e.g. “tell me more”) or reiterating a previous comment.
While Eliza falls short of any significant language processing, it exists as a proof
of how difficult the concept of NLP is. Creating more than an illusion of understanding
is too complex for many programming languages to deal with. Eliza can be implemented
in a machine language that makes no allowances for the needs of natural language
processing, such as Java. There are machine languages that provide better knowledge
representation structures, whose capabilities can aid in the development of artificial
understanding.
Despite Eliza’s shortcomings, it has been enormously popular as an amusing
mock therapist because of its perceived understanding of users’ problems, and because of
the anonymity its users maintain. One telling anecdote of the potential for machines to
replace humans for companionship follows. After Weizenbaum had developed Eliza, he
asked his secretary to test it for him. His secretary ‘conversed’ with Eliza for a few
minutes and then asked Weizenbaum to leave the room so she could be alone with her
computer.
3.2. SHRDLU
SHRDLU, developed in 1968 by Terry Winograd, is another program that studies
natural language communication between man and computer. The program involves a
robot that has the ability to manipulate toy blocks in its environment, based on human
input. It was not developed as a tool to explore the field of robotics; hence those
14
implementation details will be omitted from this section. Instead we will focus on the
language processing components of the system, and what level of understanding they
achieve.
SHRDLU is a useable language system for a very limited domain, which
demonstrates its understanding of language by carrying out given commands in its
domain. A user can type in any English/natural language command related to a
predefined environment of toy blocks. The program has a working knowledge of its
environment, the block placement, colors, etc. It accurately processes the command into
an appropriate action. For example, the command “Pick up the red cone and put it atop
the blue box” is a valid command and would be executed if the red cone and blue box
existed in the environment.
Its creator argues that SHRDLU is a fully intelligent system because it can
answer questions about its environment and actions. Although that argument is
debatable, it is unquestionable that SHRDLU achieves a higher level of understanding
than Eliza. SHRDLU acknowledges the syntax, semantics and referential problems that
could arise in its input, whereas Eliza only addresses syntactical issues. However, each
of these ambiguities is simpler to resolve in this context than in general language because
of the limited domain of the input. Lexical ambiguities are eliminated from the system
because of its vocabulary constraints. These three primary ambiguities are each
addressed in different, cooperative modules within the SHRDLU algorithm.
The input starts out in the syntactic module, where the input is parsed, and then it
is passed along to the semantic module for further analysis. The semantic module works
with the syntactic module to resolve any discrepancies, then attempts to “provide real
15
referents for the objects.”6 SHRDLU approaches this task as a proof, and defines the
existence of a real referent by proving that the negation of the task is impossible. Next,
the pragmatic module is called, which takes into account the entire environmental context
and attempts to place the request in context. Deduction and an attempt to reason are
incorporated in this module. If the command “Pick it up” is given to the program, this
module will use deduction to determine what “it” refers to. Deduction is also used if the
command “Pick up the block bigger than the green block” is given, in determining
relative sizes. After passing through all three modules, if ambiguity still exists, the system
will ask the user to explicitly clarify the uncertainty.
One contributing factor to SHRDLU’s sense of understanding is its knowledge
representation structure. SHRDLU is implemented in LISP, a programming language
developed to address the needs of Natural Language Processing. LISP’s central control
structure is recursion, which plays a direct role in SHRDLU’s parsing algorithm.
Additionally, LISP’s symbolic expressions enable the program to associate more
information than mathematical values to words. These and other LISP features that aid in
the understanding of machine language understanding will be discussed more in the next
section.
3.3. LISP
LISP is the primary programming language for AI and NLP applications. It was
invented in 1959 by John McCarthy. It is unlike other artificial languages because it
provides features useful to capture the abstract sense of meaning in concrete data
structures. The name LISP was derived from the general capability of the language to
6
16
perform LISt Processing tasks, but the language’s power extends far beyond list
processing.
LISP can create and manipulate complex objects and symbols that represent
words or grammar structures. There are two main data structures in LISP: atoms and
lists. Atoms are essentially identifiers that may include numbers. Lists are a collection
of atoms and/or other lists. To introduce the functionality of LISP in NLP, we will
consider simple sentence generation. This generation program will have syntactic
categorization of all words to be used in the generation, and will have a set grammatical
structure in place. The structure for this example is basic: A sentence is comprised of a
noun phrase and a verb phrase. A noun phrase consists of an article and a noun, and a
verb phrase consists of a verb and a noun phrase. Each of the parts of speech, noun,
article and verb, will have a certain subset of words (atoms) attached to them. In LISP,
this is represented as a series of functions as follows:7
(defun sentence() (append (noun-phrase) (verb-phrase)))
(defun noun-phrase() (append (Article) (Noun)))
(defun verb-phrase() (append (verb) (noun-phrase)))
(defun Article() (one-of ‘(the a)))
(defun Noun() (one-of ‘(man ball woman table)))
(defun Verb() (one-of ‘(hit took saw liked)))
The LISP program will generate a sentence after a call to the sentence function.
This algorithm can generate sentences such as “The man hit the table” or “The table liked
the woman”. Obviously, this generator takes no measures to ensure the sentence is
semantically correct. This feature is considerably harder to implement, though LISP
7
code adapted from Manning and Schutze
17
offers the capabilities. Lists can also be created such that they are the linear
representation of identity trees, which incorporate a sense of semantics into the language.
An identity tree would help ensure that the sentences generated by the previous example
were semantically correct by classifying the words with properties that are indicative of
their use.
Another feature that makes LISP attractive for language processing is its recursive
control structure. Recursion is particularly useful in parsing sentences. “To parse a
sentence means to recover the constituent structure of the sentence, to discover what
sequence of generation rules could have been applied to come up with the sentence.”8 A
grammar that has recursive constructs is better for parsing, and leads to more intricate
grammar structures. Expanding on the previous example, we could add in a recursive
Adjective function:
(defun Adjective() (one-of ‘((), adj, Adjective)))
(defun adj() (one-of ‘(happy blue pretty)))
An adjective function call should be added in the original noun-phrase definition
((defun noun-phrase() (append (Article) (Adjective) (Noun)))) for the grammar to be
expanded.
Two other features of LISP central to NLP are its dynamic memory allocation and
its extensibility. The dynamic memory allocation in LISP implies that there are no
artificial bounds or limits on the number or size of data structures. Functions can be
created or destroyed while a program is running. Since we have not yet defined a finite
set of grammar rules or structures that encompass natural language, this feature is
necessary.
8
18
The extensibility of LISP is one of the factors that have enabled it to persevere as
the top AI language since its conception in 1959. LISP is flexible enough to adapt to the
new language design theories that develop.
LISP is a very powerful language; this section was not an attempt to summarize it,
but rather to familiarize the student with the language. LISP is the predominant language
in AI technology and is impossible to present in a few pages. Instead, we will present
some of the current NLP applications that are in use today (and which are primarily
implemented in LISP), as a means of further discussing LISP’s capabilities.
19
IV. APPLICATIONS OF NLP
One important application of NLP is Machine Translation (MT): “the automatic
translation of text…from one [natural] language to another.”9 The existing MT systems
are far from perfect; they usually output a buggy translation, which requires human postedit. These systems are useful only to those people who are familiar enough with the
output language to decipher the inaccurate translations. The inaccuracies are in part a
result of the imperfect NLP systems. Without the capacity to understand a text, it is
difficult to translate it. Many of the difficulties in realizing MT will be resolved when a
system to resolve pragmatic, lexical, semantic and syntactic ambiguities of natural
languages is developed.
One further difficulty in Machine Translation is text alignment. Text alignment is
not a part of the language translation process, but it is a process that ensures the correct
ordering of ideas within sentences and paragraphs in the output. The reason this is such a
difficult task is because text alignment is not a one-to-one correspondence. Different
languages may use entirely different phrases to convey the same message.
For example, the French phrases “Quant aux eaux minérals et aux lemonades,
elles rencontrent toujours plus d’adeptes. En effet notre sondage fait ressortir de ventes
nettement supérieures à celles de 1987, pour les boissons à base de cola notamment”
would correspond to the English phrase “ With regard to the mineral waters and the
lemonades, they encounter still more users. Indeed our survey makes stand out the sales
clearly superior to those in 1987 for cola based drinks especially.” While this one-to-one
correspondence translation is grammatically accurate, the sense of the sentence would be
9
20
better conveyed if some phrases were rearranged to read “According to our survey, 1988
sales of mineral water and soft drinks were much higher than in 1987, reflecting the
growing popularity of these products. Cola drink manufacturers in particular achieved
above average growth rates.”10
There are currently three approaches to Machine Translation: direct, semantic
transfer and inter-lingual. Direct translation entails a word-for-word translation and
syntactic analysis. The word-for-word translation is based on the results of a bilingual
dictionary query, and syntactical analysis parses the input and regenerates the sentences
according to the output language’s syntax rules. For example the phrase “Les oiseaux
jaunes” could be accurately translated into “The yellow birds” using this technology.
This kind of translation is most common today in commercial systems, such as Altavista.
However this approach to MT does not account for semantic ambiguities in translation.
The semantic transfer approach is more advanced than the direct translation
method because it involves representing the meaning of sentences and contexts, not just
equivalent word substitutions. This approach consists of a set of templates to represent
the meaning of words, and a set of correspondence rules that form an association between
word meanings and possible syntax structures in the output language. Semantics, as well
as syntax and morphology, are considered in this approach. This is useful because
different languages use different words to convey the same meaning. In French, the
phrase “Il fait chaud” corresponds to “It is hot”, not “It makes hot” as the literal
translation would suggest. However, one limitation of this approach is that each system
must be tailored for a particular pair of languages.
10
example taken from Manning and Schutze, 469.
21
The third and closest to ideal (thus inherently most difficult) approach to MT is
translation via interlingua. “An interlingua is a knowledge representation formalism that
is independent of the way particular languages express meaning.”11 This approach would
form the intermediary step for translation between all languages and enable fluent
communication across cultures. This technology, however, greatly depends on the
development of a complete NLP system, where all levels of analysis and ambiguities in
natural language are resolved in a cohesive nature. This approach to MT is mainly
confined to research labs because significant progress has not yet been made to develop
accurate translation software for commercial use.
Although MT over a large domain is yet unrealized, MT systems over some
limited contexts have been almost perfected. This idea of closed context is essentially the
same concept used when developing SHRDLU; one can develop a more perfect system
by constraining the context of the input. This constraint resolves many ambiguities and
difficulties by eliminating them. However, these closed contexts do not necessarily have
to be about a certain subject matter (as SHRDLU was confined to the subject of toy
blocks) but can also be in the form of controlled language. For example, “at Xerox
technical authors are obliged to compose documents in what is called Multinational
Customized English, where not only the use of specific terms is laid down, but also are
the construction of sentences.”12 Hence their MT systems can accurately deal with these
texts and the documents can be automatically generated in different languages.
Again, before a perfection of MT over an unrestricted domain can be realized,
further research and developments must be made in the field of NLP.
11
12
Ibid, Manning and Schutze, 465.
http://www.foreignword.com/Technology/art/Hutchins/hutchins99_3.htm
22
Another application that is enabled by NLP is text summarization, the generation
of a condensed but comprehensive version of an original human-composed text. This
task, like Machine Translation, is difficult because creating an accurate summary depends
heavily on first understanding the original material. Text summarization technologies
cannot be perfected until machines are able to accurately process language. However,
this does not preclude parallel research on the two topics; text summarization systems are
based on the existing NLP capabilities.
There are two predominant approaches to summarization: text extraction and text
abstraction. Text extraction removes pieces from the original text and concatenates them
to form the summary. The extracted pieces must be the topic, or most important,
sentences of the text. These sentences can be identified by several different methods.
Among the most popular methods are intuition of general paper format (positional
importance), identification of cue phrases (e.g. “in conclusion”), and identification of
proper nouns. Some extraction systems assume that words used most frequently
represent the most important concepts of the text. These methods are generally
successful in topic identification and are used in most commercial summarization
software (e.g. Microsoft Word : see appendix 1). However these systems operate on the
word level rather than the conceptual level and so the summaries will not always be
fluent or properly fused pieces.13
Text abstraction is a less contrived and much more complex system for
summarization. While extraction mainly entails topic identification, abstraction involves
that, and also interpretation and language generation. These two additional steps would
13
http://www.isi.edu/natural-language/projects/SUMMARIST.html
23
make the automated summary more coherent and cohesive. To date, this approach has
not been successful because the interpretation stage of this process (the most difficult part
of NLP) needs more development before it can aid in summarization.
24
APPENDIX 1
Microsoft Word’s machine-generated summary of Section 1 (pages 2-6)
1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE
The field of Artificial Intelligence (AI) is the branch of Computer Science that is
primarily concerned with the ability of machines to adapt and react to different situations
like humans. In order to achieve artificial intelligence, we must first understand the
nature of human intelligence. Human intelligence is a behavior that incorporates a sense
of purpose in actions and decisions. An understanding of intelligent behavior will be
realized when we are able to replicate intelligence using machines, or conversely when
we are able to prove why human intelligence can not be replicated.
In an attempt to gain insight into intelligence, researchers have identified three
processes that comprise intelligence: searching, knowledge representation and knowledge
acquisition.
Game playing is concerned with programming computers to play games such as
chess against human or other machine opponents. This sub-field relies mainly on the
speed and computational power of machines. Expert systems are programmed systems
that allow trained machines to make decisions within a very limited and specific domain.
Neural networks is a field inspired by the human brain that attempts to accurately define
learning procedures by simulating the physical neural connections of the human brain.
Natural Language Processing (NLP) and robotics programming are two fields that
simulate the way humans acquire information, which is integral to the formation of
25
intelligence. A perfection of each of these sub-fields is not necessary to replicate human
intelligence, because a fundamental characteristic of humans is to err.
In order to learn more about the field of AI from the Computer Science perspective, we
will entertain the notion that it is possible for machines to possess intelligence.
Machines have already begun to replace humans in the workforce who do repetitive,
“mindless” jobs. If machines could perform intelligent tasks, they could overtake work
fields and might cause mass unemployment and poverty.
Should the machines be held accountable for their decisions or should the
programmers of the machines? The conventional methods we use to punish humans (e.g.
jail, community service) are not effective for machines.
The machines ‘ran the world’ and had the power to obliterate the human race.
There have been two sequels to this movie, each of which is based on the same premise
of artificial intelligence superseding human intelligence.
26
WORKS CITED
Automatic text generation and summarization. 29 Apr. 2001
<http://www.nada.kth.se/~hercules/nlg.html>.
Dorr, Bonnie J. Bonnie Dorr - Large Scale Interlingual Machine Translation. 30 Apr.
2001 <http://cslu.cse.ogi.edu/nsf/isgw97/reports/dorr.html>.
ELIZA -- A Computer Program. 21 Apr. 2001
<http://www.nlp.de/exp_com/eliza/info.html>.
Finlay, Janet, and Alan Dix. An introduction to Artificial Intelligence. London: UCL
Press, 1996.
Generation 5. An Introduction to Natural Language Theory. 24 Apr. 2001
<http://www.generation5.org/nlp.shtml>.
Ginsberg, Matt. Essentials of Artificial Intelligence. San Mateo: Morgan Kauffman
Publishers, 1993.
Hovy, Edward, Chin-Yew Lin, and Daniel Marcu. Automated Text Summarization
(SUMMARIST). 29 Apr. 2001
<http://www.isi.edu/natural-language/projects/SUMMARiST.html>.
Hutchins, John . The development and use of machine translation systems and
computer-based translation tools. 28 Apr. 2001
<http://www.foreignword.com/Technology/art/Hutchins/hutchins99_1.htm>.
Jackson, Philip C. Introduction to Artificial Intelligence. 2nd ed. New York: Dover
Publications, 1985.
Jones, Karen Sparck. Workshop in Intelligent Scalable Text Summarization. 29 Apr.
2001 <http://www.cs.columbia.edu/~radev/ists97/ksj-slids.txt>.
27
Kamin, Samuel N. Programming Languages: An Interpreter-Based Approach. New York:
Addison-Wesley, 1990.
Kay, Martin. Machine Translation. 30 Mar. 2001 <http://www.lsadc.org/Kay.html>.
Manning, Christopher D, and Hinrich Schutze. Foundations of Statistical Natural
Language Processing. Cambridge: The MIT Press, 2000.
Norvig, Peter. Paradigms of Artificial Intelligence Programming: Case Studies in
Common Lisp. San Francisco: Morgan Kaufmann Publishers, 1992.
Patent, Dorothy Hinshaw. The Quest for Artificial Intelligence. New York: HBJ, 1986.
Shrdlu- Detailed comments. 21 Apr. 2001
<http://www.ffzg.hr/infoz/informat/prevodjenje/Slides_Shrdlu/sh2.html>.
Weizenbaum, Joseph. ELIZA. 21 Apr. 2001
<http://acf5.nyu.edu/~mm64/x52/9265/january1966.html>.
WINOGRAD’S SHRDLU. 22 Apr. 2001
<http://www.cs.cf.ac.uk/Dave/AI1/COPY/shrdlu.html>.
Winston, Patrick Henry. Artificial Intelligence. 3rd ed. New York: Addison-Wesley,
1993.
28