Download Beyond Mice and Menus1 - American Philosophical Society

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Beyond Mice and Menus1
BARBARA J. GROSZ
Higgins Professor of Natural Sciences
Division of Engineering and Applied Sciences
Dean of Science, Radcliffe Institute for Advanced Study, Harvard University
T
HE MOUSE —the computer mouse, that is—was invented in the
mid-1960s and was first demonstrated in a computer-based editing system in 1968. The videotape of that demonstration, which,
like much else now, may be found on the “Web” (NLS:demo 2004), is fascinating and enlightening. The demonstration introduced not only the
mouse, but also cut-and-paste editing, hypertext, and many other technologies currently employed by all those who use computer systems for work
or play.
Douglas Engelbart invented the mouse, and his research group at SRI
International (then the Stanford Research Institute) engineered an editing
system and document preparation environment called NLS (OnLine System) that is the predecessor of today’s computer editing systems (Engelbart
and English, 1968). The name of this research group, “The Augmentation
Research Center,” reflected Engelbart’s goal of constructing systems able
to “augment human intellect” (Engelbart 1962). The system features he
envisioned were far ahead of that time; some remain elusive even now.
Menus along with icons as a way of communicating were integrated
with windows into editing systems at Xerox PARC in the early 1970s
(Xerox:history 2004). Together, mice and menus enable, indeed were
designed to enable, “ordinary people” to get computers to do tasks for
them without requiring that they know how to program. For any who
struggled with assembly language programming or job control languages2
and the like, this new technology represented enormous progress.
1 Read 14 November 2003. The research described in this paper was supported in part by
the National Science Foundation, Grant No. IIS-0222892. The author thanks Shira Fischer,
Stuart Shieber, Theda Skocpol, and William Skocpol for their helpful comments on earlier
versions of this paper and participants at the American Philosophical Society meeting held
14–15 November 2003, whose insightful questions and comments also have informed the
final version of this paper.
2 Job control languages were used to direct the operating system, telling it how to process
a program.
PROCEEDINGS OF THE AMERICAN PHILOSOPHICAL SOCIETY
[ 529 ]
VOL. 149, NO. 4, DECEMBER 2005
530
barbara j. grosz
Engelbart’s group and the researchers at Xerox PARC pushed the
available technology to its limit. Computers now are smaller, lighter,
and faster than the ones from the 1960s and 1970s that were used to
design mice and menus. Alas, their capabilities for communicating and
working with people, whether scientists or scholars, in business or in
play, lag behind the progress in computer hardware.3
Mine is not the first call to move beyond mice and menus in the
design of “human-computer interfaces,” as the component systems for
communicating with people are typically known. However, the direction in which I suggest we need to move, which the research I will
describe aims to make possible, is a more radical step than most have
envisioned. Many newspaper columns and technical articles argue the
advantages of voice commands over “pointing and clicking,” or tout the
virtues of touch pads, touch pens, or fancy remote control devices that
make it easier to point up close on handheld devices or at a distance on
a room-size video screen. For certain kinds of tasks and in some settings, communicating by voice is certainly preferable. The benefits of
voice are more widely accepted and computer speech capabilities are
far greater than they were in the 1970s, when I first worked on naturallanguage dialogue systems and such claims were debated. (The phrase
“natural language,” which refers to the languages people speak, is used
in the technical literature to emphasize their difference as evolved and
natural from the artificial, designed languages of mathematics, logic,
and computing.)
However, both speech capabilities and the various new devices provide only different “input/output” mechanisms. They enable new modes
of interaction, new ways of saying something to an interface, but not a
richer vocabulary. They allow people using systems to say things differently but not to say different things. One can identify a file by speaking
its name rather than typing (or pointing and clicking), but not by
describing its contents. One can point differently, but still only point
and click. It is at a much deeper level that computer systems must
move beyond mice and menus, not just at the surface of communication. To design better ways for computers to communicate with people
requires serious consideration of the deeper cognitive and pragmatic
aspects of natural language use (Grosz 2002) and not simply the surface syntax and semantics.
3 Although current hardware is more reliable, the software that is now standard is so
much more complex that overall systems reliability may not be better. Reliability is a
different topic from the one I address in this paper.
beyond mice and menus
531
More Than Screen Deep
A simple example illustrates the problems that ensue when language
use is treated only superficially. Almost every person who has used a
computer has, on more than one occasion, attempted to insert the contents of one file into another document (or otherwise to access a new
file when in the midst of some task), only to have a “dialogue box” pop
up, saying something like, “file not found” or “the file did not load
properly,” or stating that the task for which this file was intended to be
used cannot be done. In the most extreme case, the so-called dialogue
continues with the ominous words, “Need to reboot. OK?” It is never
“okay” to reboot in the midst of a job. Furthermore, the dialogue box
invariably appears over the workspace, obscuring whatever it is one
was doing, and attempts to move to another workspace and look for
an alternative file almost always lead only to a series of beeps. (The
most recent versions of the Macintosh and Windows operating systems
seem, finally, to have addressed this latter problem.) The computer system insists that we agree to its destruction of our work context and
gives no other options. If a human assistant produced the equivalent
response, offering only to burn the file cabinet when unable to find a
file in it, we would be inclined to recommend that that person look for
a new job or even consider a different occupation. The perspective that
underlies the design of human-computer interface systems that behave
this way is of the interface as merely a surface, and the communication
between people and systems as a simple surface interaction, rather than
an inherently collaborative endeavor (Grosz and Sidner 1990).
Dialogue-box exchanges are often more an egocentric monologue
with false pretenses to request approval than an actual dialogue. The
very idea that such interactions might be called a dialogue reflects a shallow surface, or “screen deep,” view of human-computer communication.
The root of the problem is a design stance that accepts a view of people
talking to the computer system through an interface. Human-computer
interfaces designed from this stance seldom incorporate sufficient knowledge of the task for which the person is using a system, but such information is crucial to the system’s being able to work with the person.
Even the language of communication supported by mice and menus
is impoverished. Menus provide options on the surface, and mice allow
one to choose among those options. Thus, in essence, the mouse provides a capability for picking among a set of nouns (for instance,
the file to which to apply some action) and verbs (such as “edit” or
“insert”), but in a somewhat clumsy way, and the graphical user interface allows these nouns and verbs to be assembled in only the simplest
of noun-verb (or verb-noun) syntaxes. Perhaps in the earliest stages of
532
barbara j. grosz
the evolution of natural languages, people were able only to point and
grunt, but even the oldest writings in this Society’s collections are filled
with far more complex linguistic constructions and more subtle uses of
languages.
The increased prevalence and reach of computer systems in the
workings of our daily lives require that we have more sophisticated ways
of communicating with them. We need to be able to do more than talk
to the computer system through a screen-deep interface. Computer systems should integrate their communications capabilities with their
underlying functionality, thus becoming capable of working with the
people using them. To have such functionality, to be able to collaborate
and not merely interact, the systems need access, either explicitly or
implicitly, to the plans and goals of their users.
Collaboration, Not Mere Interaction
Most computer systems today are interactive, making it important to
highlight the often-overlooked differences between interaction and collaboration. These differences are starkly illustrated by a contrast between
two types of driving activity: driving in Boston and driving in a convoy.
Driving in Boston is highly interactive, as anyone who has attempted
this activity knows. It is not, however, collaborative, even if drivers follow certain traffic laws and act simultaneously in seeming coordination. Although Boston drivers may have a goal in common—namely,
filling any empty space in front of their vehicles—they do not share
commitment to a common goal toward which they work together, or
any commitment to the success of other drivers. Although there may
be communication between drivers, it is not in service of a common
goal.
In contrast, driving in a convoy is a paradigmatic team activity
(Levesque et al. 1990). Convoy drivers have a goal in common; they
agree on where they are going, and each has a commitment to everyone’s
reaching that destination. They also come to agreement, perhaps incrementally, about the route to follow to their destination; they have a
“shared recipe” for performing their group activity (Grosz and Kraus
1996). Furthermore, they have some means of communicating with one
another so that they can all remain apprised (or, more formally, “maintain mutual belief”) of the status of their joint activity. Before the advent
of cell phones, check points and hand signals served to ensure this updating of status. The commitments of drivers in a convoy to one another’s
successfully reaching their common goal leads to their being willing to
help one another, should anyone get in trouble. Thus, convoy driving has
three main characteristics of collaboration: commitment to a joint activity,
beyond mice and menus
533
an agreed-upon recipe, and commitment to the success of others on the
team. Requirements for communication and inclinations toward helpful
behavior may be derived from these fundamental characteristics.
A collaborative computer system is a problem-solving partner
rather than a simple-minded servant. The design of collaborative systems enables the people using such systems to talk in terms of their
purposes or goals and the tasks they intend to accomplish, rather than
requiring them to instruct the system, in detail and precisely, what to
do and how to do it. The resulting communication is truly dialogue.
Such capabilities fundamentally change even simple exchanges like the
“file would not load properly” response in the example discussed earlier. For instance, if it were collaborative, the system would have some
notion of why the file was needed, and it might respond by saying,
“The file would not load properly; looking for another source.” Recognizing, however, that it might err in deducing a user’s intentions and
hence appropriate other sources, the system would provide a means for
the user to modify the search or stop it. Rather than having a dialogue
box appear over, and block the view of, the workspace, thus interrupting the task and likely the user’s thought processes, the dialogue would
be carried out on the side, enabling the user to search simultaneously
for alternatives.
Engelbart’s early work itself hints at this idea of collaborative systems. It aimed to provide a system with capabilities that enhanced and
complemented those of the person using it. Shieber (1996), in calling
for “collaborative interfaces” and arguing that “the collaborative stance”
matters both for the development of new types of human-computer
interfaces and as a means for analyzing existing interface systems,
emphasizes the importance of the appropriate division of labor, that is,
of apportioning correctly “the roles and responsibilities of the two participants in problem-solving, the computer and the user.” The proper
division of labor—an important constituent of successful collaborations that derives from the shared recipe and commitments to the success of other participants—was a concern for Engelbart. Norman, in
his book Things That Make Us Smart (Norman 1993, 12) says, “Technology should . . . complement human abilities, aid those activities for
which we are poorly suited, and enhance and help develop those for which
we are ideally suited.” The designs of mice and menus, decades before
Norman wrote, were driven by this goal of complementarity. Somehow, though, the ability of a system to collaborate truly with the person using it is missing from almost all human-computer interfaces that
deploy these technologies.
Before looking at what went wrong, I want to mention another littlerecognized aspect of Engelbart’s early work: the engineering and
534
barbara j. grosz
human-computer interface design properties that are important for
good systems design and yet not always characteristics of current systems. For instance, this work included “user studies” to determine
whether the mouse was better than alternatives like the joystick or
touch-pen and pad. Engelbart also recognized the clumsiness of needing to move back and forth between mouse and keyboard. He invented
a chord keypad as a companion device to the mouse; the keypad
enabled one-handed typing of letters so that one was not required to
shift back and forth from mouse to keyboard. Additional user studies
showed that, for short sequences of letters, the keypad was more efficient than the keyboard whereas for longer sequences, the keyboard
was better. (Use of the chord keypad required learning a binary encoding of the alphabet, which the designers of subsequent systems thought
would be too hard for people to learn, and so it was dropped from
later designs.)4
Despite hints at collaboration, including the ability for two computer monitors to share a single view of a document that was being
edited, Engelbart’s systems were individual-oriented. The computer
was seen predominantly as a tool for an individual; the person was the
master, the computer the servant. This bias is not surprising, given that
this work preceded even the infancy of the Internet. It is illustrated by
the description in the early paper (Engelbart 1962, 12) of the “augmented architect.” This hypothetical system is to be used by an architect designing buildings, enabling the architect not only to draw, but
also to do such auxiliary activities as viewing the design from different
angles and testing properties of materials. In the complex scenario presented in the paper, the computer “clerk” runs simulations, looks up
all sorts of information for the architect, and does various helpful computations. There is a hint of joint work in the description of what is
probably one of the earliest mentions of the “sneaker net,” that predecessor to the Internet in which people ran down the hall and exchanged
tapes to share software: “All of this information . . . can be stored on a
tape to represent the design manual for the building. Loading this tape
into his own clerk, another architect, a builder, or the client can
maneuver with this design manual to pursue whatever details or
insights are of interest to him—and he can append special notes that
are integrated in the design manual for his own or someone else’s later
benefit.” Although the idea of teamwork is suggested, the overall per4 The persistence of the QWERTY keyboard despite its suboptimal arrangement given
current keyboard capabilities is a related, but different, problem. Inertia and learning curves,
and the resulting high costs of changing ingrained behavior in a large population, have
inhibited the adoption of such better alternatives as DVORAK.
beyond mice and menus
535
spective of this work is one in which each person does his or her own
job separately. The “clerks” work for individuals; they do not coordinate with each other or assist actively in the collaboration of the people
for whom they work. Even a glance into any actual construction process will show there is a much more tightly intertwined collaboration
going on.
Widespread use of the Internet has fundamentally changed the computing situation. One-to-one human-computer uses of computer systems, though still prevalent, are dominated by settings in which there
are many people and many computer systems. Systems enable people
to communicate and work together, and the systems themselves must
communicate and work together. Although people are often unaware
of the complex network of interacting systems that support work on
“the computer” (thought of in the singular), there is a “web” of interactions. Today, most computer use is by groups comprising both people
and computer agents engaged in tasks both formal and informal and of
both long and short duration.
It is hardly debatable that it would be beneficial in most settings to
add collaborative capabilities to the ways in which computers interact
with people and with other systems. Few current systems have such collaborative capabilities, because fundamental work on modeling collaboration is needed to provide the theoretical foundations for collaborativeinterface design and for the construction of computer systems with
such collaborative capabilities more generally. What makes this computer science rather than psychology or sociology is that we are designing artificial systems to have certain properties. Although we may
study naturally occurring collaborative systems (teams of people), our
analysis is not restricted to those systems. The wonderful part about
what Herbert Simon, one of the founders of the field of Artificial Intelligence and a Nobel-Prize-winning economist, called “the sciences of
the artificial” (Simon 1981) is that one gets to explore all parts of the
design space. The frustrating part is that all sorts of interactions occur
that one would never imagine, and attention must be paid to all sorts
of details. (Throughout the history of the field of Artificial Intelligence,
there have been critics who feared “smart systems”; most of us who
have tried to build such systems are brought only to marvel all the
more at “natural intelligence.”)
Modeling Collaboration
The SharedPlans model of collaborative action (Grosz and Sidner 1990;
Grosz and Kraus 1996, 1999) aims to provide the theoretical foundations
needed for building collaborative systems (Grosz 1996). It stipulates four
536
barbara j. grosz
key characteristics that participants in a group activity must embody
for them to be true collaborative partners and thus for their joint activity
to be collaborative. The formal definition is given in a logic and may
be found elsewhere. Translated into English, the definition states that
for a group activity to be collaborative, the participants must have (1)
intentions that the group perform the group activity; (2) mutual belief
of a recipe; (3) individual or group plans for the constituent subactions
of the recipe; and (4) intentions that their collaborators (fellow group
members) succeed in doing the constituent subactions. The intentions
that this definition stipulates constitute different kinds of commitments
required of the participants. Mutual belief of the recipe essentially
amounts to all the participants’ holding the same beliefs about the way
in which the activity will be carried out; thus, they must agree on how
to do the activity. Clause (3) defines the overall plan in terms of constituent plans of individuals or groups; thus, the definition is recursive, with
the recursion ending at the level of basic, individual actions. The definition is also one of interdependence; team members must have commitments to the group activity and to each other.
To be collaborative, both human-computer interface systems and
the “computer agents” constituting “multi-agent systems” need to meet
the specifications stipulated by this definition. In particular, they need
to commit to the group activity and to their role in it; they need to
divide the labor according to their capabilities so they can carry out the
individual plans that constitute the group activity; they need to commit
to the success of others. All this is explicitly in the definition. Implicit,
implied by the specification, are needs to communicate, to interpret
what others say in the appropriate collaborative context (Grosz 1999),
to be willing to help others in the group in doing their parts, and to reconcile conflicts between commitments to this group activity and their
other individual and group activities.
Thus, for computer systems to be true collaborative partners, they
must be able to form and act on a particular set of intentions and beliefs.
However, the presence of beliefs and intentions in a formal specification for a computer system always raises the specter of “intractability,”
the question whether it is feasible to perform the computations required
effectively. (“Intractable” and “effectively” are formally defined in theoretical computer science; for the purposes of this presentation, one can
think in terms of whether or not the computation can be performed in
a reasonable amount of time, in particular that it would not take years
or even the lifetime of the universe.) This concern may be addressed by
considering the two ways this definition is intended to be used, neither
of which is a direct implementation requiring complete capabilities for
reasoning about intentions and beliefs. The theory may be manifest in
beyond mice and menus
537
practice either as a source of insight informing system design or as a
specification of systems’ design that constrains explicit reasoning processes. I will look at each of these in turn.
Manifesting Collaboration: Theory as Insight
The writer’s aid system “WAID” (Babaian, Grosz, and Shieber 2002),
integrates the system’s communications capabilities with underlying
functionality in a way that exemplifies the use of the SharedPlans theory as “insight” into the design of a particular collaborative humancomputer interface. This system addresses a problem encountered by
all authors of documents requiring references to prior work. At some
point in writing, the author needs to cite prior work, but cannot recall
the exact citation or easily find it in a bibliography file. Authors in such
situations have two possible courses of action. The first alternative is to
stop writing and search for the appropriate citation. Although various
online resources and search capabilities may make this search easier
than it once was, they do not eliminate the effects of an interruption in
work. If it takes more than a few moments to find the citation, which it
typically does even when authors are not lured into looking at other
things they find while navigating the Web, the context of writing is
lost. It is hard to recall what one was about to write next. The second
alternative is to leave notes about the desired reference and to return
later to find the appropriate citation. This approach, which allows an
author to keep writing and thus not lose context, has the unfortunate
consequence that a great deal of searching needs to be done by someone before the document is finished.
WAID provides a better alternative, one that exploits the search
capabilities of computer systems, which are superior to those of most
people. It divides the labor of producing citations so that the computer
system does the searching. When using WAID, an author follows the
second approach, inserting “hints” in the form of authors’ names and
keywords into a citation flag in the text. Rather than being notes for
the author (or another person) to follow up on later, however, these hints
are used by WAID to form search terms. The system, using artificialintelligence-planning techniques that enable it to cope with partial and
changing information as well as Web-site server crashes, searches through
various bibliography files, Web pages, and online document sources,
while the author continues to write. It compiles lists of possible citations,
and, if it is asked and they are available, retrieves the papers themselves. When the author has finished or is otherwise ready to stop writing or wants to see the possible citations, the computer agent provides
the list of citations and papers that match the hints. After the author
538
barbara j. grosz
chooses the correct one, the system formats the citation appropriately.
If WAID’s search fails, it provides information to the author about the
problem it encountered, allowing the author to modify the search. In
short, when using WAID, the author writes and the system searches
and deals with other computer systems.
WAID incorporates sophisticated planning capabilities and can act
autonomously, but it does so in service of the specific goals of the person using it. Collaboration was designed into WAID from the start,
influencing the way the planning and reasoning capabilities of the system were designed. The theory as source of insight lies behind the
appropriate division of labor: the system searches and the person
writes. It also is reflected in WAID’s exploring all possible ways of finding a citation (manifesting commitment to the group activity), providing feedback to the author and partial results (thus maintaining mutual
beliefs about the status of the group activity), and not interrupting the
author, but having information available when the author is ready
(thus embodying a commitment to the author’s success on the writing
portion of the task). Finally, it is important to emphasize that WAID
uses the same surface devices (keyboard, mouse, menus) as other interfaces; it is “beyond mice and menus” in its incorporation of collaborative properties, in particular its use of information about the task of
including citations in papers to help the author.
Manifesting Collaboration:
Theory as Systems Specification
The second way in which a formal specification such as SharedPlans
may be used to influence systems design is for it to be taken as a set of
constraints on the architecture of the system and the algorithms used
by it (Grosz, Hunsberger, and Kraus 1999). The use of the SharedPlans
formalization as a specification for design has been shown to make a
significant difference in computer-agent team performance (Tambe
1997; Rich and Sidner 1998). In addition, the constraints it imposes on
computer-agent design and attempts to define methods, mechanisms,
and algorithms to meet each of these constraints delineate a rich research
area with many challenges. It is this aspect of the formalization that I
touch on briefly in the remainder of this paper.
The formalization stipulates that collaborative computer agents
incorporate three kinds of decision-making processes, each of which is
constrained by the need for it to produce agent beliefs and intentions
appropriate in the context of a group activity. As a result, the challenge
of designing decision-making mechanisms that meet this portion of the
specification has yielded the following three interesting problems:
beyond mice and menus
539
1. Initial commitment decision problem: For a group to form,
individual agents must commit to the group activity, including
the planning activity required to carry it out. Decision-making
mechanisms must produce the requisite commitments to the
group activity.
2. Parameter choice problem: For a group to have a plan, its
members must agree on the parameters of their activity, including
the recipe to be used and the assignment of agents to constituent
tasks. Decision-making methods must accommodate different
agent capabilities and result in the coordinated updating of
beliefs and intentions.
3. Intention reconciliation problem: Agents must reconcile conflicts between intentions deriving from the group activity and
intentions that arise from their other plans and goals. Decisionmaking methods must be able to weigh trade-offs between individual good and group good.
The initial-commitment decision problem (Hunsberger and Grosz 2000)
reflects the need to determine which agents will be part of the team for
a new activity. Solutions to this problem must meet the requirement that
individual team members form appropriate commitments to the group
activity, not only to performing constituent actions, but also to group
decision-making. This problem arises because an agent in isolation does
not have sufficient information to decide whether or not to join a group
undertaking to perform a group task. Individuals cannot gauge the costs
they will incur in joining a team without knowing who else will be on
the team. This dilemma is reflected in the question often posed by those
being asked to serve on committees: “Who else is on the committee?”
Furthermore, to determine the appropriate team membership requires
knowledge of the capabilities and availability of potential team members, that is, global knowledge of the actions agents are able to perform in general and the times they are available to perform them in
service of this new activity. Individual agents know their own background capabilities, commitments, costs, and benefits; they have local
knowledge. Bringing this local knowledge to bear to determine a globally adequate solution is not trivial. Agents may be unwilling to share
private information. Even if they are willing to share such information,
providing complete information to one another incurs large communication costs.
We have addressed this problem by modifying combinatorial auction
techniques developed in economics and artificial intelligence (Hunsberger
and Grosz 2000). Instead of bidding on articles to buy, agents bid on
actions that need to be done. The auction mechanism differs in three
540
barbara j. grosz
ways from standard combinatorial auctions. First, it aims to minimize
costs rather than maximize profits. Second, it aims to satisfice rather
than to optimize—“satisfice” meaning it looks for a low enough price
rather than the lowest price—because the auction is used to assemble
the team, not to make a final determination of task allocation. This
feature reflects the importance of taking into account the resourceboundedness of agents. Third, agents may include time constraints in
their bids. The mechanism enables agents to charge more to do a task
at a less convenient time, just as a person might be less willing to come
to a meeting at a time that made getting home for dinner difficult. To
enable agents to plan their individual schedules as independently as
possible and still coordinate requires solving a range of temporal reasoning problems (Hunsberger 2003).
The parameter-choice problem reflects the need for participants in a
group activity to agree on such parameters of their activity as the way
they will perform their task, the agents who will do constituent actions,
the particular resources to be used, and the times at which different
actions will be done. Thus, this problem relates to “task allocation” and
“coalition formation” problems that have been much studied in operations research and economics. However, the collaborative context
changes the evaluation criteria for solutions. Again the varying capabilities, knowledge, and preferences of the different agents must be taken
into account appropriately. Dynamic mediation (Ortiz et al. 2003) is
one recent approach to this problem, but much work remains to be
done to design a wider range of processes for handling such problems
and to determine trade-offs in performance among them.
The intention-reconciliation problem arises because the choice of
which action to perform (or, more specifically, to commit to performing) when there is a conflict between possibilities also changes in the
group context. In choosing among conflicting intentions that arise
from individual plans, which are commitments solely to oneself, agents
weigh factors such as the relative importance of different activities (to
themselves) and the possibilities of rescheduling. The collaborative or
team context requires that agents consider in addition such factors as
the damages incurred by the team if they renege on a commitment and
the effect of such defaulting on their own reputations. This contrast is
illustrated by considering the choice between maintaining an intention
to upgrade software or reneging on that commitment to accept a newly
proffered offer to attend the theater at the time the upgrade was scheduled. In the individual-plans case, the software is on a personal computer and doing the upgrade at another time affects only the person
deliberating. In contrast, in the collaborative case, the software update is
part of a computer-systems administration group network-wide upgrade;
beyond mice and menus
541
more people will be inconvenienced and even a job may be at stake.
We have begun to analyze the ramifications of different intentionreconciliation strategies in such collaborative settings using simulations (Grosz et al. 2002) and games (Grosz et al. 2004).
Thus, the constraints derived from the formalization result in the
need for certain kinds of decision-making capabilities, on the part of
individual computer agents, that accommodate appropriately the collaborative group activity situation. As these short descriptions make
evident, prior work in economics has proved useful to research on
these problems. However, as in other computational work that builds
on economic theory (Parkes 2004 inter alia), the need to take seriously
resource-boundedness and computational constraints has meant that
new solutions, ones that are computationally tractable, need to be found.
Furthermore, for computer agents to act appropriately in groups that
include people, their design needs to take into account such properties
of actual human behavior as the inability to express preferences completely or to assess probabilities (Kahneman and Tversky 1974) and
the importance of fairness in choosing among alternatives (Gal et al.
2004). Collaborations between computer scientists and economists are
thus likely to yield results useful to both fields and important for a
wide range of systems.
Collaborative Systems in the Future
In the 1970s, computer science was in many places brought together with
psychology, linguistics, and philosophy of mind and of language in the
cognitive sciences, an endeavor aimed at understanding the individual
mind and the ways it worked. In the coming decades, the “social organization,” “social policy,” and “social laws” areas of the social sciences,
and perhaps also the insights of the humanities, will play an increasingly
important role for a broader range of computer science questions.
Teamwork, incorporating “collaborative capabilities,” adds an
important dimension to systems behavior, supports the design of collaborative systems for human-computer communication, and enables true
dialogue capabilities. These capabilities are as important for computer
scientists as for others who use computer systems, because even computer scientists share with other people certain cognitive capacities and
limitations. Supporting the design, evolution, and maintenance of large
complex software systems requires programming languages and environments that better support what programmers are doing, taking into
account and recording their intentions as well as their actions. It is
important for us to update John Donne (1624) and to design systems to
adhere to the admonition “No computer is an island, entire of itself.”
542
barbara j. grosz
References
Babaian, Tamara, Barbara J. Grosz, and Stuart M. Shieber. 2002. A Writer’s Collaborative Aid. Proceedings of the Intelligent User Interfaces Conference (IUI-2002),
San Francisco, 13–16 January. New York: ACM Press, 7–14.
Donne, John. 1624. Devotions Upon Emergent Occasions. Meditation XVII: “Nunc
lento sonitu dicunt, Morieris. Now, this Bell tolling softly for another, saies to me,
Thou must die.”
Engelbart, Douglas C. 1962. Augmenting Human Intellect: A Conceptual Framework.
Summary Report, Stanford Research Institute, Contract AF 49(638)-1024. Available
at http://www.bootstrap.org/augdocs/friedewald030402/augmentinghumanintellect/
ahi62index.html.
———. 1963. Conceptual Framework for the Augmentation of Man’s Intellect. In Vistas in Information Handling, ed. Howerton and Weeks. Washington, D. C.: Spartan Books, 1–29. Republished in Computer Supported Cooperative Work: A Book
of Readings, ed. Irene Greif. San Mateo, Calif.: Morgan Kaufmann Publishers,
Inc., 1988, 35–65.
Engelbart, Douglas C., and William K. English. 1968. A Research Center for Augmenting Human Intellect. AFIPS Conference Proceedings of the 1968 Fall Joint Computer Conference, San Francisco, December 1968, Vol. 33: 395–410. Republished
in Computer Supported Cooperative Work: A Book of Readings, ed. Irene Greif.
San Mateo, Calif.: Morgan Kaufmann Publishers, Inc., 1988, 81–105. Available at
http://www.bootstrap.org/augdocs/friedewald030402/researchcenter1968/Research
Center1968.html.
Gal, Ya’acov, Avrom Pfeffer, Francesca Marzo, and Barbara J. Grosz. 2004. Learning
social preferences in games. Proceedings of the National Conference on Artificial
Intelligence (AAAI-2004), Menlo Park, Calif.: AAAI Press.
Grosz, Barbara. 1996. Collaborative Systems: AAAI Presidential Address. AI Magazine 2.17: 67–85.
———. 1999. The Contexts of Collaboration. In Cognition, Agency and Rationality,
ed. K. Korta, E. Sosa, and X. Arrazola. Dordrecht, Netherlands: Kluwer Press,
175–87.
———. 2002. Discourse Structure, Intentions, and Intonation. In The Languages of
the Brain, ed. A. Galaburda, S. Kosslyn, and Y. Christen. Cambridge, Mass.: Harvard University Press, 127–42.
Grosz, Barbara J., and C. Sidner. 1990. Plans for Discourse. Chap. 20 in Intentions in
Communications, ed. Cohen, Morgan, and Pollack. Cambridge, Mass.: MIT
Press, 417–44.
Grosz, Barbara, and Sarit Kraus. 1996. Collaborative Plans for Complex Group
Action. Artificial Intelligence 86.2: 269–357.
Grosz, Barbara J., Luke Hunsberger, and Sarit Kraus. 1999. Planning and Acting
Together. AI Magazine, Winter: 23–33.
Grosz, Barbara, and Sarit Kraus. 1999. The Evolution of SharedPlans. In Foundations
of Rational Agencies, ed. A. Rao and M. Wooldridge. Dordrecht, Netherlands:
Kluwer Academic Press, 227–62.
Grosz, Barbara J., S. Kraus, D. G. Sullivan, and S. Das. 2002. The influence of social
norms and social consciousness on intention reconciliation. ICMAS-2000 Special
Issue of Artificial Intelligence (vol. 142): 147–77.
Grosz, Barbara J., Sarit Kraus, et al. 2004. The Influence of Social Dependencies on
Decision-Making: Initial Investigations with a New Game. Proceedings of Autonomous Agents and Multi-Agent Systems (AAMAS-2004). New York: ACM Press.
Hunsberger, Luke. 2003. Distributing Control of a Temporal Network among Multiple Agents. Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2003). New York: ACM Press,
899–906.
beyond mice and menus
543
Hunsberger, Luke, and Barbara J. Grosz. 2000. A Combinatorial Auction for Collaborative Planning. Proceedings of the Fourth International Conference on Multi-Agent
Systems (ICMAS-2000). IEEE Computer Society Press, 151–58.
Kahneman, Daniel, and Amos Tversky. 1974. Judgment Under Uncertainty: Heuristics
and Biases. Science 185: 1124–31. Reprinted in Judgment under Uncertainty:
Heuristics and Biases, ed. D. Kahneman, P. Slovic, and A. Tversky. Cambridge:
Cambridge University Press, 3–20.
Levesque, H. J., P. R. Cohen, and J.H.T. Nunes. 1990. On Acting Together. Proceedings of the Annual Meeting of the American Association for Artificial Intelligence
(AAAI-90), 94–99.
NLS:demo. 2004. Available at http://sloan.stanford.edu/MouseSite/1968Demo.html.
Norman, Donald. 1993. Things That Make Us Smart: defending human attributes in
the age of the machine. Reading, Mass.: Addison-Wesley Publishing Company.
Ortiz, C., T. Rauenbusch, E. Hsu, and R. Vincent. 2003. Dynamic resource-bounded
negotiation in non-additive domains. In Distributed Sensor Networks, ed. Victor
Lesser, Charles Ortiz, and Milind Tambe. Dordrecht, Netherlands: Kluwer Academic Publishers, 61–108.
Parkes, David, and Sebastien Lehaie. 2004. Applying Learning Algorithms to Preference Elicitation. Proceedings of the 5th ACM Electronic Commerce (EC-04). New
York: ACM Press, 180–88.
Rich, C., and C. L. Sidner. 1998. Collagen: A Collaboration Manager for a Collaborative Interface Agent. User Modeling and User Assisted Interaction 7.3–4: 315–50.
Shieber, Stuart. 1996. A Call for Collaborative Interfaces. Computing Surveys, vol.
28A (electronic). Available at http://www.acm.org/pubs/citations/journals/surveys/
1996-28-4es/a143-shieber/.
Simon, Herbert. 1981. The Sciences of the Artificial. Cambridge, Mass.: MIT Press.
Tambe, M. 1997. Towards Flexible Teamwork. Journal of Artificial Intelligence
Research 7: 83–124. Available at http://teamcore.usc.edu/papers/97/jair.ps.
Xerox:history. 2004. Available online at http://www.parc.xerox.com/about/history/
default.html.