Download Computing point-of-view - MIT Media Lab

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Development theory wikipedia , lookup

Developmental psychology wikipedia , lookup

History of the social sciences wikipedia , lookup

Cognitive science wikipedia , lookup

Cultural ecology wikipedia , lookup

Other (philosophy) wikipedia , lookup

Philosophy of artificial intelligence wikipedia , lookup

Cognitive semantics wikipedia , lookup

Bioecological model wikipedia , lookup

Embodied cognitive science wikipedia , lookup

Eye movement in reading wikipedia , lookup

Sociology of culture wikipedia , lookup

Cultural psychology wikipedia , lookup

Computational linguistics wikipedia , lookup

Postdevelopment theory wikipedia , lookup

Ethnoscience wikipedia , lookup

Cross-cultural differences in decision-making wikipedia , lookup

Intercultural competence wikipedia , lookup

Transcript
Computing point-of-view
By Hugo Liu
Thesis Proposal for the degree of Doctor of Philosophy
at the Massachusetts Institute of Technology
November 2005
Professor Pattie Maes
Associate Professor of Media Arts and Sciences
Massachusetts Institute of Technology
Professor William J. Mitchell
Head, Program in Media Arts and Sciences
Alexander W. Dreyfoos, Jr. (1954) Professor
Professor of Architecture and Media Arts and Sciences
Massachusetts Institute of Technology
Professor Larifari Aufhebung
King of Candy Land
Computing point-of-view
Hugo Liu
Media Arts and Sciences, MIT
[email protected]
November 2005
Abstract
A point-of-view affords individuals the ability to judge and react broadly to
people, things, and everyday happenstance. Your same sense-of-beauty is
versatile enough to judge almost anything you put before it, be it a
painting, a sunset, or a novel's ending. Yet point-of-view is ineffable and
quite slippery to articulate formally through words—just as light has no
resting mass, perhaps it could be said that viewpoint cannot be measured in
stasis. Drawing from semiotic and epistemological theories, this proposal
narrates a computational theory for representing, acquiring, and tinkering
with point-of-view. I define viewpoint as a self's collected situations within
latent semantic spaces such as culture, taste, identity, and aesthetics. The
topology of these spaces are acquired through linguistic ethnography of
online cultural corpora, and an individual's locations within these spaces is
inferred through psychoanalytic machine readings of egocentric texts. Once
acquired, viewpoints can gain embodiment as viewpoint artifacts, which
allow the exploration of someone else through interactivity and play. The
proposal will illustrate the theory by discussing interactive-viewpointartifacts built for five viewpoint realms—aesthetics, attitudes, cultural
identity, taste-for-food, and humor. I describe core enabling technologies
such as common sense reasoning and textual affect sensing, and propose a
framework to evaluate the judiciousness of point-of-view representations
and the value of viewpoint artifacts in affording people new ways for
organizing, shaping, and searching human narrative content.
1 Introduction
Since the late 1950s, every few years, some researcher in Artificial
Intelligence has exclaimed eureka, that they have almost engineered
a human intelligence, or some basal capability of a person. But in
2005, four years after the computer H.A.L. should have played tricks
with man in space, Artificial Intelligence feels still the same distance
from this ever-present mirage of human-level intelligence.
So it seems there were several bad paradigms stalling progress
on representing and computing people. First, too grandiose of claims
were made about formal logic and purely symbolic representation—
nicknamed Good Ole Fashioned AI by its detractors. Logic, with its
immaculate and universal calculus, treats minds like Rube-Goldberg
machines, and idealizes thought process the way that Descartes did.
Logic failed because thought is far too flexible, rich and
opportunistic than can be contained by a mathematically rigid,
symbolically sparse, and non-opportunistic representation like firstorder predicate calculus. Second, much ado was made about purely
Key Words
Point-of-view Models
User Modelling
Common Sense
Textual Affect Sensing
Aesthetics
Culture
connectionist representations like artificial neural networks. The
idea was that a properly wired ‘baby machine’ could be deployed in
the world and re-derive human mental capability applying only
first-principles. Like logic, this touched another extreme of the
representational spectrum, namely it was representationally agnostic.
The approach has yet to demonstrate compelling emergent
intelligence. Marvin Minsky (1990) reported on the stalemate—he
suggested that the common error was that ‘neat’ representations are
too rigid to capture the diversity of human intelligence. He proposed
that the intelligence modeling enterprise should instead take a
‘scruffy approach’—combining ‘multiple representations’ (Minsky
1988). In advocating the overthrow of Cartesian hegemony, Minsky
paralleled Gilles Deleuze and Felix Guattari’s defining work of our
time “A Thousand Plateaus: Capitalism and Schizophrenia—“ (1987)
which destroys Modernism’s immaculate linear account of life and
thought.
While some illusions have been overcome, Artificial Intelligence
needed in the boom of expert systems and needs now again in the
boom of knowledge-based approaches to sort out the importance of
microscopic knowledge, given as expert rules, or “facts about the
world—“ whatever that may mean. The shadow of Descartes haunts
‘facts’ as much as logic—for even if facts are received cum grano salis
and their truth conditions are hedged, they still purport to evoked by
people engaged in thinking. As a matter of reflexivity, much of our
Open Mind Common Sense work at this lab (Singh, Barry & Liu
2004) is as vulnerable as Cyc (1995) to the stamp-collecting
syndrome. Cyc’s 3 million assertions and Open Mind’s 800,000
sentence-based “facts” do not further them in a ‘horse-race’ toward
human level knowledge. So long as representation is purely
symbolic—as facts are—abilities granted to children like dexterously
manipulating a ball (Singh 2003) or granted to adults like skill with
people might occupy billions if not more sentences to describe
judiciously. The warning to heed is that human intelligence is not about
possessing rote knowledge. Having knowledge around does not ensure
that it can be applied judiciously and opportunistically to form
coherent thoughts and reactions.
Motivated by a search for coherent yet flexible representation
and emulation of human intelligence, we identify point-of-view as a
crucial metaphor for conceptualizing human intelligence.
A
layperson’s dissection of the “point-of-view” concept—two
participants in an argument are debating the merits of an artwork
and find that they disagree; one says to the other, “but from my
point-of-view, I see things differently.” Here point-of-view evokes
an image of the two debaters standing at opposite ends of an
opinion-space. In the middle is a large blob representing the true
meaning of the artwork. The claim “from my point-of-view, I see
things differently” reifies as one debater reporting that he can see a
different side of the true meaning of the artwork than can the other
debater, while allowing that she herself cannot grasp the whole
meaning. So, having point-of-view relieves the anxiety of having
true thoughts—instead, it privileges coherency and integrity over
truth itself, for standing from the same vantage point, a debater will
tend to report all sightings of meaning blobs with the same
idiosyncratic tendencies, always seeing a certain side to things.
A point-of-view is easy. Every person is always operating under
one or more points-of-view regardless of having reflexivity about it,
because cognitive economy dictates that our knowledge and
memories are always consolidated and systematized, with at least
patchwork consistency. In Metaphors We Live By, George Lakoff and
Mark Johnson (1980) report that language itself is organized and
unified by culturally-specific metaphorical frameworks, which then
shape the thoughts of cultural participants in the way that Lacan
(1957) had presaged. For example, time is money, as in “I spent my
day on you, I can’t believe I invested so much time in you, and you
weren’t worth it.”
The grandeur of point-of-view’s economy is easily
demonstrated. Look at this artwork, do you find it beautiful? Read
this book ending, is it beautiful? Is this sunset beautiful? Or this
government? Most likely, your sense-of-beauty viewpoint prepared
you to judge all of these things, or at least attempt judgment. Pointof-view affords the immediacy of judgment over person, thing, idea,
or situation placed within its realm. There is no need to move, to be
agile, for judgment often happens like the natural reflex of a knee
popping when stricken with a mallet. Whereas a facts-oriented view
of thought requires conceptual knowledge, every person has
abundant judgmental knowledge for virtue of possessing points-ofview like sense-of-beauty, sense-of-humor, sense-of-culturalidentity, a palette for food, and a personality. It is not necessary to
store each judgment as a fact, for point-of-view’s lucidity readily
produces judgments as it reacts to whatever fodder is put before it.
Economical, flexible, and broad in applicability, point-of-view is
a powerful framework and mover for human judgmental thought,
arguably exceeding conceptual and logical thought in breadth and
utility. If point-of-view could be successfully modeled, acquired, and
animated computationally for a few important human realms such
as aesthetics, identity, and opinions, in toto, the computational
system would be emulating a significant basal capability of human
thinking.
To be clear, a computational model of an individual’s point-ofview would constitute a stereotype of that person that is not as agile,
and that might make the same judgment if asked ten times in a row.
But I will argue in this proposal that this would still be an extremely
useful stereotype.
What if every person could access a
computational stereotype representing 80% of their mentor’s
judgmental capability—to bounce random things off their ‘virtual
mentors’ without resource bounds?
There would be real
consequences for education if students could ‘tinker’ a la
constructionist learning (Papert & Harel 1991) with the stereotyped
opinions and perspectives of mentors, computationally producing
‘just-in-time’ and ‘just-in-context’ reactions to the student’s actions.
The goal of the research proposed here is to design, build, and
validate systems for 1) modeling an individual’s point-of-view
within various realms—such as aesthetics, attitudes, and identity; for
2) automatically acquiring an individual’s point-of-view model
through machine readings of egocentric (self-revealing, selfdescribing) texts and 3) organizing the model into coherency; and for
4) animating point-of-view placed inside interactive artifacts such as
virtual mentors by causing the artifact to judge and react to a very
broad range of things placed before it, ‘just-in-time,’ and ‘just-incontext’.
I plan to address these four steps as follows. 1) To develop
representations of viewpoints across the realms of concern, I will
draw heavily from well-established semiotic and epistemological
theories of said realms from the psychology and literary theory
literatures. For example, Carl Jung’s Modes of Perception (Think,
Intuit, Sense, and Feel) (1921) form the dimensions of my proposed
aesthetic viewpoint space, as I pose aesthetics as the perceptual
manner and priority with which an individual approaches some
topic—a realist sees a sunset, but a romantic might prefer to feel the
sunset. The realist is thus located at the position, 100% Sense, 20%
Think, 20% Intuit, 20% Feel, for example. 2) To automatically
acquire an individual’s point-of-view, I propose to apply natural
language processing tools such as my widely used MontyLingua
package (Liu 2002), in conjunction with my common sense reasoning
package ConceptNet (Liu & Singh 2004b), and my textual affect
sensing system known as Emotus Ponens (Liu, Lieberman & Selker
2003). In particular, I anticipate that reading emotion out of text will
be vital to modeling viewpoint because human judgment often
reifies in narratives through emotional appraisal or mannerisms
around a topic’s discussion. 3) To make point-of-view models
somewhat coherent, I will apply analogy-based reasoning (Gentner
1983; Fauconnier & Turner 2002). For example, knowing that a
person loves trees, by analogical-extension, they might also love
rocks (note that this is different from a layperson intention for the
word ‘analogy’); however, pitfalls must be avoided—for example, a
dog lover may hate cats, even though dogs and cats are both pets. 4)
Finally, to animate point-of-view, I proceed along the
methodological lines of Just-in-Time-Information-Retrieval (JITIR)
(Rhodes & Maes 2000) which prescribes that interface agents—in my
case a virtual mentor reacting to things that you are writing or doing
using its viewpoint—continuously mine present user context and
utterances, searching for opportunities to retrieve and present
relevant information – in my case, a viewpoint-produced judgment
about whatever the user is doing—on the chance that it can lend
insight, inspire, or teach the user.
While the acquired models will not be absolutely complete or
always correspond to true viewpoint, and while none of the
produced reactions will be as spontaneous or as flexible as those of
the actual person, I believe that even a first-order approximation of
model acquisition and animation can produce incisive models of
individual perspective, that upon animation will afford novel and
effective new ways to search, gain insight into, be inspired by, and
connect with someone else and their collected narrative content. I
have italicized three words in the previous sentence because these
words constitute the tripartite agenda of our Ambient Intelligence
Group. I believe that our group has the most to gain from such a
thesis, as the methodological conclusions of this research would
directly inform much of the impact we seek for our technologies to
have on people.
Finally, this thesis is as diverse and as simple as I believe Media
Laboratory research should be—diverse in the methods and theories
it draws from, but simple in that it is attacking a basic problem of
relevance to people—so basic that it’s goal could be explained to
anyone on the street. This thesis draws from Sociology, Literary
Theory and Psychology for its computational framing of point-ofview, from Computational Linguistics and Artificial Intelligence for
reasoning about text, and from Interaction Design for designing
point-of-view artifacts. I have developed but not assembled nor
integrated some implementations for this thesis, and already it forms
the basis for an AAAI workshop on computational aesthetics, which
I will co-chair upon the proposed completion of this thesis. To do
justice to an idea as complex and with as long as a history as ‘pointof-view,’ it will be important to clothe the thesis in all of the relevant
literatures and to spend as much time on a computational theory of
point-of-view, as on technical details of implementation. Otherwise,
this work would lose a golden opportunity to be absorbed by an AI
community that is interested in how machines can appraise beauty
and emotion, and by a humanities and cultural studies community
that would be very interested in the computation of its long-standing
but thought incomputable theories of identity and aesthetics. The
rest of this proposal will reflect my emphasis on the importance of
grounding this thesis in the literatures, and on the importance of
distilling reusable methodology and a robust theoretical framework.
I will, of course, motivate all theory with many implemented
demonstrations and task-based evaluations.
2 Proposed Research
In the following subsections I propose a theoretical framework for
representing and computing point-of-view, and detail how the
framework and its associated methodology will be supported by
point-of-view systems I have been researching for several realms
including aesthetic-space, opinion/attitude-space, cultural identityspace, tastebud-space, humor-space, and commonsense-space. I
have implemented many of these systems, and completed various
task-based evaluations. I propose to reframe these disparate systems
so that they support and illuminate a central theoretical framework.
In what follows I also nominate core technologies, knowledge
representations, and techniques needed to assemble point-of-view
systems. Finally I outline an evaluation strategy.
2.1 Theoretical Framework
Overview. I define viewpoint as a self's collected situations within
latent semantic spaces such as culture, taste, identity, and aesthetics.
The definition reflects a school of psychology called Situationalism (
) or Social Constructionism ( ), emerging out of Jacques Lacan’s
Figures 1a-d. Viewpoint models. (clockwise from upper-left) a) viewpoints are computed as situations
within latent semantic spaces; b) a realist’s perspective in a 5-dimensional realm defining perceptual
aesthetics—a topic, such as “sunset” depicted here, casts semantic shadows on each dimension, as
shown; c) represents opinion-space as a sheet of topics which overlays two different perspective
sheets, whose alignments are shown; d) depicts cultural identity space as a fabric of cultural interests;
an individual’s ‘pattern of liking’ constitutes an ethos imprinted in the fabric.
notion that the ego is always defined in the other (1957). The
topology of these spaces are acquired through linguistic ethnography
of online cultural corpora, and an individual's locations within these
spaces is inferred through psychoanalytic readings of egocentric texts
(self-revealing, self-describing), for example, a diary, a research
paper, a social network profile. Once acquired, viewpoint models
gain embodiment as viewpoint artifacts, which allow for selfreflection or the exploration of someone else through interactivity
and play.
Entertaining the idea of point-of-view in the abstract (Fig. 1a) is
uncontroversial, but the real theoretical challenge posed in this thesis
is to reify the abstract notion into workable computational systems.
A computational theory of point-of-view will describe the
Figure 2. Semantic diversity matrix. Point-of-view spaces can be conceived in terms of their
consistency and connectedness—for each case, an appropriate knowledge representation is specified.
The top row is semiotic/symbolic in quality; the bottom row is ethnographic/connectionist in quality.
dimensionalities and properties of some major viewpoint realms
(Figs. 1b-1d are examples soon to be explained); will specify how the
topologies of such spaces can be acquired and how individuals can
be modeled within such spaces; and will demonstrate how
space+location can together be used to predict an individual’s
reaction to any person, thing, idea, or situation put before it—I will
refer to this something collectively as “fodder.”
Knowledge representation for viewpoint spaces. Figs. 1b-1d
illustrate three varieties of knowledge representation used in this
thesis research to model latent semantic spaces. But why three and
not one? Because sometimes the dimensionality of a space is known
(Fig. 1b) while other times it is not (Figs. 1c-d). The goal of locating
an individual’s viewpoint within a space is to reduce the task of
predicting reactions to fodder to simple Cartesian distance
measurements. Ideally, a dimensional space such as Fig. 1b can be
identified as appropriate. In dimensional spaces, information is
most organized and unified, and the notion of distance is most
straightforward. Dimensions of space could of course be inferred
statistically through approaches such as Latent Semantic Analysis
(Deerwester et al. 1990), Support Vector Machines (Joachims 1998),
Multi-Dimensional Scaling (Kruskal & Wish 1978), Principle
Components Analysis, and the like, but in these cases, the quality
and human-readability of the dimensions cannot be assured—for
example, in the document classification problem, LSI can
appropriate one dimension for each word or punctuation mark. The
proposed thesis only appropriates dimensional spaces when the
dimensions are semiotic in nature—that is to say, they are named,
canonical, and well studied in the Psychology and Cognitive Science
literatures. Fig. 1b depicts the example of a space for perceptual
aesthetics—its dimensions are taken from Carl Jung’s modes of
perception (1921), which is well studied in psychology and precursor
to the popular Myers-Briggs Type Indicator (MBTI) model of
personality (Briggs & Myers 1976).
When semiotic dimensionality is not known, it is still possible to
aim for a fully connected representation such as the semantic fabric
in Fig. 1c. If that is unavailable, a semantic sheet representation is
appropriate (Fig. 1d). Inspired by Marvin Minsky’s “causal diversity
matrix” organizing reasoning methods by the number of causes and
effects (Minsky 1992), in Figure 2 I pose a “semantic diversity
matrix” which organizes knowledge representation according to the
connectedness and consistency of the semantic spaces they best
represent. The reader should note that this several species are
omitted in the graph. For example, if a third dimension were
introduced for semioticity, we could distinguish “dimensional
spaces” as being either a semiotic /structuralist space like Jung’s
modes of perception, or as being a data-emergent “quality space” for
concept formation in AI ‘baby machines’ (Gärdenfors & Holmqvist
1994; Johnannesson 1996; Gärdenfors 2000).
In research already completed, two viewpoint realms were
modeled using semiotic dimensional spaces—Jung’s modes of
perception was the basis for a perceptual aesthetic space (Fig. 1b),
and the well studied “Big Five” model of personality (John 1990) was
the basis for—aptly—the personality viewpoint space (Liu &
Mueller, forthcoming). For cultural identity space—i.e. the space of
things that people like such as music, books, sports, and
subcultures—a semantic fabric representation (Fig. 1d) was chosen
since the space was rather inconsistent, and since there was an
opportunity to mine a fully connected structure out of a large corpus
of social network profiles (Liu & Maes 2005a; Liu, Maes & Davenport
2006). Opinion-space—i.e. all possible systems of attitudes toward
arbitrary topics about the world, about politics, or about academic
subjects—is believed to be much more unorganized and
opportunistic due to the myriad of causes and conditions which can
shape opinion like social influences and experiences; therefore a
“semantic sheets” representation is chosen ( ). “Semantic sheets” are
also used to create a model of humor-space ( ). ConceptNet’s
semantic network model of common sense reasoning (Liu & Singh
2004b) and Synesthetic Recipe’s annotation model of tastebuds (Liu,
Hockenberry & Selker 2005) will also be discussed as viewpoint
spaces of low consistency and moderate connectedness, for the sake
of theoretical completeness.
Organizing Principles of Viewpoint. Consistency gives shape to
viewpoint space. Without consistencies, applying viewpoint models
to predict reactions to fodder would have to resort to memory-based
and case-based reasoning—such that if a fodder is not explicitly
specified in the model, no reaction could be given. Dimensional
Spaces are fully consistent and organized because a total ordering
exists for each dimension. Organizing principles in Semantic Sheets
and Semantic Fabrics are more opportunistic and only patchwork
consistency exists. For Semantic Fabrics in the example of cultural
identity space, I nominate three topological organizing features –
hub-and-spokes, n-cliques, and neighborhoods, discussed elsewhere (Liu,
Maes & Davenport 2006) in preliminary form. For Semantic Sheets,
three organizing features are nominated—1) Minsky’s imprimer
theory (Minsky, forthcoming) informs how one individual’s system
of attitudes/opinions is partially structured by the systems of their
parents and mentors; 2) folksonomies of topics imply underlying
consistency of attitudes (e.g. “macramé” is a subtopic partial
structured by the topic “arts & crafts”); and 3) analogical reasoning
(Gentner 1983; Fauconnier & Turner 2002) conceptual blending,
structure-mapping) can be applied just-in-time to predict reactions to
unknown fodder (e.g. attitude toward “rocks” can be predicted by
attitude toward “trees” by their conceptual resemblance).
Techniques from truth maintenance systems (Doyle 1980) are
applied to maintain patchwork consistency, though contradictions
do occur and these are presented as “soft-constraints.”
Simulating Viewpoints. Statically viewpoints are space+location,
but to fully appreciate and understand a viewpoint, it must be
animated and allowed to react to a broad many things. Analogy
(Gentner 1983; Fauconnier & Turner 2002) and context-biased
spreading activation (Collins & Loftus 1975; Liu 2003) are chief
techniques for anticipating how the implications of a viewpoint
inferred from across many contexts can then be applied to create a
reaction in a new context. Although with viewpoint models we go
beyond the “rote” memory-based application of old ideas to new
fodder, viewpoint simulation is still not capable of applying
viewpoint models in any particularly clever way to new situations.
Humans are capable of evolving their viewpoint nimbly as new
fodder presents opportunities for belief revision, but machines are
not capable of simulating this complex self-dialectic (Bakhtin 1935). A
goal for the thesis is to discuss how the simulation of viewpoint
could become dialectical, how an artificial viewpoint could
contradict and overcome itself cleverly—what Hegel calls Aufhebung
(1807). Viewpoint models and simulation carry specific implications
for dialectics—a central problem in critical theory. If Aufhebung
could be simulated, it would represent a major breakthrough for the
computation of inspiration.
To animate computed viewpoint models, viewpoint artifacts are
created—such as the Identity Mirror (Liu, Maes & Davenport 2006;
Liu & Davenport 2005), the Aesthetiscope (Liu & Maes 2005b; Liu &
Maes 2006), virtual mentors in What Would They Think? (Liu &
Maes 2004), and avatars in Synesthetic Recipes (Liu, Hockenberry &
Selker 2005). Viewpoint artifacts reify space+location models by
having them constantly react just-in-time and just-in-context to a
broad range of fodder put forth to them implicitly or explicitly by a
user, and by visualizing these reactions through visual metaphors.
Furthermore, each viewpoint artifacts allows for tinkering, play, and
explanation, e.g. virtual mentors can “justify” their reactions with
quotes, and identity can be negotiated in the Identity Mirror by a
“dancing” interaction in front of the mirror. The importance of
tinkering is likely due to the fact that a reaction’s motivation cannot
be easily grasped without exploring the immediate context and
conditions surrounding the reaction.
2.2 Core Enabling Technologies
Three core technologies that drive the acquisition of viewpoint
models from machine readings of text are natural language
processing, common sense reasoning, and textual affect sensing.
Machine learning techniques and hand engineering of many support
semantic knowledge bases are also important, but they are not
discussed here.
Natural language processing. Because some viewpoint spaces are
acquired by ‘linguistic ethnography’ over cultural corpora available
online (such as a corpus of social network profiles, or a corpus of
conservative versus liberal news texts), and because all of an
individual’s locations within spaces are acquired by psychoanalytic
readings of egocentric (self-revealing, self-describing) texts, natural
language processing (NLP) is central to this research. Relevant NLP
tasks include discourse segmentation, tokenization, named-entity
recognition, spelling correction (Levenshtein 1965), part-of-speech
tagging ( ), deixis resolution ( ), verb and noun chunking ( ),
prepositional linking ( ), gisting syntactic, semantic, and thematic
role frames ( ), natural language generation ( ), topic spotting ( ),
summarization ( ), and statistical language modeling ( ).
For the bulk of these tasks, I have developed a natural language
understanding platform for Python, called MontyLingua (Liu
2002)—now widely used since my releasing it to the Computational
Linguistics and AI communities.
Commonsense reasoning.
Commonsense reasoning is a core
component of machine readers that will read texts to acquire
viewpoint spaces and locations.
The essential insight that
distinguishes machine reading—or Story Understanding / Narrative
Comprehension as it is also called—from mere deep text parsing is
that more than what a text explicates, it also implies and insinuates
through subtext, and it requires contingent knowledge in the form of
backtexts to decipher the full meaning of an utterance. As a
community, Computational Linguistics has focused on Syntax via
Grammars and Formal Lexical Semantics via dictionaries and
WordNet (Miller et al. 1990), surely due to the deep impression left
by Chomsky ( ) on linguistics. The rest is forced into relatively
under-explored buckets called “semantics”, “pragmatics” and
“discourse theory.” To read subtexts and with backtexts, the
Artificial Intelligence community has applied approaches such as
Schankian scripts and plans (Schank & Abelson 1977), and more
recently, large scale databases of world knowledge (Lenat 1995;
Mueller 2000; Singh et al. 2002). The proposed thesis uses the latter
approach as it gives broader semantic coverage—a feature necessary
to the interpretation of domain-independent texts.
Cyc (Lenat 1995), ThoughtTreasure (Mueller 2000), and Open
Mind Common Sense (Singh et al. 2002) are three approaches to
large-scale common sense knowledge acquisition and reasoning.
Cyc and ThoughtTreasure have logical representations and are more
suitable for rigorous deep reasoning about situations, while Open
Mind Common Sense and its ConceptNet (Liu & Singh 2004b) has a
natural language representation, and thus excels at contextual
reasoning over natural language texts (Liu & Singh 2004a).
ConceptNet is semantic network of common sense facts, with built in
methods for contextual expansion and analogy.
Examples of use in this thesis are as follows. The conceptual
analogy faculty of ConceptNet is used to apply viewpoint models to
predict reactions to unknown concepts in WWTT (Liu & Maes 2004)
by situating the unknown fodder into the space of known concepts,
also called conceptual alignment in the Cognitive Science literature
(Goldstone & Rogosky, 2002). In the aesthetic viewpoint space,
ConceptNet’s getContext() feature is used to brainstorm the rational
entailments of a text, in order to generate the “shadows” that a
fodder casts onto the “Think” axis. Finally, ConceptNet is a
principle component of another core technology—textual affect
sensing.
Textual affect sensing. Judgment is the behavioral and measurable
expression of viewpoint, and the primary quality of judgment is
affect. In fact, Ortony, Clore and Collins (1988) concisiated the
definition of “emotion” to mean the expression of an affect about a
person, thing, or event. Emotion and judgment thus can be
represented basically as the bound pair (thing, affect). In some of the
viewpoint systems to be presented in this thesis, affect manifests as
choice implicature. For example, in the cultural identity space
acquired through linguistic ethnography over social network
profiles, individual choose to display certain items into their profile of
“my favorite things,” and that choice can be viewed as a judgment
act (Austin 1962; Habermas 1981) which says that things listed in the
profile are more pleasurable and arousing and dominated over than
things not listed in the profile.
Other times though, affect must be inferred from unstructured
natural language texts—for example, the machine should learn from
the utterance “my mother is a loving and generous woman” that the
speaker judges his mother positively. To complete this task, a topic
spotter looks for the topics present in sentences, paragraphs, and
documents, while a textual affect sensor appraises the affective
qualities of each segment of text. Binding those two outputs to each
other as (topic, affect) pairs, and using classical reinforcement
learning (Kaelbling, Littman & Moore 1996) to generalize stable
(topic, affect) pairs from training data, we have the beginnings of a
model of a person’s system of attitudes/opinions.
To accomplish comprehensive textual affect sensing, I sense
separately surface and deep affect. Surface, or rhetorical affect, can be
measured as word-choice; I sense it by combining the Sentiment
headwords of Roget’s Thesaurus (1911), a corpus of psychologically
normalized affect words called ANEW (Bradley & Lang 1999), and
an affective lexical inventory produced by Ortony, Clore and Foss
(1987).
Deep affect is the pathos permeating from the contingent imagined
consequences of an utterance and can be communicated without
mood keywords at the surface. For example, the utterance “I was
fired, my wife left me, and she took the kids and the house” uses no
surface keywords to nonetheless convey a negative affect quite
powerfully. Deep affect sensing is attempted using Emotus Ponens
(Liu, Lieberman & Selker 2003), a textual affect sensor built using the
Open Mind Common Sense corpus (Singh et al. 2002). The basic idea
is when the affect of a concept is unknown, it can be approximated
by the affect in its surrounding conceptual neighborhood. For
example, supposing that the concept “get fired” is not annotated
with affect, ConceptNet (Liu & Singh 2004b) has semantic links
which connects “get fired” to other nodes which are annotated with
affect such as “recession” (probable cause), “stupid person”
(probable cause), “no money” (probable consequence), “hungry”
(probable consequence). Thus the affect of “get fired” can be
guessed by its context.
2.3 Viewpoint Artifacts and Interactions
Describing viewpoint spaces and their organizational dynamics is
one pillar of the present research. Acquiring the topology of spaces
and the location of individuals from psychoanalytic readings of text
is a second pillar. The third pillar then, is to draw from Interaction
Design ( ) principles to construct interactive viewpoint artifacts and
animate their reactions to fodder. These artifacts allow viewpoints
to be explored and tinkered with, thus they hold great promise for a
great variety of applications such as technological support for selfreflection, perspectival tools for learning from others, interfaces for
visualizing and searching human narrative content, psychographic
visualizations for marketing and ethnography, and so on.
The proposed thesis will narrate several viewpoint artifacts
already built, and then distill from those a set of core methodologies
and considerations for embodying viewpoint. An inventory of
research results obtained thus far is given below.
Figures 3a-b. What Would They Think? is a panel of virtual mentors who continually observe the
user’s browsing and writing activities, offering up just-in-time and just-in-context feedback to the
user’s “fodder”.
Visual metaphors: red=> displeasure, green=>pleasure, dim=>unaroused,
lit=>aroused, sharp=>dominant, blurry=>submissive. 3a) (left) depicts a panel of AI luminaries
reacting to the user’s surfing of the Social Machines Group website. 3b) (right) shows a Democratic
Party persona and a Republican Party persona (trained on their party talking points) reacting to an
article entitled, “What’s Wrong with the Contract with America?”
2.3.1 Major Examples
(Opinion Space) What Would They Think? (Liu & Maes 2004) is a
system for modeling personal attitudes and the space of opinions at
large using the Semantic Sheet representation shown in Fig. 1. A
user can build a new “persona” by supplying an icon and pointing
the system to some egocentric texts that are self-revealing and selfdescribing—i.e. position papers, instant messenging logs, emails,
weblogs. The system reads and infers from the text a system of
attitudes for that persona. Personae are embodied into virtual
mentors (Fig. 3a) who continually observe the user’s browsing and
writing activities, offering up just-in-time and just-in-context
feedback to the user’s “fodder” through visual metaphors. To find
out why a mentor reacted in a particular way, mentors can be
double-clicked to pop up an explanation window—this window
displays a list of quotes snipped from the mentor’s “memory” of
egocentric texts, rank-ordered by how well they justify the reaction
that was given. For example, virtual mentor Roz Picard reacts
negatively to the utterance “Robots will have consciousness” which
is defended with quotes like “Several of my colleagues believe it’s
just a matter of time and computational power before machines will
attain consciousness, but I see no science nuggets which support
such a belief.” Fig. 3b depicts the modeling of two cultures qua
personae. In WWTT, cultures can be treated commensurately with
individuals. The proposed thesis will pre-generate a fabric of
cultural opinions to acquire the opinion space. Using this opinion
fabric, individuals can be located as inhabitants of particular cultural
opinions by applying simple alignment or “diff” techniques between
cultures’ reactions and individuals’ reactions.
(Perceptual Aesthetic Space) The Aesthetiscope (Liu & Maes 2005b)
is an art robot that renders color grid artwork a la Ellsworth Kelly
and early Twentieth Century abstract impressionists (Figure 4). The
manner and quality of the generated artwork is guided by a model
of the user’s perceptual aesthetics. The perceptual aesthetic space
(shown in Figure 1b) has the five dimensions of Think, Sense, Intuit,
Feel, and Culturalize—these dimensions are based on Carl Jung’s
fundamental modes of perception (1921).
Though not yet
implemented, the proposed thesis will automatically acquire the
user’s aesthetic viewpoint through readings of egocentric text.
Currently these dimensions must be specified manually. As a
perspectival artifact, the Aesthetiscope reacts to “fodder” given to it,
such as a word, a poem, or song lyrics. For example, it continuously
observes what poetry the user is reading or what songs are queued
in the playlist, dynamically changing the color grid artwork to “pair”
with the fodder, just as wines are selected to pair with a cheese
course. Another perspectival game that can be played is for two
individuals both standing in front of the same artwork visualizing
some poem to find their shared aesthetic (by averaging their
Figure 4. Perspectival aesthetic rendition in the Aesthetiscope. The left column shows how the art
robot renders the aesthetic impression of the words “sunset” (above) and “war” (below) through the
eyes of a Realist (e.g. Sense=90%, Think=60%, Culturalize=40%, Feel=20%, Intuit=10%). The right
column shows the same fodder rendered through the eyes of a Romantic (e.g. Sense=50%,
Think=20%, Culturalize=70%, Feel=90%, Intuit=80%).
locations), or to violated each other’s aesthetic (by allow one
aesthetic viewpoint to corrupt another viewpoint). I am particularly
interested on how deeply held aspects such as aesthetics can be
exhibited or worn on one’s sleeve so to speak, like a piece of clothing
avails identity and taste.
(Cultural Identity & Taste Space) Identity Mirror (Liu, Maes &
Davenport 2005; Liu & Davenport 2005) is a mirror to support selfreflection that lets you “see who you are, not what you look like.”
As shown in Fig. 5, the mirror’s computed reflection overlays a
swarm of keyword descriptors over an abstracted image of the
“performer.”
The performer can use dance to negotiate his
identity—for example, walking to and fro the mirror affects the
granularity of the keywords being shown, which describe a far away
performer using broad strokes like subculture keywords (e.g.
fashionista, raver, intellectual, dog lover), but describe an up-close
performer with descriptors like song names, books, food dishes, etc.
When movement is slow and deliberate, the keywords more
semantically distant from the performer’s ethos appear in the
computed reflection, but those keywords are quickly dashed with
sudden movements.
The Identity Mirror uses a social network profile to locate the
Figure 5. Self-reflexive performance with the identity mirror. A swarm of keywords shows a user’s
situation within the cultural fabric of identity/taste, and with respect to the attentional biases of the
zeitgeist as calculated by monitoring daily news streams. The user’s social network profile is used to
locate the user within the cultural fabric.
performer’s viewpoint within the cultural fabric of identity and taste.
The cultural “taste fabric” (Liu, Maes & Davenport 2005) is derived
by computing the latent semantic connectedness of “interest
keywords” (music, books, sports, subcultures, etc) from analysis of
the texts of 100,000 social network profiles. The performer’s location
on the fabric is calculated by reading his social network profile,
mapping that profile onto the nodes of the fabric, and using
spreading activation (Collins & Loftus 1975) to define an ethos (a
weighted collection of nodal activations). In the mirror artifact, the
identity/taste viewpoint of the performer is visualized as a swarm of
keywords. The viewpoint “reacts” to changes in the daily news
stream. For example, around the time of the summer Olympics, the
sports-centered news wire would bias the cultural fabric by
highlighting nodes relating to the Olympic sports. The reflection in
the mirror simulates the performer’s viewpoint by selectively
interpreting the new cultural situation, and displaying just what
exists at the intersection of the performer’s ethos and the news-dujour’s ethos. Ambient Semantics, ( ) another system using the taste
fabric, is an artifact that uses viewpoint to predict whether or not one
individual would find another person to be sympathetic.
2.3.2 Minor Examples
(Gustatory Space) Synesthetic Recipes (Liu, Hockenberry & Selker
2005) is an interface for browsing for food recipes by imagined
tastes of food. For example, typing “old, beautiful, desperate,
urgent, alive, primal, homey, organic, nutritious, spicy, sweet, moist,
aromatic, easy, zen" yields a recipe for "bohemian stew.” With food
dishes, tastes, genres, and cultures arranged into a highly connected
semantic network, the network approximates a space of taste-forfood. In Synesthetic Recipes, a viewpoint, called a “tastebud,” can
be programmed into one of three avatars. As the user browses for
food, the avatars constantly emote their likings and dislikings for
suggested recipes. An individual’s tastebud can also be acquired
through observational learning of what the user types into the search
box. This is a minor viewpoint example and will be included in the
thesis for completeness.
(Humor Space) Buffolo is a humor robot that suggests jokes it
anticipates an individual will find funny. It does so by having
crafted a model of a person’s sense-of-humor relative to the space of
jokes. Using a Semantic Sheet representation like What Would They
Think?, Buffolo reads an individual’s egocentric texts such as a
weblog or email corpus with the goal of extracting (topic, pressure)
pairs, much as WWTT extracted (topic, affect) pairs. Pressure is one
particular dimension of a full affect measurement. Harkening to
psychoanalysis’s hydraulic model of emotions (Freud 1901), an
individual’s affective pressure points suggests psychic tensions
which need catharsis—humor is a primary means to meet cathartic
need (Freud 1905). A major way to structure humor is by culture,
since much of one’s embarrassing and tense experiences growing up
is shaped by cultural idiosyncrasy, e.g. Asian families and scholastic
and work ethic emphasis; overbearing and verbose relatives in
Jewish families, narratives of hustling, ghettos, players, and bling in
Afro-American culture. Thus, Buffolo senses an individual’s cultural
identifications and uses this as a humor viewpoint from which to
predict the pleasure of a joke. Buffolo is also a minor viewpoint
example and will be included in the thesis for completeness.
(Personality Space) Character Affect Dynamics Analysis (CADA)
(Liu & Mueller, forthcoming) is a cognitive linguistics system which
reads novels and infers the personalities of its characters into the Big
Five personality inventory (John 1990) whose dimensions are
agreeableness,
neuroticism,
openness,
conscientiousness,
extraversion. The system models the actions and interactions of
characters as affective token passing. For example, the sentence “Mary
swindled Jack” is parsed into a Who-Dun-What representation,
Mary and Jack are recognized as characters, and swindle is
recognized as a displeasurable aggressive act.
In CADA’s
representation, the utterance equates to Mary sending Jack a
negative attack token. If the narrative continues to show Jack
negatively affected and submissive, then the system learns that Jack
is vulnerable. Statistically over the long haul of a novel, stable
personality characterizations can be made about each character.
CADA demonstrates how personality qua viewpoint can be acquired
in a sophisticated way from text, but it does not work well over too
small a corpus of text, nor does it work well over most egocentric
texts like weblogs. It is a minor system example that will
nonetheless illuminate the idea of “psychoanalytic readings of text.”
2.4 Evaluation
To successfully defend the computational theory of point-of-view to
be presented in the proposed thesis, I propose three lines of
evaluation—literature evaluation, model validation, and task-based
evaluation.
Literature evaluation. When presenting a new theory on a subject as
basic as point-of-view, a primary task is to properly situate the
theory within all of its proper literatures. I believe it is fair to call
this activity “evaluation” even though it is not quantitative.
Literature evaluation means to scrutinize the implications of the
computed viewpoint models to the viewpoint theories presented in
the literature, not only stuffing the literature narrowly into “related
work,” but sustaining dialog with the literature all the way through
the thesis, constantly check-pointing how this thesis’s account and
the literature’s accounts mirror and inform each other. The literature
on point-of-view is so extensive that even if my theory could
provoke any new thinking in the existing frameworks on point-ofview, it should be considered a huge success. Because this is a
computational thesis, there is in particular a tremendous
opportunity to reify humanistic theories into computational
structures and processes.
Model validation. Among the major viewpoint systems to be
discussed in the thesis, space models are computed for perceptual
aesthetic space, for opinion space, and for cultural identity/taste
space. Location models are computed from individuals’ egocentric
texts. I propose to evaluate how well both the space models and
location models present accurate pictures of viewpoint spaces and
individual viewpoints. I will collect human ratings of the computergenerated models as baselines for computing the accuracy of model
acquisition. I will also evaluate the quality of reactions produced
through simulated viewpoints by employing human judges. A few
of these evaluations are already obtained, including a human-rated
evaluation of the quality of attitude prediction in What Would They
Think?
Task-based evaluation. The usefulness of viewpoint artifacts speaks
to the significance and potential impact of computing viewpoint.
User studies will illuminate how well viewpoint artifacts can
support a diverse set of tasks such as self-reflection, taste-based
recommendation, learning about others, decision support, artistic
portrayal, and others. Some of these evaluations are already
obtained, including: a taste-based recommendation task using the
viewpoint space of cultural identity & taste; a learning about others
task using What Would They Think?; an artistic portrayal task in the
Aesthetiscope.
3 Contribution
The proposed thesis aspires to be the first comprehensive and
computed theory of point-of-view.
The theory will be well
supported by built viewpoint models for several domains such as
aesthetics, cultural identity, and opinions, implementations of
automated viewpoint acquisition from readings of text, and
implementations of several interactive viewpoint artifacts, which
demonstrate broad and significant implications for this line of
research. Specifically, I hope to show that




The slippery notion of point-of-view, well covered in the
humanities literatures, can be represented, captured, and
reified into computational artifacts.
A point-of-view can be computed most elegantly as an
individual’s collective situations [sic] within latent semantic
spaces
of
viewpoint
such
as
OpinionSpace,
PerceptualAestheticSpace, and CulturalIdentitySpace.
An individual’s point-of-view and the topology of viewpoint
spaces can be acquired automatically through what I call
psychoanalytic machine reading.
Interactive viewpoint artifacts that simulate an individual’s
judgments and react to an individual’s actions just-in-time
and just-in-context can afford powerful new tools for learning
about others, for self-reflection, for inspiration, and for deeper
user modeling.
4 Background and Related Work
The proposed thesis research is articulated against several backgrounds. Given that I proposed ‘literature evaluation’ as an
important contribution of this thesis, the following section will be
extensive. In the following, I revolve discussion around several basic
topics treated in this thesis, present both computational and noncomputational work.
4.1 Psychoanalytic reading for viewpoint
I suggest that an individual’s point-of-view within a particular
realm—inter alia, aesthetic, cultural identity, or opinion—can be
located by psychoanalytic readings of egocentric text. This feat builds
on work in the text interpretation division of literary theory called
hermeneutics, and on story understanding work in Artificial
Intelligence.
Non-computational work
While a preponderance of modernist theorists still cling to the view
that textual communication is rational, is objective, and that meaning
can be deciphered from text through logical disambiguation, postEnlightenment thinkers and many computational semantics
practitioners believe that meaning is just a collective hallucination—it
is not present in text objectively, only inter-subjectively. Friedrich
Schleiermacher (1809)—founder of the modern science of
hermeneutics—posed textual interpretation as a Romantic
enterprise. He believed that, more than objective communication,
each text avails of its author. To unravel the author from the text, it
is necessary to read deeply, holding in mind the various layers of
context which underlie each authorial choice, and using each context
like a colored lens to illuminate a single dimension of the text’s
kaleidoscope of meanings. In the last century, hermeneutic theorists
formalized the notion of reading-through-context in Speech Acts
theory. John Austin’s formulation of the theory (1962) distinguished
between and utterance and subtext in an act of speech. When your
boss says “there’s no room in this company for mediocrity,” the
surface utterance, or locution, seems to state a moral truth that the
company is not mediocre, but the subtext, or illocutionary force, of the
utterance can mean to threaten. Jurgen Habermas has broadened
Speech Acts Theory into a Theory of Communicative Action (1981),
which re-views all social and cultural interactions as speech acts
with locutionary and illocutionary components. In my research, I
invoke Speech Acts Theory by attempting to unravel illocutions that
emanate from the homunculus of author’s viewpoint—for example,
someone telling you their favorite things on a social network profile
can help you (or a machine) to infer their taste: an individual listing
“Marilyn Manson’s Antichrist Superstar” as a favorite album
(Manson is a goth rock musician) and “Salinger’s Catcher in the Rye”
as a favorite book constitutes two factual utterances, but
illocutionary force underlying the two utterances seems
contemptuous of social and ethical norms, and so the illocutionary
act here, is potentially one of rebellion.
Since the last century, reading is posed as multiple because each
reader posturing results in a different meaning gleaned. In De
l’interpretation, Paul Ricoeur
(1965) distinguished between a
hermeneutics of ‘retrieval’ versus a hermeneutics of ‘suspicion’,
practiced chiefly by Nietzsche and Freud. Nietzsche grandfathered
existentialism because he dissociated from socially condoned
‘readings’ of life’s meaning and instead ‘read’ life and the world
through a social biologist’s lens—for example he makes a famous
argument in Beyond Good and Evil (1886) that evil is actually good
when viewing the world as a body because it undermines the false
power of society, returning human kind to the homeostasis of
entropy, that universal end. Freud (1901) was fascinated with the
unconscious mind as illocutionary force, and his methodology of
psychoanalytic reading to infer the neuroses of his patients is invoked
in this thesis. Resembling Ricoeur’s contribution, reading
psychologist Louise Rosenblatt (1978) distinguished between
‘efferent’ and ‘aesthetic’ reading. ‘Efferent’ like Ricoeur’s ‘retrieval’
reading means objective reading—reading with the modus operandi to
take away something from the text. ‘Aesthetic’ reading means to
allow the reader to live through the text—this is the evocative mode of
reading which the Aesthetiscope uses to liberate from the text,
meanings which come out of the reader, not out of the author. Rosenblatt’s
‘aesthetic’ and Ricoeur’s ‘suspicious’ readings are superficially
oppositional but they are actually doppelganger ideas, as will
become clear when deep reading is computationalized in this thesis.
Computational work
Classical AI works in Story Understanding are Terry Winograd’s
SHRDLU blocks world understander (1971), Eugene Charniak’s
children’s story understander (1972), Mike Dyer’s BORIS goaloriented understander (1983) which manifests Schank and Abelson’s
scripts, goals, and plans construction (1977), and Wendy Lehnert’s
plot units strategy (1982) for viewing the macroscopic semantic
structure of narratives. These classical systems and theories were
overly symbolic, logical, and brittle, treated understanding as logical
theorem proving, and thus failed to work over a broad range of
natural texts.
Reading is such a high level description of a complex cognitive
machinery surrounding the interpretation of text that only recently
has Artificial Intelligence researchers dared to term their work
“reading.” Moorman and Ram (1994) present a sophisticated model
of machine reading where their reader robot ISAAC can focus and
attend, can willfully suspend disbelief, and can use analogy to
creatively force knowledge into a current understanding framework.
Moorman and Ram lament that ISAAC does not have enough
background knowledge to perform baseless analogy, a problem that
is addressed in this thesis using ConceptNet as a ‘base’ for analogy.
Ram’s system, AQUA (Ram 1994), introduces a computational
workflow for interleaving reading with understanding—in AQUA,
reading with some understanding framework produces anomalies
which prompt questions which help to direct the explanation and
understanding process. ISAAC and AQUA both have the idea that
reading is an activity which constructs, populates, and occasionally
revises a situation model (Zwaan & Radavansky, 1998), a construct
meant to demonstrate unification of comprehension. Reading using
situation models should be considered ‘retrieval reading’ (Ricoeur),
‘efferent reading’ (Rosenblatt), and ‘objective’. This thesis poses its
computed readings as ‘aesthetic’, ‘suspicious’ and overarchingly
‘psychoanalytic’. One interesting example of metaphorical reading
that is closer to psychoanalytic reading is Srinivas Narayanan’s
KARMA system (1997). Using multiple, synchronized Petri-nets,
KARMA ‘reads’ text by simulating state trajectories in Petri-nets—
for example, the utterance “Japan’s economy stumbled” reifies in
KARMA’s walking and stumbling machinery, allowing the inference
that Japan’s economy is off-balance and the situation may not correct
immediately but will correct eventually. Along these same lines,
Erik Mueller’s ThoughtTreasure system (2000)
reads-byvisualization—creating a 2D ASCII-art rendition of a read passage,
which is a lucid representation which can be used to answer many
questions. In my view, psychoanalytic reading is also trying to readby-visualization because once the author can be envisaged as
occupying a location in some viewpoint space, the representation is
lucid enough to be able to answer many questions about the author
and allow authorial reactions to be predicted.
Other relevant technologies and literature that supports
psychoanalytic readings at the mechanistic level are Natural
Language Processing, Common Sense Reasoning, and Textual Affect
Sensing. The related works for these have already been given in a
previous section.
4.2 Viewpoint spaces
I pose an individual’s viewpoint as her situation within latent
semantic spaces that serve as realms of viewpoint. The idea that an
individual can only be understood by understanding the whole
culture or bundle of potentialities that surrounds her is a sociological
understanding. Relevant non-computational work includes theories
of culture’s structure and language-qua-culture’s structure;
computational work includes the modeling of latent semantic spaces.
Non-computational work
Of the three major viewpoint spaces considered in the present thesis
work, perceptual aesthetic space is a formal semiotic space because I
was inspired by Carl Jung’s dimensional model of perception
(1921)—though that model has been verified sociologically and
psychologically as it is the basis of the widely used Myers-Briggs
Type Indicator of temperament (Briggs & Myers 1976). As for
cultural identity space and opinion space, these are ethnographically
acquired by analyzing large corpora of cultures. Here, we invoke
‘culture’ to mean the collective symbolic creative product of
humanity, and not to mean a mode of superior intellect or taste. Our
interpretation is in line with the word ‘Kultur’ as invoked by
Wittgenstein and Nietzsche, and is in line with Clifford Geertz’s
interpretation. In The Interpretation of Cultures (1973), Clifford Geertz
motivated the significance of culture to the self thusly, “man is an
animal suspended in webs of significance he himself has spun, I take
culture to be those webs” (Geertz, 1973: 4-5). “Webs of significance”
is the inspiration for the semantic fabric representation of the cultural
identity space—a fabric is a super densely connected semantic web,
and ‘significance’ is embodied in the 12,000 nodes of cultural
symbols such as book authors, book titles, musical genres, etc. By
exposing the vastness of tastes captured by the cultural identity
space, I illustrate Grant McCracken’s (1997) point that our
contemporary consumer-driven world is in late capitalism, where all
tastes and identities that can be imagined are fulfilled by
consumables—this is Plato’s prediction of Plenitude. Whereas
Geertz conceived of culture around the interconnectedness of symbols,
Roland Barthes’s conceptualization of culture concerns the valences
of symbols (1964). Barthes’s approach, which he calls Semiology,
views culture as a system of signification. That is to say, words and
objects are signifiers, but culture supplies a way to map signifiers
into signifieds, or underlying meaning. For example, in Western
cultures, ‘rich’ is a signifier that maps into a positive affect, or
privilege, as Structuralists calls it. The Semantic Sheet representation
of viewpoint spaces such as opinion follows Barthes’s interpretation
of culture, i.e. the space of opinion is a collection of pairs of (topic
qua signifier, affect qua signified).
Computational work
The Internet contains many resources such as weblog communities,
social networks, recipe corpora, humor corpora, political corpora –
these resources are reflections of the cultures of the offline everyday
world—thus mining these resources can provide us with working
models of viewpoint spaces. The topology of these latent semantic
spaces can be inferred through statistical modeling techniques such
as Latent Semantic Analysis (Deerwester et al. 1990), Support Vector
Machines (Joachims 1998), Multi-Dimensional Scaling (Kruskal &
Wish 1978), and the mathematical method of Principle Components
Analysis.
Explaining culture has been the providence of
ethnographers (Kluckhohn 1949), but the process is nearly identical
to linguistic modeling as culture is a language with symbols,
meaning, syntax, sentences, and discourse—hence in this thesis
work, I term language modeling of cultures, ‘linguistic
ethnography’. My idea of linguistic ethnography is close to a
movement in the Semantic Web community called ‘emergent
semantics’ (Aberer et al. 2004) which advocates the countervailing
view that semantic ontology should be shaped from the ground-up,
a posteriori, and in accordance with the natural tendencies of the
unstructured data—such a resource is often called a folksonomy when
built by humans (e.g. dmoz.org, allrecipes.com).
4.3 Point-of-view as ‘locations’ in space
Knowing the topology and constitution of viewpoint spaces, I pose
an individual’s point-of-view as locations and situations within this
space.
Non-computational work
The idea that a self is just a particular emanation of the social and
cultural milieu is embraced by situational theorists—in their
discourse, viewpoint is called perspective and subject-position.
Experiential situationists believe that a self is formed out of its prior
experiences in the world, and these ideas originate in David Hume’s
(1748) empiricism. Memory-based reasoning (Stanfill & Waltz 1986)
and case-based reasoning (Riesbeck & Schank 1989) and
reinforcement learning (Kaelbling, Littman & Moore 1996) in
Artificial Intelligence are examples of experiential situationalism.
Social situationists believe that a self is a ‘socially and culturally
mediated construction.’ Although situationalism can be traced back
to the Sixth Century in the Indian linguistic tradition, the recent
episode of the movement finds ground in Jacques Lacan’s notion
that ‘the ego is formed out of the other’ (1957) (‘other’ meaning
environment in Lacan’s discourse), although Nietzsche’s social
biology also implied social situationalism, even existential
situationalism. Georg Simmel (Levine 1971) presaged the field of
sociology by posing an individual as only knowable through
fragmented reflections against his milieu such as his job, his church
membership, his social status, etc. In more recent work, Mihaly
Csikszentmihalyi and Eugene Rochberg-Halton (1981) studied how
significant objects in a family’s domestic setting constitutes a
‘symbolic environment’, which echoes and reinforces each
individual’s identity. Narrative psychologists like Kevin Murray
(1990) suggest that cultural narratives like romantic and comedic
stories serve as materials out of which an individual constructs an
identity. Sarah Thornton’s study (1996) of underground club culture
reveals how hipsters politicize their locations within the music
‘scene’ because their location is something to be signaled to others,
and something capable of winning them social capital. Frederic
Jameson (1998) poses situationalism most bleakly by equating
language to a ‘prison-house’ and lamenting that cultural space is so
fractured that individuals are increasingly bucketed into their
idiolects and experience Marx’s alienation.
Situationalism coming from the sociological literature seems
sometimes to obscure the agency and creativity of individuals. In
the philosophical literature, it is shown that being situated in cultural
milieu is not a helpless act resembling K’s fate in Franz Kafka’s The
Trial (1925). Jean-Francois Lyotard (1984) proclaims that the end of
hegemonic ‘meta-narratives’ means that individuals can shape their
own viewpoint by selecting mini-narratives to cloth themselves in.
Jacques Derrida and Claude Levi-Strauss (Derrida 1966) conceive of
the individual as a bricoleur who makes cultural bricolage by
opportunistically choosing what ideas and positions to import into
their viewpoint space. In other words, much of postmodern philosophy
is concerned with prognosticating that individuals either will (Lyotard,
Derrida) or will not (Jameson) be able to control how the space in which
their perspective lives is constructed. Recent ethnographic examples of
empowered individuals appropriating their own situational spaces
include Certeau (1997) and Grodin and Lindlof (1996).
My
viewpoint models do not reflect the full range of dynamicity and
agility that individuals are really capable of, though in describing an
individual’s location as multiple and complex rather than singular and
categorical, and in predicting reactions creatively through analogy, I
believe this thesis will support postmodern optimism.
Computational work
The computation of individuals as locations within latent semantic
spaces is connected to the user modeling literature. The most directly
applicable work on situated viewpoint is, found in a noncomputational but quite computable psychological theory by H.
Montgomery. In “Towards a perspective theory of decision making
and judgment” (1994), Montgomery writes “Three determinants of
perspectives in thinking are identified: (a) the subject, i.e., subject
orientation, (b) the object, and (c) psychological distance between
subject and object” (Montgomery 1994: abstr.). Likewise, this thesis
simulates viewpoints by computing the relationship between
individual’s location and the fodder.
The user modeling literature computes and attempts to predict
user actions and likings. It should be noted that this thesis models
individuals whereas the notion of user implies that a user model is
inherently only a narrow description of an individual behaving
within the context of a narrow application domain. The user
modeling literature knows two prevailing paradigms for
representing users – firstly using frames to model intrinsic attributes,
as in demographics and psychographics, and secondly using vectorbased statistics to model extrinsic behavior, as used in collaborative
filtering (Shardanand & Maes 1995). However, reducing a person to
but a few categorical attributes lacks specificity, while most behavior
modeling is too domain or task-specific and the learned features do
not rise the generality of describing an individual as he exists outside
applications. Point-of-view models developed in this thesis work are
semiotic like categorical user models, but can be reasoned with
robustly like statistical vector-based user models.
Viewpoint
modeling further distances itself from typical behavior modeling
since the work is concerned with characterizing an individual out of
any application’s context.
4.4 Simulating judgment
Viewpoint models can simulate judgments of the individual being
modeled. There is substantial background literature on how people
simulate the judgments of others, and how machines can cognitively
model reactions.
Non-computational work
The ability of individuals to mentally model the thoughts and
behaviors of other individuals is an evolutionary adaptation often
called mindreading. Two core components of mindreading recently
popular in the Cognitive Science literature are intentionality—the
ability to infer a speaker’s reference through gaze and other cues—
and theory of mind—the name of the mental faculty for modeling
other minds. Opinions on how humans implement theory-of-mind
are split between the Theory-Theory camp believing that predictive
rules are hard-coded into minds, and the Simulation-Theory camp
(Gallese & Goldman 1998) believing that other minds can be
simulated on the self like how the same Java code can be run on
different computers. The Simulation-Theory camp was boosted by a
recent finding in neurobiology that lower primates have mirrorneurons—neurons which fire both when an action like grasping a
banana is taken, and when the action is seen to be taken by a
conspecific (Gallese & Goldman 1998). In Minsky’s Society of Mind
Theory (1988), he introduces the abstraction of mental critics to
explain how minds access the expertise and wisdom of other minds.
Minsky seems to ally with the Simulation-Theory camp, and even
goes further to theorize that all of our actions are proactively
simulated by mental critics in the cognitive background, who censor
our present stream-of-consciousness with advice; this is not unlike
what my What Would They Think? system tries to achieve.
Daniel Dennett poses mindreading in a manner sympathetic to
viewpoint-based simulation. In The Intentional Stance (1987), Dennett
suggests that individuals can ‘read’ the same situation differently
through different lenses of interpretation (=viewpoint-spaces?),
which he calls ‘stances’. For example, a robbery witnessed through
the ‘physical stance’ yields physical perceptions like a convenience
store opening and a man-object storming out. Witnessed through the
‘design stance’, telic and agentic aspects are illuminated and it is
seen that robbers are designed to rob stores, which are designed to
carry money. Finally, witnessed through the ‘intentional stance’, it is
noticed that the robber is a willful and rational person who robbed
this store out of some motivation or habit, and that he is running
because he is fleeing from the scene of the crime. Parallel to this are
Clifford Geertz’s notions of ‘thin description’ and ‘thick description’
(1973). Using Gilbert Ryle’s example of a narrative that recounts a
person winking, Geertz remarks that a ‘thin description’ renders the
wink as not differentiable from an unmotivated twitch in the body,
but ‘thick description’, importing context regarding cultural
motivations for winks into the narrative, allows a wink’s cultural
meaning to be seen. ‘Thick description’, then, constitutes something
like a ‘cultural stance’ in Dennett’s framework, and ‘thin
description’—dissociated from human intention and meaning—
maps to the ‘physical stance’. Stance theory—including Ron
Edward’s Actor Stance model and Kevin Hardwick’s Narrative
Stance model—is also well developed as a practical methodology in
the Acting discipline, where the lifeblood of actors is the successful
and convincing rendition of characters. Improvisatory actors must
in particular possess something resembling point-of-view models of
the characters whose shoes they must perform spontaneously in.
Computational work
Dennett’s intentional stance treats persons as agents who follow
rational principles, thus suggesting a way to simulate what an
individual might think, want or do next given a prior model. This
computation is called the Belief-Desire-Intention model (Georgeff
1998) of agency in the Intelligent Agents literature, and simulation of
actions is called ‘action selection’ (Maes, 1994). Similarly, Pollack
(1992) advocates the use of plans as packets of rational actions. A
plan is guided by two principles—filtering the space of possible
actions to eliminate those which conflict with goals, and constructing
a plan to satisfy the greatest number of goals, called overloading.
Notable examples of rational simulation systems include Allen
Newell and Herbert Simon’s (1963) General Problem Solver, and
Newell’s SOAR (1990) cognitive architecture. When some prior
symbolic knowledge in the form of expert rules, knowledge captures
of prior states, goals, or scripts are infused, rational simulation is
called Case-based Reasoning (Riesbeck & Schank 1989) or Memorybased Reasoning (Stanfill & Waltz 1986).
Outside of rational simulation are baby machines, behaviorbased models (closely related to behavior-based user modeling),
cognitive-affective architectures, and knowledge-based models.
Gary Drescher’s baby machine (1991) tried to reach the holy grail of
unifying behaviorism and symbolicism by learning abstractions out
of behavior through marginal attribution. Rodney Brooks (1991)
advocated a ‘reactive’ approach to intelligence, which relies
completely on reinforcement learning of hidden Markov models to
drive action. Marvin Minsky (forthcoming), Push Singh (2005), and
Aaron Sloman (1981) have proposed three-tier cognitive-affective
heterogeneous architectures for simulating minds. Cyc (Lenat 1995),
ThoughtTreasure (Mueller 2000) and ConceptNet (Liu & Singh
2004b) represent a large-scale knowledge approach to simulating
thought, guided by a topology of thinkables. This thesis work
practices a knowledge-based and associative approach to simulation.
Knowledge is not acquired formally but uses statistical and
reinforcement learning to model the ‘knowledge’ of viewpoint
spaces. Viewpoints are simulated through associative approaches
like spreading-activation (Collins & Loftus 1975) and knowledgebased approaches like analogical reasoning (Gentner 1983;
Fauconnier & Turner 2002).
4.5 Interactive viewpoint artifacts
Viewpoint models captured into interactive artifacts affords a tools
for self-reflection, learning about others, and inspiration. The design
of interactive artifacts is informed by knowledge of Interaction
Design and Human-Computer Interaction, and builds upon previous
computational work in Interaction and Software agents, and
Responsive Environments.
Non-computational work
Born from the Bauhaus and Ulm Schools, Interaction Design
theorizes the psychological, cognitive, and social ramifications of
design. The design of interactive viewpoint artifacts must be
informed by the various politics of what makes an artifact useable,
useful, and inspiring. A unique design challenge for this work is
that these artifacts rest somewhere between the categories of
“humanistic agent” and “tool.” The human metaphor must be
sustained since viewpoint is primary a human competency and is
only fully appreciable as such. But the tool metaphor prescribes that
the boundaries and capabilities of the interface are made clear and
transparent to ensure that the artifacts are useable in a predictable
manner. Byron Reeves and Clifford Nass (1996) advocate that
computer-human interaction be illuminated by understanding
human-human communicative contracts.
Don Norman (1989)
explains the importance of a tool’s aesthetics and emotional
evocations to its usability. Because interacting with someone’s or the
self’s viewpoint is by its nature so provocative and engaging, it
allows ample opportunity for constructivist ‘tinkering’ (Papert &
Harel 1991), ludic or playful activity (Gaver 2001) (), and allows
users to explore unusual values or avenues which Anthony Dunne
(1999) calls ‘value fictions’.
Computational work
“Software agents” (Maes 1994) are computed embodiments of
stereotyped human capabilities, and Pattie Maes explored how they
could interactively support human choices such as music selection or
browsing the Web, and augment human intelligence (I.A., not A.I.).
Bradley Rhodes (Rhodes & Maes 2000) and Henry Lieberman (1997)
describe interaction agents could observe user actions such as typing
or browsing, and serendipitously and proactively give advice or
suggestions. Another line of computational work in Responsive and
Reflective Environments (Krueger 1983) investigates how interfaces
such as Identity Mirror can engage an individual to ‘perform’ selfreflection.
5 Timeline
I plan to finish refactoring technical implementations and complete
all evaluations by the end of January 2006, and the thesis and
defence by the end of the spring term 2006.
6 Resources
No additional resources are required, beyond typical access to Media
Lab resources, and opportunities to travel to meet with non-local
readers.
References
Aberer, K. et al.: 2004, Emergent semantics. Proc. of 9th International Conference on
Database Systems for Advanced Applications (DASFAA 2004), LNCS 2973, 25-38
Heidelberg.
Austin, J. L.: 1962, How to Do Things with Words. Cambridge (MA): Harvard UP.
Bakhtin, M.: 1935, The Dialogic Imagination, Austin, TX: University of Texas Press.
Barthes, R.: 1964/1967, Elements of Semiology. (Translated by Annette Lavers & Colin
Smith). London: Jonathan Cape.
Bradley, M.M., & P.J. Lang: 1999, Affective norms for English words (ANEW):
Instruction manual and affective ratings. Technical Report C-1, The Center for
Research in Psychophysiology, University of Florida.
Briggs, K. C., & I. B. Myers: 1976, Myers-Briggs Type Indicator: Form F, Palo Alto:
Consulting Psychologists Press
Brooks, R.: 1991, Intelligence Without Representation, Artificial Intelligence Journal (47):
139-159
Certeau, M. de: 1997, Culture in the Plural. Ed. and intro. Luce Giard. Trans. and
afterword Tom Conley. Minneapolis: U of Minnesota Press.
Charniak, E.: 1972, Toward a Model of Children Story Comprehension. MIT School of
EECS PhD Thesis.
Collins, A. M., and E. F. Loftus: 1975, A spreading-activation theory of semantic
processing, Psychological Review 82: 407-428.
Csikszentmihalyi, M., E. Rochberg-Halton: 1981, The Meaning of Things: Domestic
Symbols and the Self, Cambridge University Press.
Deerwester, S., S. T. Dumais, G. W. Furnas, T. K. Landauer, R. Harshman: 1990,
Indexing by Latent Semantic Analysis, Journal of the Society for Information Science,
41(6), 391-407.
Deleuze, G. and F. Guattari: 1987, A Thousand Plateaus: Capitalism and Schizophrenia,
Brian Massumi (transl.), Minneapolis: The University of Minnesota Press.
Dennett, D.: 1987, The Intentional Stance, Bradford Books.
Derrida, J.: 1966/1978, Structure, Sign, and Play in the Discourse of the Human
Sciences". In: Writing and Difference, trans. Alan Bass: 278-294. London: Routledge.
Doyle, J.: 1980/1987, A truth maintenance system, in Reading in nonmonotonic
reasoning: 259-279, Morgan Kaufmann.
Drescher, G.: 1991, Made-Up Minds: A Constructivist Approach to Artificial Intelligence.
MIT Press.
Dunne, A.: 1999, Hertzian tales: Electronic products, aesthetic experience and critical design.
London, RCACRD Research Publications.
Dyer, M.G.: 1983, In-depth understanding. Cambridge, Mass.: MIT Press.
Fauconnier, G. and M. Turner: 2002, The Way We Think: Conceptual Blending and the
Mind’s Hidden Complexities, Basic Books.
Freud, S.: 1901, The psychopathology of everyday life.
Freud, S.: 1905/1963, Jokes and their relation to the unconscious, WW Norton.
Gallese, V. and A. Goldman: 1998, Mirror neurons and the simulation theory of mindreading, Trends in Cognitive Sciences 2(12): 493-501
Gärdenfors, P.: 2000, Conceptual Spaces, MIT Press.
Gärdenfors, P. and K. Holmqvist: 1994, Concept formation in dimensional spaces,
Lund University Cognitive Studies 26.
Gaver, W.: 2001, Designing for Ludic Aspects of Everyday Life, ERCIM News No. 47.
Geertz, C.: 1973, The interpretation of cultures. New York: Basic.
Georgeff, M.P. et al.: 1998, The Belief-Desire-Intention Model of Agency. In N. Jenning,
J. Muller, and M. Wooldridge (eds.), Intelligent Agents V. Springer.
Gentner, D.: 1983, Structure-mapping: A theoretical framework for analogy, Cognitive
Science 7: 155-170.
Goffman, E.: 1959, The Presentation of Self in Everyday Life. Garden City, NY:
Doubleday.
Goldstone, R. L. and B.J. Rogosky: 2002, Using relations within conceptual systems to
translate across conceptual systems, Cognition 84: 295–320
Grodin, D. and T. Lindlof (eds.): 1996, Constructing the Self in a Mediated World,
Thousand Oaks, CA: Sage.
Habermas, J.: 1981/1985, The Theory of Communicative Action, Volume 1, Beacon Press.
Hegel, G. W. F.: 1807, Phänomenologie des Geistes.
Hume, D.: 1748, An Enquiry Concerning Human Understanding.
Jameson, F.: 1998, Postmodernism and Consumer Society, in: The Cultural Turn:
Selected Writings on the Postmodern 1983-1998. Verso.
Joachims, T.: 1998, Text Categorization with Support Vector Machines: Learning with
Many Relevant Features, Proceedings of ECML-98, 10th European Conference on
Machine Learning, edited by Claire Nédellec and Céline Rouveirol: 137-142,
Springer-Verlag.
Johannesson, M.: 1996, Obtaining Psychological spaces with MDS - a pilot study with
perceptual stimuli, Lund University Cognitive Studies 45.
John, O. P.: 1990, The “Big Five” factor taxonomy: Dimensions of personality in the
natural language and questionnaires, Handbook of personality: Theory and research,
Guilford Press.
Jung, C. G.: 1921/1971, Psychological Types, trans. by H. G. Baynes, Princeton, NJ:
Princeton University Press.
Kaelbling, L.P., L.M. Littman and A.W. Moore: 1996, Reinforcement learning: a
survey, Journal of Artificial Intelligence Research, 4:237—285
Kafka, F.: 1925/1999, The Trial, Schocken.
Kluckhohn, C.: 1949, Mirror for Man, McGraw-Hill Book Co.
Krueger, M.W.: 1983, Artificial Reality, Addison-Wesley.
Kruskal, J. B., and M. Wish, 1978, Multidimensional Scaling, Sage University Paper
series on Quantitative Application in the Social Sciences, 07-011. Beverly Hills
and London: Sage Publications.
Lacan, J.: 1957, The agency of the letter in the unconscious, or reason since Freud, La
Psychoanalyse 3: 47-81.
Lakoff, G. and M. Johnson: 1980, Metaphors We Live by, University of Chicago Press.
Lehnert, W.G.: 1982, Plot units: A narrative summarization strategy. In W. G. Lehnert
& M. H. Ringle (Eds.). Strategies for natural language processing. Hillsdale, NJ:
Lawrence Erlbaum Associates.
Lenat, D.: 1995, Cyc: a large-scale investment in knowledge infrastructure.
Communications of the ACM 38(11). ACM Press.
Levenshtein, V.: 1965/1966, Binary codes capable of correcting deletions, insertions,
and reversals, Doklady Akademii Nauk SSSR, 163(4):845-848, 1965 (Russian).
English translation in Soviet Physics Doklady, 10(8):707-710.
Levine, D.N. (ed.): 1971, On Individuality and Social Forms: Selected Writings of Georg
Simmel, University of Chicago Press.
Lieberman, H.: 1997, Autonomous Interface Agents, Proceedings of CHI 1997.
Liu, H.: 2002, MontyLingua: Commonsense-Enriched NLP, Toolkit and API. Accessed
at: http://web.media.mit.edu/hugo/montylingua/
Liu, H.: 2003, Unpacking meaning from words: a context-centered approach to
computational lexicon design, Blackburn et al. (Eds.): Modeling and Using Context,
LNCS 2680: 218-232, Springer.
Liu, H., G. Davenport: 2005, Self-reflexive performance: dancing with the computed
audience of culture , International Journal of Performance Arts and Digital Media
1(3), Intellect Ltd.
Liu, H., M. Hockenberry & T. Selker: 2005, Synesthetic Recipes: foraging for food with
the family, in taste-space, Proceedings of the 32nd Annual Conference on Computer
Graphics and Interactive Techniques, Los Angeles, CA.
Liu, H., H. Lieberman and T. Selker: 2003, A model of textual affect sensing using realworld knowledge, Proceedings of the 7th International Conference on Intelligent User
Interfaces, IUI 2003, 125-132, ACM Press.
Liu, H. and P. Maes: 2004, What Would They Think? A Computational Model of
Attitudes. Proc. of the 2004 ACM Conference on Intelligent User Interfaces, 38-45.
ACM Press.
Liu, H. and P. Maes: 2005a, InterestMap: harvesting social network profiles for
recommendations, Proceedings of ACM Beyond Personalization 2005: A Workshop on
the Next Stage of Recommender Systems Research, 54-59, ACM Press.
Liu, H. and P. Maes: 2005b, The Aesthetiscope: Visualizing Aesthetic Readings of Text
in Color Space. Proceedings of FLAIRS2005: 74-79, AAAI Press.
Liu, H., P. Maes, G. Davenport: 2006, Unraveling the taste fabric of social networks,
International Journal on Semantic Web and Information Systems 2(1), Hershey, PA:
Idea Academic Publishers.
Liu, H. and E. T. Mueller: forthcoming, Character Affect Dynamics Analysis.
Liu, H. and P. Singh: 2004a, Commonsense reasoning in and over natural language, M
Negoita, RJ Howlett, LC Jain (Eds.): Knowledge-Based Intelligent Information and
Engineering Systems, LNCS 3215:293-306, Springer.
Liu, H. and P. Singh: 2004b, ConceptNet: A Practical Commonsense Reasoning
Toolkit. BT Technology Journal 22(4), 211-226. Kluwer Academic Publishers.
Lyotard, J-F: 1984, The Postmodern Condition: A Report on Knowledge, Minneapolis:
University of Minnesota Press.
McCracken, G.: 1997, Plenitude. Toronto: Periph: Fluide.
Maes, P.: 1994, Modeling Adaptive Autonomous Agents, Artificial Life Journal, C.
Langton, ed., Vol. 1, No. 1 & 2, MIT Press
Miller, G. A., R. Beckwith, C. Fellbaum, D. Gross, & K. Miller: 1990, Five papers on
WordNet, CSL Report 43, Cognitive Science Laboratory, Princeton University.
Minsky, M.: 1988, Society of Mind, Simon and Schuster.
Minsky, M.: 1990, Logical vs. Analogical or Symbolic vs. Connectionist or Neat vs.
Scruffy, Artificial Intelligence at MIT., Expanding Frontiers, Patrick H. Winston
(Ed.), Vol 1, MIT Press.
Minsky, M.: 1992, Future of AI Technology, Toshiba Review 47(7).
Minsky, M.: forthcoming, The Emotion Machine, Pantheon.
Montgomery, H.: 1994, Towards a perspective theory of decision making and
judgment, Acta Psychol (Amst) 87(2-3):155-78.
Moorman, K. and A. Ram: 1994, Integrating creativity and reading: A functional
approach. In Sixteenth Annual Conference of the Cognitive Science Society.
Mueller, E. T.: 2000, ThoughtTreasure: A natural language/commonsense platform.
Accessed on 11 November 2005 from http://www.signiform.com/tt/
Murray, K.: 1990, Life as fiction, Ph.D. Dissertation, Department of Psychology,
University of Melbourne
Narayanan, S.S.: 1997, Knowledge-based action representations for metaphor and
aspect (KARMA) (Unpublished doctoral dissertation). University of California,
Berkeley
Newell, A.: 1990, Unified Theories of Cognition, Cambridge, MA: Harvard University
Press.
Newell, A. and H. A. Simon: 1963, GPS, a program that simulates human thought. In
E. A. Feigenbaum and J. Feldman (eds.), Computers and Thought: 279-293. New York:
McGraw-Hill.
Nietzsche, F.: 1886, Jenseits von Gut und Böse (Beyond Good and Evil).
Norman, D.: 1989, The Design of Everyday Things, Currency Doubleday.
Ortony, A., G.L. Clore, M.A. Foss: 1987, The Referential Structure of the Affective
Lexicon, Cognitive Science 11(3): 341-364.
Ortony, A., G. L. Clore, A. Collins: 1988, Cognitive Structure of Emotions, Cambridge
University Press.
Papert, S., and I. Harel: 1991, Situating constructionism, in Constructionism, Ablex
Publishing Corp.
Pollack, M.: 1992, The uses of plans, AI Journal:57
Ram, A.: 1994, AQUA: Questions that drive the explanation process. In Roger C.
Schank, Alex Kass, & Christopher K. Riesbeck (Eds.), Inside case-based explanation: 207261, Hillsdale, NJ: Erlbaum.
Reeves, B. and C. Nass: 1996, The media equation: How people treat computers, television
and new media like real people and places, CSLI / Cambridge University Press.
Rhodes, B. and P. Maes: 2000, Just-in-time information retrieval agents, IBM Systems
Journal 39(4): 685-704.
Ricoeur, P.: 1965/1970, Freud and Philosophy: An Essay on Interpretation, transl. Denis
Savage, New Haven: Yale University Press.
Riesbeck, C.K. and R.C. Schank: 1989, Inside Case-Based Reasoning. Lawrence Erlbaum
Associates, Hillsdale.
Roget, P.: 1911, Roget’s Thesaurus of English Words and Phrases. Retrieved from
gutenberg.net/etext/10681
Rosenblatt, L.: 1978, Efferent and Aesthetic Reading, The Reader, The Text, The Poem: A
Transactional Theory of the Literary Work: 22-47, Carbondale: Southern Illinois UP.
Schank, R. C. and R. P. Abelson: 1977, Script, plans, goals, and understanding. Hillsdale,
NJ: Erlbaum.
Schleiermacher, F.: 1809/1998, General Hermeneutics, in A. Bowie (Ed.) Schleiermacher:
Hermeneutics and Criticism: 227-268. Cambridge University Press.
Shardanand, U. and P. Maes. (1995). Social information filtering: Algorithms for
automating `word of mouth'. Proceedings of the ACM SIGCHI Conference on Human
Factors in Computing Systems: 210-217.
Singh, P.: 2003, Reaching for dexterous manipulation. EECS Area Exam Paper.
Accessed 12 November 2005 at http://web.media.mit.edu/~push/Reaching.pdf.
Singh, P.: 2005, EM-ONE: An Architecture for Reflective Commonsense Thinking, MIT
Media Lab PhD Thesis.
Singh, P., B. Barry, H. Liu: 2004, Teaching machines about everyday life, BT Technology
Journal 22(4), 227-240, Kluwer Academic Publishers.
Singh, P., T. Lin, E. T. Mueller, G. Lim, T. Perkins, W. L. Zhu: 2002, Open Mind
Common Sense: knowledge acquisition from the general public. Proceedings of
ODBASE’2002.
Sloman, A.: 1981, Why robots will have emotions. Proceedings of the Seventh
International Joint Conference on Artificial Intelligence.
Stanfill, C. and D. Waltz: 1986, Toward memory-based reasoning, Communications of
the ACM 29(12): 1213-1228.
Thornton, S.: 1996, Club Cultures: Music, Media and Subcultural Capital, Wesleyan
University Press
Winograd, T.: 1971, Procedures as a Representation for Data in a Computer Program
for Understanding Natural Language. MIT AI Technical Report 235.
Zwaan, R.A. and G.A. Radvansky: 1998, Situation models in language comprehension
and memory. Psychological Bulletin, 123(2): 162-185