Download Steps towards Integrated Intelligence (ppt 0.26MB)

Document related concepts

Soar (cognitive architecture) wikipedia , lookup

Wizard of Oz experiment wikipedia , lookup

Neural modeling fields wikipedia , lookup

Philosophy of artificial intelligence wikipedia , lookup

Pattern recognition wikipedia , lookup

Embodied cognitive science wikipedia , lookup

Time series wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Knowledge representation and reasoning wikipedia , lookup

Affective computing wikipedia , lookup

Transcript
Aug. 22nd, 2003
Steps towards Integrated
Intelligence
Naoyuki OKADA
(Professor Emeritus)
Kyushu Institute of Technology
Progress
Step 1 Conceptual taxonomy of
vocabulary
Step 2 Natural language understanding
of moving picture patterns
Step 3 Emotion processing vs
knowledge processing
Step 4 Integrated intelligence
1.Introduction
 The history of the research of Artificial Intelligence(AI) is repetition
of diversification and specialization
1960s
of it’s fields as other research
does.
• Natural lang.
process.
Pattern recognition
At the beginning of• 1960s
• Learning
At the beginning of• 2000s
problem solving
2000s
1.Introduction
•Fundamentals/Theory
Knowledge representation, reasoning, algorithm ,
fuzzy theory, ---
•Learning/Discovery
Inductive/deductive learning, example-based reasoning,
data-mining, ---
 The history
of theofresearch
of Arti-fi
•Infrastructure
knowledge
Knowledge acquisition, knowledge base, Web
cial Intelligence(AI)
is repetition
of
search, •AI architecture/language
diversification
and specialization of
•Agent/Distributed AI
Problem solving by collaboration, agent society, --it’s fields as
other
research does.
•Life/Brain system
Artificial life, genetic algorithm, connectionism, --At the •Natural
beginning
language of 1960s
Natural language understanding, dialog
At the beginning
of 2000s
processing, corpus,
speech recognition, ---•Pattern understanding
Image recognition, scene analysis, image sequence
processing,---
•Cognition/Body
Intelligent robot, symbol-ground ding, cognitive
psychology, --・・・・・・・・・・・・・・・・・・・・・
 However, too much diversification
and specialization weaken the study
on the relations among subfields.
Those relations are important above
all in human intelligence.
 So, we should sometimes stop, look
back, put various kinds of results in
order, and integrate them into a
system.
Approach towards integration
 Multi-modal
Human intelligence accepts
multi-modal inputs.
- Natural language in letters/voices
- Picture patterns

Intellect and sensitivity
Knowledge and emotions
are in the relationship of both
wheels of a cart.
2.Conceptual taxonomy
of vocabulary

Language is “the window” of the mind.
Semantic contents of language, or the
system of concepts is the most
important objects in making clear
intelligence.

Research in Early years
C.J.Fillmore ’68 Case grammar
M.R.Quillian ’68 Semantic network
R.Schank ’72 Conceptual dependency
Y.Wilks ’75 Preference semantics
Conceptual analysis

Categories of concepts
Concepts are formed for all the nature.
- There are five categories from the
linguistic viewpoints: substance,
attribute, event, space/time, and
miscellaneous
- But each category is vague.

Computational definition
- What is
substance?
Individual of which quantity and quality
can be recognized by sensors
Fig. 2・1 Substance sensed by eyes
---Mountain
- What is state?
Fundamentally, static relation among
several substances
Fig.2・2
State----Man in the car
- What is attribute?
A special case of state. Fundamentally,
difference between object and standard
Difference
~
Object
Standard
Fig. 2・3 Object-standard pair--The mountain is higher than the tree.
- A measure is necessary for the
detection of difference
Measure : Height (length in the
perpendicular direction)
This measure brings an attribute to
the object.
- What is event ?
Fundamentally, change from a beforestate to an after-state
Change
前状態
Before-state
後状態
After-state
Fig. 2・4 Before-after state pair--A man gets out of a car
- What is space and time ?
Fundamentally, the location of
substance, attribute or event is
identified.
Space:position
Time: passage
Primitive and complex
- Primitive

A concept which can not be decomposed
any more(by referring to its word)
- Complex
A concept which can be decomposed
into one or more primitives

Formation of complex concept
- Compound
Type A: Two primitives are connected
with a logical/syntactic relation.
Type B: Primitives are connected with
a scenario
- Derivative
Derived from a primitive
Conceptual classification
Why classification?
- Verification of the proposed theory
- Acquisition of conceptual data for
machine processing


Target vocabulary
- About 32,000 words used in everyday
language
Results of classification

川(river)
whole/part
attribute
N11
N12
event
N13
flow
stagnate
深い
本
支
底
面
上
中
下
上流
下流 淀み 瀞
本流 川底
(upper (lower (pool)
(main
川面 stream) course)
stream)
支流
中流
(tributary)
(midstream)
水源 河口
浅い
小川
瀬 激流 急流 清流
(brook)
(torrent)
大川
(big river)
Fig. 2・5 Network of substance concepts
Table 2・1
No.
Primitives of attribute/event
Subcategory
(attribute/event)
Examples
(attribute/event)
0・00
0・01
1・00
Spirit/change_in_spirit
Glad/get anger
Sense/change_in_sense
Cold/hurt
Location/change_in_location
Deep/fall
1・01
Direction/change_in_direction Diagonal/turn over
1・02
1・03
Shape/change_in_shape
Sharp/bend
Quality/change_in_quality
Soft/rot
1・04
1・05
Quantity/change_in_ quantity Many/decrease
Light/change_in_light
Dark/flash
1・06
Color/change_in_color
Red/color
1・07
Heat/Change_in_heat
Hot/cool
1・08
Force・power/change_
in_force・power
Strong/strengthen
1・09
Sound/change_in_sound
Noisy/sing
1・10
Appearance・disappearance
Bare/appear
1・11
Start・finish
Sudden/begin
1・12
Time/Change_in_time
Quick/pass
2・00
Continuation
Constant/continue
2・01
State
Fine/tower
3・00
Abstract
Equivalent/fit
4・00
Others
Eat
Table 2・2
Type
v(sbj)
v(sbj,org)
v(sbj,goal)
v(sbj,ptn)
v(sbj,std)
v(sbj,obj)
v(sbj,obj,org)
v(sbj,obj,goal)
v(sbj,obj,inst)
v(sbj,obj,att)
Others
Case-frame of events
Example
Fall(leaf)
Come(smoke, chimney)
Go(Taro, post office)
Collide (truck, bus)
Resemble(children, parents)
break(boy, cup)
Unload(driver, box, truck)
Put(girl, candy, pocket)
Scoop(Hanako, sugar, spoon)
Feel(Jiro, breeze, cool)

Number of classified concepts
Substance
Attribute
Event
Space/time
4,200
2,060
3,720
1,800
-------------Total 11,780
Evaluation


Our theory can cover the 70% of the
target vocabulary, and almost the
whole if a little enlarged.
Fundamental data of concepts was
obtained, which contributed to the
construction of EDR concept
dictionaries later.
Main publications
1973 N.Okada &T.Tamati:Analysis and
Classification of Simple Matter Concepts
for the Interpretation of Natural
Language and Picture Patterns, IECE
Trans, Vol.56D,No.9, pp.523-530.
1980 N.Okada: Conceptual Taxonomy of
Japanese Verbs for Understanding
Natural Language, Proc. COLING'80,
pp.127-135.
3.Natural language understanding
of moving picture patterns
 R.A.Kirsch,
Pioneer
- Kirsch proposed integrated
processing through the common
representation of their meanings
[Kirsch64].
- But he processed just static
picture patterns.
 Approaches
to moving picture
patterns in early years
N.Badler ‘75
Temporal scene analysis
Sentence generation as the results
of temporal scene analysis
Minsky ’75
Frame theory
Universal data structure, particularly
representation of event
 Our
approach
- Input
Sequential pictures each of which
is line drawing by hands
- Meanings captured
The events of change_in_location
which is the biggest in number.
- Output
Japanese and English sentences
 Flow
of processing
start
Picture reading
Noise cleaning
Bottom up
Primitive picture recognition
Reasoning of occurring events
Structural analysis among primitives
Top down
Understanding events
Sentence generation
end
Fig 3・1
Natural language understanding
of picture sequences
Bottom up process
Picture reading
A TV camera follows a line
segment by octagonal scanning.


Primitive picture recognition
An input line drawing with graph
structure is matched with a
template just like wavepropagation.
(a) Octagonal scanning
(b) Line following
Fig 3・2 Reading a line segment
P1
q1
P2
P3
q2
q3
P4
P5
(a) Input drawing
q4
q5
(b) Selected template
Fig. 3・3 Wave-propagation pattern
matching(WPPM)
Top down process
- Context and focus attention
All the things in a picture are not
necessarily recognized in a
certain context, but some
attentional objects are focused.
-
Attentional rules
1.Objects related to a goal in the
execution of a plan
2.Dangerous objects
3.Favorite things
4. Sudden, big change_in_location/
_shape
--------
((S, thing), time passage, existence's))
move
+((OX, thing), existence(OX))
+movement_
perpendicularly(S)
+(S, movable_
by _oneself)
V2
V1
+movement downward
(S,(T0,T1))
+((S ,legs),
walking_figure(S))
descend
walk
V3
+((S,direction),
go_forward(S),
come_close(S,OT))
go
+touch(S,OT, T1)
+((OF,inside),
inside(S,OF,T0),
outside(S,OF,T1))
get out
touch
+short_time
+(OF,veihcle))
get off
Fig.3・4
collide
Reasoning network of change_in_location

Structural analysis
- Technologies
*
*
*
*
Numerical computation
Logical computation
Gestalt processing
Template matching

Logical computation
Sj
Si
Fig.3・5 Boolean judgment of
“inside/outside”
- Gestalt processing
Metzger’s rule
Continuation: two line segments
meeting with angle 180°
Enclosure: a domain enclosed by
contours
(a) Complex pictures
(b) Template
Fig. 3・6 Symbol processing
Experiments
Reading and recognition of line
drawings by hands
 Structural analysis of static
pictures
 Natural language understanding
(NLU) of before-after state pairs
 NLU of picture sequences

(a) Before-state
(b) After-state
Generated sentences:
1)A man(4) moves(1).
2)A man(4) passes(1).
3)A man(4) walks.
4)A man(4) goes forward(1).
5)A man(4) goes out(1)
of a house.
6) A man(4) heads for a car.
7) A man(4) goes to(1) a car.
8) A man(4) comes to(1) a car.
9) A man(4) gets near a car.
-----
Fig. 3・7 NLU of a before-after state pair
t=t0
t=t1
t=t2
t=t3
t=t4
t=t5
t=t0 A man(4) is in a house
t=t3 A car runs(2).
----------------
t=t1 A man(4) goes out(1)
a house
----------------
of
t=t2 A man(4) gets on(1) a car.
----------------
t=t4
A car collides with a tree.
----------------
t=t5 A bird(1) leaves(1) a tree.
A man(3) get off (1) a car.
----------------
Fig.3・8 NLU of a picture sequence
Evaluation

Reading and recognition
About 150 primitive pictures
were input, the 88% of which
were correctly recognized and
the 95% of which could be
possible by some improvement .

Structural analysis of beforeafter state pairs
Note that the current image
processing technology can
process gray-scale image
sequences by real-time

Meaning understanding
Our technology is still useful
for all the subcategories of
events except mental one
 Historical
significance
This research took the lead in
the field of NLU of moving
picture patterns in ’70s.
Main publications
1976 N.Okada & T.Tamachi:
Interpretation of Moving Picture
Patterns and its Description in
Natural Language---Semantic
Analysis, IEICE Trans(D),Vol.J59-D,
No.5, pp.331-338.
1979 N.Okada: SUPP---Understanding Moving Picture Patterns
Based on Linguistic Knowledge,
Proc. IJCAI,pp.690-693.
4.Emotion processing vs.
knowledge processing
Why does AI need emotion processing?
(1) Texts, e.g. social articles in
newspapers often touch humanity
such as glad/sad or gain/loss.
(2) Some intelligent agents should be
friendly to humans.
(3) Some kinds of processing need a
mechanism for evaluation of input
information.
 Research in early years
J.G.Carbonell ’80
Story understanding by personality
Pfeifer & Nicholas ’85
Simulation of emotion mechanism
by “interruption”
Okada ’87
Emotion model in NLU
 Our approach
- Evocation and response
Analysis of general property and
algorithm
- Roles shared by emotion and
knowledge
Analysis of emotion
 Multi-factor analysis by Plutchik
Plutchik divided emotions into two
categories: “primary” and “complex”
[Plutchik60].
- We follows this idea, and take the
followings as primary emotion:
Gladness/sadness, like/dislike,
surprise, expectancy, anger, and
fear.
(Gladness( the current state is better than the
previous (
physiological (inner pleasure; outer pleasure);
psychological (
goal achievement(
information collection (expected; discover;
become clear);
plan (planning);
results (completion; gain; useful));
personal relations(
companion mind (agreement; sympathy;
collaboration; make_friends_again);
superiority/inferiority (superior; praise;
obedience; hospitality; protection)));
others))))
Fig.4・1 Hierarchical features of gladness

Evocation of emotion
- Reflective
Evoked unconsciously by a sudden
stimulus from the external world or a
remarkable change in the internal.
Reflective response follows it.
- Deliberative
Evoked consciously by a cognitive
process. Deliberate reasoning
mediates between the input and its
response.

Response of emotion
- General trends
If one is brought “pleasure” by an
input, one promotes the input
stimulus through one’s response,
otherwise one inhibits it.
- Type of response
* Free
* Constrained
- Free
An emotion is evoked straight to a
stimulus, and a promoting/inhibitory
response follows it. The response
may cause to give up a task under
execution.
- Constrained
Even if a free emotion is evoked
internally, some task under execution
inhibits straight expression
Emotion vs. knowledge
 Language expression
Emotion is adjective whereas
knowledge is verb
This implies that emotions are attributes.
Since an attribute gives a measure to
detect the difference between an
object-standard pair, evocation of an
emotion is measurement of the input
stimulus.
Subjective and objective
Emotion: subjective evaluation of
information
Knowledge: memory of objective
information
 Pattern of evaluation
Formation of personality

Experiments
Simulation of protagonists of fables
- Free evocation in a series of actions
(Shown in Chapter 5)
- Constrained evocation in dialog
process
 A dialog---invitation
K1 Hi.
P2 Hi.
K3 Where are you going?
P4 To the river for fishing.
K5 Sounds good.
P6 And you?
K7 I’m going to the mansion to drink
water of the pond. I’m very thirsty.
(continued)
P8 The mansion is dangerous.
K9 Why?
P10 Because I heard a voice when I
passed it a while ago.
K11 Really?I wonder what shall I do.
P12 Why don’t you come to the river
with me ?
K13 Well, it’s far, isn’t it?
P14 But the water there is colder and
more tasty.
K15 O.K. I’ll come with you.
dialogue model
persuade_to_abandon(E-Plan)
understand
(E-PLAN)
tentative_
acceptance
(R-PLAN)
understand_drawback
(R-PLAN)
emphasize_advantage(R-Plan)
intention recognition
utterance planning
accept(R-PLAN)
persuade_to_accept(R-PLAN)
refuse_for_drawback(R-PLAN)
inform(E-PLAN)
---
deny_drawback(R-PLAN)
inform_advantage(R-PLAN)
---
---
action planning
seek_advantage(R-PLAN)
emotion
---
seek_drawback(R-PLAN)
seek_drawback(E-PLAN)
---
language analysis
(E7) “I’m going to the pond
in the mansion to - - -.”
(E13) “Well, it’s far.”
language generation
message flow
top-down prediction
dialogue state tracking
dialogue state transition
(R14) “But the water there is
colder and more tasty.”
(R12) “Why don’t you
come to the river with me?”
Fig.4・2 Interaction between discourse and
mental analyses
Evaluation
 Conceptual analysis
The properties of primitive emotions
of children were made clear.
 Evocation
The so-called “non-logical” algorithm
was clarified.
 Response
Complicated responses in behavior
and dialog were verified.
Main publications
1987 N.Okada: Representation of
knowledge and emotions,Proc.
Kyushu Symp. Information
processing,pp.47-65.
1997 M.Tokuhisa & N.Okada: A
Pattern Recognition Approach to
Emotion Arousal of Intelligent
Agents,Trans.JSAI,Vol.39, No.8,
pp.2440-2451.
5.Integrated intelligence
Intelligence dwells in the mind.
Recent research in the fields of
cognitive science(CS) and AI
throws light on the
comprehensive mechanism of
the mind.
Computer Models of the Mind
 Existent
models
- M.Minsky ’85
System of multi-gents
- Okada ’87
Mind composed of six domains
and five levels
- P.N.Johnson-Laird ’88
Systematization of the results
of research in CS
 The
author’s model
Fundamentally, we follow
Minsky’s multi-agent model.
Micro-processor ”μ-agent“ and
it’s “chain-activation” are
introduced.
μ-agent(
name (identifier),
domain (attached),
input (premise of activation),
execution (program),
memory (data),
description (result),
output (message))
Fig.5・1 Frame representation of
μ-agent
- Chain activation
Various functions of mind is
executed by a “chain activation”
or a series of activations of μagents.
Recognition
Reasoning
Behavior
Fig.5・2 Chain activation
Domains of processing
The mind consists of six
domains which function as
follows:

(1)
(2)
(3)
(4)
(5)
(6)
Recognition
Reasoning&Design
Emotion
Expression
Memory
Language
Language
Emotion
Memory
Reasoning&Design
Recognition
Expression
Mind(brain)
Sensors
(Thirst,hunger,…)
Actuators
Body
(Scene,
speech,…) External world
(Behavior,
speech,…)
Fig.5.3 Domains of processing
Plan controller
Control
Interrupt controller
Language
Reasoning
of behavior
実現可能性
PlanningEmotions
Reasoning of di
存在性、ほか
危険性
Memory
Plan generator
Reasoning
Simulator
Evaluator
認識・人間の存在性・交差点1
Reasoning&Design
Recognition
認識・人間の存在性・館1前1
認識・滑る可能性・池1
認識・転ぶ可能性・池1
Expression
認識・落ちる可能性・池1
認識・人間の存在性・館1の池1
Mind(brain )
Sensors
(Thirst,hunger…)
認識・人間の存在性・池1
認識・溺れる可能性・池1
認識・人間の存在性・猟師小屋1前1
認識・風邪をひく可能性・池1
認識・人間の存在性・館1のぶどう棚1
認識・凍死する可能性・池1
認識・人間の存在性・橋1の東1
Body
認識・滑る可能性・館1の池1
Actuators
(Scene,
External world
speech,…)
・・・・・・・・・・・・・・・・
・・・・・・・・・・・・・・・・
(Behavior,
speech,…)
・・・・・・・・・・・・・・・・
・・・・・・・・・・・・・・・・
 Levels
of data
Along concept formation process
Level 5
4
3
2
1
Connected concept
Simple concept
Conceptual feature
Cognitive feature
Raw data
go
Agent
Connected
concept
Origin
Inside
Movable
is_a
is_a
Human
Primitive
concept
Ni
House
Vi
Shopping(
buy,
cash/card,
.store,
...
Car
Ai
High
Go
Composed
Conceptual
feature
Roof,
Wall,
Room,
....
Movement_
from_inside_
to_outside,
....
Difference_in_
length,
....
Associated
, ,…
Cognitive
feature
Extracted
Raw data
,
…
,
(Internal)
Visual
(External)
,
…
,
(Substance)
(Event)
Fig. 5・4 Levels of data
(Attribute)
Aesopworld Project
- Implementation of our theory
- Simulation of the physical and
mental activities of the
protagonists of Aesop Fables,
e.g. The Fox and the Grapes
Language
desire
relieve
thirst
Emotion
Reasoning
goal
&Design
relieve
thirst
plan
eat fruits
plan
Drink water
Controller
Memory
Planknowledge
Naturereasoning
Planner
physiology
thirst
Recognition
Plan generator
Simulator
Expression
Evaluator
Reasoner
reasoning
water in pond
Sensors
reasoning
pot in house
reasonin
g
human
near pond
Fig.5・5 Chain activation of μ-agents
Actuators
Language
Emotion
plan
go to mansion
to drink water
Controller
Reasoning
&Design
Planner
Memory
actionPlanknowledge
movement
to
Naturemansion
reasoning
Plan generator
Recognition
Simulator
Evaluator
Expression
Reasoner
Sensors
Actuators
Experiments
 Main
system
Four PCs and fifteen interpreters
(subdomains)
Subdomain 1
SubSubdomain 2 domain 3
Subdomain 4
SubSubdomain 5 domain 6
Subdomain 7
Subdomain 8
PC2 : Turbo Linux 8
PC1 : Turbo Linux 8
LAN
Subdomain 9
SubSubdomain 10 domain 11
Subdomain 12
PC3 : RedHat Linux 5.2J
SubSubSubdomain 13 domain 14 domain 15
PC4 : RedHat Linux 5.2J
)
Fig.5・6
Composition
Message
server
Fig.5・7 Snapshot1
Fig. 5・8
Snapshot2
Fig.5・9
Animation
 Generated
monolog by the Fox
It’s very hot today. I’m on the animal trail
300 meters from the intersection. I’m very
thirsty. I’d like to relieve my thirst in a safe
way in a hurry.
I’ll search for and drink water. I’ll go home.
My home is far. I give up going there. I’ll go
under the bridge. It’s far. I give up going
there… I study other ways.
I’ll search for a place with water. I remember
a pond. I’ll find it. I remember the B pond.
It’s in the Aesopworld. I’ll go there. A
hunter’s lodge is close to it. He’ll probably
be in it. He is man. Man is dangerous. I
give up going there…
I’ll eat watery foods. I’ll search for and eat
fruits…
Table 5.2 Comparison with Minsky and
Johnson-Laird
Minsky ’85
Okada ’87
Johnson-Laird
’88
Approach
Bottom up
Top down
Top down
Domains
Many
Six
Six
Levels
many
Five
Many
Technology
Multi-agents
Multi-agents
Turing machine
Experiment
No
Yes
No
Evaluation
 Various
mental activities discussed
in CS and AI could be captured by
our six domains and five levels.
 An interface to physiology is put at
the level of raw data.
 This model can be implemented if
the number of μ-agents is less
than ten thousands.
 Our integrated intelligence took
the lead in verifying its validation
by experiments.
Main publications
1990 N.Okada and T.Endo: Story
Generation Based on Dynamics
of the Mind, Computational
Intelligence, Vol.8, No.1,
pp.123-160.
1996 N.Okada: Integrating Vision,
Motion, and Language through
the Mind, Artificial Intelligence
Reiview, Vol.10, pp.209-234.
6.Residual problems and
social applications
 Problems
- Learning through experiences
- Implementation to robots
 Applications
- Support agents for education or
diagnosis
- Partner of handicapped/elder
people
7.Conclusions
 Concepts of substance, attribute,
event, and space/time are
systematically analyzed and
classified.
 A system for NLU of picture
sequences were constructed.
 Primitive emotions were analyzed
and implemented in the tasks of
action and dialog planning.
 A computer model of the mind with
six domains of processing and five
levels of data was proposed, and
was implemented with twelve
hundreds μ-agents on computers.
These results led us to a conclusion
that an infrastructure to construct
complex intelligence covering many
subfields could be obtained.