Download Semantic Addressable Encoding

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Semantic Addressable
Encoding
Cheng-Yuan Liou, Jau-Chi Huang, and Wen-Chie Yang
Department of Computer Science and
Information Engineering
National Taiwan University
TC402, Oct. 5, ICONIP 2006, Hong Kong
Web red.csie.ntu.edu.tw



Sentence generating function
The semantic world of Mark Twain
Semantic Search under Shakespeare
Outline


Introduction
Encoding Method








Elman network
The word corpus – Elman’s idea
Review semantic search
Multidimensional Scaling (MDS) space
Representative vector of a document
Iterative re-encoding
Example
Summary
Introduction



A central problem in semantic analysis is to
effectively encoding and extracting the
contents of word sequences.
Traditional way of creating a prime semantic
space is extremely expensive and complex
because experienced linguists are required
to analyze huge number of words.
This paper represents an automatic
encoding process.
Elman Network






Uoh: Lh x Lo weight matrix
Uhi: Li x Lh weight matrix
Uhc: Lc x Lh weight matrix

H ( w(t ))   (U hi w(t )  U hc w(t  1))


2 
3 
 ( x)  1.7159  tanh  x 


Ll = # neurons in output
Lh = # neurons in hidden
Li = # neurons in input
Lc = # neurons in context
The context layer carries
memory
The hidden layer activates
output layer and refreshes
context layer
Desired behavior after
training process
w(t  1)  E (w(t  1))   (U oh H (w(t ))
The word corpus – Elman’s
idea



All words are coded with certain given lexical
codes and all word sequences in corpus D follow
the syntax (Noun + Verb + Noun).
After training, input all sequences again and
record all hidden outputs for each individual input.
S nE  H ( w(t )) | w(t )  wn 
Obtain new code wnE for nth word by averaging all
vectors in S nE
wnE 

1
S nE
 H (w(t )), n  1 N
w ( t )  wn
w ( t )D
Construct a word tree based on the new codes to
explore the relationship between words.
Review semantic search



The conventional semantic search constructs a
semantic model and a semantic measure.
A manually designed semantic code set by
experts is used in the model. (main focus)
One can build a raw semantic matrix W for all N
different words
WR N  w1 w2  wN RN

A code of a word is a column
vector of R features
T

One may use the orthogonal space configured by
the characteristic decomposition of the matrix,
WWT.
wn  w1n
w2n  wRn 
The semantic search
WR NWRT N


1 0
0 
2
T 
 FR R
 0

0 
where FRR   f1 f 2 
0
0  
and r  r 1 , r  1 R
FR R
 0

0  R  R R

f R RR , f r  1
Since WWT is a symmetric matrix, all its
eigenvalues are real and nonnegative numbers.
Each eigenvalue λi equals the variance of the N
projections of the codes on the ith eigenvector, fi,
that is,
N
2
i    wn  f i  
n 1
Multidimensional Scaling
(MDS) space

Select a set of Rs eigenvectors {fr, r=1~Rs} from all
R eigenvectors to build a reduced feature space

FRsR s  f1


f2 
f Rs

R R s
The MDS space is MDS = span{Fs}
These selected features are independent and
significant. The new code of each word in this
space is
w  F wn
sT
s
n
s
R N
or W
 F WR N
sT
Representative vector of a
document


A representative vector for a document D should
contain the semantic meaning of the whole
document.
Two measures are defined

Peak preferred measure

vDa  w1a

T
wn D
Average preferred measure
 w  w w  w  where w
Magnitude is normalized
vDb 


w2a  wRa where wra  max
wrns , r  1 R
s
s
n
wns D
vD  vDb
1 b
D
v
b
1
b
2
b
R
b
r

w
s
rn
wns D
, r  1 R
Representative vector of a
document


The normalized measure vD is used to represent
a whole document. And a representative vector
vQ for a whole query can be obtained by the
same way.
The relation score is defined as
RS Q D  
 vD , vQ 
vD  vQ
 vD , vQ 
Iterative re-encoding



Eleman’s method for sentence generation of fixed
syntax Noun+Verb+Noun can not be applied to
more complex sentences.
We modify his method. Each word has random
lexical code initially
T
wnj 0  wn1 wn 2  wnR 
After the jth training epoch, a new raw code is
calculated
wnraw 
1
sn
 (U
w ( t )  wn
w ( t )D
oh
H ( w(t  1))), n  1 N
where sn is total # of words in a set sn
Iterative re-encoding

The set sn contains all prediction for the word wn
based on its precedent words.
sn   (U oh H (w(t  1))) | w(t )  wn , w(t )  D

After each epoch, all the codes are normalized by
the following two equations. The normalization
prevents a diminished solution derived by the
backpropgation algorithm.
raw
WRave
 N  WR N
1   1


1 raw  1
 WR N

 
N


1


1

 N N
wnj  wnnom  wnave
1
wnave , where wn  ( wnT wn ) 0.5 , n  1 N
Example

Test the ability of classifying 36 Shakespeare’s
plays. We consider each play as the query input
and calculate the relation score of this and one
other play. The figure below shows the relation tree.
c: comedy r: romance
h: history
t: tragedy
Number denotes publication year
Model parameters: Di=1…36, Qi=1…36,
N=10000, Lh=Lc=200, Lo=Li=Rs=R=64
Example


We provide a semantic search tool using corpus from
Shakespeare’s comedies and tragedies at
http://red.csie.ntu.edu.tw/demo/literal/SAS.htm
Example search result with parameters Di=1…7777, N=10000,
Lo=Li=R=100, Lh=Lc=200, Rs=64
Query
Search result
she loves kiss
BENVOLIO: Tut, you saw her fair, none else being by herself poised with
herself in either eye; but in that crystal scales let there be weigh.d. Your lady.s
love against some other maid that I will show you shining at this feast, and
she shall scant show well that now shows best.
-Romeo and Juliet
armies die in blood
MARCUS AND RONICUS: Which of your hands hath not defended Rome,
and rear.d aloft the bloody battle-axe, writing destruction on the enemy.s
castle? O, none of both but are of high desert my hand hath been but idle; let
it serve. To ransom my two nephews from their death; then have I kept it to a
worthy end.
-Titus Andronicus
Summary




We have explored the concept of semantic
addressable encoding and completed a design for it
that includes automatic encoding methods.
We have presented the result of applying this
method in studying literary works.
The trained semantic codes can facilitate other
research such as linguistic analysis, authorship
identity, categorization, etc.
The method can be modified to accommodate
polysemous words.