Download Semantic Addressable Encoding

Semantic Addressable Encoding Cheng-Yuan Liou, Jau-Chi Huang, and Wen-Chie Yang Department of Computer Science and Information Engineering National Taiwan University TC402, Oct. 5, ICONIP 2006, Hong Kong Web red.csie.ntu.edu.tw    Sentence generating function The semantic world of Mark Twain Semantic Search under Shakespeare Outline   Introduction Encoding Method         Elman network The word corpus – Elman’s idea Review semantic search Multidimensional Scaling (MDS) space Representative vector of a document Iterative re-encoding Example Summary Introduction    A central problem in semantic analysis is to effectively encoding and extracting the contents of word sequences. Traditional way of creating a prime semantic space is extremely expensive and complex because experienced linguists are required to analyze huge number of words. This paper represents an automatic encoding process. Elman Network       Uoh: Lh x Lo weight matrix Uhi: Li x Lh weight matrix Uhc: Lc x Lh weight matrix  H ( w(t ))   (U hi w(t )  U hc w(t  1))   2  3   ( x)  1.7159  tanh  x    Ll = # neurons in output Lh = # neurons in hidden Li = # neurons in input Lc = # neurons in context The context layer carries memory The hidden layer activates output layer and refreshes context layer Desired behavior after training process w(t  1)  E (w(t  1))   (U oh H (w(t )) The word corpus – Elman’s idea    All words are coded with certain given lexical codes and all word sequences in corpus D follow the syntax (Noun + Verb + Noun). After training, input all sequences again and record all hidden outputs for each individual input. S nE  H ( w(t )) | w(t )  wn  Obtain new code wnE for nth word by averaging all vectors in S nE wnE   1 S nE  H (w(t )), n  1 N w ( t )  wn w ( t )D Construct a word tree based on the new codes to explore the relationship between words. Review semantic search    The conventional semantic search constructs a semantic model and a semantic measure. A manually designed semantic code set by experts is used in the model. (main focus) One can build a raw semantic matrix W for all N different words WR N  w1 w2  wN RN  A code of a word is a column vector of R features T  One may use the orthogonal space configured by the characteristic decomposition of the matrix, WWT. wn  w1n w2n  wRn  The semantic search WR NWRT N   1 0 0  2 T   FR R  0  0  where FRR   f1 f 2  0 0   and r  r 1 , r  1 R FR R  0  0  R  R R  f R RR , f r  1 Since WWT is a symmetric matrix, all its eigenvalues are real and nonnegative numbers. Each eigenvalue λi equals the variance of the N projections of the codes on the ith eigenvector, fi, that is, N 2 i    wn  f i   n 1 Multidimensional Scaling (MDS) space  Select a set of Rs eigenvectors {fr, r=1~Rs} from all R eigenvectors to build a reduced feature space  FRsR s  f1   f2  f Rs  R R s The MDS space is MDS = span{Fs} These selected features are independent and significant. The new code of each word in this space is w  F wn sT s n s R N or W  F WR N sT Representative vector of a document   A representative vector for a document D should contain the semantic meaning of the whole document. Two measures are defined  Peak preferred measure  vDa  w1a  T wn D Average preferred measure  w  w w  w  where w Magnitude is normalized vDb    w2a  wRa where wra  max wrns , r  1 R s s n wns D vD  vDb 1 b D v b 1 b 2 b R b r  w s rn wns D , r  1 R Representative vector of a document   The normalized measure vD is used to represent a whole document. And a representative vector vQ for a whole query can be obtained by the same way. The relation score is defined as RS Q D    vD , vQ  vD  vQ  vD , vQ  Iterative re-encoding    Eleman’s method for sentence generation of fixed syntax Noun+Verb+Noun can not be applied to more complex sentences. We modify his method. Each word has random lexical code initially T wnj 0  wn1 wn 2  wnR  After the jth training epoch, a new raw code is calculated wnraw  1 sn  (U w ( t )  wn w ( t )D oh H ( w(t  1))), n  1 N where sn is total # of words in a set sn Iterative re-encoding  The set sn contains all prediction for the word wn based on its precedent words. sn   (U oh H (w(t  1))) | w(t )  wn , w(t )  D  After each epoch, all the codes are normalized by the following two equations. The normalization prevents a diminished solution derived by the backpropgation algorithm. raw WRave  N  WR N 1   1   1 raw  1  WR N    N   1   1   N N wnj  wnnom  wnave 1 wnave , where wn  ( wnT wn ) 0.5 , n  1 N Example  Test the ability of classifying 36 Shakespeare’s plays. We consider each play as the query input and calculate the relation score of this and one other play. The figure below shows the relation tree. c: comedy r: romance h: history t: tragedy Number denotes publication year Model parameters: Di=1…36, Qi=1…36, N=10000, Lh=Lc=200, Lo=Li=Rs=R=64 Example   We provide a semantic search tool using corpus from Shakespeare’s comedies and tragedies at http://red.csie.ntu.edu.tw/demo/literal/SAS.htm Example search result with parameters Di=1…7777, N=10000, Lo=Li=R=100, Lh=Lc=200, Rs=64 Query Search result she loves kiss BENVOLIO: Tut, you saw her fair, none else being by herself poised with herself in either eye; but in that crystal scales let there be weigh.d. Your lady.s love against some other maid that I will show you shining at this feast, and she shall scant show well that now shows best. -Romeo and Juliet armies die in blood MARCUS AND RONICUS: Which of your hands hath not defended Rome, and rear.d aloft the bloody battle-axe, writing destruction on the enemy.s castle? O, none of both but are of high desert my hand hath been but idle; let it serve. To ransom my two nephews from their death; then have I kept it to a worthy end. -Titus Andronicus Summary     We have explored the concept of semantic addressable encoding and completed a design for it that includes automatic encoding methods. We have presented the result of applying this method in studying literary works. The trained semantic codes can facilitate other research such as linguistic analysis, authorship identity, categorization, etc. The method can be modified to accommodate polysemous words.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Semantic Addressable Encoding