Download Meaning representation, semantic analysis, and lexical semantics

Document related concepts

Macedonian grammar wikipedia , lookup

Portuguese grammar wikipedia , lookup

Compound (linguistics) wikipedia , lookup

Preposition and postposition wikipedia , lookup

Kannada grammar wikipedia , lookup

Internalism and externalism wikipedia , lookup

Transformational grammar wikipedia , lookup

Spanish grammar wikipedia , lookup

Old English grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

Morphology (linguistics) wikipedia , lookup

Integrational theory of language wikipedia , lookup

Untranslatability wikipedia , lookup

Dependency grammar wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Malay grammar wikipedia , lookup

Causative wikipedia , lookup

Polish grammar wikipedia , lookup

Serbo-Croatian grammar wikipedia , lookup

Icelandic grammar wikipedia , lookup

Focus (linguistics) wikipedia , lookup

Word-sense disambiguation wikipedia , lookup

Symbol grounding problem wikipedia , lookup

Pleonasm wikipedia , lookup

Latin syntax wikipedia , lookup

Meaning (philosophy of language) wikipedia , lookup

Construction grammar wikipedia , lookup

Semantic memory wikipedia , lookup

Parsing wikipedia , lookup

Junction Grammar wikipedia , lookup

Semantic holism wikipedia , lookup

Pipil grammar wikipedia , lookup

Cognitive semantics wikipedia , lookup

Lexical semantics wikipedia , lookup

Transcript
Semantics
Ling 571
Fei Xia
Week 6: 11/1-11/3/05
Outline
• Meaning representation: what formal
structures should be used to represent the
meaning of a sentence?
• Semantic analysis: how to form the formal
structures from smaller pieces?
• Lexical semantics:
Meaning representation
Meaning representation
• Requirements that meaning
representations should fulfill
• Types of meaning representation:
– First order predicate calculus (FOPC)
– Frame-based representation
– Semantic network
– Conceptual dependency diagram
Requirements
•
•
•
•
•
Verifiability
Unambiguous representations
Canonical form
Inference
Expressiveness
Verifiability
• A system's ability to compare the state of
affairs described by a representation to the
state of affairs in some world as modeled
in a knowledge base
• Example:
– Sent: Maharani serves vegetarian dishes.
– Question: Is the statement true?
Unambiguous representation
• Representations should have a single
unambiguous interpretation.
• Example:
– Mary and John bought a book
– Two students met three teachers
– A German teacher
– A Chinese restaurant
– A Canadian restaurant
Canonical form
• Sentences with the same thing should
have the same meaning representation
• Example:
– Alternations: active/passive, dative shift
– Does Maharani have vegetarian dishes?
– Do they serve vegetarian food at Maharani?
Inference
• a system's ability to draw valid conclusions
based on the meaning representation of
inputs and its store of background
knowledge.
• Example:
– Sent: Maharani serves vegetarian dishes
– Question: can vegetarians eat at Maharani?
Expressiveness
• A system should be expressive enough to
handle an extremely wide range of subject
matter.
• Example:
– Belief: I think that he is smart.
– Hypothetical statement: If I were you, I would buy that
book.
– Former president, fake ID, allegedly, apprarently
Meaning representation
• Requirements
–
–
–
–
–
Verifiability
Unambiguous representations
Canonical form
Inference
Expressiveness
• Types of meaning representation:
–
–
–
–
First order predicate calculus (FOPC)
Frame-based representation
Semantic network
Conceptual dependency diagram
FOPC
• Elements of FOPC
• Representing
– Categories
– Events
– Time (including tense)
– Aspect
– Belief
–…
Elements of FOPC
• Terms:
– Constant: specific objects in the world: e.g., Maharani
– Variable: a particular unknown object or an arbitrary
object: e.g., a restaurant
– Function: concepts: e.g., LocationOf(Maharani)
• Predicates: referring to relations that hold
among objects:
– Ex: Serve(Maharani, food)
– Arguments of predicates must be terms.
Elements of FOPC (cont)
• Logical connectives:
• Quantifier:
,, 
, 
• Example: All restaurants serve food.
x Re staurant ( x)  Serve( x, food )
Inference rules

• Modus ponens:
• Conjunction:


 
• Disjunction:


• Simplification:
• ….
 



FOPC
• Elements of FOPC
• Representing
– Categories
– Events
– Time
– Aspect
– Belief
–…
Representing time
•
•
•
•
•
•
Past perfect: I had arrived in NY
Simple past: I arrived in NY
Present perfect: I have arrived in NY
Present: I arrive in NY
Simple future: I will arrive in NY
Future perfect: I will have arrived in NY
Representing time (cont)
• Reichenbach’s approach
– E: the time of the event
– U: the time of the utterance
– R: the reference point
• Example:
– Past perfect:
I had arrived: E > R > U
– Simple past:
I arrived:
E=R > U
– Present perfect: I have arrived: E > R=U
Aspect
• Four types of event expression:
–
–
–
–
Stative: I like books. I have a ticket
Activity: She drove a Mazda. I live in NY
Accomplishment: Sally booked her flight.
Achievement: He reached NY.
• Differences:
– Being in a state or not
– occurring at a given time, or over some span of a time
– Resulting in a state: happening in an instant or not.
Distinguishing four types
• Allowing progressive, imperative
– *I am liking books.
– *Like books.
• Modified by in-phrase, for-phrase: in a
month, for a mont
– He lived in NY for five years.
– *He reached NY for five minutes.
Distinguishing four types (cont)
• “Stop” test: stop doing something
– *He stopped reaching NY.
– He stopped booking the ticket
• Modified by adverbs such as
“deliberately”, “carefully”
– *He likes books deliberately
Representing beliefs
• John believes that Mary ate lunch.
• One possibility:
u, v, IsA(u, believing )  IsA(v, Eating )  Believer (u, John)
 Believed Pr op(u, v)  Eater(v, Mary )  Eaten(v, lunch )
• Another possibility:
Believing ( John, Eating ( Mary , lunch ))
Representing beliefs (cont)
• Substitution does not work
• Example:
– John knows Flight 1045 is delayed
– Mary is on Flight 1045
– Does John know that Mary’s flight was
delayed?
FOPC is not sufficient.
Use modal logic
Summary of meaning
representation
• Five requirements:
–
–
–
–
–
Verifiability
Unambiguous representations
Canonical form
Inference
Expressiveness
• Four types of representations:
–
–
–
–
First order predicate calculus (FOPC)
Frame-based representation
Semantic network
Conceptual dependency diagram
Outline
• Meaning representation:
• Semantic analysis: how to form the
formal structures from smaller pieces?
• Lexical semantics:
Semantic analysis
Semantic analysis
• Goal: to form the formal structures from
smaller pieces
• Three approaches:
– Syntax-driven semantic analysis
– Semantic grammars
– Information extraction: filling templates
Syntax-driven approach
• Parsing then semantic analysis, or parsing with
semantic analysis.
• Semantic augmentations to grammars (e.g.,
CFG or LTAG)
– Associate FOPC expression with lexical items
– Use   exp ression
(xP( x))( A)  P( A)
– Use complex-terms
  exp ression
• Sentence: AyCaramba serves meat
• Goal:
eIsA(e, serving )  Server(e, AyCaramba)  Served (e, Meat )
• Augmented rules:
V  serves {xye IsA(e, serving )  Server(e, y )  Served (e, x)}
VP  V NP
{V .sem( NP.sem)}
S  NP VP
{VP.sem( NP.sem)}
NP  N
{N .sem}
N  AyCaramba
{ AyCaramba}
N  meat
{meat}
Quantifiers
• Sentence: A restaurant serves meat
• Goal: ex IsA( x, Re staurant )  IsA(e, Serving )
 Server(e, x)  Served (e, Meat )
• Augmented rules:
Det  a
N ' N
N  restaurant
NP  Det N '
a restaurant 
{}
x IsA( x, N .sem)}
{restaurant}
{Det .sem x N '.sem( x)}
x IsA( x, restaurant )
Complex terms
• Current formula:
e IsA(e, Serving ) 
Server(e, xIsA( x, Re staurant ))  Served (e, Meat )
• Goal:
ex IsA( x, Re staurant )  IsA(e, Serving )
 Server(e, x)  Served (e, Meat )
• What is needed:
( P(...,  x body ,...))  (x body  P(..., x,...))
( P(...,  x body ,...))  (x body  P(..., x,...))
Quantifier scoping
• Sentence: Every restaurant has a menu
• Formula with complex terms
e IsA(e, Having )  Haver(e,  x IsA( x, Re staurant ) )
 Had (e,  y IsA( y, menu) )
• Reading 1:
x IsA( x, Re staurant ) 
e y IsA( y, menu)  IsA(e, Having )  Haver(e, x)  Had (e, y )
• Reading 2:
y IsA( y, menu)  x IsA( x, Re staurant ) 
(e IsA(e, Having )  Haver (e, x)  Had (e, y ))
Semantic analysis
• Goal: to form the formal structures from
smaller pieces
• Three approaches:
– Syntax-driven semantic analysis
– Semantic grammar
– Information extraction: filling templates
Semantic grammar
• Syntactic parse trees only contain parts that are
unimportant in semantic processing.
• Ex: Mary wants to go to eat some Italian food
• Rules in a semantic grammar
– InfoRequest USER want to go to eat FOODTYPE
– FOODTYPENATIONALITY FOODTYPE
– NATIONALITYItalian/Mexican/….
Semantic grammar (cont)
Pros:
• No need for syntactic parsing
• Focus on relevant info
• Semantic grammar helps to disambiguate
Cons:
• The grammar is domain-specific.
Information extraction
• The desired knowledge can be described by a
relatively simple and fixed template.
• Only a small part of the info in the text is relevant
for filling the template.
• No full parsing is needed: chunking, NE tagging,
pattern matching, …
• IE is a big field: e.g., MUC. KnowItAll
Summary of semantic analysis
• Goal: to form the formal structures from
smaller pieces
• Three approaches:
– Syntax-driven semantic analysis
– Semantic grammar
– Information extraction
Outline
• Meaning representation
• Semantic analysis
• Lexical semantics
Lexical semantics
What is lexical semantics?
• Meaning of word: word senses
• Relations among words:
• Predicate-argument structures
• Thematic roles
• Selectional restrictions
• Mapping from conceptual structures to
grammatical functions
• Word classes and alternations
Important resources
•
•
•
•
•
•
•
Dictionaries
Ontology and taxonomy
WordNet
FrameNet
PropBank
Levin’s English verb classes
….
Meaning of words
• Lexeme is an entry in the lexicon that
includes
– Orthographic form
– Phonological form
– Sense: lexeme’s meaning
Relations among lexemes
• Homonyms: same orth. and phon. forms,
but different, unrelated meanings
– bank vs. bank
• Homophones: same phon. different orth
– read vs. red,
to, two, and too.
• Homographs: same orth, different phon.
– bass vs. bass
Polysemy
• Word with multiple but related meanings
– He served his time in prison
– He served as U.N. ambassador
– They rarely served lunch after 3pm.
• What’s the difference between polysemy and
homonymy:
– Homonymy: distinct, unrelated meanings
– Polysemy: distinct but related meanings
– How to decide: etymology, notion of coincidence
Synonymy
• Different lexemes with the same meaning
• Substitutable in some environment:
– How big is that plane?
– How large is that plane?
• What influences substitutablity?
–
–
–
–
Polysemy: big brother vs. large brother
Subtle shade of meaning: first class fare/?price
Colllocational constraints: big/?large mistake
Register: social factors
Hyponymy
• General: hypernym
– “vehicle” is a hypernym of “car”
• Specific: hyponym
– “car” is a hyponym of “vehicle”.
• Test: X is a car implies that X is a vehicle.
Ontology and taxonomy
• Ontology:
– It is a specification of a conceptualization of a knowledge domain
– It is a controlled vocabulary that describes objects and the
relations between them in a formal way, and has strict rules
about how to specify terms and relationships.
• Taxonomy:
– A taxonomy is a hierarchical data structure or a type of
classification schema made up of classes, where a child of a
taxonomy node represents a more restricted, smaller, subclass
than its parent.
– a particular arrangement of the elements of an ontology into a
tree-like class inclusion structure.
WordNet
• Most widely used lexical database for English
• Developed by George Miller etc. at Princeton
• Three databases: Noun, Verb, Adj/Adv
• Each entry in a database: a unique orthographic form +
a set of senses
• Synset: a set of synonyms
• http://www.cogsci.princeton.edu/~wn
WordNet (cont)
• Nouns:
–
–
–
–
Hypernym: meal, lunch
Has-Member: crew, pilot
Has-part: table, leg
Antonym: leader, follower
• Verbs:
– Hypernym: travel, fly
– Entail: snoresleep
– Antonym: increase  decrease
• Adj/Adv:
– Antonym: heavy light, quickly slowly
Lexical semantics
• Meaning of word: word senses
• Relations among words:
• Predicate-argument structures
• Thematic roles
• Selectional restrictions
• Mapping from conceptual structures to grammatical
functions
• Word classes and alternations
Predicate-argument structure
• Predicate-argument:
– Verb/adj as predicate
– Nouns etc. as arguments
– Example: buy(Mary, book)
• Subcategorization frame:
– specify number, position, and syntactic category of arguments
(or complements)
– Example:
• (NP, NP): I want Italian food
• (NP, Inf-VP): I want to save money
• (NP, NP, Inf-VP): I want the book to be delivered tomorrow.
Thematic (Semantic) roles
• A set of roles:
–
–
–
–
–
Agent: the volitional causer of an event
Force: the non-volitional causer of an event
Patient/Theme: the one most directly affected by an event
Experiencer: the experiencer of an event
Others: Instrument, Source, Goal, Beneficiary, …
• Example:
– John broke a glass
– John broke an ankle in the game
Selectional restriction
• Mary ate the cake
• ?The table ate the cake
• Mary ate Italian food with her friends.
• Mary ate somewhere with her friends.
• White house announced that …
• The spider assassinated the fly.
FrameNet
• Developed by Fillmore and Baker at UC
Berkeley since 1997.
• http://www.icsi.berkeley.edu/~framenet
• FrameNet database has two parts:
– Frame database: a list of semantic frames,
and relations between them, such as frame
inheritance and frame composition.
– Lexical database: each entry (called a lexical
unit) is a (lemma, semantic frame) pair.
Semantic frames
• Definition
• Frame elements (FEs): conceptual structure
– Core FEs: Communicator, Medium, Message, Topic
– Non-Core FEs: time, place, manner
•
•
•
•
Inherit from:
Subframes:
Lexical units:
Example sentences:
One frame
• Frame: Communication
– Definition: A Communicator conveys a Message to an
Addressee. the Topic and Medium of the
communication also may be expressed.
– Core FEs: Addressee, Communicator, Medium,
Message, Topic
– Lexical units: communicate, indicate, signal
Another frame
Frame: Statement
– Inherit from: Communication
– Definition: This frame contains verbs and nouns that
communicate the act of a Speaker to address a
Message to some Addressee using language.
– Core FEs: Communicator, Medium, Message, Topic
– Lexical units: admit, affirm, express,….
Project status
• More than 625 semantic frames, 8900
entries in the lexicon.
• Version 1.2 released in June 2005.
• Book: “FrameNet: Theory and Practice”
(printed June 2005)
Proposition Bank (PropBank)
• Developed by Palmer and Marcus at
UPenn.
• http://www.cis.upenn.edu/~ace
• Annotate the English Penn Treebank with
predicate-argument information
• Corpus can be used for automatic labeling
of thematic roles
Semantic tags
• Main tags:
–
–
–
–
Arg0: Agent
Arg1: theme or direct object
Arg2: instrument, indirect object
…
• Secondary tags:
–
–
–
–
–
ArgM-DIR: direction
ArgM-LOC: locative
ArgM-NEG: negation
ArgM-DIS: discourse
…
Semantic tags (cont)
• Main tags are defined based on each verb.
• Example:
– Buy: John bought a book from Mary for 5 dollars
– Sell: Mary sold a book to John for 5 dollars
– Pay: John paid Mary 5 dollars for a book.
Buy
Arg0
buyer
Arg1
thing bought
Arg2
seller
Arg3
price paid
Sell
seller
thing bought
buyer
price paid
Pay
buyer
price paid
seller
thing bought
Lexical semantics
• Meaning of word: word senses
• Relations among words:
• Predicate-argument structures
• Thematic roles
• Selectional restrictions
• Mapping from conceptual structure to
grammatical function
• Word classes and alternations
Mapping between conceptual structure
and grammatical function
• Buy: buyer, thing bought, seller, price,….
• Possible syntactic realizations:
– (buyer, thing bought): John bought a book
– (price, thing bought): $5 can buy two books
– (thing bought, seller): The book was bought from
Mary
– (buyer, thing bought, seller): John bought a book from
Mary.
– **(buyer, price): John bought $5.
Alternations
• An alternation is a set of different
mappings of conceptual roles to
grammatical function.
• Example: dative alternation
– John gave Mary a book
– John gave a book to Mary
• Verb classes: give, donate,
Levin’s verb classes
• Levin (1993):
– Verb classes
– Alternations
– Show the list of alternatives a verb class can take.
• Problems:
– Many verbs appear in multiple classes
– Verbs in the same classes do not behave exactly the
same: e.g, (meet, visit), (give, donate),….
Summary of lexical semantics (1)
• Meaning of word: word senses
• Relations among words:
–
–
–
–
–
–
Homonyms: bank, bank
Homophones: read. red
Homographs: bass, bass
Polysemy:
bank: blood bank, financial bank
Synonyms:
big, large
Hypernym/Hyponym: vehicle, car
• Ontology and taxonomy
• WordNet
Summary of lexical semantics (2)
• Predicate-argument structures
• Thematic roles
• Selectional restrictions
• FrameNet
• PropBank
Summary of lexical semantics (3)
• Mapping from conceptual structures to
grammatical functions
• Word classes and alternations
• Levin’s verb classes for English
Summary of semantics
• Meaning representation:
– Criteria for good representation
– First-order predicate calculus (FOPC)
• Semantic analysis:
– Syntax-based semantic analysis
– Semantic grammar
– Information extraction
• Lexical semantics:
–
–
–
–
WordNet
FrameNet
PropBank
Levin’s verb classes