Download Extracting Information from Participial Structures

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Ojibwe grammar wikipedia , lookup

French grammar wikipedia , lookup

Causative wikipedia , lookup

Japanese grammar wikipedia , lookup

Chinese grammar wikipedia , lookup

Old Norse morphology wikipedia , lookup

English clause syntax wikipedia , lookup

Macedonian grammar wikipedia , lookup

Zulu grammar wikipedia , lookup

Navajo grammar wikipedia , lookup

Portuguese grammar wikipedia , lookup

Lithuanian grammar wikipedia , lookup

Germanic weak verb wikipedia , lookup

Esperanto grammar wikipedia , lookup

Polish grammar wikipedia , lookup

Udmurt grammar wikipedia , lookup

Swedish grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

Germanic strong verb wikipedia , lookup

Old English grammar wikipedia , lookup

Spanish grammar wikipedia , lookup

Modern Hebrew grammar wikipedia , lookup

Old Irish grammar wikipedia , lookup

Kagoshima verb conjugations wikipedia , lookup

Inflection wikipedia , lookup

Ukrainian grammar wikipedia , lookup

Italian grammar wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Kannada grammar wikipedia , lookup

Serbo-Croatian grammar wikipedia , lookup

Lexical semantics wikipedia , lookup

Icelandic grammar wikipedia , lookup

Georgian grammar wikipedia , lookup

Yiddish grammar wikipedia , lookup

Latin syntax wikipedia , lookup

Participle wikipedia , lookup

Pipil grammar wikipedia , lookup

Transcript
Extracting Information from
Participial Structures
Kata Gábor, Enikő Héja, Ágnes Mészáros
Research Institute for Linguistics, HAS
8th INTEX WORKSHOP, 2005
STRUCTURE




IE system and its shortage: the
problem of participles
NPs and participles in Hungarian
a possible enhancement of the IE
system
implementation in INTEX
IE system

input text (1-2 sentences of short business news)

shallow syntactic analysis

pre-defined semantic patterns (event frames)


output: event frames’ slots filled by the elements
of the input text
the event, its participants and circumstances are
identified
Event frames
Az ABN Amro Bank egyesül a Kereskedelmi és Hitelbankkal.
ABN Amro Bank fuses with Commercial and Credit Bank.
<event schema="owner_changed.fusion.6" roles_matched="3/3">
<rv role="member_company_1" pos="N" case="NOM" sem="company|institute">
<NP id="88" sem="company countable human institute">
<w id="0" class="DET" at="1-1" lex="az" case="NOM">Az</w>
<w id="2" class="UNKNOWN" at="2-2" lex="ABN">ABN</w>
<w id="4" class="UNKNOWN" at="3-3" lex="Amro">Amro</w>
<w id="6" class="N" at="4-4" lex="bank" case="NOM">Bank</w>
</NP>
</rv>
<rv role="_1" pos="V" lemma="egyesül">
<w id="8" class="V" at="5-5" lex="egyesül">egyesül</w>
</rv>
<rv role="member_company_2" pos="N" case="INS" sem="company|institute">
<NP id="118" sem="company countable institute">
<w id="13" class="DET" at="6-6" lex="a" case="NOM">a</w>
<w id="15" class="ONADJ" at="7-7" lex="kereskedelem"
case="NOM">Kereskedelmi</w>
<w id="17" class="CONJ" at="8-8" lex="és">és</w>
<w id="19" class="N" at="9-9" lex="hitelbank" case="INS">Hitelbankkal.</w>
</NP>
</rv>
</event>
Mapping syntax to event frames
SYNTAX
EVENT FRAMES

verb
main event

arguments
participants

free modifiers
circumstances
(time, location,manner...)
Mapping syntax to event frames
Problem: secondary information (cause or
antecedent of the main event) is ‘hidden’
in participial structures:
[A befektetők által tegnap eladott
részvények] megnövelték a tőzsde
forgalmát.
[The shares sold yesterday by the
investors] increased the traffic at the stock
exchange.
Mapping syntax to event frames
[A befektetők által tegnap eladott részvények] megnövelték a tőzsde
forgalmát.
[The shares sold yesterday by the investors] increased the traffic at
the stock exchange.
a befektetők / the investors /
eladott / sold /
tegnap / yesterday /
részvények / shares /
Mapping syntax to event frames
[A befektetők által tegnap eladott részvények] megnövelték a tőzsde
forgalmát.
[The shares sold yesterday by the investors] increased the traffic at
the stock exchange.
A befektetők tegnap eladtak részvényeket.
The investors sold shares yesterday.
A solution



a preprocessing module within the IE system
which transforms participial structures into
sentences with a finite predicate
semantic frame matching may operate on
transformed sentences
1st step: past participles within NPs
• the participle preserves the meaning of its base verb
• its arguments can be derived from the internal structure
of the NP
NPs in Hungarian 1.
NPs in Hungarian 2.
ADV
NP+case
DET
Participles
AP+case
N+Postp
V.INF
...
(past, present)
modifiers
head Noun
Participles in Hungarian

ADJ – Participle homonimy is a problem:

“mérsékelt PC-chip kereslet”
modest /~moderated/ demand for PC-chips
* Valaki mérsékelte a PC-chip keresletet.
* Somebody moderated the demand for PC-chips

“ragozott szóalakok”
inflected word forms
* Valaki ragozott szóalakokat.
* Somebody inflected word forms.

only participles can be transformed
Participle or Adjective?

syntactic tests





comparative
ADV formation
predicative use
impossibility of preverb detachment
we need to decide in the context whether the given word form
is an ADJ or a PART:
1. If at least one of the base verb’s
complements is present, than it is a
participle.
Participle or Adjective?

syntactic tests





comparative
ADV formation
predicative use
preverb detachment
we need to decide in the context whether the given word form
is an ADJ or a PART:
2. If at least one of the base verb’s
complements / adjuntcs / a preverb is present,
than it is a participle.
Participle or Adjective?

TESTS:
• comparative: “mérsékeltebb kereslet”
more moderate demand
• predicative: “Ez a szóalak ragozott.”
This word form is inflected.
• ADV formation: mérsékelt  mérsékelten
moderate  moderately
• preverb detachment:
“a [fel nem újított] házak”

“the [re- not stored] houses” (=not restored)
* Ezek a házak
[fel nem újítottak].
* These houses are [re- not stored].
THE GRAMMAR
- the correctness and informativity of the resulting
sentence depends on the correct identification of
verbal arguments and modifiers within the NP
- then these elements are transformed according to
their grammatical function
• past participles may be formed from both transitive or
intransitive verbs
• if the base verb is intransitive, the head noun of the NP
represents the subject of the base verb:
“az összedőlt épület” /the collapsed building/
• if the base verb is transitive, the head noun represents the
direct object of the base verb
“a bejelentett változások”
/the changes announced/
 transitivity needs to be coded
THE GRAMMAR

transformation rules are (enhanced) FSTs:
• they store relevant elements of the input NP in
variables
• the output is made up of the content of these
variables but in an altered order + function words
needed in the sentence
• our delaf dictionary codes


transitivity properties of verbs (on the basis of a
lexicon-grammar of verbal argument structures)
+- preverb feature shows whether the base verb has a
preverb
Transformation Graphs 1.
Transitive Verbs


transitive verbs without expressed subject
(“somebody” insertion):
Det
(V_compl)
 Valaki
V_vmib Det
VMIB
N –t
N
(V_compl) .
transitive verbs with a subject with the PostP “által”:
Det
 Nsubj
Nsubj
által
V_vmib
Det
(V_compl)
N –t
VMIB
(V_compl) .
N
Transformation Graphs 2.
Intransitive Verbs

head N becomes subject (patient)
Det
 Det
(V_compl)
N
V_vmib
VMIB
N
(V_compl) .
Structure of the graphs
1 graph
3 subgraphs according to complement-types:
possessor / verbal complement+adjunct /
nothing/
each subgraph divided into two paths:
transitive / intransitive verbs
Evaluation


central aspect: to what extent does it augment the
efficiency of the IE system?
lack of information (recall value) is considered less
important than incorrect information (precision)

evaluated on the 231.000 words corpus of short business
news;

1259 hits  898 qualified as informative

precision: 64%

further task: recall
(requires a corpus with manually annotated
participial structures)
THANK YOU FOR YOUR
ATTENTION!
{gkata, eheja, magnes}@corpus.nytud.hu