Download writeup - Courses

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Serbo-Croatian grammar wikipedia , lookup

Zulu grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

French grammar wikipedia , lookup

Portuguese grammar wikipedia , lookup

Chinese grammar wikipedia , lookup

Modern Hebrew grammar wikipedia , lookup

Esperanto grammar wikipedia , lookup

Preposition and postposition wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Yiddish grammar wikipedia , lookup

Turkish grammar wikipedia , lookup

Polish grammar wikipedia , lookup

Latin syntax wikipedia , lookup

Spanish grammar wikipedia , lookup

Pipil grammar wikipedia , lookup

Transcript
ANLP: Assignment 2
Andrew T. Fiore
29 September 2004
Grabbing sentences containing "help," "helps," "helping," and "helped" from the Treebank corpus, 100
documents. These searches find 34 distinct sentences with one or more of these words occurring.
Here are my modified chunking rules. Following the rules are some sentences from the "help" set,
chunked both by the basic rules and by my modified rules. (There are no line breaks in the rules, but
they will probably wrap, depending on how you are viewing this file.)
NP-Chunks:
(<DT>?<RB>?)?((<NN.*|POS|CD|PRP\$|\$>|(<JJ.*|RBR>(<,>)?)))*(<NN.*|POS|CD|WP|PRP>)+
PP-Chunks:
(<IN|TO>)+<NP>(((<,>)?<NP>)*(<,>)?<CC><NP>)?
VP-Chunks:
(<MD|TO>)?(<VB.*>)+(<IN>)?(<TO>)?(<VB.*>)*(<RP|RB.*>)?(<NP>|<PP>)*(<IN>)?(<RP|RB.*>)?
Clearly, NP-chunks are the most complicated, followed by VP-chunks. The amount of variation in
describing things in the world is tremendous, so NP-chunks can have complex and lengthy structure. I
decided that appositives should each constitute their own NPs to keep chunks simple and relatively
atomic. To improve accuracy, I added many tags that can sometimes act like adjectives (CD, PRP$, $,
RBR) to the adjective portion the RegExp, and many tags that can act like nouns (CD, WP, PRP) to the
noun portion. I also allowed for commas in NPs only after JJs and RBRs, to avoid overrunning the
boundaries of the NP, as I saw in some test sentences with the basic rules.
To the PP chunker I added the possibility of <TO> to the other prepositions <IN>, because in the Penn
treebank tags, <TO> indicates not only the infinitive "to" but also the preposition.
(See http://www.scs.leeds.ac.uk/amalgam/tagsets/upenn.html)
Also, I added the possibility of compound objects of the preposition: one or more NPs connected by
commas or, for the last NP in a series, a coordinating conjunction.
The major modification to the VP chunking was to allow for prepositions (for verbs like "step out") and TO
infinitive markers. See sample sentences below for some examples of how the prepositions are
matched. Also, I added adverbs (RB.*), where before were only particles (RP), which serve largely the
same function within VPs.
COMPARISON SENTENCE #1
Here's a sentence chunked with the provided basic rules. The possessive pronoun "its" should be
attached to the noun it modifies ("chairman"). The complicated verb phrase, "is stepping in to help
turn...around," is only partially captured because of the presence of "in," a preposition that functions more
as as part of the verb, "stepping in," and as such has no prepositional phrase of its own, and the <to/TO>
infinitive indicator. Finally, the verb phrase at the end, "said," is not recognized because the rules don't
allow for a VP with no direct object following the verb.
Base:
(S:
(NP: <Magna/NNP> <International/NNP> <Inc./NNP>)
<'s/POS>
(NP: <chief/JJ> <financial/JJ> <officer/NN>)
<,/,>
(NP: <James/NNP> <McAlpine/NNP>)
<,/,>
<resigned/JJ>
<and/CC>
<its/PRP$>
(NP: <chairman/NN>)
<,/,>
(NP: <Frank/NNP> <Stronach/NNP>)
<,/,>
<is/VBZ>
<stepping/VBG>
<in/IN>
<to/TO>
(VP:
<help/VB>
<turn/VB>
(NP: <the/DT> <automotive-parts/JJ> <manufacturer/NN>))
<around/IN>
<,/,>
(NP: <the/DT> <company/NN>)
<said/VBD>
<./.>)
With my new rules, the chunking is more correct:
Andrew:
(S:
(NP:
<Magna/NNP>
<International/NNP>
<Inc./NNP>
<'s/POS>
<chief/JJ>
<financial/JJ>
<officer/NN>)
<,/,>
(NP: <James/NNP> <McAlpine/NNP>)
<,/,>
<resigned/JJ>
<and/CC>
(NP: <its/PRP$> <chairman/NN>)
<,/,>
(NP: <Frank/NNP> <Stronach/NNP>)
<,/,>
(VP:
<is/VBZ>
<stepping/VBG>
<in/IN>
<to/TO>
<help/VB>
<turn/VB>
(NP: <the/DT> <automotive-parts/JJ> <manufacturer/NN>)
<around/IN>)
<,/,>
(NP: <the/DT> <company/NN>)
(VP: <said/VBD>)
<./.>)
With my rules, the compound noun phrase with POS, "Magma International Inc.'s", is correctly associated
with the NP ending in "officer." The possessive pronoun "its" is correctly associated with the noun it
modifies, "chairman." "Resigned" still isn't identified as a VP, but that's because it is mis-tagged as JJ
instead of VBD. The large VP is correctly identified as one chunk, including the preposition "around" at
the end of the VP, which is part of the expression "turn around," so it doesn't belong in a separate PP.
There are many examples of this in informal English, so I thought it was reasonable to add an optional IN
to the end of a VP. This is predicated on the PP-chunker running BEFORE the VP-chunker, so that
prepositions that are part of a legitimate PP will be tagged as such first.
I made the direct object following a verb optional: (<NP>|<PP>)* This allows intransitive verbs and, as
in this sentence, transitive verbs with a relocated object to be chunked correctly.
COMPARISON SENTENCE #2
Base:
BASE:
(S:
(PP:
<As/IN>
(NP: <a/DT> <Foster/NNP> <Corporate/NNP> <Parent/NNP>))
<,/,>
<you/PRP>
(VP: <will/MD> <experience/VB> (NP: <the/DT> <same/JJ> <joy/NN>))
(VP: <felt/VBD> (PP: <by/IN> (NP: <Robert/NNP> <Bass/NNP>)))
<,/,>
(NP: <Lewis/NNP> <Ranieri/NNP>)
<,/,>
(NP: <William/NNP> <Simon/NNP>)
<and/CC>
(NP: <others/NNS>)
<,/,>
<who/WP>
(VP: <find/VBP> (NP: <ways/NNS>))
<to/TO>
(VP:
<help/VB>
<troubled/VBN>
(NP: <savings/NNS> <institutions/NNS>))
<and/CC>
<their/PRP$>
(NP: <employees/NNS>)
<help/VB>
<themselves/PRP>
<./.>)
In the chunking with the basic rules, the pronoun "you" isn't identified as a noun chunk. The PP
beginning with "by" is linked with only the first of its four objects. The <who/WP> is not identified as a
form of noun. The TO in "to help" is not included with the VP whose infinitive it introduces. Finally, the
personal pronouns "their" and "themselves" are not treated properly as an adjective and a direct object,
respectively.
Andrew:
(S:
(PP:
<As/IN>
(NP: <a/DT> <Foster/NNP> <Corporate/NNP> <Parent/NNP>))
<,/,>
(NP: <you/PRP>)
(VP: <will/MD> <experience/VB> (NP: <the/DT> <same/JJ> <joy/NN>))
(VP:
<felt/VBD>
(PP:
<by/IN>
(NP: <Robert/NNP> <Bass/NNP>)
<,/,>
(NP: <Lewis/NNP> <Ranieri/NNP>)
<,/,>
(NP: <William/NNP> <Simon/NNP>)
<and/CC>
(NP: <others/NNS>)))
<,/,>
(NP: <who/WP>)
(VP: <find/VBP> (NP: <ways/NNS>))
(VP:
<to/TO>
<help/VB>
<troubled/VBN>
(NP: <savings/NNS> <institutions/NNS>))
<and/CC>
(NP: <their/PRP$> <employees/NNS>)
(VP: <help/VB> (NP: <themselves/PRP>))
<./.>)
My chunking correctly identifies "you" as a noun. It groups the compound objects of <by/IN> as one PP.
It treats the pronoun <who/WP> as part of a NP. It correctly pairs <to/TO> with its VP. Finally, it
properly treats "their" and "themselves" as adjective and direct object, respectively.
COMPARISON SENTENCE #3
Base:
(S:
(NP: <The/DT> <show/NN>)
<did/VBD>
<n't/RB>
(VP:
<give/VB>
(NP: <the/DT> <particulars/NNS>)
(PP: <of/IN> (NP: <Mrs./NNP> <Yeargin/NNP>)))
<'s/POS>
(NP: <offense/NN>)
<,/,>
<saying/VBG>
<only/RB>
<that/IN>
<she/PRP>
(VP: <helped/VBD> (NP: <students/NNS>))
<do/VB>
<better/RBR>
(PP: <on/IN> (NP: <the/DT> <test/NN>))
<./.>)
The basic chunking, by not allowing for the many possible positions and tag variants for adverbs, fails to
chunk most of the VPs in this sentence. Also, the POS after "Mrs. Yeargin" is detached from her NP.
Andrew:
(S:
(NP: <The/DT> <show/NN>)
(VP: <did/VBD> <n't/RB>)
(VP:
<give/VB>
(NP: <the/DT> <particulars/NNS>)
(PP:
<of/IN>
(NP: <Mrs./NNP> <Yeargin/NNP> <'s/POS> <offense/NN>)))
<,/,>
(VP: <saying/VBG> <only/RB> (PP: <that/IN> (NP: <she/PRP>)))
(VP: <helped/VBD> (NP: <students/NNS>))
(VP: <do/VB> <better/RBR> (PP: <on/IN> (NP: <the/DT> <test/NN>)))
<./.>)
With my rules, the VPs with adverbs are joined more appropriately: "didn't," "saying only that she,"
"helped students," "do better on the test." The final three VPs could plausibly be treated as one or two,
but it depends on how fine-grained a chunking the user wants. Also, the possessive is correctly attached
to "Mrs. Yeargin"; together, they act as an adjective modifying "offense."
OCCURRENCES OF "HELP"
Breakdown by syntactic structure:
(see below for listing of VPs with "help")
1. help VB (infinite with implied TO, "help keep stock") (15 instances)
2. help NP (direct object, "help the kids") (10 instances)
3. helped PP (passive voice with prepositional phrase, "helped by strength") (2 instances)
4. help TO VB (infinitive, "helping to limit") (1 instance)
5. help (no object) (1 instance)
6. help PP (no object, but PP functions like infinitive, "help in saving money") (1 instance)
The subjects of "help" are structurally very diverse. Sometimes the subject is a simple noun or pronoun,
including personal pronouns, or perhaps the impersonal construction "it helps." On the other hand, since
this corpus consisted of many news articles, I also saw elaborate noun phrases, usually representing
organizations or abstract forces, as the subject.
Semantic structure:
(see below for listing of VPs with "help")
"Help" occurs with semantically diverse words. Predictably, some occurrences, especially those with
syntactic structure 2 from the above list (taking a direct object), involve providing aid of some sort. In
several cases in my test sentences, the object was youth: kids, students, (indirectly) juvenile diabetes.
Several of these occurrences come from the same article or series of articles, about a teacher who
apparently gave illicit aid to some students on a test.
With syntactic structures 1 and 4 (see above), the word "help" loses some of its meaning. In these cases,
it carries the connotation not only of literally providing aid but also of incompleteness -- the subject is not
performing the action thoroughly, or alone, or until completion, e.g.: "Policy makers regard the youth
wage as helping to limit the loss of jobs from an increase in the minimum wage , but they have lately
touted it as necessary to help impart job skills to entrants into the work force ."
The subjects of "help" are diverse, as the word is common and somewhat generic. Both people and
inanimate objects (often abstract events or conditions, e.g., "the subsequent flood of ... orders") occur as
the subject.
The objects of "help" are also diverse, but there are many occurrences of the young or underprivileged as
objects of the word. Often the grammatical object of "help" is an infinitive phrase (with or without an
explicit TO), so, semantically, its object is the notion of some action, event, or outcome.
SEMANTIC LISTING
will help meet increasing
is stepping in to help turn the automotive-parts manufacturer around
help build a larger version of its popular 767 twin-jet
will help turn Southeast Asia into a more cohesive economic region
was trying to help kids in an unfair testing situation
wanted to help lift Greenville High School
help the poor underprivileged child
wanted to help
could help save her teaching certificate
could help treat juvenile diabetes
help develop criminal cases
help Georgia Gulf restructure
help clear the myriad obstacles
help in saving money
will help keep a needy savings
help troubled savings institutions
help themselves
was needed to help keep stock
will help the overall market
help it
helping to limit the loss of jobs from an increase in the minimum wage
help impart job skills
helping her
'll be helping a neighborhood S&L in areas
helped by strength in the defense capital goods sector
helped students
helped much by the announcement
has helped underpin the dollar against the yen
has helped lure investors
have helped ratings at
SYNTACTIC LISTING
(VP: <will/MD> <help/VB> <meet/VB> <increasing/VBG>)
(VP:
<is/VBZ>
<stepping/VBG>
<in/IN>
<to/TO>
<help/VB>
<turn/VB>
(NP: <the/DT> <automotive-parts/JJ> <manufacturer/NN>)
<around/IN>)
(VP:
<help/VB>
<build/VB>
(NP: <a/DT> <larger/JJR> <version/NN>)
(PP: <of/IN> (NP: <its/PRP$> <popular/JJ> <767/CD> <twin-jet/NN>)))
(VP:
<will/MD>
<help/VB>
<turn/VB>
(NP: <Southeast/NNP> <Asia/NNP>)
(PP:
<into/IN>
(NP: <a/DT> <more/RBR> <cohesive/JJ> <economic/JJ> <region/NN>)))
(VP:
<was/VBD>
<trying/VBG>
<to/TO>
<help/VB>
(NP: <kids/NNS>)
(PP: <in/IN> (NP: <an/DT> <unfair/JJ> <testing/NN> <situation/NN>)))
(VP:
<wanted/VBD>
<to/TO>
<help/VB>
<lift/VB>
(NP: <Greenville/NNP> <High/NNP> <School/NNP>))
(VP:
<help/VB>
(NP: <the/DT> <poor/JJ> <underprivileged/JJ> <child/NN>))
(VP: <wanted/VBD> <to/TO> <help/VB>)
(VP:
<could/MD>
<help/VB>
<save/VB>
(NP: <her/PRP> <teaching/NN> <certificate/NN>))
(VP:
<could/MD>
<help/VB>
<treat/VB>
(NP: <juvenile/JJ> <diabetes/NN>))
(VP: <help/VB> <develop/VB> (NP: <criminal/JJ> <cases/NNS>))
(VP: <help/VB> (NP: <Georgia/NNP> <Gulf/NNP> <restructure/NN>))
(VP: <help/VB> <clear/VB> (NP: <the/DT> <myriad/JJ> <obstacles/NNS>))
(VP: <help/VB> <in/IN> <saving/VBG> (NP: <money/NN>))
(VP:
<will/MD>
<help/VB>
<keep/VB>
(NP: <a/DT> <needy/JJ> <savings/NNS>))
(VP: <help/VB> <troubled/VBN> (NP: <savings/NNS> <institutions/NNS>))
(VP: <help/VB> (NP: <themselves/PRP>))
(VP:
<was/VBD>
<needed/VBN>
<to/TO>
<help/VB>
<keep/VB>
(NP: <stock/NN>))
(VP: <will/MD> <help/VB> (NP: <the/DT> <overall/JJ> <market/NN>))
(VP: <help/VB> (NP: <it/PRP>))
(VP:
<helping/VBG>
<to/TO>
<limit/VB>
(NP: <the/DT> <loss/NN>)
(PP: <of/IN> (NP: <jobs/NNS>))
(PP: <from/IN> (NP: <an/DT> <increase/NN>))
(PP: <in/IN> (NP: <the/DT> <minimum/JJ> <wage/NN>)))
(VP: <help/VB> <impart/VB> (NP: <job/NN> <skills/NNS>))
(VP: <helping/VBG> (NP: <her/PRP>))
(VP:
<'ll/MD>
<be/VB>
<helping/VBG>
(NP: <a/DT> <neighborhood/NN> <S&L/NN>)
(PP: <in/IN> (NP: <areas/NNS>)))
(VP:
<helped/VBN>
(PP: <by/IN> (NP: <strength/NN>))
(PP:
<in/IN>
(NP: <the/DT> <defense/NN> <capital/NN> <goods/NNS> <sector/NN>)))
(VP: <helped/VBD> (NP: <students/NNS>))
(VP:
<helped/VBN>
<much/RB>
(PP: <by/IN> (NP: <the/DT> <announcement/NN>)))
(VP:
<has/VBZ>
<helped/VBN>
<underpin/VB>
(NP: <the/DT> <dollar/NN>)
(PP: <against/IN> (NP: <the/DT> <yen/NN>)))
(VP: <has/VBZ> <helped/VBN> (NP: <lure/NN> <investors/NNS>))
(VP: <have/VBP> <helped/VBN> (NP: <ratings/NNS>) <at/IN>)
SENTENCE LISTING
The new plant , located in Chinchon about 60 miles from Seoul , will help meet increasing and diversifying
demand forcontrol products in South Korea , the company said .
Magna International Inc. 's chief financial officer , James McAlpine , resigned and its chairman , Frank
Stronach , is stepping in to help turn the automotive-parts manufacturer around , the company said .
Boeing Co. said it is discussing plans with three of its regular Japanese suppliers to possibly help build a
larger version of its popular 767 twin-jet .
Japanese money will help turn Southeast Asia into a more cohesive economic region .
After numerous occurrences of questionable teacher help to students , Texas is revising its security
practices .
`` I was trying to help kids in an unfair testing situation , '' she says .
'' Mrs. Yeargin says that she also wanted to help lift Greenville High School 's overall test scores , usually
near the bottom of 14 district high schools in rankings carried annually by local newspapers .
They found students in an advanced class a year earlier who said she gave them similar help , although
because the case was n't tried in court , this evidence was never presented publicly .
`` That pretty much defeats any inkling that she was out to help the poor underprivileged child , '' says Joe
Watson , the prosecutor in the case , who is also president of Greenville High School 's alumni
association .
Mrs. Yeargin concedes that she went over the questions in the earlier class , adding : `` I wanted to help
all '' students .
She says she offered Mrs. Yeargin a quiet resignation and thought she could help save her teaching
certificate .
Medical researchers believe the transplantation of small amounts of fetal tissue into humans could help
treat juvenile diabetes and such degenerative diseases as Alzheimer 's , Parkinson 's and Huntington 's .
But they have obtained 8300 forms without court permission and used the information to help develop
criminal cases .
The offer follows an earlier proposal by NL and Mr. Simmons to help Georgia Gulf restructure or go
private in a transaction that would pay shareholders $ 55 a share .
But for small American companies , it also provides a growing source of capital and even marketing help .
Partly to help clear the myriad obstacles facing any overseas company trying to penetrate Japan , tiny
Candela turnedto Mitsui & Co. , one of Japan 's largest trading companies , for investment .
The program not only offers a pre-approved car loan up to $ 18,000 , but throws in a special cash-flow
statement to help in saving money .
Your $ 15,000 will help keep a needy savings and loan solvent -- and out of the federal budget deficit .
As a Foster Corporate Parent , you will experience the same joy felt by Robert Bass , Lewis Ranieri ,
William Simon and others , who find ways to help troubled savings institutions and their employees help
themselves .
Do n't wait -- a savings institution needs your help now !
But when the contract reopened , the subsequent flood of sell orders that quickly knocked the contract
down to the 30-point limit indicated that the intermediate limit of 20 points was needed to help keep stock
and stock-index futures prices synchronized .
But Mr. Boesel of T. Rowe Price , who also expects 12 % growth in dividends next year , does n't think it
will help the overall market all that much .
The company , which recently said it lacked the profits and capital to pay dividends on its Series A
convertible preferred stock , said it has hired an investment banker to help it raise additional cash .
Policy makers regard the youth wage as helping to limit the loss of jobs from an increase in the minimum
wage , but they have lately touted it as necessary to help impart job skills to entrants into the work force .
Mrs. Yeargin 's extra work was also helping her earn points in the state 's incentive-bonus program .
As a Foster Corporate Parent , you 'll be helping a neighborhood S&L in areas crucial to its survival .
Policy makers regard the youth wage as helping to limit the loss of jobs from an increase in the minimum
wage , but they have lately touted it as necessary to help impart job skills to entrants into the work force .
Manufacturers ' backlogs of unfilled orders rose 0.5 % in September to $ 497.34 billion , helped by
strength in the defense capital goods sector .
The show did n't give the particulars of Mrs. Yeargin 's offense , saying only that she helped students do
better on the test .
The stocks of banking concerns based in Massachusetts were n't helped much by the announcement ,
traders said , because many of those concerns have financial problems tied to their real-estate loan
portfolios , making them unattractive takeover targets .
While market sentiment remains cautiously bearish on the dollar based on sluggish U.S. economic
indicators , dealers note that Japanese demand has helped underpin the dollar against the yen and has
kept the U.S. currency from plunging below key levels against the mark .
Jay Goldinger , with Capital Insight Inc. , reasons that while the mark has posted significant gains against
the yen as well -- the mark climbed to 77.70 yen from 77.56 yen late Tuesday in New York -- the strength
of the U.S. bond market compared to its foreign counterparts has helped lure investors to
dollar-denominated bonds , rather than mark bonds.
The reruns have helped ratings at many of the 187 network affiliates and independent TV stations that air
the shows .