Download Advanced JAPE

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Advanced JAPE
Mark A. Greenwood
University of Sheffield NLP
Recap
• Installed and run GATE
• Understand the idea of
 LR – Language Resources
 PR – Processing Resources
• ANNIE
 Understand the goals of information extraction
 Loaded ANNIE into GATE
 Constructed one or more gazetteer lists
• Created JAPE rules with simple RHS
University of Sheffield NLP
Overview
•
•
•
•
•
Simple RHS Limitations
The RHS API
Accessing Annotations and Features
Adding New Annotations
Hands-On
University of Sheffield NLP
Simple RHS Limitations
• The simple RHS of a JAPE rule can only add
simple annotations and features
 Feature values are hard coded or can be copied from
annotations matched by the LHS
• You may need more complex processing
 Removing temporary annotations
 Building complex features
 ...
• Fortunately the RHS of a rule can consist of
arbitrary Java code – the possibilities are endless!
University of Sheffield NLP
The RHS API
• Java code provided as a RHS is used as
the body of this method:
public void doit(Document doc,
Map bindings,
AnnotationSet annotations,
AnnotationSet inputAS,
AnnotationSet outputAS,
Ontology ontology)throws JapeException
• This provides easy access to the document,
rule bindings and annotations.
DO NOT USE annotations IT IS DEPRECATED!
University of Sheffield NLP
Accessing Annotations
and Features
• Each labelled section of the LHS results in
an Annotation Set
• These Annotation Sets can be retrieved
from the bindings map
AnnotationSet set =
(AnnotationSet)bindings.get("labelname");
University of Sheffield NLP
Accessing Annotations
and Features
• When writing complex JAPE you will often
need to access annotation features
• All features of an annotation are stored in
a map
FeatureMap map = annotation.getFeatures()
• Each feature is accessed by name
Object obj = map.get(“featurename”)
University of Sheffield NLP
Adding New Annotations
• New annotations should always be
created in the outputAS
• To create an annotation you need
 The annotation name
 The start and end offset
 A FeatureMap instance (can be empty)
outputAS.add(start,end,label,features)
University of Sheffield NLP
Shorthand Notation for
JAVA RHS
• Where a Java block refers to a single lefthand-side binding, JAPE provides a
shorthand notation:
Rule: RemoveDoneFlag
(
{Instance.flag == "done"}
):inst -->
:inst{
Annotation theInstance =
(Annotation)instAnnots.iterator().next();
theInstance.getFeatures().remove("flag");
}
University of Sheffield NLP
Shorthand Notation for
JAVA RHS
• A label :<label> on a Java block creates a
local variable <label>Annots within the
Java block which is the AnnotationSet
bound to the <label> label.
• The Java code in the block is only
executed if there is at least one annotation
bound to the label
University of Sheffield NLP
Hands On:
Extending the IE Example
• In the previous JAPE session you wrote a
rule to annotate phrases such as
 Whitbread shares closed up 2p at 645p.
• Annotating the phrase is useful but there is
lots of information which would be useful
to extract as features
 Starting price
 Change in price
 Closing price
University of Sheffield NLP
Hands On:
Extending the IE Example
• You will need to
 Extract the closing price and change
• assume they are always in pence so you can get
the value by removing the trailing ‘p’
 Get the minorType of the Lookup
 Calculate the starting price
 Create a new annotation with these values as
features
Your Turn!
Feel Free To Refer To The User Guide
And To Ask For Help
University of Sheffield NLP
Hands On:
Extending the IE Example
Phase: Shares
Input: Token Organization Lookup Money
Options: control = appelt
Rule:ShareChange
(
{Organization}
({Token})[0,3]
({Lookup.majorType=="change"}):lookup
({Token})[0,3]
({Money}):delta
{Token.string == "at"}
({Money}):closing
):change -->
{
try {
AnnotationSet change = (AnnotationSet)bindings.get("change");
Annotation delta = ((AnnotationSet)bindings.get("delta")).iterator().next();
Annotation closing = ((AnnotationSet)bindings.get("closing")).iterator().next();
boolean rise = ((AnnotationSet)bindings.get("lookup")).iterator().next().getFeatures().get("minorType").equals("Changes-up");
int deltaValue = Integer.parseInt(doc.getContent().getContent(delta.getStartNode().getOffset(),delta.getEndNode().getOffset()-1).toString());
int closingValue = Integer.parseInt(doc.getContent().getContent(closing.getStartNode().getOffset(),closing.getEndNode().getOffset()-1).toString());
int startValue = (rise ? closingValue - deltaValue : closingValue + deltaValue);
FeatureMap features = Factory.newFeatureMap();
features.put("rule","ShareChange");
features.put("opening",startValue+"p");
features.put("change",deltaValue+"p");
features.put("closing", closingValue+"p");
features.put("direction", (rise ? "up" : "down"));
outputAS.add(change.firstNode(),change.lastNode(),"ShareChange",features);
}
catch (Exception e) {
// ignore this for now
}
}