Download 19-21 - University of Pittsburgh

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Human-Computer Interaction Institute wikipedia , lookup

Wizard of Oz experiment wikipedia , lookup

AI winter wikipedia , lookup

Philosophy of artificial intelligence wikipedia , lookup

Knowledge representation and reasoning wikipedia , lookup

History of artificial intelligence wikipedia , lookup

Transcript
PROJECT TITLE
ADDING DOMAIN KNOWLEDGE TO INDUCTIVE LEARNING METHODS FOR CLASSIFYING
TEXTS
Kevin D. Ashley
University of Pittsburgh
Learning Research and Development Center
Graduate Program in Intelligent Systems
Contact Information
Kevin D. Ashley
3939 O'Hara Street
Pittsburgh, PA 15213
Phone: (412) 624 -7496
Fax : (412) 624 -9149
Email: [email protected]
http://www.lrdc.pitt.edu/Ashley/Default.htm
WWW PAGE
http://www.pitt.edu/~steffi/CBR/group.html
List of Supported Students and Staff (optional)
Stefanie Brüninghaus, GRA, University of Pittsburgh Graduate Program in Intelligent Systems
Project Award Information
l
l
l
Award Number: NSF IIS-9987869
Duration: 9/1/2000 -- 8/31/2001
Title: Adding Domain Knowledge to Inductive Learning Methods for Classifying Texts
Keywords
case-based reasoning (CBR), automated case indexing, automated text classification, knowledge-guided machine learning,
text-oriented CBR, factor-based text classification, legal information retrieval
Project Summary
The work improves current methods for learning to classify texts by incorporating knowledge from an expert domain model.
The goal is automatically to classify the texts of legal opinions in terms of the factors that apply to the cases described.
Factors, stereotypical fact patterns tending to strengthen or weaken the underlying legal claims in a case, and their relations to
legal issues, are a kind of expert domain knowledge useful in legal argumentation. The program takes as inputs the raw texts
of legal opinions and assigns as outputs the applicable factors. The program's training instances are drawn from a corpus of
legal opinions whose textual descriptions of cases have been represented manually in terms of factors. The problem is hard
because the language of the opinions is complex; the mere fact that an opinion discusses factors does not necessarily imply
that those factors actually apply to the case. We employ ID3 to induce decision trees for classifying by factors. We are
exploring means of using information extraction techniques and certain linguistic information (e.g., about negation) to
improve the text representation and classification performance.
Publications and Products
l
Papers in Conference and Workshop Proceedings.
Ashley, K.D., and St. Bruninghaus (1998) Developing Mapping and Evaluation Techniques for Textual CBR. In:
Proceedings of the AAAI-98 Workshop on Textual Case-Based Reasoning. Pages 20 - 23. AAAI Technical Report
WS-98 -12. AAAI Press, Menlo Park, CA.
Bruninghaus, S. and Ashley, K.D. (2001). "Improving the Representation of Legal Case Texts with Information
Extraction Methods." To appear in Proceedings, Eighth International Conference on Artificial Intelligence and Law,
Association of Computing Machinery, New York. St. Louis. May.
Bruninghaus, S. and Ashley, K.D. (1999a). "Toward Adding Knowledge to Learning Algorithms for Indexing Legal
Cases," In Proceedings, Seventh International Conference on Artificial Intelligence and Law, Association of
Computing Machinery, New York. Oslo. June. Donald H. Berman Award for Best Student Paper.
http://www.pitt.edu/~steffi/papers/icail99.ps.
Bruninghaus, S. and Ashley, K.D. (1999b). "Bootstrapping Case Base Development with Annotated Case
Summaries," In Proceedings of the Third International Conference On Case-Based Reasoning. Munich, Germany.
July. Outstanding Research Paper Award.
http://www.pitt.edu/~steffi/papers/iccbr99.ps.
Bruninghaus, St., and K.D. Ashley (1998a) Evaluation of Textual CBR Approaches. In: Proceedings of the AAAI-98
Workshop on Textual Case-Based Reasoning. Pages 30-34. AAAI Technical Report WS-98-05. AAAI Press, Menlo
Park, CA.
Bruninghaus, St., and K.D. Ashley (1998b) How Machine Learning Can be Beneficial for Textual Case-Based
Reasoning. In: Proceedings of the AAAI-98/ICML-98 Workshop on Learning for Text Categorization. Pages 71-74.
AAAI Technical Report WS-98-05. AAAI Press, Menlo Park, CA.
l Invited Talks
Ashley, K. D. (2000) Applying Textual Case-Based Reasoning and Information Extraction in Lessons Learned
Systems in Papers from the AAAI Workshop on Intelligent Lessons Learned Systems. Technical Report WS-00-03.
pp. 1-4. AAAI Press. Menlo Park, CA.
Ashley, K.D. (1999) Progress in Text-Based Case-Based Reasoning. Invited Talk for the Third International
Conference on Case-Based Reasoning. Seon, Germany.
http://www.lrdc.pitt.edu/Ashley/TalkOverheads_files/v3_document.htm
Bruninghaus, St. (1998) Case-Based Reasoning From Textual Documents. Invited Talk at the Sixth German
Workshop on Case-Based Reasoning. Extended Abstract published in: Proceedings of the Sixth German Workshop
on Case-Based Reasoning (GWCBR-98). Pages 55-58. Berlin, Germany. http://www.pitt.edu/~steffi/papers/slidesgwcbr98.ps
Project Impact
The project has enabled a graduate student in the University of Pittsburgh Graduate Program in Intelligent Systems to pursue
her ideal research topic. Stefanie Brüninghaus is performing her Ph.D. dissertation project with this funding and plans to
enter academia in AI/Computer Science, a field that needs more female faculty. This funding has already bolstered Ms.
Brüninghaus’ professional experience and exposure with an invited talk and two "Best Paper" awards. The funding has
enabled me to hire a number of law student assistants who have gained exposure and interest in computational approaches to
dealing with legal texts. Finally, the work will enable us to improve the intelligent tutoring system CATO, which teaches law
students basic skills of legal argument, and to expand its database to include other legal domains.
Goals, Objectives, and Targeted Activities
We are exploring how to use information extraction methods, especially AutoSlog, and certain linguistic information (e.g.,
about negation) to improve the representation of the legal case texts. Specifically, we are testing hypotheses that (1)
abstracting from the individual actors and events in cases, (2) capturing actions in multi-word features, and (3) recognizing
negation, can improve the text representation and classification performance.
Project References (See Publications and Products above)
Area Background
Previously, we developed an expert model of case-based reasoning, which is the basis for an intelligent tutoring system to
teach law students argumentation with previous cases available as texts. The texts are legal opinions in which judges record
their decisions and rationales for litigated disputes. We have compiled a large corpus of full-text descriptions of cases and a
parallel abstract representation of some important aspects of those cases which capture their content and meaning. Our model
of expert legal reasoning relates a set of factors, stereotypical factual strengths and weaknesses which tend to strengthen or
weaken a legal claim, with the more abstract legal issues to which the factors are relevant. The evidence that factors apply to
a given case are passages in the text of the opinions. We have constructed these resources in building the CATO program, an
NSF PYI-supported intelligent tutoring environment designed to teach law students to make arguments with cases. CATO's
Factor Hierarchy relates factors to more aggregated concepts and ultimately to legal issues raised by the legal claim. Together
factors and the Factor
Hierarchy enable CATO to generate examples of legal arguments and to provide some feedback on a students' work. We
think that using the representation as guidance, a machine learning program trained on the corpus could learn to classify
which factors and issues apply in new cases presented as raw texts.
Area References
Rissland, E. L. and Daniels, J. (1995) "A Hybrid CBR-IR Approach to Legal Information Retrieval." In Proceedings of the
Fifth International Conference on AI and Law, (ICAIL-95), pp. 52-61. ACM-Press: New York, NY.
Smith, J.C., Gelbart, D., MacKimmon, K., Atherton, B., McClean, J., Shinehoft, M. and Quintana, L. (1995). "Artificial
Intelligence and Legal Discourse: The Flexlaw Legal Text Management System". In Artificial Intelligence and Law, Volume
3, Number 1, pp. 55-95. Kluwer Academic Publishers: Dordrecht, The Netherlands.
Potential Related Projects
To be determined.