Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Application of the AR2NL system for
reporting association rules in Finnish
21.10.2004
Emilia Ylirinne
[email protected]
Tampere University of Technology
Introduction




Based on system LISp-Miner
Data of the medical project
STULONG
AR2NL translates association rules
into Czech and English
Translating into Finnish
Topics




Reporting Data Mining results in
Natural Language
System AR2NL
Translating into Finnish
Concluding Remarks
(based on doctor Petr Strossa’s articles)
Reporting Data Mining results in
Natural Language

Association rule φ  ψ
Founded implication φ  ψ  p,n χ

Four-fold contingency table


Example of association rule
ED(univ)  RS(mng)  0.95,76 AJ(sits)

Four-fold table
Natural Language (NL) Formulations

several ways to formulate
1. 76 (i.e. 95 %) of the observed patients confirm
this dependence: if the patient has university
education and responsibility of a manager, then
he mainly sits in his job.
1. 76 (eli 95 %) havainnoiduista potilaista
toteuttaa seuraavan riippuvuuden: jos potilaalla
on korkeakoulutus ja työ johtotehtävissä,
hän istuu enimmäkseen työssään.
Natural Language (NL) Formulations
2. 95 % of the observed patients that have
reached university education and work as a
managerial position also mainly sit in their job.
2. 95 % havainnoiduista potilaista, jotka ovat
saaneet korkeakoulutuksen ja työskentelevät
johtotehtävissä, myös enimmäkseen istuvat
työssään.
Natural Language (NL) Formulations
3. It is characteristic for the patients that have
reached university education and work as a
managerial position that they also have a sedentary
job. This fact is confirmed by 76 (i.e. 95 %)
observed patients.
3. Potilaille, jotka ovat saaneet korkeakoulutuksen
ja työskentelevät johtotehtävissä, on ominaista, että
heillä on myös istumatyö. Tämän toteuttaa 76
(eli 95 %) havainnoitua potilasta.
Natural Language (NL) Formulations

X  Y 0.95,76 Z
1. a (i.e. 100p %) of the observed patients confirm
this dependence: if the patient has NLF(X) and
NLF(Y), then he NLF(Z).
2. 100p % of the observed patients that NLF(X)
and NLF(Y) also NLF(Z).
3. It is characteristic for the patients that NLF(X)
and NLF(Y) that they also have NLF(Z).
This fact is confirmed by a (i.e. 100p %)
observed patients.

Noun phrase, NP
university education
a managerial position
a sedentary job

korkeakoulutus
työ johtotehtävissä
istumatyö
Verb phrase, VP
works as a managerial position
työskentelee johtotehtävissä
has reached university education
on saavuttanut korkeakoulutuksen

Adjectival phrase AP
university-educated

korkeakoulutettu
Participial phrase
working as a manager
johtotehtävissä työskentelevä
mainly sitting in his job
työssään istuva
Natural Language (NL) Formulations
1. a (i.e. 100p %) of the observed patients confirm
this dependence: if the patient has NP(X) and
NP(Y), then he VP(Z).
1. a (eli 100p %) havainnoiduista potilaista
toteuttaa seuraavan riippuvuuden: jos potilaalla
on NP(X) ja NP(Y), hän VP(Z).
Natural Language (NL) Formulations
2. 100p % of the observed patients that VP(X)
and VP(Y) also VP(Z).
2. 100p % havainnoiduista potilaista, jotka VP(X)
ja VP(Y), myös VP(Z).
Natural Language (NL) Formulations
3. It is characteristic for the patients that VP(X)
and VP(Y) that they also have NP(Z).
This fact is confirmed by a (i.e. 100p %)
observed patients.
3. Potilaille, jotka VP(X) ja VP(Y), on ominaista,
että heillä on myös NP(Z). Tämän toteuttaa a
(eli 100p %) havainnoitua potilasta.
Finnish language


Belongs to Uralian family of
languages
More than a dozen cases
(http://www.cs.tut.fi/~jkorpela/finnish-cases.html)

Synthetic language
uses suffixes to express grammatical relations and
also to derive new words
in my house, too
-> talossanikin
after you had written -> kirjoitettuasi
 ”Free” word order
Pete loves Anna - Anna loves Pete
Pete rakastaa Annaa. This is the normal word order, the same as in
English.
Annaa Pete rakastaa. This emphasizes the word Annaa: the object of
Pete's love is Anna, not someone else.
Rakastaa Pete Annaa. This emphasizes the word rakastaa, and such a
sentence might used as a response to some doubt about Pete's love; so
one might say it corresponds to Pete does love Anna.
Pete Annaa rakastaa. This word order might be used, in conjunction
with special stress on Pete in pronunciation, to emphasize that it is
Pete and not someone else who loves Anna.
Annaa rakastaa Pete. This might be used in a context where we
mention some people and tell about each of them who loves them. So
this roughly corresponds to the English sentence Anna is loved by
Pete.
Rakastaa Annaa Pete. This does not sound like a normal sentence,
but it is quite understandable.
source: http://www.cs.tut.fi/~jkorpela/finnish-intro.html
Finnish language




no definite or indefinite article
no grammatical gender
negation, corresponding to English
not, behaves as a verb
ownership or possession (have
and be in English)
I have a dog -> Minulla on koira
("at me (there) is (a) dog")
System AR2NL




Main features
Written in XML standard
Files which contains data needed
in translations
Translates association rules with
founded implication
FP-file




Formulation Patterns
Base of (NL) sentences
File which contains data needed in
translations
Translates association rules with
founded implication
FP-file
FPA-file



Formulation Patterns - Auxiliary
Substitutions for higher-order nonterminal symbols
Variability of sentences
FPA-file
Entitynames-file


Entities
e.g. ”Patient”, ”which”
MN-file




Morphology - Nouns
Language dependent
Singular and plural case endings
7 cases in Czech,
14 cases in Finnish
MN-file
MV-file



Morphology - Verbs
Singular and plural case endings
Participial form and case
Elementary-file



Important part
Contains data of the literals
Noun phrase, Adjectival phrase,
Verb phrase
Elementary-file
Conversion process

example of the process
Problems

word order in participial form
drink beer - drinking beer
juoda olutta - olutta juova

cases in participial form
many cases

ja (and) in logic and in Finnish
Patients drinking beer and smoking mainly sits in
their job.
Olutta juovat ja tupakoivat potilaat istuvan
enimmäkseen työssään

ownership
Concluding Remark

AR2NL system can translate
association rules into Finnish, too