Download Metody zpracování informací - Knowledge Engineering Group

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
SEWEBAR - a Framework for
Creating and Dissemination of
Analytical Reports from Data Mining
Jan Rauch, Milan Šimůnek
University of Economics, Prague, Czech Republic
SEWEBAR - a Framework for Creating and
Dissemination of Analytical Reports from Data Mining

Starting points

Principles (as seen now)

Simple examples

First steps
SEWEBAR
2
SEWEBAR – Starting points (1)

Several similar mining problems a la STULONG: ADAMEK, TINITUS
HEPATITIS, SOCIOLOGY, …:


Cca. 100 - 300 attributes

thousands of objects (usually patients)

domain expert (non informatics) available

some (this time relatively simple) background knowledge available
Reasonable result form is a well structured analytical report that must be

created

stored

retrieved

disseminated

used to answer more complex analytical questions
SEWEBAR
3
SEWEBAR – Starting points (2)

Some results concerning partial related projects

Report assistant (it works)

AR2NL (successful experiment)

EverMiner (considerations)

SEWEBAR (considerations)

observational calculi

Grants: LISp, Czech Science Foundation (GAČR), Kontakt, CBI, ??

Students can contribute (4IZ460, 4IZ210, ? )

Dealing with knowledge and semantics „is in“ (see e.g. „10 Challenging
problems in Data Mining Research“ - http://www.cs.uvm.edu/~icdm/)
SEWEBAR
4
SEWEBAR – inspiration by Semantic Web
(SEmantic WEB and Analytical Reports)
SEWEBAR
5
SEWEBAR – Principles (1)


There is a structured set of (types of) patterns of local analytical questions

What strong relations (*, *, …) are valid in given data?

What strong known relations are not valid in given data?

What exceptions from … are valid in given data?

….
There are various items of background knowledge in easy understandable form

Bier consumption  BMI

Mother hypertension + Hypertension


, - , ….
Application of the pattern of analytical question to a given item of background
knowledge and to a given data matrix leads to a concrete analytical question.
SEWEBAR
6
SEWEBAR – Principles (2)

To each local analytical question there is type of local analytical report
answering the question

The concrete local analytical question can be answered by the GUHA
procedures implemented in the LISP-Miner system

The corresponding analytical report can be automatically created

There is a similar structured set of patterns of global analytical questions
(concerning several similar data matrices) that can be automatically
answered on the basis of the local analytical reports
SEWEBAR
7
SEWEBAR – Principles
From local analytical question to analytical report
SEWEBAR
8
SEWEBAR – simple examples

Pattern of analytical question – mutual influence of attributes

Pattern of analytical question – groups of attributes

Answering „analytical question – groups of attributes“ by 4ft-Miner

Analytical report
SEWEBAR
9
SEWEBAR - a Framework for Creating and
Dissemination of Analytical Reports from Data Mining

Starting points

Principles (as understood now)

Simple examples

First steps
SEWEBAR
10
SEWEBAR – Principles for first steps

To implement soon first version (simplified if necessary) of support for the whole
process dealing with local and global analytical reports. The whole process
covers:

Formulation of reasonable local analytical questions using background knowledge

Creation of analytical reports answering particular analytical questions


Formulating and answering reasonable global analytical questions
Use the first version to

Gradually improve and enhance particular parts

Develop corresponding theory using observational calculi
SEWEBAR
11
Control panel – tool for first steps
SEWEBAR
12
SEWEBAR – First steps (1)
Background knowledge and local analytical questions:
1.
We start with ADAMEK and STULONG data sets
2.
Background knowledge – we use current version of Knowledge Base
3.
To define first version of the set of LAQ - Local Analytical Questions
4.
To implement LAQPA - Local Analytical Question Patterns Administrator
5.
To implement LAQA - Local Analytical Questions Administrator
SEWEBAR
13
SEWEBAR – First steps (3)
Local analytical reports:
6.
Enhancement of 4ft-Miner by filtering out of uninteresting rules
7.
EverMiner modules
8.
To define skelets of analytical reports
9.
Generator of analytical reports
SEWEBAR
14
SEWEBAR – First steps (4)
Global analytical reports - implemented using ?Topic Maps Content
management system?
9.
To define rules for indexing analytical reports by Topic Maps
10.
To implement tool for automated indexing analytical reports for Topic Maps
11.
To define first version of a set of global analytical questions
12.
To implement tool for automated answering global analytical reports
13.
??IGA grant??
SEWEBAR
15
Thank you for your attention
SEWEBAR
16
Thank you for your attention
SEWEBAR
17