Download Ferda Visual Environment for Data Mining

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data vault modeling wikipedia , lookup

Business intelligence wikipedia , lookup

Transcript
Ferda
Visual Environment for Data
Mining
Martin Ralbovský
Ferda History 1
LISp-Miner System – Implementation of
several GUHA procedures + more
 2003: Idea of creating a new Clementinelike visual interface for LISp-Miner
 2003: Ferda project started based on this
idea

subject Softwarový projekt at MFF UK
Ferda History 2
2004 – 2006: Development of Ferda
project
 February 2006: Ferda presented at
Znalosti 2006 conference
 April 2006: Ferda became a approved
software project at MFF UK
 Now: Further development of Ferda
system, master theses of Ferda creators

Ferda Advantages
Modular and extensible architecture,
usage of middleware, support for
distributed computing
 Ferda’s box model: ability implement and
include new boxes, possible engine for
EverMiner
 Comprehensive user interface including
new features such as box archive

Ferda Disadvantages
Not so well tested (haven’t been used for
education)
 Dependent on LISp-Miner modules and
metabase
 Slower then LISp-Miner

Future goals for Ferda
“Spreading Ferda”
 Getting more people to work for Ferda –
creation of new boxes, modules
 Cooperation with other systems
 Road to EverMiner

Master theses improvements for
Ferda
Reimplementing LISp-Miner procedures
 Relational versions of some procedures
(SD4FT)
 Domain knowledge support

Reimplementing LISp-Miner
procedures 1
Not working with the metabase anymore–
faster implementation
 Modular implementation of data mining
task - enables the full potential of the
Ferda’s box module
 Open implementation of 4ft, SD4ft, KL,
SDKL, CF and SDCF procedures

Reimplementing LISp-Miner
procedures 2 – further plans
Enabling fuzzy computing
 Data stream support – connecting Ferda
to Sumatra TT
 Distributed computing
 KL Collaps, 4ftUV Filter implementation
 “little” improvements to task setup (literal,
cedent…)

Ontologies in Ferda
Ontologies aid user in various phases of
CRISP-DM cycle, planning to develop
(semi)automated tools to help with:
 Identification of redundant attributes
 Creation of attributes
 Creation of partial cedents
…
Field knowledge in Ferda
Field knowledge – vague term, rules that
are common knowledge, widely accepted
in a domain
 Formalization of field knowledge using
abstract attributes and quantifiers
 Creation of boxes in Ferda that enable
user to express field knowledge, veryfiing
field knowledge against procedures’
output
