Download Debellor: Data Mining Platform with Stream Architecture

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Debellor
Data Mining Platform with Stream Architecture
Marcin Wojnarski
Warsaw University, Poland
Outline
Debellor – data mining platform
Motivation
Main features
Architecture:



Cell
data streaming
multi-threading
Available in ver. 0.6
Future releases
Summary
2
Debellor
Language:
Java
Licence:
open source (GPL)
Download:
www.debellor.org
Debello – to conquer (latin). Debellor – conqueror of data
3
Debellor – data mining platform
Weka
Debellor
TA-Lib
4
Motivation
Demand for more complex algorithms.
Necessity to combine elementary algorithms.
5
Motivation
1.
Data Processing Network (DPN)
Visualize
Load
Preprocess
Preprocess
Predict
Save
Load
6
Motivation
2.
Committee of algorithms
Classifier A
Classifier B
Voting
Classifier C
7
Motivation
3.
Nested algorithms
RBF neural network
K-means
8
Requirements
Versatile
Efficient
Simple
9
Features of Debellor
All types of data processing algorithms
Extendible data types
Stream architecture  large data sets
Multi-threading
Immutability of data objects  safety
10
Debellor
11
Algorithm = Cell
cell
Cell cell = new RseslibClassifier("C45");
cell.set("pruning", "true");
12
Cell – data source
cell
cell.open();
Sample s1 = cell.next(),
s2 = cell.next(),
...
cell.close();
13
Cell – data receiver
anotherCell
cell
cell.setSource(anotherCell);
14
Trainable Cell
EMPTY
cell
TRAINED
cell
cell.setSource(…);
cell.learn();
15
Data Streaming
A
B
BATCH
A
B
STREAM
It’s the cell who is responsible for asking for data
16
Benefits of streaming
training of k-means
X X
17
Multi-threading
Thread_1
A
B
18
Multi-threading
Thread_2
A
Thread_1
B
A.newThread();
19
Available in version 0.6
Rseslib algorithms:

classifiers (~20 algorithms)
Weka algorithms:

ARFF reader

classifiers (~60)

filters (47)
Debellor algorithms:

Train&Test evaluation

k-means for large data (stream-based)
Data types:

numeric and symbolic features

vectors of features, vectors of vectors of …
20
Future releases
Multi-input & multi-output cells
Composite cells (e.g. meta-learning)
Serialization and copying
…
21
Summary
Platform
Stream architecture
Extendible
Multi-threaded
Weka & Rseslib partially integrated
22
Home
www.debellor.org
23
24
Related documents