Download GridMiner: Design and underlying grid technology Ivan Janciak

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
GridMiner:
Design and underlying grid
technology
Ivan Janciak
University of Vienna
Institute for Software Science
email: [email protected]
www.gridminer.org
Outline

Architecture


Grid Services




Service description
Implementation
Data sources


Motivation/Use Case
Mediator
Knowledge Discovery Workflow
GridMiner Application Demo
www.gridminer.org
Edinburgh, 30. Nov. 04
2
Motivation



Grid Based data mining tool
Open/Expandable
Easy to use
Requirements




Ability to access and analyze a huge amount of
information – typically heterogeneous and
geographically distributed
Intelligent behavior ability to maintain, discover,
extend, present and communicate knowledge
High performance (real-time or soft real-time) query
processing
High security guarantee
www.gridminer.org
Edinburgh, 30. Nov. 04
3
Simple Use Case
Decision Rules (SPRINT)
(Select 10k rows)
Decision Rules (C45)
Database
(100k rows)
(Select 20k rows)
Decision Rules
(C45)
www.gridminer.org
Edinburgh, 30. Nov. 04
4
Web
User environment
GridMiner Architecture
Graphical User Interface
Knowledge Base
Service configurators
DSCE Client
Grid
Dynamic service composition engine (DSCE)
Data Access and Integration
Data mining services
www.gridminer.org
Edinburgh, 30. Nov. 04
5
Implementation/Technology







Globus 3.2
OGSA/DAI ver 3
GUI – Workflow constructions/Results
visualization (JGraph, Java web Start, Java server
pages)
Service Configurators (Java server pages)
Workflow management – DSCE Client (OGSA)
Knowledge base – Configurations (XML,OWL)
Data mediation service (OGSA-DAIS activity)
www.gridminer.org
Edinburgh, 30. Nov. 04
6
Implemented Services



Data mining services
 Sequences (SPADE)
 Clustering (SimpleKMeans)
 Decision rules (SPRINT)
 OLAP (sequential/parallel version)
 Association rules on OLAP
Dynamic Workflow Composition Service
Mediator (Activity in DAIS ver5)
www.gridminer.org
Edinburgh, 30. Nov. 04
7
Service Description


GWSDL
Semantic Description (OWL-S)



Activity ontology
Data mining ontology
Service Configurator
 Dynamic service composition language
 variables/composition
www.gridminer.org
Edinburgh, 30. Nov. 04
8
Datasources


WebRowSet (XML file) produced by OGSA-DAI
Data Streams
Metadata




WebRowSet
Predictive Model Markup Language (PMML)
Mediating schema
Datasource ontology
www.gridminer.org
Edinburgh, 30. Nov. 04
9
Knowledge Discovery process
1. Data preparation/integration
OGSA -DAI
(mediator)
Databases
webRowSet
(File,stream)
2. Setting data mining task
Data view
Service configuration
3. Data mining task execution
4. Results visualization
www.gridminer.org
Edinburgh, 30. Nov. 04
Visualization
10
Grid service execution
1.
PMML
webRowSet, number of cluster
Clustering
•Create service
•Execute method (process clustering)
•Query SDE
•Destroy service
2.
PMML, webRowSet
GSH
OLAP
•
•
Create service
Execute method (build cube)
OLAP Query
• Execute query
www.gridminer.org
Edinburgh, 30. Nov. 04
11
Web
User environment
Components interaction
Graphical User Interface
Knowledge Base
Service configurators
DSCE Client
Grid
Dynamic service composition engine (DSCE)
Data Access and Integration
Data mining services
www.gridminer.org
Edinburgh, 30. Nov. 04
12
Related documents