Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
GridMiner: Design and underlying grid technology Ivan Janciak University of Vienna Institute for Software Science email: [email protected] www.gridminer.org Outline Architecture Grid Services Service description Implementation Data sources Motivation/Use Case Mediator Knowledge Discovery Workflow GridMiner Application Demo www.gridminer.org Edinburgh, 30. Nov. 04 2 Motivation Grid Based data mining tool Open/Expandable Easy to use Requirements Ability to access and analyze a huge amount of information – typically heterogeneous and geographically distributed Intelligent behavior ability to maintain, discover, extend, present and communicate knowledge High performance (real-time or soft real-time) query processing High security guarantee www.gridminer.org Edinburgh, 30. Nov. 04 3 Simple Use Case Decision Rules (SPRINT) (Select 10k rows) Decision Rules (C45) Database (100k rows) (Select 20k rows) Decision Rules (C45) www.gridminer.org Edinburgh, 30. Nov. 04 4 Web User environment GridMiner Architecture Graphical User Interface Knowledge Base Service configurators DSCE Client Grid Dynamic service composition engine (DSCE) Data Access and Integration Data mining services www.gridminer.org Edinburgh, 30. Nov. 04 5 Implementation/Technology Globus 3.2 OGSA/DAI ver 3 GUI – Workflow constructions/Results visualization (JGraph, Java web Start, Java server pages) Service Configurators (Java server pages) Workflow management – DSCE Client (OGSA) Knowledge base – Configurations (XML,OWL) Data mediation service (OGSA-DAIS activity) www.gridminer.org Edinburgh, 30. Nov. 04 6 Implemented Services Data mining services Sequences (SPADE) Clustering (SimpleKMeans) Decision rules (SPRINT) OLAP (sequential/parallel version) Association rules on OLAP Dynamic Workflow Composition Service Mediator (Activity in DAIS ver5) www.gridminer.org Edinburgh, 30. Nov. 04 7 Service Description GWSDL Semantic Description (OWL-S) Activity ontology Data mining ontology Service Configurator Dynamic service composition language variables/composition www.gridminer.org Edinburgh, 30. Nov. 04 8 Datasources WebRowSet (XML file) produced by OGSA-DAI Data Streams Metadata WebRowSet Predictive Model Markup Language (PMML) Mediating schema Datasource ontology www.gridminer.org Edinburgh, 30. Nov. 04 9 Knowledge Discovery process 1. Data preparation/integration OGSA -DAI (mediator) Databases webRowSet (File,stream) 2. Setting data mining task Data view Service configuration 3. Data mining task execution 4. Results visualization www.gridminer.org Edinburgh, 30. Nov. 04 Visualization 10 Grid service execution 1. PMML webRowSet, number of cluster Clustering •Create service •Execute method (process clustering) •Query SDE •Destroy service 2. PMML, webRowSet GSH OLAP • • Create service Execute method (build cube) OLAP Query • Execute query www.gridminer.org Edinburgh, 30. Nov. 04 11 Web User environment Components interaction Graphical User Interface Knowledge Base Service configurators DSCE Client Grid Dynamic service composition engine (DSCE) Data Access and Integration Data mining services www.gridminer.org Edinburgh, 30. Nov. 04 12