Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
KDNet (NOE) , KDubiq (CA) and Beyond Codrina Lauth Fraunhofer AIS, Sankt Augustin and Institute for Computer Science III, University Bonn 1 Ina Lauth, Fraunhofer AIS, St. Augustin, Germany Outline Fraunhofer AIS and University Bonn will host ICML/ILP 2005 7-13 August 2005 EU Projects: Networking and Activities Coordination from KDNet to KDubiq Text Mining Research Activities Lastname 2 Machine Learning and Data Mining in Bonn University of Bonn, Computer Science III: 15 researchers - Machine learning and Data Mining Groups (Stefan Wrobel) Algorithmic Learning Theory (Marek Karpinski) Machine learning in robotics (Armin Cremers) Fraunhofer AIS: 10 researchers, 20 project engineers - Knowledge Discovery Group (Michael May, Stefan Wrobel) - Interactive Discovery and Visualization Group (Hans Voss) - Lastname Robotics and Reinforcement Learning Group (Joachim Hertzberg, Thomas Christaller) Embedded Data Mining Machine Learning, ILP Grid Computing Databases Spatial Knowledge Discovery Statistics Text-Mining Audio-Mining Video-Mining 3 Knowledge Discovery Team @ AIS (Michael May, Stefan Wrobel) Representation of knowledge (methods for complex, structured and/or multi-relational data) Scalability (run time complexity, parallelization and distribution, sampling) Application domains (Spatial-,Text-, Multimedia-, Web-Mining, Data Warehouses) Text Mining Research @ KD IE, NER, term clustering with SVM, PLSI, LDA Ontology learning, multimedia archives autom. extraction of semantic relations Ling. methods: semantic tagging/parsing for ontology mapping/extraction (using graphs methods) Visualization: Semantic Maps Groupware- Applications Lastname 4 Sample of Data Mining Projects @ KD FAW (20 04-2006): M odelling of evaluation frequencie of poster a s for the dvertiseme n t SPR (200 4 -2 frequencies 005): Extrapolation an d shaping o for cities of f traffic Switzerland based on G REWE (2 PS data 004- …,): D ata Mining manageme for assortm nt ent Vodafon e (2004-20 05): Data M planning ining for ra dio network SIMDAT (EU 2004-2 008) – Data computing Mining and for complex Grid industrial a pplications KDNet (E U2 Discovery N 002-2004) – Europea n Knowledg etwork of E e xcellence Diastasis (EU 2003-2 004) – Web of statistica Mining and l indicators extraction Rolex (In dustry, 200 4) – imitation an d plagiarism Study for recognition of based on T the field of ext Mining Ebay methods in Mediaran k (Industry, 2003-2005 commercia ) – Evaluati l text minin on and of g systems ( w it h PAN , W D R PI-Avida (BMBF, 20 ) 0 1-2003) – T Video- Min ext-, Audio ing - and Lastname 5 KDNet vs KDubiq KDNet (NOE) KDubiq (CA) 900 000 EUR 1 000 000 EUR 2002-2004 Oct 2005-2007 36 months (130+ active members) 30 months WP2 – Blueprint editing (5 PM) WP3 – Clustering Projects (4 PM) WP3 – Research Project Forum (Maarten WP4 – Online Information Services (AIS) (4 PM) van Someren, Dunja Mladenic) WP4 – eTraining and Virtual Innovation Center (4 PM) WP4 – Online Information Services (Stefan Wrobel) WP6 – Areas Integration (Donato Malerba), 3 PM (WG: 12 PM) WP5 – Training (Katharina Morik) WP7 – Application Transfer (DaimlerChrysler) WP6 – Industrial Workshops (4 PM) (DaimlerChrysler) WP 8 – Evaluation, Standards, Benchmarking WP7 – Public Sector Workshops (Willi (Michele Sebag) (2 PM) Klösgen,Donato Malerba) WP 1+9 – AIS (PM 30) WP8 – Scientific Sector (Neighboring Disciplines) (Arno Siebes) WP2 – Trends in Research (Lorenza Saitta, Jaakko Hollmen) WP9 – Dissemination (Ina Lauth) Lastname 6 WP6: Areas Integration Application Distributed Technology Learning Data Types Security HCI,Cognitive Modelling WP2: Blueprint Editing WP3: Project Clustering WP4: Portal WP5: Training WP7: Transfer WP8: Standards WP9: Blueprint Dissemination 7 Ina Lauth, Fraunhofer AIS, St. Augustin, Germany Working Group WG Coordinator Country Application Environments Hillol Kargupta Gholamreza Nakhaeizadeh USA Germany Ubiquitous Technologies Domenico Talia Assaf Schuster Georgios Paliouras Italy Israel Greece Learning Components Michèle Sebag Stefan Wrobel France Germany Data Types Pavel Brazdil/ Joao Gama Gerd Stumme/Myra Spiliopoulou Portugal Germany Security & Privacy Yücel Saygin Fosca Giannotti Turkey Italy HCI & Cognitive Modelling Bettina Berendt Ernestina Menasalvas Eduardo Alonso Lastname Germany Spain UK 8 KDubiq (not attributed funding) Category Travel Costs Euro Total EUR 120,000 Student Exchange 60,000 Workshop organization 70,000 Printing 10,000 Conference Support 45,000 Others Lastname 7,000 312,000 9 KDNet vs KDubiq Lastname 10 Challenge: Data Mining in distributed information systems Industrial Use Case Daimler Chrysler: Text repositories for CRM, quality management with millions of documents and constant inflow of new documents. High amount of data, distributed and local accrue, organizational barriers ⇒ Centralized data management or text repositories almost impossible Goal: Intelligent distribution of the learning and classification process Methods: • Text Mining, • Ontology Learning, • Distribution data management, data access, preprocessing, classification, learning • GRID Computing Lastname 11