Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
WP6: Software Platform and Tools Lead: UDE Partners: UMA, CICE, FriontiersIn Month 1 - Month 30 Overview Bundles all activities related to the provision of a software platform hosting tools and services for data mining, crawling and social network analysis Relies on existing tools, either free and open software or tools owned by the partners First part: definition of crawling, data mining, storage strategy Second part: Data transformation for SNA, definition of network based role model and evaluation of these models Specific objectives Selection and evaluation of mining strategies Specification of crawling approach and integration of crawlers Specification and configuration of a software platform Preparation / transformation of data for SNA Specification and modelling of roles and constellations (SNA) Data analyses and evaluation Model revision and software adaptation T6.1 Crawler and mining strategy Specify requirements for crawling and data mining based on the focused data sources and social models flexible with respect crawling strategies to be adaptable also to the needs other work packages (esp. the case studies) integrated and controlled by a framework which handles the storage of retrieved web objects and the notification of newly found relevant data and changes in the data sources. Responsible: UMA (2PM) Contributors: UDE (1PM), CICE (1PM) T6.2 Semantic evaluation and filtering Categorize and filter data retrieved from the various data sources relies on techniques adopted from the field of knowledge discovery in databases (KDD) encompass the pre-processing of given data in terms of statistical sampling, cleaning and transformation of the data into adequate representations for the subsequent algorithms Responsible: UDE (2PM) Contributors: UMA (1PM) T6.3 Framework for storage, notification and triggering Retrigger crawler due changes in data corpus over time Re-triggering based on a "when appropriate" strategy recognition of specific events such as new conference announcements or availability of proceedings. Notify its users about new and relevant findings Responsible: UDE (2PM) Contributors: UMA (2PM), CICE (2PM) T6.4 Data transformation and structural modeling for SNA Define a common data format for sharing within consortium based on the identification of relevant communities and their "traces" (communication, copublications etc.), and based on the general conceptual model (WP 2) Define and specify typical roles and constellations (e.g. broker) based on SNA techniques (e.g. blockmodeling) Continuously verification of social indicators Responsible: UDE (2PM) Contributors: UMA (2PM) T6.5 Software platform Configure an integrated software platform for crawling/data mining and SNA based on the initial specifications input relates to the transformation from relevant data sources (specified in T6.4) output is concerned with visualisation and reporting Revised and adapt platform according to emerging issues and needs (esp. considering the case studies) Uses freely available (open) software and software owned by the partners (mainly UDE) Responsible: UDE (7PM) Contributors: UMA (4PM), CICE (1PM) T6.6 Data analysis and evaluation Test platform with standard cases based on specifications of WP 4 (Measurements and Social Indicators) early phase: test functioning of the platform and its components (from T6.5) and adequacy of the semantic filters (T6.2) and structural definitions (T6.4). later stage: evaluate actual performance and community developments in association with the case studies and with WP 4. Responsible: UDE (3PM) Contributors: Frontiersln (2PM), UMA (1PM), CICE (1PM) Deliverables and Milesones Deliverables 6.1 Mining strategy and requirements specification for the software platform (RP:UDE,RV:UMA, C: all in /M5) 6.2 First version of structural definitions (RP: UDE, RV: UMA, C: all in / M10) 6.3 Configuration, test of the platform and first evaluation report (RP:UDE,RV:CICE,C: all in /M22) 6.4 Final report and system (RP:UDE,RV:CICE,C: all in /M30) Milestones MS2, SISOB System first prototype, month 15 MS3, SISOB Final System, month 30 Tools Open Source Crawler DMD –Data-Multiplexer-Demultiplexer WOS2Pajek, Pajek, and UCINET CFinder Challenges Data model adequate to different data sources Data model supporting multilevel analysis according to multivocality in project Merging different types of data Cleaning data e. g. researchers having different email e. g. researchers writing their names in different ways How to get data from Web 2.0 Platforms like Mendeley