Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The OpenScienceLink architecture for novel services exploiting open access data in the biomedical domain Efstathios Karanastasis, NTUA Vassiliki Andronikou (NTUA), Efthymios Chondrogiannis (NTUA) George Tsatsaronis (TUD), Daniel Eisinger (TUD), Alina Petrova (TUD) PCI2014 conference, Athens, Greece 3rd of October 2014 CIP-ICT PSP-2012-6 ICT PSP Main Theme: Open Data and Open Access to Scientific Information The Scientific Literature Background • Lack of universal well-structured repositories of scientific and research data for experimentation and benchmarking of pertinent research works in a given thematic area • Fragmented, lengthy, weak and inefficient peer review processes given the growing number of journals, magazines and conferences • Non-objective and extremely focused (in terms of the aspects that they cover such as impact and popularity) tools and metrics for assessing research work as well as individuals, institutions and organizations which are based on a specific snapshot of the scientific work • Poorly linking of research articles to data journals OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 The Scientific Literature Background • Growing wealth of the scientific work and information produced by researchers and scholars – scientific/research articles – monographs – research datasets • Need for more effective processes and improved tools and techniques towards: – – – – – reviewing scientific articles and research data organising and managing data journals bibliographic analysis management of scientometrics and development of new ones better collaboration between researchers OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 OpenScienceLink Objectives Open Semantically-enabled, Social-aware Access to Scientific Data • Provide a holistic approach to the publication, sharing, linking, reviewing and evaluation of research results based on open access to scientific information • Empower a novel eco-system for open access to scientific information, which will provide a range of added-value services for all stakeholders • Main Outcomes: – The OpenScienceLink platform – Implementation of 5 pilots – The Biomedical Data Journal (BMDJ) OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 OpenScienceLink Pilots 1. Research Dynamics-aware Open Access Data Journals Development 2. Novel open, semantically-assisted peer review process 3. Data Mining for Biomedical and Clinical Research Trends Detection and Analysis 4. Data Mining for Proactive Formulation of Scientific Collaborations 5. Scientific field-aware, Productivity- and Impact-oriented Enhanced Research Evaluation Services OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 5 Pilots Overview Research Dynamics-aware Open Access Data Journals Development • Data journal establishment • Journal issue suggestion • Dataset submission • Novel open, semanticallyassisted peer review process • Article-based reviewers suggestion Dataset peer review • Assign competent reviewers • Publishing • • Assessment and evaluation • Identification of research dynamics associated with specific datasets Review support tools (e.g. automatic retrieval of relevant research articles) • Review form submission • Post-review discussion OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 Pilots Overview Data Mining for Biomedical and Clinical Research Trends Detection and Analysis • Detect research trends • Analyse research trends • Essential for: – allocation of research funding (by private sponsors and governmental agencies) – overall planning of research strategies OpenScienceLink Data Mining for Proactive Formulation of Scientific Collaborations • Enable the networking and collaboration of researchers and scholars working on similar or potentially collaborating scientific fields and sharing similar research interests • Infer relationships between researchers and research groups, including (in several cases) non-obvious, nondeclared ones Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 Pilots Overview Scientific field-aware, Productivity- and Enhanced Research Evaluation Services Impact-oriented • Current simplified indices and impact factors evaluate only an aspect of the scientific work • Introduce, produce and track new metrics of research and scientific performance, beyond conventional ones for evaluation of: – Research work (incl. data papers) – Researcher – Research group or community – Conference, Journal, Publisher – Department, Laboratory, Institution, University, Organisation – Country – Research grant OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 OpenScienceLink Ecosystem OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 Integrated Platforms • FP7 SocIoS – – – – • GoPubMed – – – • A set of tools that leverage the potential of Social Networking Sites (SNSs) Serves as an umbrella for accessing user data scattered among various SNSs through a common and secure interface Hides SNS-specific complexity Enables the delivery of services which exploit social graphs A semantic search engine for the life sciences Allows exploring PubMed search results with concepts from the Medical Subject Headings (MeSH), the Gene Ontology (GO) and the Universal Protein Resource (UniProt) A data management model expanded with the ability to index, annotate, and semantically search datasets FP7 PONTE – – – A knowledge-oriented platform that supports the design and creation of clinical trial protocols Provides a set of semantic web enabled mechanisms and services facilitating clinical trials lifecycle Incorporates a set of advanced data mining and semantic reasoning mechanisms which are applied on a variety of web data sources containing clinical and non-clinical information OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 OpenScienceLink Conceptual Architecture OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 OpenScienceLink Core Components • The OpenScienceLink core components implement the main functionality of the Platform and form the OpenScienceLink API • Users Management – Responsible for handling all functionality related to the Platform users, their profile and access rights, such as user registration, profile editing, authentication and role-based authorisation. • Datasets Management – Responsible for handling all functionality related to datasets and the corresponding metadata. – Metadata are partially based on the Dryad Metadata Application Profile, including extensions at the level of parameters, e.g. dataset source type (real-world vs. synthetic), level of noise, and species, among others OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 Core Components Layer • Articles Management – Responsible for handling all functionality related to articles • Authors Management – This component is responsible for handling all functionality related to authors • Groups Management – This component is responsible for handling all functionality related to groups of people OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 Core Components Layer • Review Data Management – Responsible for handling all functionality related to the review process and the corresponding data – Covers the initiation and updating of the review process as well as the provision of access to the reviews to the corresponding users • For example, for a particular article or dataset, some users can see their own review (e.g. a reviewer), some users can see all reviews without knowing the reviewers (e.g. an author), and some users can see all reviews and reviewers (e.g. a publisher) – Comments and ratings are also managed by this component, always considering each user's access rights OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 Adaptors Layer • The OpenScienceLink core components interact with the SocIoS, GoPubMed and PONTE platforms by means of the adaptors • The latter undertake the required actions, mappings and transformations in order to enable communication with the existing platforms and ultimately the underlying data sources for the exploitation of the existing wealth of information OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 Social Networks Adaptor • • • • • • • Comprises a simplification layer on top of the SocIoS Services Undertakes the integration of the underlying SocIoS platform and communication with the connected SNS(s) Receives requests from the OpenScienceLink core components for the provision of data stemming from the connected SNSs, including the exact type of information required and the SNS(s) involved Combines SocIoS services in order to provide tailored functionality pertaining to the specific data needs of the OpenScienceLink Core Components Queries the services built on top of the SocIoS platform in order to further process the specific requests and gather the required data Internally performs data processing or mapping that may required for the seamless collaboration between the OpenScienceLink core components and the SocIoS platform in either direction Offered functionality: Persons retrieval, connected persons retrieval, media items retrieval, activities retrieval, data transformation and data extraction OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 16 Content and Data Management Adaptor • Integrates the data management system of GoPubMed within the OpenScienceLink Platform • Integrates the services of the GoPubMed semantic search engine • Comprises a simplified layer of services on top of the GoPubMed platform that pertain to the indexing of data, annotation with the underlying ontology concepts, importing of new ontologies, semantic search on the indexed data and identification of trends in the indexed data. – Utilised for presenting statistics about the resulting set of documents, such as the number of publications over time, the top countries, cities, journals, authors and ontology terms – It is, thus, a summary of the trends observed for the documents that are returned via the input query OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 17 Semantically-enabled Inference Adaptor • • • • • Enables the integration of the PONTE platform with OpenScienceLink Exploits the PONTE data mining and semantic reasoning mechanisms and services as well as the rich knowledge base of the PONTE platform Use of the term co-occurrence index building capability of the PONTE platform, in order to exploit the fact that relevant terms appear together in the literature – the more this happens, the more relevant they are considered to be – and build a co-occurrence index for pairs and triples of terms, ranked on each case by frequency (offering a first stage filter of information, able to reduce the amount of information to manageable levels, without sacrificing interesting results, for guiding research) Exploitation of a local knowledge base based on curated data from the web of linked data, as well as specialized data sources (incl. KEGG, ChEBI, DrugBank, Sider, etc) Application of various ranking algorithms to the discovered data, following the knowledge-based concept correlations capability stemming from PONTE, with the ranking results being used either for presentation purposes (top first) or for adjusting the level of inclusion / exclusion of terms deemed relevant. OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 18 Conclusions • The OpenScienceLink platform enables accessing and offering of added value services (including trends detection and analysis, development of new scientometrics, data journals management, enhanced review processes) based on a multitude of openly accessible data sources (from literature and data sets to social network data), while at the same time empowering their semantic linking and data processing • It further offers a wide range of opportunities for better collaboration between researchers, scholars, and research organisations, including their ability to formulate added-value scientific / research networks OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 19 Future Work • Expand the capabilities of the components and user interfaces according to the recorded end user needs and requirements regarding all Pilots • Address any issues with the implemented functionality and provide improvements based on the end user’s evaluation feedback • Consider additional data sources for inclusion via integration with the underlying platforms, according to the needs of OpenScienceLink • Investigate the integration more SNSs, with the aim to also include networks specifically addressed to researchers and research communities, with the most probable first candidate being Mendeley • Analyse the steps required (e.g., link with other domains’ ontologies, data sources and models) for enabling the Platform to offer its services beyond the biomedical domain, and, thus, ideally become domain-agnostic OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 20 Thank you ► Contact Efstathios Karanastasis Research Engineer +30 210 772 2132 [email protected] National Technical University of Athens School of Electrical and Computer Engineering Distributed Knowledge and Media Systems Group http://grid.ece.ntua.gr OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 21 The OpenScienceLink Platform OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 22 Log in OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 23 User registration (step 1 of 3) OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 24 User registration (step 2 of 3) OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 25 User registration (step 3 of 3) OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 26 Main menu bar OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 27 My profile OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 28 My datasets OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 29 Upload dataset OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 30 Create review call (1 of 2) OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 31 Create review call (2 of 2) OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 32 Trends OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 33 Collaborations OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 34 Evaluation (1 of 2) OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 35 Evaluation (2 of 2) OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 36 PONTE: Eligibility Criteria Model Scope within PONTE: ► Formal representation of Eligibility (Inclusion/Exclusion) Criteria ► Patients Model for Clinical Research purposes (especially recruitment) Current Status: 1st year work ► Work upon extending and adapting the eligibility criteria model for OpenScienceLink purposes Future work: 2nd and mainly 3rd year ► Update and Integrate I/E criteria model within OpenScienceLink platform ► Annotate literature search results ► Improve literature search process OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 37 PONTE: Abbreviations - Introduction ► An abbreviation is shortened form of a term or expression (aka the expanded form) ► Abbreviations are widely used in biomedical articles and datasets. Example: ► An abbreviation is present within a document, e.g. “Cardiac testing for all patients at low-risk for ACS is not sustainable”… ► But its expansion is missing Acute Coronary Syndrome ► Highly Ambiguous ► Over 5 expansions per abbreviation on average ► Abbreviations expansion detection or prediction is a real challenge OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 38 PONTE: Abbreviations - Tasks ►Current Status: Work done during 1st year ►In-depth analysis of problem ►Abbreviation Expansion Detection and Prediction System Architecture ►Description of Algorithms / Methodology ►Future Work: for 2nd and 3rd year ►Repository of abbreviations with expansions along with context ►Suggestion of most appropriate expansion for an abbreviation OpenScienceLink Efstathios Karanastasis, The OpenScienceLink Architecture, PCI2014, 3rd of October, 2014 39