Download ELIXIR Applications, ECCB 2016, 3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Integration of EGA secure data access into Galaxy
Youri Hoogstrate[1], Alexander Senf[2], Jochem Bijlard[3], Saskia Hiltemann[1], David van Enckevort[4], Chao Zhang[5], Remond
Fijneman[6], Jan-Willem Boiten[7], Gerrit Meijer[6], Andrew Stubbs[1], Jordi Rambla[8], Dylan Spalding[2], Sanne Abeln[5]
[1]ErasmusMC
Rotterdam (NL), [2]EMBL-EBI (UK), [3]The Hyve (NL), [4]UMC Groningen (NL), [5]VU university medical
center (NL), [6]Netherlands Cancer Institute (NL), [7]Lygature (NL), [8]Centre for Genomic Regulation (ESP)
ELIXIR Applications, ECCB 2016,
3-7 September, The Hague, Netherlands
Bio-molecular high throughput data is privacy sensitive and can not easily made accessible to the entire outside world. To manage access to long termarchival of such data the EGA project was initiated to facilitate data access and management to funded projects after completion to enable continued
access to these data. Strict protocols govern how information is managed, stored, transferred and distributed and each data provider is responsible for
ensuring a Data Access Committee is in place to grant access to the data. Moreover, the transfer of privacy sensitive data should be encrypted.
Entire IT-infrastructure
The aim of the CTMM-TraIT project is to setup a multi-domain
IT-infrastructure in which researchers can track, share and
reproduce their entire study, including metadata on wet lab
experiments. To achieve this, CTMM-TraIT uses big community
open source software including TranSMART and Galaxy.
EGA & Galaxy
In a collaboration between ELIXIR, TraIT and EGA a full
ecosystem was designed to connect storage of raw
experimental molecular profiling data with processed data and
computational workflows (fig. 1). In this ecosystem we find
Galaxy, a popular and user friendly bio-informatics analysis
platform that provides an intuitive user interface for molecular
biologists and bio-informaticians to run and design workflows,
to do integrated analysis within the browser and to share and
communicate both results and methodologies. By integrating
EGA into galaxy, a user can perform an entire analysis,
containing (privacy sensitive) data from EGA, to make it
available in a reproducible manner for other researchers.
Fig 1. A flowchart of the ecosystem: an entire study setup in a way that
all molecular data and metadata and software are tracked and
versioned, while having access to personal data is secure in a secure and
administered manner to deal with the challenges of molecular data like
accessibility, reproducibility, security and privacy. The entire ecosystem
is accessible within the browser and uses free and open software (FOSS)
and where EGA is a central storage facility.
Proof of concept study
To demonstrate the ecosystem we use cell-line data to avoid privacy sensitive related matters and demonstrate fusion gene
detection in RNA-Seq data of prostate cancer cell line VCaP using STAR-Fusion (fig 2.). The workflow and the results are explained
in more detail in a published Galaxy page [*], including interactive views of detected fusion genes and access to reference data.
Fig 2. The end-to-end workflow starts by obtaining paired-end RNA-Seq data of the VCaP cell line via EGA. Galaxy determines the file type
automatically with its built-in format detection system. The reads will be clipped before an alignment with STAR and STAR produces several output
files that can be used for different analysis.The last tool is STAR-Fusion which was able to confirm theTMPRSS2-ERG fusion gene[**].
https://bioinf-galaxian.erasmusmc.nl/galaxy/u/yhoogstrate/w/ega-vcap-rna-seq-demo
[*] https://bioinf-galaxian.erasmusmc.nl/galaxy/u/yhoogstrate/p/ega-vcap-rna-seq-star-fusion-demo
https://github.com/ErasmusMC-Bioinformatics/ega_client_galaxy_wrapper
[**] doi: 10.1007/s00439-013-1308-1
Contact:
Youri Hoogstrate (ErasmusMC)
[email protected]
Sanne Abeln (VUmc)
[email protected]
@ELIXIREurope
/company/elixir-europe