Download Interpreting Gene Lists from Omics Studies

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
NCRI Cancer Conference
November 1, 2015
www.bioinformatics.ca
Module #: Title of Module
2
The ICGC Data Portal
Part 1: Data submission, processing and release
NCRI Workshop 2015
bioinformatics.ca
ICGC Data Release Cycle
Data
files
Data
files
Open
Sign off
Submission and
Validation
Portal
Release Open
Data
Annotation &
ETL
Sign off
Submission and
Validation
Release 1
Portal
Release
Data
Annotation &
ETL
Release 2
Time
NCRI Workshop 2015
bioinformatics.ca
Data Type Submitted
• To the Data Coordination Center (DCC)
–
–
–
–
–
–
–
–
Simple somatic and germline mutation
Somatic copy number variation
Somatic structural mutation
Methylation
Gene expression (RNAseq, Arrays)
Protein expression
miRNA
Exon junctions
• To the European Genome Archive (EGA) and cgHub
– Sequencing raw data (Fastq, BAM)
NCRI Workshop 2015
bioinformatics.ca
Data Validation at Submission
NCRI Workshop 2015
bioinformatics.ca
Data Annotation & ETL Pipeline
• Annotation
– Mutation frequencies
– Mutation gene consequences
• Amino Acid changes and their consequences for all gene & transcripts
(e.g. frameshift)
– Mutation functional impact
– Gene Ontology terms, Reactome pathways, Cancer Gene
Census
– Germline mutations masking
• ETL pipeline
– Annotated data indexed using an ElasticSearch cluster of 16
nodes
NCRI Workshop 2015
bioinformatics.ca
THE ICGC Data Portal
Part 2: Portal features highlights
NCRI Workshop 2015
bioinformatics.ca
ICGC Data Portal
NCRI Workshop 2015
bioinformatics.ca
Top 20 mutated genes with high functional
impact SSMs in selected cancer projects
NCRI Workshop 2015
Simple somatic mutation rate per donor across
selected cancer projects
bioinformatics.ca
Project Entity Page
ALSO
• Most frequent mutations
• Most affected donors
• Publications
NCRI
Workshop
2015
• Filter
on high impact
mutations
bioinformatics.ca
Gene Entity Page
Frequencies by cancer projects
Pfam domains for all transcripts
NCRI Workshop 2015
bioinformatics.ca
Reactome Pathway Entity Page
NCRI Workshop 2015
bioinformatics.ca
Mutation Entity Page
Permanent ID across releases
Consequences for all transcripts
NCRI Workshop 2015
bioinformatics.ca
Genome Viewer
NCRI Workshop 2015
bioinformatics.ca
Current filters
Affected donors, mutated genes and
mutations found simultaneously
Search data of interest by
applying filters at Donor,
Gene, and/or Mutation
Download data files for filtered donors only
Export table
Search for donor files in external
repositories (e.g. raw data)
NCRI Workshop 2015
bioinformatics.ca
Customized saved donor, gene and mutation sets
Analyses:
• Enrichment Analysis
• Phenotype Comparison
• Set Operation
NCRI Workshop 2015
bioinformatics.ca
File filters: Repository, Data Type, Experimental
Strategy, File format, Access
NCRI Workshop 2015
bioinformatics.ca
Acknowledgment
• Principal Investigator
– Vincent Ferretti
• Project Manager
– Francois Gerthoffert
• Lead bioinformatician
– Junjun Zhang
• Software Architect and Tech Lead
• Business Analyst
– Phuong-My Do
• Software Developer
–
–
–
–
Dusan Andric
Terry Lin
Michael Moncada
Vitalii Slobodianyk
– Bob Tiernay
NCRI Workshop 2015
bioinformatics.ca
The ICGC Data Portal
Part 3: Live demo
NCRI Workshop 2015
bioinformatics.ca
Related documents