Download Interpreting Gene Lists from Omics Studies

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
CCRC Cancer Conference
November 8, 2015
www.bioinformatics.ca
Module #: Title of Module
2
The ICGC Data Portal
Part 1: Data submission, processing and release
CCRC Workshop 2015 – Module 2
bioinformatics.ca
ICGC Data Release Cycle
Data
files
Data
files
Open
Sign off
Submission and
Validation
Portal
Release Open
Data
Annotation &
ETL
Sign off
Submission and
Validation
Release 1
Portal
Release
Data
Annotation &
ETL
Release 2
Time
CCRC Workshop 2015 – Module 2
bioinformatics.ca
Data Type Submitted
• To the Data Coordination Center (DCC)
–
–
–
–
–
–
–
–
Simple somatic mutations and germline variants
Copy number somatic mutations and germline variants
Structural somatic mutations and germline variants
DNA methylation
Gene expression (RNA-Seq, microarrays)
Protein expression
miRNA
Exon junctions
• To the European Genome Archive (EGA) and CGHub
– Raw sequencing data (FASTQ, BAM)
CCRC Workshop 2015 – Module 2
bioinformatics.ca
Data Validation at Submission
CCRC Workshop 2015 – Module 2
bioinformatics.ca
Data Annotations & ETL Pipeline
• Annotations
– Mutation frequencies
– Mutation consequences
• protein changes and their consequences for genes & transcripts (e.g.
amino acid substitution, frameshift, nonsense-mediated decay etc)
– Mutation functional impact
• High impact mutation prediction by FatHMM
– Gene Sets: Gene Ontology terms, Reactome Pathways, Cancer
Gene Census
• ETL data processing pipeline
– Annotations and data are transformed and indexed using an
ElasticSearch to support highly integrated search
CCRC Workshop 2015 – Module 2
bioinformatics.ca
THE ICGC Data Portal
Part 2: Portal feature highlights
CCRC Workshop 2015 – Module 2
bioinformatics.ca
ICGC Data Portal
Major
functional
sections
CCRC Workshop 2015 – Module 2
https://dcc.icgc.org
Quick keyword
search
bioinformatics.ca
Top 20 mutated genes with high functional
impact SSMs in selected cancer projects
Facets
Simple somatic mutation rate per donor across
selected cancer projects
CCRC Workshop 2015 – Module 2
bioinformatics.ca
Project Entity Page
ALSO
• Most frequent mutations
• Most affected donors
• Publications
CCRC
Workshop
2015
– Module
• Filter
on high impact
mutations
2
bioinformatics.ca
Gene Entity Page
Frequencies by cancer projects
mutations
Pfam domains for all
transcripts
CCRC Workshop 2015 – Module 2
bioinformatics.ca
Reactome Pathway Entity Page
CCRC Workshop 2015 – Module 2
bioinformatics.ca
Mutation Entity Page
Permanent ID across releases
View the mutation in
Genome Viewer
Consequences for all
transcripts
CCRC Workshop 2015 – Module 2
bioinformatics.ca
Genome Viewer
CCRC Workshop 2015 – Module 2
bioinformatics.ca
Current filters
Donors, mutated genes and mutations
found simultaneously
Search data of interest by
applying filters at Donor,
Gene, and/or Mutation
Save the current donors
Export table
Download data files for filtered donors only
Search for donor files in external
repositories (e.g. raw data)
Facets: filter + count
CCRC Workshop 2015 – Module 2
bioinformatics.ca
Customized saved donor, gene and mutation sets
Analyses:
• Enrichment Analysis
• Phenotype Comparison
• Set Operation
CCRC Workshop 2015 – Module 2
bioinformatics.ca
File filters: Repository, Data Type, Experimental
Strategy, File format, Access
CCRC Workshop 2015 – Module 2
bioinformatics.ca
Acknowledgment
• Principal Investigator
– Vincent Ferretti
• Project Manager
– Francois Gerthoffert
• Lead bioinformatician
– Junjun Zhang
• Software Architect and Tech Lead
• Business Analyst
– Phuong-My Do
• Software Developer
–
–
–
–
Dusan Andric
Terry Lin
Michael Moncada
Vitalii Slobodianyk
– Bob Tiernay
CCRC Workshop 2015 – Module 2
bioinformatics.ca
The ICGC Data Portal
Part 3: Live demo
CCRC Workshop 2015 – Module 2
bioinformatics.ca
Related documents