Download Knowledge Engine for Genomics (KnowEnG): Cloud

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Transcript
Knowledge Engine for Genomics (KnowEnG):
Cloud-based Environment for Scalable
Analyses of Genomic Signatures
Charles Blatti
Postdoctoral Research Associate
KnowEnG Center of Excellence in Big Data Computing,
University of Illinois at Urbana-Champaign
National Center for Supercomputing Applications
November,University
2016
of Illinois at Urbana-Champaign
Genomic Data Analysis Using Prior Knowledge in a Scalable Cloud
Knowledge Network
User Interface
Analysis Pipelines
Network
Smoothing
Consensus
Clustering
Genes
Samples
NMF
Clustering
User Spreadsheet
2
Genomic Data Analysis Using Prior Knowledge in a Scalable Cloud
Knowledge Network
User Interface
Analysis Pipelines
Network
Smoothing
Samples
NMF
Clustering
Consensus
Clustering
Genes
3
User Spreadsheet
Gene Prioritization Research Application
Amin Emad
Knowledge-Guided Prioritization
of Genes Determinant of Drug
Resistance Using ProGENI
Research Highlights Session 2
4
Genomics of Drug Sensitivity in Cancer Data
GDSC Data
13,000 Genes
Basal Expression Data
600 Cell Lines
139 Drugs
Drug Sensitivity Data
600 Cell Lines
5
Gene Prioritization Measure
6
Incorporating the Knowledge Network
Network
Transform
Prioritization
Measure
Network
Ranking
7
Robust Prioritization by Resampling
Network
Transform
Prioritization
Measure
Network
Ranking
8
Running the Pipeline
Docker Containers
Scheduled by Chronos
Managed by Mesos
Synced to Cloud Storage
9
Visualizing the Results
10
Porting Results to Gene Set Characterization
11
Choosing Public Gene Sets
Standard GSC Method
Popular Webtools
Annotation Gene Sets
Characteristic Gene Sets
Experimental Gene Sets
12
Incorporating the Knowledge Network
P1
P2
Heterogeneous
Edge Types
GO_edge
KEGG_edge
HumanNet_edge
P3
G1
G6
G2
G7
G3
G8
G4
G9
G5
G10
P4
P5
13
Visualizing the GSC Results
14
Sample Clustering / Subtype Stratification
15
Upcoming Features
Integration with Other Clouds
Import user spreadsheets directly from other
cloud-based datasets like TCGA, LINCS
New Workflows
Gene Regulatory Networks – Model interactions
between transcripts and transcription factors
Text Mining – Find genes most specifically
related to different disease terminology
Phenotype Prediction – Create model that
predicts phenotypic outcomes from genomic data
16
Thank You!
Thank You!
Come see our demo at Poster #75
KnowEnG Development Team
• Research and Design: Saurabh Sinha, Colleen Bushell, Matt Berry, Lisa
Gatzke, Amin Emad, Charles Blatti, and Sheng Wang
• Pipelines and Infrastructure: Nahil Sobh, Dan Lanier, Milt Epstein, Xi Chen,
Suyang Chen, Jing Ge, Pramod Rizal, Omar Sobh, Aidan Epstein, Corey Post
17