* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Gene Set Analysis with Phenotypic Screening Data Results and Validation Purpose
RNA interference wikipedia , lookup
Metagenomics wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Metabolic network modelling wikipedia , lookup
History of genetic engineering wikipedia , lookup
Pathogenomics wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Ridge (biology) wikipedia , lookup
Gene therapy wikipedia , lookup
Genomic imprinting wikipedia , lookup
Genome evolution wikipedia , lookup
Quantitative comparative linguistics wikipedia , lookup
Gene desert wikipedia , lookup
Minimal genome wikipedia , lookup
Gene nomenclature wikipedia , lookup
Public health genomics wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
The Selfish Gene wikipedia , lookup
Genome editing wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Gene expression programming wikipedia , lookup
Genome (book) wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Microevolution wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Gene Set Analysis with Phenotypic Screening Data Charles Hoyt, Elisabet Gregori-Puigjane, Miguel Camargo Center for Proteomic Chemistry, Novartis Institutes for Biomedical Research, Cambridge, Massachusetts Results and Validation Compound Purpose Down-regulated Tail • Identify pathways relevant to a phenotypic screen • Elucidate active compounds’ mechanisms of action in a phenotypic screen • Find new targets by expanding around validated targets in relevant gene sets in silico hypothesis to link the phenotype to a compounds Up-regulated Tail Target-centric lead discovery Disease Pathway Phenotype Genetics in silico hypothesis to link the phenotype to a MoA Target Background • Most gene set analysis methods were developed for gene-centric data • Of these methods, many focus on genes with extreme readouts, and can overlook significant aggregate effects due to many genes with more subtle readouts • Pathway Influence Scoring was developed to address that issue for gene-centric data • The method was then adapted for use with compound data • The plot shows scores versus the p-values to help distinguish significant gene sets that are up-regulated versus down-regulated • Sensitivity analyses of the net and absolute methods have been conducted to measure the robustness of the techniques and detect false positive gene sets • The analysis was run on a viral infection cell proliferation assay then the significant sets were clustered (below). The themes are consistent with validated targets and pathways in viral infection. Challenges • Using small molecules as probes faces its own challenges. • Their target annotations are less complete than the siRNA/related probes and may be acting through an unknown mechanism. • The molecules might also lack the ability to enter the cell or get to their intended targets. • This is addressed by using larger screens with highly annotated compounds. Methods • Compounds are annotated to genes for which they have an IC50 < 10μM • Genes’ activities are calculated by aggregating their annotated compounds’ phenotypic activities with an arithmetic mean Readout 1 Compound 1 Readout 2 Compound 2 Readout 3 Compound 3 Gene A • Each gene set is scored using a net and absolute scheme • P-values are calculated for each set and method using bootstrapping – the original set’s score was compared to the scores of 10000 sets built from genes randomly sampled from the population A. Transglutiminase Pathways B. Immune Response, PI3K Pathways C. Surface Receptors and Ion Channels D. Cellular Metabolism E. Viral Life Cycle F. Tetratricopeptide Domain Conclusions • The method has been validated and is sufficiently robust • New cell-based assays can be analyzed Looking Forwards • Comparison with adaptations of Broad’s GSEA method and other statistical methods • Additional post-processing techniques, including the contextualization of results with literature and text-mining approaches Acknowledgements Special Thanks to: Elisabet Gregori-Puigjane, Miguel Camargo, Stephan Reiling, Shreyas Mahimkar, and everyone in iSLD and CPC.