Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Tuesday, May 23, 2017 T From high-throughput data to network biology: gain in statistical power and biological relevance Stockholm Bioinformatics Centre Andrey Alexeyenko PLoS Med 2005 2(8):e124 Why Most Published Research Findings Are False Statistical model: no positive facts, and an allowed rate of Type I error True negatives False positives Biological reality: negative facts are the vast majority, positive facts are yet to be discovered Positive facts True positives Negative facts “Positive facts”: the discoveries we are after, e.g. genomic associations, differentially expressed genes, relations “phenotype<->disease” etc. Network is just a graph! The fact that I can draw a network does not yet make it a biological reality!.. Conversion “data pieces confidence” in a Bayesian framework D. rerio, 17.3% D. melanogaster, 9.8% C. elegans, 9.3% R. norvegicus, 5.1% S. cerevisiae, 10.2% M. musculus, 25.4% A. thaliana, 6.5% H. sapiens, 16.5% A Phylogenetic profiling, 18.6% Protein interactions, 10.6% Protein expression, 6.1% T F targeting, 12.3% miRNA targeting, 2.0% Sub-cellular localization, 7.3% mRNA expression, 43.1% Enrichment of functional groups Enrichment analysis in the networks turns to be more powerful than on gene lists Enrichment of functional groups Partial correlations rPLC = 0.95 rPLC = 0.88 rPLC = 0.76 Benjamini-Hochberg correction Quantitative modeling of multi-component system with mutually dependent elements Why going “list network” is an advancement? • Functional context • “Anchoring”, i.e. interdependence • Biological interpretability • Statistical features • Data integration Many of those can be applied to the lists as well, but mind the flexibility! Ways to augment confidence Trivial: 1) increase power 2) decrease false prediction rate • Data integration – Evaluation prior to integration! • • • Consider biological context Remove spurious edges Generalize to a higher level of organization Ways to evaluate confidence • Supervised learning • Balance comprehensiveness and complexity (s.c. information criteria) • Benjamini-Hochberg • Show it a biologist • Go out to the real world and test Ways to employ confidence • • • • Initialize network Add node and edge attributes to the network Filter network elements for higher relevance Build more complex models accounting for confidence