Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Genomic Data Manipulation Thinking about data visually Curtis Huttenhower [email protected] http://huttenhower.sph.harvard.edu/bio508 Harvard School of Public Health Department of Biostatistics 01-28-13 The usual suspects 2 Small changes, big differences 3 Fig. 3. Comparison of Sargasso Sea scaffolds to Crenarchaeal clone 4B7. J C Venter et al. Science 2004;304:66-74 Published by AAAS Only one of many ways to think about DNA sequence data... 5 Fig. 7. Phylogenetic tree of rhodopsinlike genes in the Sargasso Sea data along with all homologs of these genes in GenBank. (Almost) everything can be clustered into a tree, even DNA sequences J C Venter et al. Science 2004;304:66-74 Published by AAAS Aerobic, microaerobic and anaerobic communities But not every tree is a clustering Model of microbial biomarkers Why are networks so popular in biology? 8 Don’t be afraid to get creative when representing data! Hunger Games Avengers Dark Knight Rises Twilight XXVII http://xach.com/moviecharts/2012.html 9 Wordles 10 Four 11-pair datasets with the same... Anscombe's quartet X mean, X standard deviation, Y mean, Y standard deviation, Correlation, and regression coefficients μ(x)=9 σ(x)=11 μ(y)=7.5 σ(y)=4.1 ρ=0.816 y=3+0.5x Looking at data – it’s not just fun, it’s important, too! 11