Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Luke Alden Yancy, Jr. Mentor: Robert Riley Broad Institute of MIT & Harvard Cambridge, MA Source: http://staff.vbi.vt.edu/pathport/pathinfo_images/Mycobacterium_tuberculosis/AerosolTransmission.jpg Deaths Causes by TB (Estimated by WHO) 1998 1,751,858 2006 1,654,805 Source: WHO Stop TB Department, website: www.who.int/tb Learn more about Mycobacterium Tuberculosis (Mtb) using analysis of gene expression data Biclustering ◦ ◦ ◦ ◦ ◦ Bimax (Prelic et al. 2006) CC (Cheng and Church, 2000) Plaid Model (Turner et al. 2003) Spectral (Kluger et al. 2003) Xmotifs (Murali and Kasif, 2003) Traditional Clustering ◦ K-Means (MacQueen, 1967) ◦ Hierarchical (Eisen et al. 1998) Traditional Clustering Biclustering Gene Clusters Based on: All Experiments Subsets of Experiments Genes Assigned to Clusters: One-to-One Many-to-Many/ Oneto-Many Reproducibility: Yes No (due to random steps in algorithm) Source: Machine Learning and Its Applications to Biology, Tarca et al. 2007. (Editor: Fran Lewitter, Whitehead Institute) Bimax K-Means Boshoff Data (Processed: 3924 Genes, 359 Experiments) Clusters of Genes Source: The Transcriptional Responses of Mycobacterium tuberculosis to Inhibitors of Metabolism. (Boshoff et al. 2004) (proS loci of Mtb ) (N) Significance of overlap k estimated using hypergeometric distribution: Cluster (m) (k) Operon (n) Gene Pair (Source: http://www.nature.com/nature/journal/v409/n6823/full/4091007a0.html) Bimax Biclustering Operon Overlap Source: Prolinks: a database of protein functional linkages derived from coevolution (Bowers et al. 2005) Random step – lacks reproducibility No biological soundness Artificial arrangement of data ◦ Large data sets produce statistically significant, but small clusters Practicality ◦ Implementation ◦ Large Input Data Sets K-Means clustering performs better than biclustering on our data set Next, use motif recognition methods to identify regulatory motifs in clusters Further development of improved biclustering algorithms Project Team Robert Riley (Mentor) Brian Weiner The Broad Institue Eric Lander Core Members SRPG Program Members Summer Research Program in Genomics (SRPG) Shawna Young Bruce Birren Lucia Vielma Maura Silverstein