* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CIPRES.2006.algorthms_sr
Survey
Document related concepts
Pathogenomics wikipedia , lookup
Designer baby wikipedia , lookup
Microevolution wikipedia , lookup
Helitron (biology) wikipedia , lookup
Metagenomics wikipedia , lookup
Gene expression programming wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genome editing wikipedia , lookup
Smith–Waterman algorithm wikipedia , lookup
Sequence alignment wikipedia , lookup
Maximum parsimony (phylogenetics) wikipedia , lookup
Multiple sequence alignment wikipedia , lookup
Transcript
Algorithms at Berkeley (partial overview). Group leader: Satish Rao Overview 1. Distance Methods • Using Singular value decompositions. (Erickson,Sturmfels) • Neighbor Joining Criteria. (Mihaescu, Pachter.) • Lower bounding techniques for tree search using matrix. (Chen et al.) 2. Statistical Research • Breakthrough: Optimal logarithmic sequence length tree reconstruction (Daskalakis, Mossel, Roch 05). Simplified version (Mihaescu et al. 06). Preliminary Implementation [Adkins et al.]. • Maximum likelihood. NP-hard. Relationship to Maximum Parsimony (Roch05,Tul05). • Standard chains of MCMC method scan be problematic (DMV05). • Learning with other Evolutionary Models: (Mossel, Roch). Only nonsingular hmodels can be learned at all. Overview 3. Alignment Related Issues • Alignment Event Histories. Insertion/deletion histories. (Pachter, Snir • Deriving Internal Sequence in a copy/delete model. (Chen et al.) • Multiple Alignment. MAVID (Bray, Pachter.) • Alignment Accuracy Metrics: (Schwartz, Meyer, Pachter). Pieces out unalignable sequence better than previous metrics. 4. Other Activities • Gene Order. Can reconstruct phylogenies from gene order data with logarithmic number of genes. (Adkins et al.) • Supertree Methods. Reconstructing trees from Triplets, Quartets. (Snir et al.) Combining above into supertree method. • Reticulate Evolution: (Karp, Riesenfeld). Produces Galled trees from high degree input trees. (Simultaneous.) Distance Methods: Tree versus Forests Forests (instead of tree): Mossel: finds almost all edges even with logarithmic length sequence data. Hill et al.: better run times, implementation. Sequence Methods: Phase Transitions. Phase Transition: If and only if mutation rate less than ≈ 78%, can learn with logarithmic length sequence data. (Mossel 2002, Daskalakis, Mossel, Roch06. ) Takeaways: • Lovely results, techniques. Applicable to other computational biology and learning problems. • Infer internal sequences (confirms algorithms for MCMC, maximum likelihood, parsimony). Statistical Methods • Likelihood. NP-completeness implies that one should analyse or rely on properties of model based instances. MCMC methods • “Average Likelihood”. DMV is among the first to understand the convergence conditions of these methods.