* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Large-Scale High-Resolution Orthology Using Gene Trees
Gene therapy of the human retina wikipedia , lookup
Koinophilia wikipedia , lookup
Transposable element wikipedia , lookup
Oncogenomics wikipedia , lookup
Metagenomics wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Copy-number variation wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Genetic engineering wikipedia , lookup
Public health genomics wikipedia , lookup
Segmental Duplication on the Human Y Chromosome wikipedia , lookup
Gene therapy wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Essential gene wikipedia , lookup
History of genetic engineering wikipedia , lookup
Pathogenomics wikipedia , lookup
Gene nomenclature wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Gene desert wikipedia , lookup
The Selfish Gene wikipedia , lookup
Genomic imprinting wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Ridge (biology) wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Genome (book) wikipedia , lookup
Gene expression programming wikipedia , lookup
Minimal genome wikipedia , lookup
Genome evolution wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Designer baby wikipedia , lookup
Finding Orthologous Groups René van der Heijden What is this lecture about? • What is ‘orthology’? • Why do we study gene-ancestry/gene-trees (phylogenies)? • Several approaches to find orthologous genes • High-resolution orthology • Steps involved • Things to think about (homework) Homology Genes are homologous if and only if they derive from the same ancestral gene • Sufficient sequence similarity proofs homology • Very dissimilar sequences: PSI blast, HMM searches Homologous genes tend to have similar functions The usual range Homologous genes tend to have similar functions Accurate function prediction requires something better than homology Orthology Duplications, Speciations, and Orthology Evolution results in: • Growing number of genes – Gene duplications – Horizontal gene transfer Tendency for functional – De novo generation • Growing number of species The fate of gene duplicates: • Perish • Find a new functional niche expansion Duplications, Speciations, and Orthology Two genes in two species are orthologous if they derive from one gene in their last common ancestor • Orthologous genes are likely to have the same function • Much stronger than “tend to have similar function” Duplications, Speciations, and Orthology present genes primal ancestor evolutionary distance Homologs, Orthologs, and Paralogs • Homologous: one common ancestral gene • Orthologous: separated by a speciation event The view on orthology and • Paralogous: separated by a duplication event paralogy is relative to a certain speciation • Orthologs and Paralogs must be Homologs Are there homologous genes which are not orthologous nor paralogous? Inparalogs and Outparalogs • Both, In- and Outparalogous genes are separated by a gene duplication event • For Inparalogs, the duplication event Are Inparalogs Orthologs ? is not followed by speciation(s) Depends on your definition: Yes: two genes are orthologous if • Outparalogs arederive separated a duplication they from oneby gene in their last ancestor event, followed by common speciation(s) No: two genes are orthologous if they are only separated by Inparalogs are recent paralogs cell division events • • Outparalogs are more ancient paralogs Reading Gene-Trees Although genes spec1,1 and spec2,1 are closer relatives, their distance is larger than that between spec1,1 and spec3,1 The tree suggests at least 2 gene losses In-, and Outparalogs, Orthologs, and Co-orthologs More examples www = What, Why, and hoW? • What: Orthologous genes are separated by cell division only • Why: Orthologous genes are likely to have the same function • How: Yes, how can orthologous relations be established ? Several approaches • The COG approach • InParanoid • Tree-based methods COG approach • Based on blast hits • Establishment and extension of triangles: COG approach II Extension of orthologous groups InParanoid I • Method denotes – IN- and OUTparalogs – For TWO species • Find all hits from species A on B • Find all hits from species B on A • Find all bi-directional best hits (BBH) – These for putative orthologs InParanoid II • Find all hits from A on A • Find all hits from B on B • Find all InParalogs – These are all hits better than the orthologs – Better => more recently split InParanoid III • Putative orthologous pairs are curated by an outgroup species C • InParalogs are given a confidence value • Bootstrapping is used to give confidence values for orthologous pairs Genes with promiscuous domains • Gene A may hit on gene B because of a shared domain X • Gene B may hit on gene C because of a shared domain Y • Promiscuous domains require (manual) curation Tree-based methods 1. 2. 3. 4. Get all homologous genes Make multiple alignments Generate phylogenetic gene trees Analyze trees • • • • Uncertainty in multiple alignment? Different methods for distance calculations Superpose a trusted species tree? How to assess a level of accuracy? The Phylogenetic Gene-Tree • Multiple alignment for all genes • Distance matrix calculation – Kimura correction – PAM model – Categories model • Large trees: distance-based methods – Neighbor Joining Uncertainty in trees • Evolutionary noise – Differing rates of evolution – Convergent evolution (low complexity, coiled coils) – Promiscuous domains (recombination, fusion, fission) • Use of heuristic methods – Multiple alignment – Tree making Analyze trees … but don’t trust them fully If this is correct …. this can’t be • Rigid analysis suggests many duplications and losses • Presume scp branch is wrongly placed! Analyze trees … but don’t trust them fully • And if we accept wrong placement of branches … Considering Three orthologous one wrongly groups placed gene suggesting leaves only 15 gene 2 gene losses losses High-res versus Low-res • Many, • Complete, and • Closely related genomes Challenge: Automatic Orthology assignment Things to think about (homework) • Select a partner • Collect a gene tree (and some copies) • Carefully deduce which nodes are duplications and which are speciations • Denote which genes are orthologous to each other (orthologous groups) • Select interesting parts to predict what – The COG procedure would say – InParanoid would say – What would have happened if some genes (or species) where not involved in the analysis Homework: also think about …