* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download introduction - Gerstein Lab Publications
Survey
Document related concepts
Epitranscriptome wikipedia , lookup
RNA interference wikipedia , lookup
Non-coding DNA wikipedia , lookup
Non-coding RNA wikipedia , lookup
List of types of proteins wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Gene desert wikipedia , lookup
Gene expression wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Gene regulatory network wikipedia , lookup
Gene expression profiling wikipedia , lookup
Molecular evolution wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Transcript
Researches have sought to understand and determine a universal molecular phylogeny, also known as the tree of life. Different sequences of proteins or ribosomes have been used as a basis of comparison. Currently, the most widely accepted grouping is based on sequence similarity of small subunit ribosomal RNA (Woese 1987; Woese et al., 1990). This method uses important and highly conserved genes as the basis of phylogeny which has complex interactions with many other RNA and proteins. However, the ribosomal RNA tree is under much question and scrutiny due to problems such as long-branch attraction, unresolved tree differences, lack of incorporation of other genomic data, among-site rate variation, mutational saturation, and most recently, lateral gene transfer (LGT). Long branch attraction is present when there are differing rates of sequences change among different sites in a gene, resulting in the clustering of organisms with higher rates of sequence change. Trees build on RNA polymerase, which are just as fundamental and complex, have produced trees that differ from the rRNA phylogeny. One explanation is secondary deletion events. Since the small subunit ribosomal RNA corresponds to less than 0.2% of the genomic material in most microorganisms (i.e., ~1.8 kb for the rRNA in genomes ~1Mb and up), focusing exclusively on it ignores the bulk of the genetic information in constructing phylogenetic trees. Different rates of evolution between different genes have resulting in differing gene phylogenies differ depending on the gene chosen. Organismal phylogeny is no longer based on just one gene but as a synthesis of the differing gene phylogenies. Mutational saturation can greatly affect the deeper branches of the tree as the possible places of mutation have been exhausted. “Mutational saturated sequences are maximally diverged, so that further changes are as likely to make them more similar as they are to make them more different, and tree topology is based on noise.” Genes have been shown to be “transferred” from one organism from another, meaning that when a gene is present in a organism, it is not necessary from its ancestor. Some researchers have proposed that the rRNA itself can be transferred, with evidence such as in vitra replacement of the E. Coli SSU rRNA to or the SSU rRNA hetereogeneity among the Streptomyces. As a result, other approaches have been used for phylogenetic analysis. Some researchers still based their methods on sequences, such as ones of ATPases, elongation factors, aminoacyl tRNA synthase, and beta-tubulin; some of new trees seem more representative based on macroscopic properties. Other researchers have taken advantage of the availability of completely sequenced genomes to determine the phylogeny. Gupta used insertions and deletions, which he called “indels,” along with the structure of the cell membrane for tree reconstruction. Gogarten suggests building trees using events of LGT as characters. The sequencing of whole genomes of microbial organisms allows us to reassess how we place organisms into groups and relate them to each other in phylogenetic trees. In our survey, we build trees based on gene function denoted by the COGs database also on three-dimensional structure classification suggested by SCOP. As schematized in Table 1, we built trees considering progressively more and more of the information in a genome and compare them with the traditionally proposed phylogeny. We are proposing a number of novel trees based on the overall genomic occurrence of two specific features - folds and orthologs. We call these "genomic trees" or "wholegenome trees." Our aim is to assess how these compare to phylogenetic trees constructed on other levels. distance more robust all trees suffer from long branch ribosomal distance and parsimony goal write 2 to three paragraphs robosomal stil h transfer rates of evolution moves eu and pro controversial googarten general paper doolittle specific exampoe edlind specific examples same thing find ref describe current cintrivery in whole genome phylogeny 2 ribosomal problems 2 genome comparison even more unsettled long branch attraction clippings online gavin * http://bioinfo.mbb.yale.edu/~jimmyln/clippings/Pennisi-Uproot-tree-of-life.pdf "Comparisons of the genomes then available not only didn't clarify the picture of ho life's major groupings evolved, they confused it. "So confusing that some biologist are ready to replace what has become the standard history with something new." the new genomes are not adding details to the traditional tree but challenging the tree altogether genes unreliable due to horizontal gene transfer perhaps not even use trees anymore because of lateral transfer Deinococcus radiodurnas contains genes only present in plants (might also be gene loss from most other microbes) mycobacterium tuberculosis contains at least eight human genes Gupta: use small insertions and deletions called "indels" and structure of cell membrane to build trees --> fundamental split among bacteria divides archaea between two bacterial branches Philippe: genes differ in rates of evolution Robb: suspected wrong placement based on rRNA, placements of hyperthemophiles towards the bottom might be an artifact * http://bioinfo.mbb.yale.edu/~jimmylin/clippings/Pennisi-shake-tree-of-life.pdf [The new genomic data] "threatens to overturn what researchers thought they already knew about how microbes evolved and gave rise to higher organisms." rRNA history: Woese - successful because identify archaea based only on one genes --> rRNA regarded as tree of life Reeve: because of different gene evolution, whole genomes produce different trees Horizontal Gene transfer: intertwining branches, swapping back and forth Aaeo --> different phylogenetic placement based on gene Soll: sum of all these trees make up the organism Other problematic genomes: Mjan, Bbur, Aful, Tpal etc. rampant gene swapping question the very existence of archae clippings/doolittle-pnas-genealogy-fun.pdf 2.5 billion years for divergence times of bacteria/eukaryote vs. 3.5 demanded possible resolution: hasegawa & fitch - covarion model residues functional constrained at different periods in protein history horizontal gene transder "contamination" mitochondria -> has TIM *** replication, transcription, and translation machinery less subject to horizontal gene transfer most eukaryotic nuclear genomes comes from bactiera, should we still insist in archaeal/eukaryal sisterhood because we think the genes that show this.. are less subject to horizontal gene transfer? Are they really? To waht extent is our desire to look at early evolution in terms of cellular lineages preventing us from seeing that it is about genes and their premiscious spread across taxonomic boundaries, which then have no permanent significance? * phylogenetic analysis of beta-tubulin sequences rRNA-anomality as a result of secondary deletion events beta-tubulin based phylogeny paromomycin sensitivity confirm beta-tubulin but not rRNA microsporida related to fungi and animals * phylogenetic classification and the universal tree genes from multiple sources and lateral gene transfer 1970s Woese suggest SSU rRNA less reliable of all genes to experience lateral gene tranfer rRNA and protein based phylogenies show that artifacts related to within-molecule and between lineage differences in evolutionary rate and mutational saturation can be misleading about deep branching and the rooting of the tree genes with different phylogenies prominent sources of error or uncertainty in establishing patterns: mutational saturation long-branch attraction among-site rate variation lateral gene transfer 1) not logical to equate gene phylogeny and organism phylogeny 2) unless organisms are construed as either less or more than the sum of their genes, there is no unique organismal phylogeny problem with the ery conceptual basis of phylogentic classification conflict between SSU rRNA and largest subunit of RNA polymerase in placement of microsporidia based on elongation factors (12-14, 19) gene trasfer seems unlikely to affect genes for replication, transcription, and translation, especially rRNA genes. but could be transfered (34) (32) LGTs of ribosomal protein genes zuckerkandl and pauling(2) based on orthologous genes if diff genes give diff trees, no fair way to suppress this disagreement *New York Times article * Orthologs, paralogs and genome comparisons independent gene duplications, parallelism in evolution of independent heritages with new databank searches for homologous sequences, and increase availability of reliable three-dimensional structual information proteins with diverse functions can belong to same superfamily vertical inheritance not sufficient horizontal gene transfer (15-17) wolf (40) studied distribution of protein folds in completely sequenced genomes ATPases(48), elongation factors(49-51), aminoacyl tRNA synthase(52) replcation, transcription, and translation machinery (54-55) other characters (56-59) placement of archae rooting of tree of life as working hypothesis and not as established fact (60-61) gene duplicatoion(82,93,85,87) * Felsenstein, Sys. zool. 27, 401 (178) long branch attraction occurs when rates of sequence change differ substantially between taxa. lineages with higher rates of sequence change artifactually associcate with each other and with out groups, excepts with maximum likelihood methods.