Download Lecture-TreeOfLife

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Essential gene wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Gene desert wikipedia , lookup

NUMT wikipedia , lookup

Genetic engineering wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Oncogenomics wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Transposable element wikipedia , lookup

RNA-Seq wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Public health genomics wikipedia , lookup

Genomic imprinting wikipedia , lookup

Metagenomics wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Ridge (biology) wikipedia , lookup

Gene expression programming wikipedia , lookup

Non-coding DNA wikipedia , lookup

Gene expression profiling wikipedia , lookup

Gene wikipedia , lookup

Designer baby wikipedia , lookup

Genomics wikipedia , lookup

Genome (book) wikipedia , lookup

Genomic library wikipedia , lookup

Human genome wikipedia , lookup

History of genetic engineering wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Microevolution wikipedia , lookup

Human Genome Project wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Pathogenomics wikipedia , lookup

Helitron (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Genome editing wikipedia , lookup

Minimal genome wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
In brief
•
•
•
•
•
•
Vertical vs. Horizontal
Homologous vs. Unequal
Prokaryotes vs. Eukaryotes
Mechanisms and Vectors
Impact on Tree of Life
Implications for prokaryotic species
Possible mechanisms for HT in
Drosophila
From Heredity (2008) 100, 545–554
EVOLUTION: Genome Data Shake Tree of Life
E Pennisi - Science, 1998 - sciencemag.org
The ring of life provides evidence for a genome fusion origin of
eukaryotes
MC Rivera, JA Lake - Nature, 2004
The net of life: reconstructing the microbial phylogenetic network
V Kunin, L Goldovsky, N Darzentas, CA … - Genome Research 2005
The tree of one percent
T Dagan, W Martin - Genome biology, 2006
Uprooting the tree of life
WF Doolittle - Evolution: a Scientific American reader, 2006
Clusters of Orthologous Groups (COGs)
Puigbo et al.
• 6901 ML trees
• 100 taxa total
• Objective – compare
topological distance
between trees
• New metric called IS
(inconsistency score) =
fraction of the time
splits in a tree are
found all trees
Many genes are not found in all taxa
Define 102 NUTs or
“nearly universal trees”
that include 90% of the
prokaryotes under
comparison.
Mostly translation and
core transcription related
J Biol. 2009;8(6):59.
The big divide?
• Look for evidence of HGT between bacteria
and archaea
• 56% of NUTs separated the groups perfectly
• 44% show at least on HGT
– 13% from archaea to bacteria
– 23% from bacteria to archaea
– 8% both directions
The network of similarities among the nearly universal trees (NUTs). (a) Each node
(green dot) denotes a NUT, and nodes are connected by edges if the similarity between
the respective edges exceeds the indicated threshold. (b) The connectivity of 102 NUTs
and the 14 1:1 NUTs depending on the topological similarity threshold.
The supernetwork of the NUTs. For spcies abbreviations see Additional File 1.
Puigbò et al. Journal of Biology 2009 8:59 doi:10.1186/jbiol159
Network representation of the 6,901 trees of the forest of life. The 102 NUTs are shown as
red circles in the middle. The NUTs are connected to trees with similar topologies: trees
with at least 50% of similarity with at least one NUT (P-value < 0.05) are shown as purple
circles and connected to the NUTs. The rest of the trees are shown as green circles.
Puigbò et al. Journal of Biology 2009 8:59 doi:10.1186/jbiol159
Similarity of the trees in the forest of life to the NUTs. (a) For each of the 102 NUTs,
the breakdown of the rest of the trees in the forest by percent similarity is shown. (b)
The same breakdown for 102 random trees generated from the NUTs.
Puigbò et al. Journal of Biology 2009 8:59 doi:10.1186/jbiol159
Proc Natl Acad Sci U S A. 2005 Oct 4;102(40):14332-7.
Highways of obligate gene transfer within and among phyla and divisions of prokaryotes, based on analysis
of the 22,348 protein trees for which a minimal edit path could be resolved
Beiko R G et al. PNAS 2005;102:14332-14337
©2005 by National Academy of Sciences
Ratio of observed to expected discordant bipartitions among proteins in major TIGR role category
groupings
Beiko R G et al. PNAS 2005;102:14332-14337
©2005 by National Academy of Sciences
Fig. 1. Two methods for assessing LGT in bacterial genomes, applied to available quartets of closely related, fully sequenced bacterial taxa. The reference topology, based on
SSU rRNA, is shown in the upper left, with taxon names listed in the rows below. The yellow box contains the numbers of gene acquisitions in genomes A and B, as
determined by parsimony in comparisons of complete genome contents. The blue box contains the numbers of orthologous genes supporting a topology that conflicts with
the reference topology. "Interspecies" and "Intraspecies" comparisons represent quartets of taxa in which phylogenetic incongruence can be explained, respectively, by a
transfer from another species or from another strain of the same species. For intraspecies comparisons, numbers of acquired and lost genes were not calculated because of
uncertainty about the actual tree topology (nd, not determined). (B. aphidicola strains are entirely isolated in different hosts and were thus considered as different species
despite having a single name. In B. aphidicola, amounts of gene loss and gene gain are similar, suggesting that LGT is overestimated due to independent losses of genes.)
Fig. 2. Relative frequencies of the three categories of alignments, i.e., those supporting the reference phylogeny (SSU rRNA), those supporting an
alternate phylogeny (LGT), and those with no statistical support for any phylogeny. Points represent quartets of genomes for which orthologous
genes have been inferred, aligned, and evaluated at the nucleic acid sequences level based on the SH test implemented in Puzzle 5.1 (19). The
left part of the plot (in blue) represents the area where LGT predominates.
“THE” E. coli genome
Blattner et al., Science 5
September 1997 277: 14531462
Figure 1. The overall structure of the E. coli genome. The origin and terminus of replication are shown as green lines, with
blue arrows indicating replichores 1 and 2. A scale indicates the coordinates both in base pairs and in minutes (actually
centisomes, or 100 equal intervals of the DNA). The distribution of genes is depicted on two outer rings: The orange boxes
are genes located on the presented strand, and the yellow boxes are genes on the opposite strand. Red arrows show the
location and direction of transcription of rRNA genes, and tRNA genes are shown as green arrows. The next circle illustrates
the positions of REP sequences around the genome as radial tick marks. The central orange sunburst is a histogram of
inverse CAI (1 - CAI), in which long yellow rays represent clusters of low (<0.25) CAI. The CAI plot is enclosed by a ring
indicating similarities between previously described bacteriophage proteins and the proteins encoded by the complete
E. coli genome; the similarity is plotted as described in Fig. 3 for the complete genome comparisons.
Perna et al., Nature 409, 529533(25 January 2001)
Outer circle shows the distribution of islands: shared co-linear backbone (blue); position of EDL933-specific sequences (Oislands) (red); MG1655-specific sequences (K-islands) (green); O-islands and K-islands at the same locations in the backbone
(tan); hypervariable (purple). Second circle shows the G+C content calculated for each gene longer than 100 amino acids,
plotted around the mean value for the whole genome, colour-coded like outer circle. Third circle shows the GC skew for
third-codon position, calculated for each gene longer than 100 amino acids: positive values, lime; negative values, dark
green. Fourth circle gives the scale in base pairs. Fifth circle shows the distribution of the highly skewed octamer Chi
(GCTGGTGG), where bright blue and purple indicate the two DNA strands. The origin and terminus of replication, the
chromosomal inversion and the locations of the sequence gaps are indicated. Figure created by Genvision from DNASTAR.
Shared E. coli proteins
Welch R A et al. PNAS 2002;99:17020-17024
©2002 by National Academy of Sciences