Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene therapy wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

NUMT wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Whole genome sequencing wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Copy-number variation wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Gene nomenclature wikipedia , lookup

Essential gene wikipedia , lookup

Oncogenomics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Transposable element wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Gene desert wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Genetic engineering wikipedia , lookup

Metabolic network modelling wikipedia , lookup

Metagenomics wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Genomic imprinting wikipedia , lookup

RNA-Seq wikipedia , lookup

Public health genomics wikipedia , lookup

Ridge (biology) wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Human Genome Project wikipedia , lookup

Genomic library wikipedia , lookup

Gene expression programming wikipedia , lookup

Human genome wikipedia , lookup

Genomics wikipedia , lookup

Non-coding DNA wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genome (book) wikipedia , lookup

Gene wikipedia , lookup

Gene expression profiling wikipedia , lookup

Microevolution wikipedia , lookup

Helitron (biology) wikipedia , lookup

Designer baby wikipedia , lookup

Genome editing wikipedia , lookup

History of genetic engineering wikipedia , lookup

Pathogenomics wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Minimal genome wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
Unitary Pseudogenes in Bacterial Genomes: Lifestyle and Gene Loss
Tal Dagan1 and Dan Graur 2
of Zoology, Tel Aviv University, Ramat Aviv 69978, Israel; 2Department of Biology and Biochemistry, University of Houston, Houston, TX 77204, USA
Bacterial Lifestyles
Genome Miniaturization
The question of “use and disuse” in evolution is as
old as the discipline itself. At least one unambiguous
rule can be deduced from the effects of disuse at the
molecular level: A drastic reduction in genome size
(genome miniaturization) is invariably associated
with loss of function. In particular, parasitic or
endosymbiontic lifestyles were found to effect
genome size profoundly.
In the following, we characterize genome size
reduction in 120 bacterial genomes due to gene loss
by using unitary pseudogene data.
Free
living
C. violaceum
R. palustris
B. anthracis
x
o
extreme
outlier
mean+STD
mean+SE
mean
mean-SE
mean-STD
Facultative
symbionts
Obligate
symbionts
Obligate
parasites
Unitary pseudogenes (%)
1Department
N (genomes)
Genes (COGs)
Unitary pseudogenes
41
45,444
106
20
33,943
117
19
20,986
40
44
38,802
189
• The frequency of the unitary pseudogenes was found to be significantly dependent on bacterial lifestyle.
• Obligate parasites have many more unitary pseudogenes than other bacteria.
• Facultative symbionts have a higher-than-expected frequency of unitary pseudogenes. This estimate, however,
Facultative: Optional, referring to the ability of an
organism to adopt to an alternative lifestyle.
Free living: Living without being directly dependent on
another organism.
Mutualism: Symbiosis in which both symbionts derive
benefit from the association.
Obligate: An essential attribute of an organism. For
example, an obligate parasite can only grow as a parasite.
Parasitism: Symbiosis in which one symbiont
(the parasite) benefits at the expense of the other (the
host).
Pseudogene: Functionless duplicate of functional gene.
II) Unprocessed
I) Unitary
non-functionalization
III) Processed
duplication & non-functionalization
Symbiosis: A stable condition in which two organisms
(symbionts) live in close physical association.
may be an overestimation due to the inclusion of Shigella flexneri.
COG function
• Genes encoding metabolic functions were found to be
prone to pseudogenization in all lifestyles, although
the fraction of pseudogenes derived from genes
involved in metabolic functions is smaller in free-living
bacteria.
Buchnera aphidicola
An intracellular mutualist obligatesymbiont of aphids. The fact that
different strains exhibit different
degrees of genome reduction, led
to the hypothesis that their
adaptation to different aphid
species is an ongoing process.
• The frequency of unitary pseudogenes
can be used to characterize genome
reduction.
• The frequency of unitary pseudogenes was found to
be dependent on genetic function.
• Function loss is more frequent in genes performing
metabolic functions than in genes involved in
information transfer or cellular processes.
An exceptional facultative
symbiont, whose genome resembles
that of obligate parasites. It has
been hypothesized that it is “on the
way to become one.”
Summary
Molecular Function
# Unitary Pseudogenes
Commensalism: Symbiosis in which one symbiont
(the commensal) derives benefit from the association, and
the other (the host) neither benefits nor is harmed.
An endosymbiontic parasite that
underwent extensive genome
reduction. About half of its genes
were rendered nonfunctional.
About 45% of its pseudogenes are
derived from genes specifying
metabolic functions.
Shigella flexneri
• In an analysis of 120 completely sequenced bacterial genomes, 452 unitary pseudogenes were found.
Glossary
Mycobacterium leprae
• Parasitic bacteria tend to lose gene
functions.
13%
16%
37%
34%
• Genes encoding metabolic functions are
more prone to pseudogenization than
other genes.
• Surprisingly, the ratio of unitary
pseudogenes to functional genes does not
differ significantly between free-living
and obligate-symbiotic bacteria.
• The frequencies of the lost
functions were found to be
similar among the different
lifestyles, except for the case of
free-living bacteria and
facultative symbionts.
• Our method for detection of unitary
pseudogenes assumes vertical
transmission of genes. Horizontal gene
transfer may be a source of bias.
Search algorithm
Identification of missing genes
An extended COG (Cluster of
Orthologous Genes) database was used to
identify missing functions. Bacteria that
were not represented in a certain COG
were marked as candidates for search of
orthologous unitary pseudogenes.
Search
We chose a query gene for each missing
bacterium in a COG. The chosen query
gene was from the closest related
bacterium according to rRNA distances.
The query gene was BLASTed against
the candidate bacterium genome.
Analysis of search results
Successful (e <10-4) BLAST hits with no
overlap to known genes were collected.
Nearby hits (30 bp) were joined. The DNA
sequence of each such hit was aligned to the
DNA sequence of the query gene using
CLUSTALW and to the protein sequence of
the query-gene product using WISE2. dn/ds
ratios were calculated using PAML.
Filtering
Unitary pseudogenes were defined as such if:
• DNA sequence similarity to the functional
gene was above 50%.
• Protein sequence similarity was above 35%.
• Length was at least 35% of the query gene.
• Signs of nonfunctionality, e.g., premature
STOP codons, frameshifts, or lack of
purifying selection (dn/ds = 1), are evident.