Download Evolution of Metabolic Pathway

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genomic imprinting wikipedia , lookup

Drug discovery wikipedia , lookup

Gene desert wikipedia , lookup

Genomic library wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Oxidative phosphorylation wikipedia , lookup

Pharmacometabolomics wikipedia , lookup

Community fingerprinting wikipedia , lookup

Gene wikipedia , lookup

Mitogen-activated protein kinase wikipedia , lookup

Genetic engineering wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Paracrine signalling wikipedia , lookup

Gene regulatory network wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Enzyme wikipedia , lookup

Microbial metabolism wikipedia , lookup

Glycolysis wikipedia , lookup

Metabolic network modelling wikipedia , lookup

Citric acid cycle wikipedia , lookup

Biosynthesis wikipedia , lookup

Evolution of metal ions in biological systems wikipedia , lookup

Biochemical cascade wikipedia , lookup

Genome evolution wikipedia , lookup

Amino acid synthesis wikipedia , lookup

Metabolism wikipedia , lookup

Transcript
Evolution of Metabolic
Pathway
10/03/2007
Reconstruction of metabolic pathways in
sequenced genomes
¾ Metabolic pathways are the most well-understood pathways in
the model organisms, and are relatively conserved across all
the three domain of organisms;
¾ One goal of computational genomics is to reconstruct the
metabolic pathways in all sequenced genomes;
¾ An naïve solution to this problem is to use the metabolic
pathways in model organisms as templates to reconstruct the
corresponding pathways in newly sequenced genomes
through othologs identification (COG database, BDBH).
¾ However, it turns out that the problem is far more complicated
to be fully automated due to a few seasons.
Problems for pathway mapping through orthologs
identification
¾ The missing gene problem due to non-orthologous gene
displacement;
¾ Multiple paralogs in a genome;
¾ Gene loss/reduction.
gi
g1
g2
g3
g4
g5
Components
in template
pathway
Orthology
mapping
Mapped
components in
target genome
Non-orthologous
gene displacement
or gene loss
????
g’1
g’2
g’3
g’4
Carbohydrate
Metabolism
Glucose
Glycolysis pathway:
¾ Encoded in all the
organism, except Rickettsia
prowazekii, a intracellular
parasite.
¾ Non-orthologous gene
displacement is found in 6
of 10 enzymes in this
pathway
Glycolysis pathway
1. Glucokinase: not a
essential gene, but
at least three
analogs are found:
COG0837
Classic in E.coli
ADP-dependent
New form in P. furious
PEP-dependent
phosphotransferase
A transporter type
Glucose
Glycolysis pathway
2. Glucose-6-phosphate
isomerase(GPI)
COG0166
The classical (E. coli) form found in
most eubacteria and archaea and in
the cytoplasm of the eukaryotic cell.
COG2140
Encoded in P. furiosus P. horikoshii
and A. fulgidus,
Glycolysis pathway
3. Phosphofructokinase (Pkf)
an example of an enzyme where
several “missing” enzyme forms
have been discovered recently
COG0205
An ATP-dependent kinase of unique
structure found in bacteria and
many eukaryotes.
PH1645
ADP-dependent PKF found in P.
furiosus and M. jannaschii.
APE0012
found in Halobacterium sp., A.
fulgidus, M. thermoautotrophicum, A.
pernix, and several other archaea
Glycolysis pathway
4. Fructose-1,6-bisphosphate
aldolase (PB)
COG0191
a metal-independent one (class I) in
bacteria and multicellular
eukaryotes
COG1830
and a metal-dependent one (class II)
in archeae, bacteria and yeast
These two forms have been known
for more than 50 years. Some
organisms have both, eg. E. coli.
Glycolysis pathway
6. Phosphoglycerate mutase
COG0588
2,3-bisphosphoglycerate-dependent
(animal-type)
COG0696
2,3-bisphosphoglycerate-independent
(plant-type),
COG3635
distantly related to the cofactorindependent phosphoglycerate
mutase, found in archaea.
Glycolysis pathway
6. Pyruvate kinase
COG0469
Irreversible form
COG0574
Reversible form, also called
phosphoenolpyruvate
synthase
Carbohydrate
Metabolism
Gluconeogenesis pathway
¾ With the exception of
reactions catalyzed by
phosphofructokinase and
pyruvate kinase,
glycolytic reactions are
reversible and function
also in gluconeogenesis
Gluconeogenesis
pathway
¾ Fructose-1,6bisphosphatase
COG0158
found in E. coli, yeast, and human)
COG1494
originally described in cyanobacteria
BS_fbp
encoded in B. subtilis
MJ0109
Encod in archaea as M. jannaschii
Carbohydrate
Metabolism
¾ Phosphoenolpyruvate
synthase
COG0574
widely present in bacteria,
archaea, protists, and plants but
missing in animals, where PEP is
synthesized from oxaloacetates in
a PEP carboxykinase-catalyzed
reaction
Carbohydrate
Metabolism
¾ Phosphoenolpyruvate
carboxykinase
COG1866
oxaloacetate + ATP <=>
phosphoenolpyruvate + ADP + CO2
found in plants, yeast, and many
bacteria, potential drug target.
COG1274
oxaloacetate + GTP <=>
phosphoenolpyruvate + GDP + CO2
found in animals and in a limited
number of bacteria.
Carbohydrate Metabolism
Entner-Doudoroff pathway and pentose phosphate shunt
¾ Found in many organisms but not universal; convert hexoses
and pentoses into trioses.
¾ Novel enzymes need to be discovered, maybe Important for
biofuel studies.
Carbohydrate Metabolism
The tricarboxylic acid cycle (TCA or Krebs cycle)
¾ The complete TCA cycle is only found in a handful of
microorganisms. Most organisms encode only a certain subset
of TCA enzymes and, utilize only fragments of it.
The tricarboxylic acid cycle (TCA or Krebs cycle)
¾ The pathway is very diverse: cases of non-orthologous gene
displacement are detectable for at least five of the eight TCA
cycle enzymes.
The tricarboxylic acid cycle (TCA or Krebs cycle)
¾ Non-orthologous gene displacement in TCA cycle
not related
Distantly related
Malate dehydrogenase
Aconitase
Citrate synthase
Fumarate hydratase
Succinate dehydrogenase
Succinyl-CoA synthetase
Isocitrate
dehydrogenase
2-Ketoglutarate dehydrogenase
The tricarboxylic acid cycle (TCA or Krebs cycle)
¾ There are still missing genes in TCA of some genomes
Malate dehydrogenase
Aconitase
Citrate synthase
Fumarate hydratase
Succinate dehydrogenase
Succinyl-CoA synthetase
Isocitrate
dehydrogenase
2-Ketoglutarate dehydrogenase
Pyrimidine Biosynthesis
¾ Has fairly consistent phylogenetic
profiles, 3 cases of nonorthologous gene displacement;
¾ The pathway, with the
exception of the last 3
steps, is missing in the
obligate parasitic
bacteria with small
genomes: rickettsiae,
chlamydiae,
spirochetes, and
mycoplasmas.
¾ Bacteria and archaea
with larger genomes
encode all or almost all
enzymes of pyrimidine
biosynthesis
Carbamoyl phosphate synthase
Aspartate carbamoyltransferase
Dihydroorotase
Dihydroorotate dehydrogenase
Orotate
phosphoribosyltransferase
Orotidine-5′-monophosphate
decarboxylase
Uridylate kinase
Nucleoside diphosphate kinase
CTP synthase
Pyrimidine Biosynthesis
¾ There appears to be a tendency toward decreasing the
genome size by losing genes that have ceased to be
essential.
¾ The trend toward gene loss is much more pronounced for the
initial steps of the pyrimidine biosynthesis pathway than it is
for the distal steps. Thus, while depending on the host for the
supply of essential nutrients, a parasite preserves at least
some metabolic plasticity.
Purine Biosynthesis
¾ Has fairly consistent
phylogenetic profiles,
although cases of nonorthologous gene
displacement can be
found.
¾ Some enzymes are
missing in parasitic
bacteria with small
genomes, namely
mycoplasmas,
rickettsiae, chlamydiae,
spirochetes, Buchnera
sp., and H. pylori.
Amino Acid Biosynthesis
Aromatic amino acids
Biosyntheses of tryptophan
¾ The biosynthetic pathways for
phenylalanine, tyrosine, and
tryptophan in bacteria and
eukaryotes share common
steps leading from
phosphoenolpyruvate and
erythrose-4-phosphate to
chorismate;
¾ Enzymes for most of these
steps are encoded also in
archaeal genomes;
Biosynthesis of biosyntheses of tryptophan
¾ Dehydroquinate dehydratase is
found in several genomes that do not
encode dehydroquinate synthase,
indicating the existence of an
alternative, still uncharacterized
pathway of dehydroquinate formation
in these species
¾The phyletic pattern for shikimate 5dehydrogenase coincides with the
combined pattern of the two forms of
3-dehydroquinate synthase:
Biosynthesis of phenylalanine and tyrosine
¾ Non-orthologous gene
displacement are
found for the 3
enzymes led to the
syntheses of
phenylalanine and
tyrosine from
chorismate.
A summary on aromatic amino acid biosynthesis
¾ Most bacteria and archaea retain the complete set of genes
for tryptophan biosynthesis. The exceptions are the obligate
archaeal heterotroph P. horikoshii and some obligate bacterial
parasites;
¾ Enzymes of the tyrosine biosynthesis pathway are encoded in
almost as many complete genomes, with the exception of P.
abyssi (live in 105ºC).
¾ A. aeolicus and four archaeal species encode 3dehydroquinate dehydratase and all the downstream enzymes
but do not encode either DAHP synthase or 3-dehydroquinate
synthase:
¾ It appears that these organisms produce 3-dehydroquinate via
a different mechanism, which does not include DAHP as an
intermediate.
Coenzyme Biosynthesis
Thiamine Biosynthesis
¾ Not well-studied, every enzyme has its own distinct phylogenetic
pattern, indicating the abundance of non-orthologous gene
displacement cases.
¾There is still
ample opportunity
for new
discoveries in
other
organisms.
Coenzyme Biosynthesis
Riboflavin Biosynthesis
¾ Three of the seven rib genes
characterized in E. coli and B.
subtilis having no archaeal
orthologs;
¾ There is an excellent chance
of discovering new enzymes
of this pathway.
Microbial enzymes as potential drug targets
¾ One of the major goals of bacterial pathogen genome
sequencing projects is to better understand their peculiarities
and to develop new approaches for controlling diseases caused
by these organisms.
¾ Comparative genomics studies can help to choose drug
candidates that are most likely to be effective and least likely to
be toxic.
¾ In fact, the known targets of all antibiotics are essential for
bacterial cell metabolism and not represented in human cells
¾ The “parts list” encoded in the pathogen genomes offers a wide
selection of potential drug targets.
Potential targets for broad-spectrum drugs
¾ Essential genes for each particular group of bacteria (all
bacteria, all Gram-positive bacteria, all mycobacteria, etc.),
e.g.,
Cell wall synthesis pathways
Transcription and translation machineries;
DNA replication pathways;
Purine and pirimidine synthesis pathways
¾ Uncharacterized genes whose wide distribution in microbial
genomes marks them as being most likely essential.
Potential targets for pathogen-specific drugs
¾ Membrane transport systems: Many pathogens rely on the
host for the supply of certain essential nutrients, the respective
membrane transport systems can be valid targets for drug
intervention, e.g.,
The still uncharacterized biotin transport system appears to be
the only means of biotin acquisition for several pathogens, such
as S. pyogenes, R. prowazekii, C. trachomatis, and T. pallidum .
¾ Surface proteins of bacteria as drug targets
¾ Host interaction factors that are essential for establishing
parasitic state.
¾ Analogous proteins in certain pathogens in a different form
that is present in humans. Detailed analysis of non-orthologous
displacement cases has led to suggestions that alternative
forms of essential enzymes could be used as drug targets
Methods for drug target selection
¾ Phylogenetic profiles can be used to identify genes that are
essential to all (or most) pathogenic microorganisms in a
chosen group, as well as those that are specific for a particular
organism.
¾ The products of the former set are attractive targets for broadspectrum antibiotics, whereas the unique proteins offer an
opportunity to design specific antibiotics, which would
specifically target a narrow group of bacteria or even one
particular pathogen.
Pathogens
COG1
COG2
COG3
COG4
COG5
COG6
COG7
Non-pathogens and eukaryotes
1111100000000000000000000
1111100000000000000000000
0001100000000000000000000
0001100000000000000000000
1000000000000000000000000
0100000000000000000000000
0010000000000000000000000
Methods for drug target selection
¾ Analogous genes can be identified by the inverted patterns for
the phylogenetic profile of the genes.
¾ Differential genome display looks for the genes that are
present in the genome of a pathogen but not in the genome of a
closely related free-living bacterium.
Pathogenicity factors in H. influenzae and H. pylori can be
identified through comparison of their genomes against E. coli
Examples of pathogen-specific drug targets
Enzymes with limited phylogenetic
distribution
Human pathogens that depend on these
enzymes
ATP/ADP translocase, bacterial/plant type
R. prowazekii, C. trachomatis, C.
pneumoniae
3-Dehydroquinate dehydratase, class II
C. jejuni, H. influenzae, H. pylori, P.
aeruginosa, V. cholerae
DhnA-type fructose-1,6-bisphosphate
aldolase
C. trachomatis, C. pneumoniae
Lysyl-tRNA synthetase, class I
B. burgdorferi, R. prowazekii, T. pallidum,
Na+-translocating NADH: ubiquinone
oxidoreductase
C. trachomatis, C. pneumoniae, Cl.
perfringens, T. denticola
Na+-translocating oxalo-acetate
decarboxylase
S. pyogenes, T. pallidum,
Orotidine 5′-phosphate decarboxylase
M. leprae, M. tuberculosis
Pyridoxine biosynthesis enzymes PDX1,
PDX2
Bacillus anthracis, H. influenzae, L.
monocytogenes, M. leprae, M.
tuberculosis, S. pneumoniae
Cofactor-independent phosphoglycerate
mutase
C. jejuni, H. pylori, M. genitalium, P.
aeruginosa, V. cholerae
Conclusions
1. Metabolic pathways are very subject to non-orthologous gene
displacement;
2. If an organism encodes a significant fraction of the enzymes for
a particular pathway, it is likely that a missing enzyme may be
sought for among uncharacterized orthologous sets with
inverted phylogenetic profiles;
3. Conversely, there are cases where most of the enzymes of a
given pathway are missing in an organism but one or two still
stay around, most likely, these are cases of exaptation;
4. Identification of potential targets for antibacterial drugs is a
natural task for comparative genomics and will likely remain
one of its most important practical applications for years to
come.