Download Co-‐evolution of the human genome and microbiome - EMBL-EBI

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genomic imprinting wikipedia , lookup

Ridge (biology) wikipedia , lookup

Non-coding DNA wikipedia , lookup

Oncogenomics wikipedia , lookup

Genomic library wikipedia , lookup

RNA-Seq wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Gene expression profiling wikipedia , lookup

Genetic engineering wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Human–animal hybrid wikipedia , lookup

Microevolution wikipedia , lookup

Genome editing wikipedia , lookup

Human genetic variation wikipedia , lookup

Designer baby wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genomics wikipedia , lookup

Human genome wikipedia , lookup

Genome (book) wikipedia , lookup

Public health genomics wikipedia , lookup

Pathogenomics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Minimal genome wikipedia , lookup

Human Genome Project wikipedia , lookup

Genome evolution wikipedia , lookup

Metagenomics wikipedia , lookup

Transcript
Co-­‐evolution of the human genome and microbiome Lawley/Finn ESPOD SUMMARY
Groups at the WTSI (Lawley) and EMBL-EBI (Finn) study the structure and function of complex microbial
communities using very different, yet complementary approaches. The EMBL-EBI has an in silico analysis
pipeline to characterize metagenomic and metatranscriptomic data, whereas the WTSI uses host-pathogen
interaction and microbial ecology methods to study the intestinal microbiota during health and diseases in
humans and mice. The major goal of this ESPOD proposal is to support this ongoing collaboration and
utilize our emerging Human Gastrointestinal Microbiome Database (HGM-DB)(Figure 1) to study the coevolution of the human genome and microbiome, and validate the function of key pathways using in vitro
(RNA-seq, proteomics, microbiology) and in vivo (germ free and transgenic mice) experiments.
BACKGROUND
Over the past decade, the genomic technologies available to characterize the microbial communities,
termed microbiota, living on and in people and animals have matured dramatically. We now know that a
human body contains ~10 times more bacterial cells than human cells and the microbiota codes for enormous
genetic diversity, containing ~200 times more unique genes than the host human genome. Over 70% of our
microbiota colonizes our intestine, where >1000 bacterial species reside in a mutualistic relationship on the
mucosal surface playing a key role in our development, sustenance and well-being. The microbiota and its
metabolic products are major stimuli for the underlying host cells and play a significant role during health
and in a range of diseases. The genetic complement of our microbiota therefore represents a “second
genome”, which is undersampled and poorly understood, representing a new frontier in scientific research
and human genomics. The ESPOD Fellow will utilize our unique computational and biological
resources to perform a comprehensive comparative metagenomic analysis to identify orthologous
genes shared between the human genome and microbiome as such genes represent biochemical
pathways central to the mutualistic co-evolution of humans and bacteria that are potentially key for
symbiosis. Select genes/pathways will be validated experimentally using in vitro and in vivo methods to
unravel the mechanism(s) that are key for symbiosis.
metagenomic database development, building
intestinal bacteria culture collection, data generation
bioinformatic
analysis
hypothesis
generation
Lawley and Finn Groups
Custom Microbiome Database
public reference
genomes and
metagenome datasets
Integrated Analysis Framework
EBI tools
Experimental Validation
in vitro and in vivo
models (germ free
and trangenic mice)
identification of secondary
metabolite gene cassettes
identification of bacterial
surface glycosylation
~1800 Faecal Metagenomic
Samples
~23 billion classifications
~5000 reference genomes
including bacteria, archaea,
fungi and DNA viruses
potential scientific outcomes
ESPOD PDF
horizontal transfer of
genes between human
genome and microbiome
234 Species
Culture Collection
experimental testing and validation
dynamics of antibiotic resistance
spread in patient microbiota samples
identification of therapeutic bacteria
Scientific Deliverables
undertanding of human
genome and microbiome
co-evolution
identification of therapeutic
probiotic candidates
proteomic, transcriptomic
and metabolic based
phenotyping
genetic engineering
of bacteria
protein purification
and biochemistry
methods and targets for
microbial diagnostics
improved annotation of
Metagenomic Database
Bioinformatic and Experimental Platform to Study the Function of Complex Micobial Communities Associated with Human Health and Disease
Human Gastrointestinal Microbiome Database (HGM-DB)
The vast majority of bacteria in our microbiota have never been grown in the laboratory. Lawley’s group
has developed novel methods for culturing such fastidious bacteria from the human intestinal microbiota and
has to date built a culture collection of ~250 species (50% novel species) that were sequenced, assembled
and annotated at the WTSI. This collection of reference genomes is approaching that produced by the
International Human Microbiome Consortium, but our uniqueness is that the actual bacterial present in the
database and used for metagenomic classification are archived as a bacterial culture collection, and therefore
available for phenotypic validation. To decipher this large amount of information, the Lawley and Finn
groups have extended the EBI Metagenomics portal to create a searchable HGM-DB that uniquely integrates
high quality reference genomes and human gastrointestinal metagenomic datasets with the archived culture
collection, and will be continuously updated as new data is generated at the Sanger or is archived at the EBI
(Figure 1). The accompanying analysis framework allows us to explore a variety of basic and applied
questions about our microbiota a level of resolution not previously possible.
Co-­‐evolution of the human genome and microbiome Lawley/Finn ESPOD SCIENTIFIC AIMS OF PROPOSAL
1. to develop a comprehensive list of orthologous genes shared between the human genome and microbiome
2. to determine the phylogeny, natural history and evolutionary dynamics of shared orthologues
3. to functionally validate role of select orthologue(s) and pathways using in vitro and in vivo methods
ESPOD PROJECT
Susceptibility to inflammatory bowel disease (IBD) (ydjC; unpublished data from Lawley Lab) and
Friedreich ataxia (frataxin; fxn) in humans are underpinned by genes that have originated from bacteria that
were horizontally acquired early in animal/eukaryotic evolution. Phylogenetic analysis suggests that ydjC
was acquired early in metazoan evolution from the Proteobacteria but is also present in other bacteria
common to the human microbiota, for example Firmicutes and Bacteroidetes bacteria. Identification of such
genes was possible due to recent advances in human genome sequencing and annotation and raises
significant human evolutionary questions: how many orthologs are shared between the human genome
and microbiome? And what are their roles in human genome-microbiome interactions? Initially after
completion of the human genome there were claims of human genes that were of bacterial origin but many of
these originated from a phylogenetically limited group of bacteria and were likely due to contamination
during sequencing. We are now in a position to properly ask these questions because of improved annotation
of the human and other eukaryotic genomes and our HGM-DB populated with high quality reference
genomes from our culture collection.
1. Orthologue identification: Our preliminary analysis suggests that ~64 orthologous genes are shared
between the human genome and microbiome. The ESPOD fellow will work with both groups to develop a
bioinformatic analysis pipeline to identify potential horizontally transferred genes. Phylogenetic and
taxonomic tree comparisons, coding conservation, presence in mitochondrial genome and the co-occurrence
across bacteria will be factored into a scoring metric to eliminate false-positives. This method will be
benchmarked using (i) ancient gene families in Pfam (false positives) and (ii) the ydjC and fxn genes (true
positives). Once established, the in silico validation will be applied to an expanded set of candidate genes.
2. Functional prediction: Using the functional prediction tools in the Finn team, the ESPOD Fellow
would perform a variety of in silico analyses with individual orthologous gene pairs to provide insight into
the functions, and to determine the distribution and expression patterns in the host and across different
human microbiomes. We expect this analysis to provide general insight into the phenomena of horizontal
gene transfer between bacteria and animals/eukaryotes and determine if particular functions and expression
patterns are shared between orthologous gene sets. The ESPOD will update annotations in Pfam and InterPro
to reflect the acquired knowledge, enabling dissemination to other resources such as UniProtKB and the EBI
metagenomics portal and ensuring a broad impact of the project outcomes.
3. Experimental Validation: Based on the findings from Aims 1 and 2, the ESPOD will experimentally
validate a subset of the orthologs using in vitro and in vivo methods. We have a variety of experimental
platforms to investigate the molecular and genetic basis of host-microbiota interactions (Figure 1). For
example, in addition to humans and gut bacteria, we found the ydjC ortholog was present in mice allowing
us to determine that ydjC was expressed mainly in the digestive system, and to generate a knock mouse line
for phenotypic characterization using the Mouse Genetics Programme pipeline and bespoke phenotypic
assays developed in the Lawley Lab. Importantly, the ydjC knock out mouse line has a severe intestinal
phenotype indicative of IBD validating our human genome-microbiome ortholog identification strategy. We
can also utilise our extensive bacterial culture collection and mutagenesis methodologies to inactivate genes
in certain bacterial species harbouring the gene of interest that can be subsequently phenotyped in vitro or in
germ free mice using proteomics or transcriptomics. YdjC is predicted be a deacetylase, potentially
removing acetyl groups from carbohydrates during digestion, so we have purified the protein from humans,
mice and intestinal bacteria to experimentally identify the substrate and prove deacetylase activity. Thus, we
have a variety of experimental platforms to interrogate and validate hypotheses generated from Aims 1 and 2.
EXPECTED OUTCOMES
This project offers a unique opportunity in the burgeoning field of human microbiome research to ask
deep questions about the co-evolution of the human genome and microbiome. The HGM-DB generated in
collaboration between the Sanger and EMBL-EBI will have broad applications for deciphering the
mutualistic co-evolution of the human genome and microbiome and understanding health and diseases that
are associated with the microbiota. Finally, this collaboration between the EMBL-EBI and WTSI will form
the basis of a unique, world-class research programme, integrating state-of-the-art bioinformatic and
experimental approaches and help improve the interpretation of human and animal metagenomics projects.