* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Co-‐evolution of the human genome and microbiome - EMBL-EBI
Genomic imprinting wikipedia , lookup
Ridge (biology) wikipedia , lookup
Non-coding DNA wikipedia , lookup
Oncogenomics wikipedia , lookup
Genomic library wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Gene expression profiling wikipedia , lookup
Genetic engineering wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Human–animal hybrid wikipedia , lookup
Microevolution wikipedia , lookup
Genome editing wikipedia , lookup
Human genetic variation wikipedia , lookup
Designer baby wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Human genome wikipedia , lookup
Genome (book) wikipedia , lookup
Public health genomics wikipedia , lookup
Pathogenomics wikipedia , lookup
History of genetic engineering wikipedia , lookup
Minimal genome wikipedia , lookup
Human Genome Project wikipedia , lookup
Co-‐evolution of the human genome and microbiome Lawley/Finn ESPOD SUMMARY Groups at the WTSI (Lawley) and EMBL-EBI (Finn) study the structure and function of complex microbial communities using very different, yet complementary approaches. The EMBL-EBI has an in silico analysis pipeline to characterize metagenomic and metatranscriptomic data, whereas the WTSI uses host-pathogen interaction and microbial ecology methods to study the intestinal microbiota during health and diseases in humans and mice. The major goal of this ESPOD proposal is to support this ongoing collaboration and utilize our emerging Human Gastrointestinal Microbiome Database (HGM-DB)(Figure 1) to study the coevolution of the human genome and microbiome, and validate the function of key pathways using in vitro (RNA-seq, proteomics, microbiology) and in vivo (germ free and transgenic mice) experiments. BACKGROUND Over the past decade, the genomic technologies available to characterize the microbial communities, termed microbiota, living on and in people and animals have matured dramatically. We now know that a human body contains ~10 times more bacterial cells than human cells and the microbiota codes for enormous genetic diversity, containing ~200 times more unique genes than the host human genome. Over 70% of our microbiota colonizes our intestine, where >1000 bacterial species reside in a mutualistic relationship on the mucosal surface playing a key role in our development, sustenance and well-being. The microbiota and its metabolic products are major stimuli for the underlying host cells and play a significant role during health and in a range of diseases. The genetic complement of our microbiota therefore represents a “second genome”, which is undersampled and poorly understood, representing a new frontier in scientific research and human genomics. The ESPOD Fellow will utilize our unique computational and biological resources to perform a comprehensive comparative metagenomic analysis to identify orthologous genes shared between the human genome and microbiome as such genes represent biochemical pathways central to the mutualistic co-evolution of humans and bacteria that are potentially key for symbiosis. Select genes/pathways will be validated experimentally using in vitro and in vivo methods to unravel the mechanism(s) that are key for symbiosis. metagenomic database development, building intestinal bacteria culture collection, data generation bioinformatic analysis hypothesis generation Lawley and Finn Groups Custom Microbiome Database public reference genomes and metagenome datasets Integrated Analysis Framework EBI tools Experimental Validation in vitro and in vivo models (germ free and trangenic mice) identification of secondary metabolite gene cassettes identification of bacterial surface glycosylation ~1800 Faecal Metagenomic Samples ~23 billion classifications ~5000 reference genomes including bacteria, archaea, fungi and DNA viruses potential scientific outcomes ESPOD PDF horizontal transfer of genes between human genome and microbiome 234 Species Culture Collection experimental testing and validation dynamics of antibiotic resistance spread in patient microbiota samples identification of therapeutic bacteria Scientific Deliverables undertanding of human genome and microbiome co-evolution identification of therapeutic probiotic candidates proteomic, transcriptomic and metabolic based phenotyping genetic engineering of bacteria protein purification and biochemistry methods and targets for microbial diagnostics improved annotation of Metagenomic Database Bioinformatic and Experimental Platform to Study the Function of Complex Micobial Communities Associated with Human Health and Disease Human Gastrointestinal Microbiome Database (HGM-DB) The vast majority of bacteria in our microbiota have never been grown in the laboratory. Lawley’s group has developed novel methods for culturing such fastidious bacteria from the human intestinal microbiota and has to date built a culture collection of ~250 species (50% novel species) that were sequenced, assembled and annotated at the WTSI. This collection of reference genomes is approaching that produced by the International Human Microbiome Consortium, but our uniqueness is that the actual bacterial present in the database and used for metagenomic classification are archived as a bacterial culture collection, and therefore available for phenotypic validation. To decipher this large amount of information, the Lawley and Finn groups have extended the EBI Metagenomics portal to create a searchable HGM-DB that uniquely integrates high quality reference genomes and human gastrointestinal metagenomic datasets with the archived culture collection, and will be continuously updated as new data is generated at the Sanger or is archived at the EBI (Figure 1). The accompanying analysis framework allows us to explore a variety of basic and applied questions about our microbiota a level of resolution not previously possible. Co-‐evolution of the human genome and microbiome Lawley/Finn ESPOD SCIENTIFIC AIMS OF PROPOSAL 1. to develop a comprehensive list of orthologous genes shared between the human genome and microbiome 2. to determine the phylogeny, natural history and evolutionary dynamics of shared orthologues 3. to functionally validate role of select orthologue(s) and pathways using in vitro and in vivo methods ESPOD PROJECT Susceptibility to inflammatory bowel disease (IBD) (ydjC; unpublished data from Lawley Lab) and Friedreich ataxia (frataxin; fxn) in humans are underpinned by genes that have originated from bacteria that were horizontally acquired early in animal/eukaryotic evolution. Phylogenetic analysis suggests that ydjC was acquired early in metazoan evolution from the Proteobacteria but is also present in other bacteria common to the human microbiota, for example Firmicutes and Bacteroidetes bacteria. Identification of such genes was possible due to recent advances in human genome sequencing and annotation and raises significant human evolutionary questions: how many orthologs are shared between the human genome and microbiome? And what are their roles in human genome-microbiome interactions? Initially after completion of the human genome there were claims of human genes that were of bacterial origin but many of these originated from a phylogenetically limited group of bacteria and were likely due to contamination during sequencing. We are now in a position to properly ask these questions because of improved annotation of the human and other eukaryotic genomes and our HGM-DB populated with high quality reference genomes from our culture collection. 1. Orthologue identification: Our preliminary analysis suggests that ~64 orthologous genes are shared between the human genome and microbiome. The ESPOD fellow will work with both groups to develop a bioinformatic analysis pipeline to identify potential horizontally transferred genes. Phylogenetic and taxonomic tree comparisons, coding conservation, presence in mitochondrial genome and the co-occurrence across bacteria will be factored into a scoring metric to eliminate false-positives. This method will be benchmarked using (i) ancient gene families in Pfam (false positives) and (ii) the ydjC and fxn genes (true positives). Once established, the in silico validation will be applied to an expanded set of candidate genes. 2. Functional prediction: Using the functional prediction tools in the Finn team, the ESPOD Fellow would perform a variety of in silico analyses with individual orthologous gene pairs to provide insight into the functions, and to determine the distribution and expression patterns in the host and across different human microbiomes. We expect this analysis to provide general insight into the phenomena of horizontal gene transfer between bacteria and animals/eukaryotes and determine if particular functions and expression patterns are shared between orthologous gene sets. The ESPOD will update annotations in Pfam and InterPro to reflect the acquired knowledge, enabling dissemination to other resources such as UniProtKB and the EBI metagenomics portal and ensuring a broad impact of the project outcomes. 3. Experimental Validation: Based on the findings from Aims 1 and 2, the ESPOD will experimentally validate a subset of the orthologs using in vitro and in vivo methods. We have a variety of experimental platforms to investigate the molecular and genetic basis of host-microbiota interactions (Figure 1). For example, in addition to humans and gut bacteria, we found the ydjC ortholog was present in mice allowing us to determine that ydjC was expressed mainly in the digestive system, and to generate a knock mouse line for phenotypic characterization using the Mouse Genetics Programme pipeline and bespoke phenotypic assays developed in the Lawley Lab. Importantly, the ydjC knock out mouse line has a severe intestinal phenotype indicative of IBD validating our human genome-microbiome ortholog identification strategy. We can also utilise our extensive bacterial culture collection and mutagenesis methodologies to inactivate genes in certain bacterial species harbouring the gene of interest that can be subsequently phenotyped in vitro or in germ free mice using proteomics or transcriptomics. YdjC is predicted be a deacetylase, potentially removing acetyl groups from carbohydrates during digestion, so we have purified the protein from humans, mice and intestinal bacteria to experimentally identify the substrate and prove deacetylase activity. Thus, we have a variety of experimental platforms to interrogate and validate hypotheses generated from Aims 1 and 2. EXPECTED OUTCOMES This project offers a unique opportunity in the burgeoning field of human microbiome research to ask deep questions about the co-evolution of the human genome and microbiome. The HGM-DB generated in collaboration between the Sanger and EMBL-EBI will have broad applications for deciphering the mutualistic co-evolution of the human genome and microbiome and understanding health and diseases that are associated with the microbiota. Finally, this collaboration between the EMBL-EBI and WTSI will form the basis of a unique, world-class research programme, integrating state-of-the-art bioinformatic and experimental approaches and help improve the interpretation of human and animal metagenomics projects.