Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Silencer (genetics) wikipedia , lookup
Expanded genetic code wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Non-coding DNA wikipedia , lookup
Community fingerprinting wikipedia , lookup
Gene expression profiling wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Genetic code wikipedia , lookup
Genomic library wikipedia , lookup
Use Case Template (revised from EarthCube version 1.1) Summary Information Section Use Case Name Using population genomes to analyse taxon specific functional constraints Contact(s) Jack A. Gilbert: [email protected] Naseer Sangwan: [email protected] Chris Marshall: [email protected] Melissa Dsouza: [email protected] Pamela Weisenhorn: [email protected] Overarching Science Driver To understand how translational fine-tuning shapes the microbial genome evolution in natural environment Science Objectives, Outcomes, and/or Measures of Success (I) Create habitat specific database of population level orthologous genes with pre-calculated metrics i.e. codon bias, dN/dS. (ii) Create new workflows and analysis pipelines to compute codon bias and dN/dS values across fragmented metagenome assemblies representing complex environments e.g. soil/sediment (iii) Create new normalization methods for accurate correlation between dN/dS and codon bias values of population level genes Key people and their roles Jack A. Gilbert: Lead PI Naseer Sangwan: Postdoctoral researcher Chris Marshall: Postdoctoral researcher Pamela B. Weisenhorn : Postdoctoral researcher Melissa Dsouza: Postdoctoral researcher Basic Flow 1. Quality trimming and de-novo assembly of shot-gun metagenome datasets 2. Binning Metagenome contigs into population genomes (pan-genomes) 3. Gene calling on contig bins representing population genomes 4. Identification of orthologous genes between population genomes 5. Cross validation of orthologous genes (i.e length cut-off, sequencing errors) 1 6. Calculating pairwise dN/dS and codon bias values 7. Normalization and calculation of pairwise correlation between dN/dS and codon bias profiles 8. Demarcate & functionally characterize protein pairs w/ positive and/or negative selection Critical Existing Cyberinfrastructure o Alignable Tight Genome Clusters (ATGC) database of prokaryote genomes (has genomes of cultured isolates) o Integrated Microbial Genomes (IMG) (e.g. can be used to pull orthologous genes) o MicroScope pipeline ( e.g. *has size limit for annotation*) Critical Cyberinfrastructure Not in Existence o Central database of population genomes i.e. reconstructed from metagenomes o Unique algorithms for calculating codon bias and dN/dS across short protein sequences. o Accurate normalization method that can handle the average genome size variation across populations Activity Diagram This can be targeted during the workshop Problems/Challenges 1. How to acess the habitat specific gene pool information? Recommendation : Create a comprehensive portal that can store such datasets. 2. High-throughput methods to screen orthologous genes across multipule population genomes a. some methods exist, but they are specific for genome sequences of cultured micobes. b. Recommendation: develop new methods or modify the existing methods to target the genome bins represting mix of strains or species. 3. How to calculate accurate rate to evolution and codon bias on short protein sequences. a. There are some methods but they are not validated for errors and bias caused during metagenome data analysis e.g length variation, average genome size variation etc. b. Recommendation: develop some new method to calculate and normalize the dN/dS and codon bias profiles of population genomes. e.g consider the average genome size variations. References -Ran W, Kristensen DM, Koonin EV. (2014). Coupling Between Protein Level Selection and Codon Usage Optimization in the Evolution of Bacteria and Archaea. mBio 5:e00956–14. -Nielsen, R. (2005). Molecular signatures of natural selection. Annu Rev Genet. 39:197-218. Notes 2