* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download JGI - MaizeGDB
Cancer epigenetics wikipedia , lookup
Comparative genomic hybridization wikipedia , lookup
DNA damage theory of aging wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Pathogenomics wikipedia , lookup
Primary transcript wikipedia , lookup
SNP genotyping wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
Transposable element wikipedia , lookup
Genome evolution wikipedia , lookup
DNA barcoding wikipedia , lookup
Y chromosome wikipedia , lookup
DNA vaccination wikipedia , lookup
Genome (book) wikipedia , lookup
United Kingdom National DNA Database wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
DNA sequencing wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genealogical DNA test wikipedia , lookup
Nucleic acid double helix wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Molecular cloning wikipedia , lookup
Designer baby wikipedia , lookup
X-inactivation wikipedia , lookup
Epigenomics wikipedia , lookup
Human genome wikipedia , lookup
Point mutation wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Non-coding DNA wikipedia , lookup
History of genetic engineering wikipedia , lookup
DNA supercoil wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Neocentromere wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Human Genome Project wikipedia , lookup
Microevolution wikipedia , lookup
Microsatellite wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Genome editing wikipedia , lookup
Metagenomics wikipedia , lookup
Helitron (biology) wikipedia , lookup
Genomic library wikipedia , lookup
Advancing Science with DNA Sequence Maize Missouri 17 “chromosome 10” project update Dan Rokhsar 3 October 2006 Advancing Science with DNA Sequence Aims: “Plan A” • Generate and annotate “gene space” for the ~180 Mbp chromosome 10 of Mo17 using a random shotgun approach from flow-sorted chromosomes. • This resource will complement the BAC-by-BAC sequencing of B73, informing our understanding of intra-species variation, from SNPs to chromosomal organization. • The project will serve as a pilot R&D study for chromosome-scale random shotgun sequencing of complex genomes Advancing Science with DNA Sequence Challenges • Produce high-quality shotgun library from a single chromosome (year 1) – Apply flow sorting methods to root tip preparations or oatmaize hybrid lines with maize Mo17-10 • Assemble shotgun sequences and relevant mapping data to recover non-repetitive and ‘distinguishable repetitive’ regions (years 1-2) – DuPont Mo17 BAC library, BAC-end sequence – Targeted mapping to link across complex repeats • Targeted finishing of “gene space” from wholechromosome-shotgun draft (year 2) – Interplay of finishing with annotation Advancing Science with DNA Sequence Project goals for researchers and breeders • Unlimited markers for mapping • Nearly complete gene set for Mo17-10 • Conserved synteny/chromosome dynamics with sorghum • Evolutionary approaches empowered • Novel reagents begin to emerge • Framework for understanding strain differences Advancing Science with DNA Sequence Milestones • Year 1 – Produce test libraries from mock flow sorted material (JGI) – Produce preliminary flow sorting data for discussion at Advisory Committee meetings (NFCR) – Produce 1-10 micrograms of flow sorted chromosome 10 material (NFCR). – Complete library production (JGI) – Begin shotgun sequencing, with associated data deposition (JGI) Advancing Science with DNA Sequence Milestones • Year 2. – Complete initial shotgun assembly, with associated data deposition (JGI) – Integrate with physical map data from DuPont (JGI) – Complete two rounds of primer walking (SHGC) – Annotate initial draft assembly, with data release (JGI) – Complete subsequent rounds of targeted finishing reactions (SHGC) – Complete physical mapping of markers and release to public repositories (PGML) – Produce final assembly incorporating finishing data (JGI, SHGC) – Publish detailed analyses of Maize Genome Project outcomes (all) – Offer summer course on maize genome data (JGI) Advancing Science with DNA Sequence Problems at first step • First milestone from “plan A” not met – Flow sorting system is going … – But no significant progress to chromosome flow sorting at preparative scale – Some small-scale root tip chromosome preps have been done, but not ready to scale up – Three months of chromosome preps (~10,000 root tips) would be needed to obtain even a few tenths of micrograms of DNA for first chromosome-specific cloning attempt, outcome not guaranteed – JGI library group would prefer more material for robust shotgun library prep (minimum of several ug); previous chromosomespecific lambda cloning (Arumuganathan) is more forgiving, still gave low coverage (2X) – Attempted to contract to Dolezel’s group in Czech. but their capacity is taken with wheat BAC preps. Willing to advise. Arumuganathan is now doing human cell sorting, not working with chromosome preps, and cannot take on task. Advancing Science with DNA Sequence Even in expert hands, purity of chromosome prep is 85-90% • Li, Arumuganathan, et al. Flow cytometric sorting of maize chromosome 9 from an oat-maize chromosome addition line. TAG (2001). Advancing Science with DNA Sequence Proposal for “Plan B” • Continue development of flow sorting chromosome 10, but decouple from sequencing plans in current project • Produce ~3/4 X random whole genome shotgun sequence of Mo17 in plasmid and fosmid paired ends (mix TBD) – ~3 months to bulk prep DNA, make libraries, do quality control testing/sampling (Jan 2007) – <3 month to schedule and perform production sequencing run (Apr 2007) • Note: JGI is not in position to take on significant BACbased shotgun from B73 project – perhaps a few hundred clones, maybe ~1% of project Advancing Science with DNA Sequence Alignment of Mo17 “gene space” with B73 allele ~97% identity • • Mo17 B73 1 88023 AACCAATTGGCAGCATTATTATTTTGAACAGATAAAAATCACGCCAGGGCGATGGATACT ..............C.........C................................... 60 88082 • • Query Sbjct 61 88083 CAGCTCAATCACGGAATTCATCCATGAACTTCTCGTGGAACTCCTTGAGCCTGGATACTA ............................................................ 120 88142 • • Query Sbjct 121 88143 TCGCAGGTATCTTGTCCTCCTGCGGCAGTATCGTGCACCTGAAGTGCCACGTTCCAGGGA ............................................................ 180 88202 • • Query Sbjct 181 88203 CCTTCA--------CG--G-T--G-T-C-GC-AAAGCAACGTGTCAGTATCGTGTGCATC ......CGGTGTCG..AA.T.AA.A.C.A..A................G........... 223 88262 • • Query Sbjct 224 88263 TGAAGCTTAACGATGCTTTGAAACGGCAGGGACTTCCACaaaaaaaGG-CTTTTGAGATT .............................................G..G........... 282 88322 • • Query Sbjct 283 88323 ACCCACCTGTCCAAACCCAGAACCGGGGACGACGACGATTCCAGTGGCTTCCAGTAGGCG ............................................................ 342 88382 • • Query Sbjct 343 88383 TTTTGCGTAGTATGCATCTGGCGCAGTGCCGACTGCTTGGGCAGCTCCAATTGCCTTCTG ..........................................T................. 402 88442 • • Query Sbjct 403 88443 GGGTAAATGAAGGCGTGGGAACAGATACATTGCACCTTCGGCTTTGTTGCATGTAATTCC ............................................................ 462 88502 • • Query Sbjct 463 88503 TTCTAAACTGTTGAATGCTTCTTCCAAAGCCTGTGACAGAAGAACACGTAACAATAAGAA ............................................................ 522 88562 • • Query Sbjct 523 88563 GGTGCTTATAAGATTCAGGaaaaaaaa--TCTTTTTTAAAGTTGTTTTGCATATGTTAAC ...........................GA............................... 580 88622 • • Query Sbjct 581 88623 GGACTACTCGACCAGGGGTATAGCTTTTATTCTTGTTTGATATTTCCATATTAGGACTCT ..........G................................................. 640 88682 • In unique “genic” regions (especially coding sequence), can easily align Mo17 and B73 to detect polymorphism. • Cf comparable human-chimp alignments at ~98.5% (putative aminotransferase, Morgante et al.) Advancing Science with DNA Sequence Likely outcomes of Plan B • Align Mo17 shotgun to emerging B73 draft (at quarterly intervals) – Should be easy to recognize allelic variants in non-repetitive (i.e., genic) regions, based on Morgante et al. results. Expect unique coverage of ~40% of B73 sequence. (alternative: MeF, C0t) – In a typical genic locus of 5 kb, conservatively expect ~100 mismatches or indels. Dense markers allows rapid development of multiple markers per gene. (Distribute via Gramene, NCBI) – Repetitive regions within B73 differ by ~90-99%, so identifying “allelic” repeats will be difficult given ~97% polymorphism (Attempt to localize “sisters” of unique reads based on B73 map.) – In places where both ends of a clone are alignable, can confirm local colinearity of B73 and Mo17, or identify rearrangements and/or deletions (A la human-chimp comparison, but expect worse) – Mo17 fosmid clones with localized ends will be available for distribution and/or targeted sequencing of loci-of-interest – Potential start towards Mo17 WGS if desirable Advancing Science with DNA Sequence JGI Sorghum update • Sorghum WGS currently at ~7X (in Trace Archive) – mostly small insert plasmids sequenced to date • BAC-end and fosmid-end sequences coming by end 2006 – but uniformity of BAC library is in question, may limit assembly • Quick and dirty assemblies look good using “skeleton” of method proposed for maize – ~13 kb contigs and ~300 kb scaffolds (N50 #’s) at ~5X – considerable scaffolding even without much BAC/fosmid data – recovering ~2/3 of genome is easy even setting aside “difficult” repeats, as predicted for maize – Expect full 8X assembly (with map integration) ready late Q1 2007. • Quick and dirty annotation: ~42,000 genes in low copy families – plus >100K retrotransposon-ish genes even in easy-to-assemble regions Advancing Science with DNA Sequence Early peek at Sorghum-rice comparison shows syntenic segments "4dtv Dist/RiceSorg" Sorghum-Rice syntenic segments are of uniform molecular “age” 0.5 Comparable to human-chicken divergence 4DTv distance 0.4 Younger than Rice-Rice paralogs (from cereal-specific duplication) 0.3 0.2 0.1 0 0 10 20 30 40 50 Segment size (lo ci) 60 Loci in syntenic block 70 80 90 Advancing Science with DNA Sequence Maize divergences (transversions) Maize: 7,960 complete/29,922 partial peptides sugarcane Sorghum: 5,927 complete/19,681 peptides Sugarcane: 6,566 complete/ 21,850 peptides ~16,000 gene families at base of grasses ~12,000 families defined by rice/arabidopsis/poplar sorghum rice Arabidopsis