Download Genomics - California Lutheran University

Genomics Biology 122 Genes and Development Genomics milestones First genome: Haemophilus influenza, 1995; by Craig Venter and TIGR Human genome, draft sequences, 2001: Two groups (Francis Collins of the Public consortium ; Craig Venter and CELERA) Now: 1000’s of bacteria have been sequenced. Hundreds of human genomes have been sequenced! NCBI, Nov. 2010 From Genome.gov Human genome conference 6/7/2010 Restriction analysis FISH Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Reciprocal translocation between one 9 and one #22 chromosome forms an extra-long chromosome 9 (“der 9”) and the Philadelphia chromosome (Ph1) containing the fused bcr-abl gene. This is a schematic view representing metaphase chromosomes. Fig. 18.2 bcr Ph1 22 abl 9 der 9 a. bcr (on normal 22) abl (on normal 9) bcr fused gene abl Normal interphase nucleus Interphase nucleus of leukemic cell containing the Philadelphia chromosome (Ph1) b. b: Reprinted by permission from Macmillan Publishers Ltd: Bone Marrow Transplantation 33, 247-249, “Secondary Philadelphia chromosome after non-myeloablative peripheral blood stem cell transplantation for a myelodysplastic syndrome in transformation,” T Prebet, A-S Michallet, C Charrin, S Hayette, J-P Magaud, A Thiébaut, M Michallet, F E Nicolini © 2004 Sequence-tagged sites (STS) Comparison of genetic and physical maps Manual sequencing Automated DNA sequencing Estimated genes in sequenced genomes Transposable elements Alternative splicing Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Genome variation SNPs SNP SNP SNP Chromosome 1 A A C A C G C C A T T C G G G G T C A G T C G A C C G Chromosome 2 A A C A C G C C A T T C G A G G T C A G T C A A C C G Chromosome 3 A A C A T G C C A T T C G G G G T C A G T C A A C C G Chromosome 4 A A C A C G C C A T T C G G G G T C A G T C G A C C G a. Haplotypes Haplotype 1 C T C A A A G T A C G G T T C A G G C A Haplotype 2 T T G A T T G C G C A A C A G T A A T A Haplotype 3 C C C G A T C T G T G A T A C T G G T G Haplotype 4 T C G A T T C C G C G G T T C A G A C A b. Diagnostic SNPs A/G c. T/C C/G Haplotype 1 A T C Haplotype 2 A C G Haplotype 3 G T C Haplotype 4 A C C Comparison of plant genomes (Comparative genomics) Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Rice Genome Fig. 18.9 Sugarcane Chromosome Segments Genomic Alignment (Segment Rearrangement) 1 2 3 4 5 6 7 8 9 10 11 12 Corn Chromosome Segments A B C D F G H I Wheat Chromosome Segments Rice Sugarcane Corn Wheat 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. SCIENTIFIC THINKING Hypothesis: Flowers and leaves will express some of the same genes. Prediction: When mRNAs isolated from Arabidopsis flowers and from leaves are used as probes on an Arabidopsis genome microarray, the two different probe sets will hybridize to both common and unique sequences. Genomes deposited at NCBI Organism Complete Genome sequencing projects statistics Draft assembly In progress total Prokaryotes 850 585 534 1969 Archaea 78 5 32 115 Bacteria 773 580 502 1855 Eukaryotes 39 249 320 608 Animals 6 110 159 275 Mammals 3 37 81 121 Birds 3 12 15 Fishes 13 12 25 26 20 48 2 3 5 13 12 26 Insects 2 Flatworms Roundworms 1 Amphibians 1 1 Reptiles 1 Other animals 16 22 38 1 Plants 7 23 78 108 Land plants 4 19 73 96 Green Algae 3 4 4 11 Fungi 16 83 39 138 Ascomycetes 13 63 28 104 Basidiomycetes 1 12 8 21 Other fungi 2 8 3 13 Protists 10 31 40 81 Apicomplexans 5 10 4 19 Kinetoplasts 4 1 3 8 Other protists 1 19 33 53 total: 889 834 854 2577 Revised: Nov 18, 2010 GOLD (Genomes Online Database) Complete Incomplete Targeted Bacterial 2666 5493 424 Archaeal 149 182 1 Eukaryotic 166 2037 13 Metagenome studies 340 Metagenome samples 1930 [Metagenome are environmental samples] Finished 1960 Permanent draft 1021 Complete, not published 26 Draft 1529 In progress 3426 DNA received 266 Awaiting DNA 510 Targeted (funded, not started) Date 438 11/23/2011 NCBI, Genomes Species Reference sequences In progress Viroids 41 41 Viruses 2721 3933 Bacterial 1681 5140 Archaeal 121 90 Eukaryotes 1815 Organelles 2974 Date 11/23/2011 Human Disease genes From Genome.gov, 11-2010 Animals Vertebrates Amphipod Crustacean Chicken Aphid, Pea Coelacanth Beetle, Red Flour Gar, Spotted Bug (Chagas' Vector) Hagfish Centipede, Geophilimorph Lamprey, Sea Chelicerate (Horseshoe Crab) Lizard, Anole Drug Resistant Parasitic Nematode Pufferfish Freshwater Polyp Shark, Elephant Fruit Fly Skate Honey Bee Spotted African Lungfish Louse, Body Stickleback, Threespine Mosquito Turtle, Painted Placazoan Zebra finch Planarian Roundworm Sand Fly Sea Slug Sea Squirt Sea Star Sea Urchin Snail, Freshwater Strongylid Nematode Tardigrade Wasp, Parasitoid Worm, Acorn Genome.gov Worm, Priapulid 11/22/2011 Animal genomes in progress, November 2011 (genome.gov) Mammals Aardvark Guinea Pig Opossum, Gray Short-Tailed Alpaca Hedgehog, European Opossum, Laboratory Armadillo, Nine-banded Hippopottamus Orangutan Baboon Honey Possum (Noolbenger) Pangolin Bat, Little Brown (Microbat) Horse Pika Bat, Big brown Human Platypus, Duck-Billed Bonobo Hyrax Rabbit Bushbaby Koala Rat Bushbaby/Galago Lemur, Flying Rat, Kangaroo California leaf-nosed bat Lemur, Mouse Rhesus Macaque Cape golden mole Lesser Egyptian jerboa Ring-tailed lemur Cat Lizard, Anole Shrew, Elephant Chimpanzee Llama Shrew, European Common Chinchilla Long-haired (Rufous) elephant shrew Shrew, Tree Chinese hamster Macaque, Cynomolgous Sloth Cow Macaque, Pigtail Springhare Crested porcupine Macaque, Rhesus Squirrel Degu Macaque, Rhesus (Chinese population) Star nosed mole Dog Malayan tapir Stickleback, Threespine Dolphin Mangabey, Sooty Syrian/Golden Hamster Eastern grey kangaroo Marmoset Tarsier Elephant, African Savannah Mexican free-tailed bat Tenrec (Lesser Hedgehog) Ferret Mole Vervet Fly Fox (Megabat) Monkey, Squirrel Vole, Prairie Giant anteater Mouse Wallaby, Tammar Gibbon Mouse, Deer Water Chevrotain Golden-mantled howling monkey Mouse, White-Footed Weddell Seal Greater horseshoe bat Naked mole rat West Indian manatee North American porcupine White rhinocerous Mammal genomes in progress, November 2011 (genome.gov) Neanderthals Science Nov 17, 2006 Neanderthals • 99.5% identical to humans when comparing the same sequences Neanderthals Draft sequence published May 7, 2010. Neanderthals from four sites (see map) 21 bones from Vindija analyzed for this study 3 bones were selected for detailed sequencing (from three individuals) Bones from three other sites were also sequenced (see map) Compared Neanderthal to five human genomes Conclusion: Non-African humans contain some Neanderthal derived sequences (1 to 4%) (gene flow estimated to be Neanderthal to Human, and occurred > 45,000 years ago) Notes: Humans and Neanderthals lived in the same area for > 10,000 years. Neanderthals perished 30,000 years ago. Neanderthals Four models of how the gene transfer could have occurred (option 2 is least likely, option 3 most likely) Transfer most likely occurred in Middle-East/Western Asia PNG = Papua New Guinea Denisovians Third type of human genome sequenced Finger bone found in the Denisova cave in Altai Krai, Russia in 2008 The Denisova bone had a genome distinct from modern humans or Neanderthals The bone was dated to 41,000 years ago Since only bone fragments are known, it is not known how they looked It is thought that they were distributed throughout Asia and Melanasia Analysis of the genome, and comparison with humans and neanderthals, suggests that 4% of non-African DNA is related to neanderthals and 4 to 6% of melanasian genomes is related to denisovians. This suggests some interbreeding between the first modern humans, neanderthals, and denisovians. Analysis of HLA types (immune proteins) suggests that over half of eurasian HLA types came from neanderthals or denisovians, suggesting that they were selected for in the eurasians. Watson’s genome • Sequenced using shotgun sequencing • About 3.5 percent of Watson’s genome could not be matched to the reference genome-probably due to differences in cloning step Venter’s genome compared to the reference genome • 32 million reads resulted in 2.8 billion base pairs of assembled sequence (7.5 fold coverage) • 4.1 million differences to the already published genome (12.3 million bases different) • 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2-206 bp), 292,102 heterozygous insertion/deletion events (indels)(1-571 bp), 559,473 homozygous indels (1-82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. How different are individuals? • 44% of genes were heterozygous for one or more variants (they could determine both copies) • A conservative estimate that a minimum of 0.5% variation exists between two haploid genomes (all heterozygous bases). How different are individuals? • DNA from a Yoruba from Ibadan, Nigeria was completed. • About 4 million SNPs were found, 74% had already been found by others. • About 24% more polymorphism (heterozygosity) than Caucasian genomes. • There were 5,704 indels ranging from 50 to over 35,000 bp long. Many were SINES and LINES. Bentley et al., Nature, November 6, 2008 How different are individuals? • DNA from a Han Chinese individual was completed. • About 3 million SNPs were found, 86% had already been found by others. • About 24% more polymorphism (heterozygosity) than Caucasian genomes. • There were 2,682 structural variations, including insertions, deletions, and inversions. Many variations in SINES and LINES were found. Wang et al., Nature, November 6, 2008 How different are cancer cells? • DNA from skin cells and acute myeloid leukemia cells from the same Caucasian woman were sequenced. • About 2.9 million SNPs were found in the skin cells, and 3.8 million in the leukemia cells. • Almost all of the differences in SNPs were found to be common in other sequenced genomes or not in genes. • Ten genes were found to have acquired mutations in the leukemia cells. Of these, two were known to be involved in tumour progression. The functions of the other eight mutant genes are unknown. Ley et al., Nature, November 6, 2008 Metabolomics • A study of 284 males compared 383 metabolic indicators and SNPs (genetic variants). • Up to 12% of the levels of the metabolic molecules could be explained by particular versions of the gene (SNP). • Four genes were known to be in metabolic pathways related to the metabolic molecule that was high or low. Geiger et al., PLOS Genetics. November, 2008 Wooly mammoth • Over 4 billion bp in genome • Mammoths and African elephants differ in about 1 amino acid per protein • Estimate that Mammoths and African elephant separated 1.5 to 2.0 Million years ago Nature, November 20, 2008 Wooly mammoth Recent genome news Nov 19, 2011 Malaysian Genomics Resource Centre Berhad (MGRC) today announced that it has successfully completed its 100th human genome from a diverse mix of Malaysian, European and Australian individuals. The results of the data generated from these genomes has helped in efforts to identify and compare highly represented patterns of common and clinically-relevant genetic variations within Malaysian and other populations, and to establish robust bioinformatics protocols for the reference-based analysis of genomic information. Recent genome news Nov 23, 2011 A study of 11,000 children and adults found that very short people (the lowest 2.5% of the population) are missing more genes or parts of genes than taller people. Recent genome news November, 2011 The mythical "$1,000 genome" is almost upon us (in 2012), said Jonathan Rothberg, CEO of sequencing technology company Ion Torrent, at MIT's Emerging Technology conference. November 2, 2011 Duke University said last week that it will sequence 4,000 individuals as part of a collaborative, $25 million effort to identify as many genes as possible implicated in epilepsy. Maize (corn) genome Maize has 10 chromosomes, 2.3 billion base pairs The sequencing was done using clone-by-clone method, with 16,848 BACs sequenced, assembled, and analyzed. There are estimated to be 32,500 protein encoding genes, and 150 microRNA genes (miRNA). Approximately 75% of the genome is repeated DNA. It has over 400 families of LTR retrotransposons with over 31,000 different sequences. Fig. 1 The maize B73 reference genome (B73 RefGen_v1): Concentric circles show aspects of the genome P. S. Schnable et al., Science 326, 1112-1115 (2009) 1000 Genomes project The 1000 Genomes Project is an international collaboration to produce an extensive public catalog of human genetic variation, including SNPs and structural variants, and their haplotype contexts. This resource will support genome-wide association studies and other medical research studies. The genomes of about 2500 unidentified people from about 27 populations around the world will be sequenced using next-generation sequencing Technologies. Highlights Over 4.9 trillion nucleotides sequenced Over 800 individuals (179 people had their whole genomes sequenced and 697 people just the protein-coding regions) Each child had around 60 mutations in its genome that did not exist in either parent Over 15 million SNPs discovered each individual is carrying a significant number of deleterious mutations, maybe 250 or 300 genes that have defective copies 1000 Genomes project http://www.1000genomes.org/home 3 billion Number of DNA letters in the human genome (200 volumes the size of a Manhattan telephone book, which has around 1,000 pages) 20,000-25,000 Number of genes in the genome (though not all scientists agree) 2000 Year the first draft of the human genome was announced to much fanfare at the Clinton White House 2003 Final draft completed to 99.99% accuracy 2500 Number of people whose genomes the 1,000 Genomes Project hopes to sequence, from 25 populations 15 million Number of single-letter changes identified in the pilot phase 1 million Number of small insertions and deletions identified in the pilot phase 4.9 trillion Number of letters of data sequenced by the 1,000 Genomes Project so far 1094 Genomes completed for 1094 individuals, 6/23/11 Human microbiome Adults harbor ten times more microbial cells than they have human cells. Examination of how these microbes impact human health through their association with the body, for example by influencing metabolism, disease susceptibility and drug response is key for improving human health. Through the Comparative Genome Evolution (CGE) program, NHGRI approved a limited project – Sequencing of Cultivable Microbes from Human Gut – to obtain reference genome sequence data from up to 300 cultured bacteria and archea sampled from the human digestive tract and urogenital tract in September 2005. The object is three-fold: to start to generate reference data for future large-scale metagenomics studies; to understand the diversity of bacterial pangenomes, and to start to address the technical and bioinformatic challenges that human metagenomics research will encounter. From Genome.gov, 11-2010 Scientists propose a "genome zoo" of 10,000 vertebrate species November 03, 2009 By Branwyn Wagman, Guest Writer (831) 459-3077 Scientists involved in the Genome 10K Project are assembling specimens of thousands of animals spanning a broad range of evolutionary diversity. Photos courtesy of San Diego Zoo. From http://news.ucsc.edu/2009/11/3333.html 10,000 vertebrate genomes In the most comprehensive study of animal evolution ever attempted, an international consortium of scientists plans to assemble a genomic zoo--a collection of DNA sequences for 10,000 vertebrate species, approximately one for every vertebrate genus. Known as the Genome 10K Project, it involves gathering specimens of thousands of animals from zoos, museums, and university collections throughout the world, and then sequencing the genome of each species to reveal its complete genetic heritage. Launched in April 2009 at a three-day meeting at the University of California, Santa Cruz, the project now involves more than 68 scientists. Calling themselves the Genome 10K Community of Scientists (G10KCOS), the group outlined its proposal to create a collection of tissue and DNA specimens for the project in a paper to be published online November 5 in the Journal of Heredity. From http://news.ucsc.edu/2009/11/3333.html

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Genomics - California Lutheran University