Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Primer, Friday 10am, Beckman B-302 Ex. 1 is coming. http://cs273a.stanford.edu [Bejerano Fall10/11] 1 Lecture 4 Our place in the tree of life Genome Size Genome Content: Repetitive Sequences Genes http://cs273a.stanford.edu [Bejerano Fall10/11] 2 Our Place in the Tree of Life you are here [Human Molecular Genetics, 3rd Edition] http://cs273a.stanford.edu [Bejerano Fall10/11] 3 Metazoans (multi-cellular organisms) you are here [Human Molecular Genetics, 3rd Edition] http://cs273a.stanford.edu [Bejerano Fall10/11] 4 Vertebrates , Stickleback , Lizard , Opossum you are here [Human Molecular Genetics, 3rd Edition] http://cs273a.stanford.edu [Bejerano Fall10/11] 5 INTERSPECIES VARIATION IN GENOME SIZE WITHIN VARIOUS GROUPS OF ORGANISMS http://cs273a.stanford.edu [Bejerano Fall10/11] 6 Figure from Ryan Gregory (2005) Meet Your Genome Continues [Human Molecular Genetics, 3rd Edition] http://cs273a.stanford.edu [Bejerano Fall10/11] 7 http://cs273a.stanford.edu [Bejerano Fall10/11] 8 Repeats / obile Elements ("selfish DNA") Human Genome: 3*109 letters http://cs273a.stanford.edu 1.5% known function [Bejerano Fall10/11] >50% junk 9 [Adapted from Lunter] http://cs273a.stanford.edu [Bejerano Fall10/11] 10 http://cs273a.stanford.edu [Bejerano Fall10/11] 11 http://cs273a.stanford.edu [Bejerano Fall10/11] 12 TE composition and assortment vary among eukaryotic genomes 100% 80% 60% DNA transposons LTR Retro. 40% Non-LTR Retro. 20% http://cs273a.stanford.edu [Bejerano Fall09/10] 13 Feschotte & Pritham 2006 http://cs273a.stanford.edu [Bejerano Fall10/11] 14 http://cs273a.stanford.edu [Bejerano Fall10/11] 15 http://cs273a.stanford.edu [Bejerano Fall10/11] 16 http://cs273a.stanford.edu [Bejerano Fall10/11] 17 http://cs273a.stanford.edu [Bejerano Fall10/11] 18 http://cs273a.stanford.edu [Bejerano Fall10/11] 19 Assemby Challenges http://cs273a.stanford.edu [Bejerano Fall10/11] 20 Inferring Phylogeny Using Repeats [Nishihara et al, 2006] http://cs273a.stanford.edu [Bejerano Fall10/11] 21 Functional elements from obile Elements Co-option event, probably due to favorable genomic context [Yass is a small town in New South Wales, Australia.] http://cs273a.stanford.edu [Bejerano Fall10/11] [Bejerano et al., Nature 2006] 22 The amount of TE correlate positively with genome size Mb Genomic DNA 3000 2500 TE DNA 2000 Protein-coding DNA 1500 1000 500 0 http://cs273a.stanford.edu [Bejerano Fall09/10] 23 Feschotte & Pritham 2006 The proportion of protein-coding genes decreases with genome size, while the proportion of TEs increases with genome size TEs Protein-coding genes 24 Gregory, Nat Rev Genet 2005 Genome Size Variability 1pg = 978 Mb http://cs273a.stanford.edu [Bejerano Fall10/11] 25 Simple Repeats •Every possible motif of mono-, di, tri- and tetranucleotide repeats is vastly overrepresented in the human genome. •These are called microsatellites, Longer repeating units are called minisatellites, The real long ones are called satellites. •Highly polymorphic in the human population. •Highly heterozygous in a single individual. •As a result microsatellites are used in paternity testing, forensics, and the inference of demographic processes. •There is no clear definition of how many repetitions make a simple repeat, nor how imperfect the different copies can be. •Highly variable between genomes: e.g., using the same search criteria the mouse & rat genomes have 2-3 times more microsatellites than the human genome. They’re also longer in mouse & rat. http://cs273a.stanford.edu [Bejerano Fall10/11] 26 http://cs273a.stanford.edu [Bejerano Fall10/11] 27 http://cs273a.stanford.edu [Bejerano Fall10/11] 28 http://cs273a.stanford.edu [Bejerano Fall10/11] 29 Restriction enzymes recognize and make a cut within specific palindromic sequences, known as restriction sites, in the DNA. This is usually a 4- or 6 base pair sequence. blunt end sticky end http://cs273a.stanford.edu [Bejerano Fall10/11] 30 DNA Fingerprint Basics DNA fragments of different size will be produced by a restriction enzyme that cuts at the points shown by the arrows. 31 DNA fragments are then separated based on size using gel electrophoresis. 32 DNA Fingerprinting can be used in paternity testing or murder cases. 33 http://cs273a.stanford.edu [Bejerano Fall10/11] 34 From an evolutionary point of view transposons and simple repeats are very different. Different instances of the same transposon share common ancestry (but not necessarily a direct common progenitor). Different instances of the same simple repeat most often do not. http://cs273a.stanford.edu [Bejerano Fall10/11] 35 The Gene-ome makes < 2% of the H.G. [Human Molecular Genetics, 3rd Edition] http://cs273a.stanford.edu [Bejerano Fall10/11] 36 Gene Structure Signal – a string of DNA recognized by the cellular machinery http://cs273a.stanford.edu [Bejerano Fall10/11] 37 Gene Processing Eukaryotic Gene Structure http://cs273a.stanford.edu [Bejerano Fall10/11] 38 Gene Finding – The Practice Challenge: “The genes, the whole genes, and nothing but the genes” Problems: spliced ESTs legitimate gene isoform? predicting gene isoforms tissue/condition-specific genes / gene isoforms single exon genes pseudogenes Practice: http://cs273a.stanford.edu [Bejerano Fall10/11] 39 Evolution of Gene Finding Tools 1982 intrinsic extrins ic hybrid Ab-initio Alignment-based Genie 1996 Genscan 1997 Comparative Genomics DNA cDNA, Protein Protein ExoFish GenieEST Procrustes GenieESTHOM 2000 Informant 1996 HMM-based Rosetta Twinscan 2000 2001 Pair-HMM Phylo-HMM Slam Siepel-Haussler DoubleScan Jojic-Haussler 2002 2004 etc http://cs273a.stanford.edu [Bejerano Fall10/11] 40 The Human Gene Set [HGC, 2001] http://cs273a.stanford.edu [Bejerano Fall10/11] 41 [Celera, 2001] http://cs273a.stanford.edu [Bejerano Fall10/11] 42 wrong! http://cs273a.stanford.edu [Bejerano Fall10/11] 43 Signal Transduction http://cs273a.stanford.edu [Bejerano Fall10/11] 44 Ancient Origins of Important Gene Families http://cs273a.stanford.edu [Bejerano Fall10/11] 45 Multigene families due to: Single gene duplication Segment duplication: Tandem duplication or duplication transposition 46 a b a c b d c e d e f f b c Horizontal gene transfer Genome-wide doubling event g d g Horizontal Gene Transfer http://cs273a.stanford.edu [Bejerano Fall10/11] 47 Horizontal Gene Transfer in the H.G. … [HGC, 2001] http://cs273a.stanford.edu [Bejerano Fall10/11] 48 Or is it? [Kurland et al., 2003] http://cs273a.stanford.edu [Bejerano Fall10/11] 49 HGT between fish & their parasites http://cs273a.stanford.edu [Bejerano Fall10/11] 50 Retroposed Genes and Pseudogenes http://cs273a.stanford.edu [Bejerano Fall10/11] 51