Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Levels at which Eucaryote Genome Organization can be Studied Linear Sequence Gene organization/ organization of ‘non-gene’ sequences …ATAGC... Repetitive Element Gene Pseudogene Banding Patterns/ Chromatin Structure DNA Content of Haploid Genomes of a Range of Phyla Flowering plants Birds Mammals Reptiles Amphibians Bony fish Cartilaginous fish Echinoderms Crustaceans Insects Molluscs Worms Fungi Algae Bacteria Mycoplasmas Viruses (Plasmids) 103 105 10 7 109 DNA content (bp) 1011 Eucaryote Chromosome Numbers Organism Common Diploid Chromosome Name Number _____________________________________________________ Myrmecia pilosula Ant 2 Felis catus Cat 38 Homo sapiens Human 46 Canis familiaris Dog 78 Ophioglossum reticulatum Fern 1260 Human Chromosome Sizes Size (Mb) Chromosome Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y 50 100 150 200 250 Eucaryote Genome Sizes • Eucaryote genome size 100-100000 times larger than bacterial chromosome • Why do eucaryotes have larger genomes? Developmental and differentiation processes • Larger genome size greater complexity (cf. bacteria) • Repetitive sequences Proportions of Repetitive and Nonrepetitive DNA in Example Genomes 1010 64 Genome size (bp) 109 70 36 54 46 30 30 108 70 83 17 107 100 106 105 Nonrepetitive DNA Repetitive DNA Human Genome Organization HUMAN GENOME Nuclear genome 3000 Mb 30-40000 genes? ~30% Mitochondrial genome 16.6 kb 37 genes ~70% Genes and generelated sequences Extragenic DNA Unique or moderately repetitive ~10% ~90% Coding DNA Pseudogenes Noncoding DNA Gene fragments Introns, untranslated sequences, etc. Two rRNA genes 22 tRNA genes 13 polypeptideencoding genes 80% 20% Unique or low copy number Moderate to highly repetitive Tandemly repeated or clustered repeats Interspersed repeats How Many Genes in the Human Genome? • Current estimate is 30,000-40,000 Drosophila (fruitfly) has ~13,000 genes C. elegans (nematode worm) has ~20,000 genes Mouse has ~30,000 genes • Human gene transcripts (mRNA) commonly undergo alternative splicing Exon DNA Transcription Intron pre-mRNA Splicing mRNA1 Translated into 3 proteins mRNA2 mRNA3 • More human genes are transcription factors which interact with larger number of control elements Types of Repetitive Sequence in the Human Genome • Tandem repeats: satellite DNA • Interspersed repeats SINEs, e.g., Alu elements Retroviral-like sequences. e.g., LINEs • Duplicated genes incl. pseudogenes Satellite DNA [DNA] Main band Satellite band Density • Tandem repeats 1-170 bp in length • Can total several Mb in length • Noncoding/nontranscribed • May have a structural role? • ~10-20% of human genome Satellite DNA Amplification Amplification Mutation New amplification unit Amplification Short Interspersed Repetitive Elements (SINEs): Alu Elements Characteristics • Consensus: 281 nt • Consists of two related units • Considerable variation in length due to deletions, substitutions, or insertions • ~1,000,000 elements/haploid genome (~12%) Distribution • Average spacing is 4-kb apart • Scattered but non-random? • Deleterious when inserted within a gene • Examples of insertions which assist in transcription regulation when inserted in control region of a gene • Selfish DNA? 1 41 81 121 161 201 241 281 GGCCGGGCGC GGGAGGCCGA GACCATCCCG AAATACAAAA AGTCCCAGCT GAACCCGGGA CACTGCACTC C GGTGGCTCAC GGCGGGCGGA GCTAAAACGG AATTAGCCGG ACTTGGGAGG GGCGGAGCTT CAGCCTGGGC GCCTGTAATC TCACGAGGTC TGAAACCCCG GCGTAGTGGC CTGAGGCAGG GCAGTGAGCC GACAGAGCGA CCAGCACTTT AGGAGATCGA TCTCTACTAA GGGCGCCTGT AGAATGGCGT GAGATCCCGC GACTCCGTCT Long Interspersed Nuclear Elements (LINEs) Characteristics • 60 bp - 7 kb • Considerable variation in length due to 5’ truncations, deletions, and rearrangements • ~500,000 elements/haploid genome (15-20%) • 3-4,000 are full-length • 1-2% capable of transposition, probably via an RNA intermediate (as with retroviruses) Distribution • Found in A-T rich regions 5’ 3’ ORF2 ORF1 A-T rich Pseudogenes and Gene Fragments • Many eucaryotic genes exist as variants which, for example, may be expressed during different stages of development • Families of evolutionarily-diverged genes with related functions • Pseudogenes: • Nonfunctional gene copies or gene fragments which have arisen during gene family expansion • Contain insertions, deletions, nonsense mutations • Usually non-transcribed • May be associated with functional gene copy -globin Region on Human Chromosome 11 2 1 G A 10 kb Alu repeats LINEs Splicing Removes Introns from a Primary Transcript Exon DNA Transcription pre-mRNA Splicing mRNA Translation Protein Intron Intron Numbers in Selected Human Genes Gene Size (kb) Number of introns _______________________________________________ Thrombomodulin 3.7 0 -globin 1.4 2 Ovalbumin 7.7 7 BRCA1 100 22 von Willebrand factor 175 52 2400* 79 Dystrophin *The size of a bacterial genome! Introns in Globin Gene Family Gene Size (bp) Plant globin 1098 Leghaemoglobin 876 Myoglobin 8659 Human -globin 677 Human -globin 1418 50 100 Number of amino acids 150 Summary • Eucaryotic genome sizes > bacterial genome sizes • Different chromosome numbers and genome sizes in eucaryotes, but not absolutely correlated with evolutionary position • ~3% of human genome is coding • Remainder is extragenic or noncoding (pseudogenes, introns, etc.) • Repetitive DNA constitutes a large part of human genome: Tandem repeats: satellite DNA Interspersed repeats: SINEs, LINEs Duplicated genes: pseudogenes • Introns