* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Week 10
Survey
Document related concepts
Transcript
Tools and Algorithms in Bioinformatics GCBA815, Fall 2015 Week9: Genome Analysis Tools (UCSC Browser, Galaxy) Guest demonstrator: Sanjit Pandey Babu Guda Department of Genetics, Cell Biology and Anatomy University of Nebraska Medical Center __________________________________________________________________________________________________ 10/30/2015 GCBA815 Terminology • Genome • Typically the nuclear genome in eukaryotes or the only genome in prokaryotes • Extra-nuclear genome • Mitochondrial and chloroplast genomes • Metagenome • A mixture of genomes belonging to multiple species that are not fully characterized • Epigenome • The characteristics of the genome that effects gene expression, such as chromatin packing, methylation, etc. • Pangenome • The union of the gene sets of all the strains of a species, typically applied to prokaryotes • Human Microbiome (microbe metagenome) • The set of all microbial genomes that harbor human body __________________________________________________________________________________________________ 10/30/2015 GCBA815 Genome sizes of species in the evolutionary spectrum __________________________________________________________________________________________________ 10/30/2015 GCBA815 Human Karyotype __________________________________________________________________________________________________ 10/30/2015 GCBA815 Statistics on Human Genome • Haploid nuclear genome size (3.0 x 109 ) • Female-3,227 Mbp; Male-3,122 Mbp • Chromosomes: 1-22, X, Y, all linear • Highly conserved regions • Coding DNA covers about 50 Mbp (~1.5%) • Other regulatory regions cover about 100 Mbp (3%) • Repetitive DNA covers more than 50% • Segmental duplication: more than 5% • Endogenous retroviral genomes (ERVs): 5-8% (inherited) • Other associated genomes • Mitochondrial genome: about 16.5 Kbp, circular genome • Viral genomes (transfected exogenous Retroviruses) • Microbiome (over 2000 different microbial flora harbor human body) __________________________________________________________________________________________________ 10/30/2015 GCBA815 Statistics on Human Exome • Exome includes the protein coding region and the flanking untranslated regions (5’ UTR and 3’ UTR) • Exome studies usually include the protein coding regions covering about 30 Mbp of DNA (~1%) • Human genome has approximately 180,000 exons • An estimated 85% of the disease causing mutations exist on exons; hence, clinical sequencing heavily targets exome sequencing • On average there are 9 exons per gene, but the number varies by gene length, which ranges from 1-363. • The Titin gene (TTN) has 363 exons. • Average exon length is about 122 bp • Exons with 3’ UTRs are considerably longer __________________________________________________________________________________________________ 10/30/2015 GCBA815 Statistics on Human Genes/Proteins • About 25K genes code for about 100,000 proteins in human • Not all expressed at the same time or at the same location • Mitochondrial genes: 37 (code for 22 tRNAs, 13 proteins and 2 rRNAs) • Retroviral proteins • About 3.5 million genes encoded by about 2000 microbiome flora • Oral microbiome, gut microbiome, etc. • Known genes: ~21,667 (source Ensembl) • Novel genes: 1,013 • Pseudogenes: 1,040 __________________________________________________________________________________________________ 10/30/2015 GCBA815 UCSC Genome Browser http://genome.ucsc.edu • Portal for many reference genomes, ENCODE and Neanderthal projects • Genome Browser Other Tools • Gene Sorter: Displays a sorted list of related genes • Genome Graphs: Tool to display genome-wide datasets and results of association studies • BLAT: Quick mapping of sequences to genome • Table Browser: Provides access to the underlying database • VisiGene: To browse the in situ collection of the mouse and frog images __________________________________________________________________________________________________ 10/30/2015 GCBA815