Download Week 10

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DNA virus wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Microsatellite wikipedia , lookup

Exome sequencing wikipedia , lookup

Helitron (biology) wikipedia , lookup

Human Genome Project wikipedia , lookup

Transcript
Tools and Algorithms in Bioinformatics
GCBA815, Fall 2015
Week9: Genome Analysis Tools
(UCSC Browser, Galaxy)
Guest demonstrator: Sanjit Pandey
Babu Guda
Department of Genetics, Cell Biology and Anatomy
University of Nebraska Medical Center
__________________________________________________________________________________________________
10/30/2015
GCBA815
Terminology
• Genome
• Typically the nuclear genome in eukaryotes or the only
genome in prokaryotes
• Extra-nuclear genome
• Mitochondrial and chloroplast genomes
• Metagenome
• A mixture of genomes belonging to multiple species that
are not fully characterized
• Epigenome
• The characteristics of the genome that effects gene
expression, such as chromatin packing, methylation, etc.
• Pangenome
• The union of the gene sets of all the strains of a species,
typically applied to prokaryotes
• Human Microbiome (microbe metagenome)
• The set of all microbial genomes that harbor human body
__________________________________________________________________________________________________
10/30/2015
GCBA815
Genome sizes of species in the evolutionary spectrum
__________________________________________________________________________________________________
10/30/2015
GCBA815
Human
Karyotype
__________________________________________________________________________________________________
10/30/2015
GCBA815
Statistics on Human Genome
• Haploid nuclear genome size (3.0 x 109 )
• Female-3,227 Mbp; Male-3,122 Mbp
• Chromosomes: 1-22, X, Y, all linear
• Highly conserved regions
• Coding DNA covers about 50 Mbp (~1.5%)
• Other regulatory regions cover about 100 Mbp (3%)
• Repetitive DNA covers more than 50%
• Segmental duplication: more than 5%
• Endogenous retroviral genomes (ERVs): 5-8% (inherited)
• Other associated genomes
• Mitochondrial genome: about 16.5 Kbp, circular genome
• Viral genomes (transfected exogenous Retroviruses)
• Microbiome (over 2000 different microbial flora harbor human body)
__________________________________________________________________________________________________
10/30/2015
GCBA815
Statistics on Human Exome
• Exome includes the protein coding region and the flanking
untranslated regions (5’ UTR and 3’ UTR)
• Exome studies usually include the protein coding regions covering
about 30 Mbp of DNA (~1%)
• Human genome has approximately 180,000 exons
• An estimated 85% of the disease causing mutations exist on exons;
hence, clinical sequencing heavily targets exome sequencing
• On average there are 9 exons per gene, but the number varies by gene
length, which ranges from 1-363.
• The Titin gene (TTN) has 363 exons.
• Average exon length is about 122 bp
• Exons with 3’ UTRs are considerably longer
__________________________________________________________________________________________________
10/30/2015
GCBA815
Statistics on Human Genes/Proteins
• About 25K genes code for about 100,000 proteins in human
• Not all expressed at the same time or at the same location
• Mitochondrial genes: 37 (code for 22 tRNAs, 13 proteins and 2 rRNAs)
• Retroviral proteins
• About 3.5 million genes encoded by about 2000 microbiome flora
• Oral microbiome, gut microbiome, etc.
• Known genes: ~21,667 (source Ensembl)
• Novel genes: 1,013
• Pseudogenes: 1,040
__________________________________________________________________________________________________
10/30/2015
GCBA815
UCSC Genome Browser
http://genome.ucsc.edu
• Portal for many reference genomes, ENCODE and Neanderthal projects
• Genome Browser
Other Tools
• Gene Sorter: Displays a sorted list of related genes
• Genome Graphs: Tool to display genome-wide datasets and results of
association studies
• BLAT: Quick mapping of sequences to genome
• Table Browser: Provides access to the underlying database
• VisiGene: To browse the in situ collection of the mouse and frog images
__________________________________________________________________________________________________
10/30/2015
GCBA815