Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
How to access genomic information using Ensembl August 2005 GOAL 2 of 42 Status of the human sequence finished red /orange ~96% (99.999% accurate) 30-40% repetitive elements (eg Alpha satellite, Alu repeats) All known genes, correctly identified (99.74%) heterochromatin ~4% grey Assembled draft sequence totals 2.85 Gb Finishing the euchromatic sequence of the human genome, Nature 431:931-45 (2004) 4 of 42 Ensembl Analysis DB Final DB Supporting Databases SNP Manual Annotation CPU 5 of 42 Genome browsing why present the whole genome? • • • • • Explore what is in a chromosome region See features in and around a specific gene Search & retrieve across the whole genome Investigate genome organization Compare to other genomes 6 of 42 Genome browsers • Ensembl – public site + installable system • UCSC Human Genome Browser • NCBI Map Viewer http://www.ensembl.org http://www.ncbi.nlm.nih.gov/mapview http://genome.ucsc.edu 7 of 42 Introduction to the Ensembl web site Ensembl … … takes genomic sequence assemblies human build 35, mouse, rat, mosquito… adds annotation and links automated process presents all the data on a web site 8 of 42 Basic Genome Annotation • Genes – Genomic location – Gene model structures • Exons • Introns • UTRs – Transcript(s) • Pseudogenes • Non-coding RNA – Protein(s) – Links to other sources of information 9 of 42 Advanced Genome Annotation • Cytogenetic bands • Polymorphic markers – Sequence Tagged Sites (STS) • Genetic variation – Single Nucleotide Polymorphisms (SNPs) – Deletion-Insertion Polymorphisms (DIPs) – Short Tandem Repeats (STRs) • • • • Repetitive sequences Expressed Sequence Tags (ESTs) cDNAs or mRNAs from related species Regions of sequence homology 10 of 42 How to get started … … • • • • • Species homepage Map View Text search BLAST SSAHA 11 of 42 Homepage MapView BLAST and SSAHA See blast hit on genome 14 of 42 BLAST and SSAHA practical Query sequence: http://genome.imim.es/~nlopez/UVIC/seq.fas Practical: In which chromosome you get the best hit? Explore the alignment of the query sequence with the genome Is this is a sequence of a gene? If so, which one? Explore the region around this sequence 15 of 42 Regions, maps and markers ContigView CytoView SyntenyView MultiContigView MarkerView SNPView GeneSNPView 16 of 42 Ensembl ContigView ContigView close-up Transcripts red & black (Ensembl predictions) Blue (Vega) Pop-up menu ContigView - Chromosome 20 close-up Manual annotation via Vega Ensembl predictions Ensembl EST-based predictions Chromosomes with manual annotation (http://vega.sanger.ac.uk): 1, 6, 7, 9, 10, 13, 14, 16, 18, 19, 20, 22, X and Y CytoView GeneSNP View SNPView MarkerView SyntenyView MultiContigView Genes & gene products GeneView TransView ExonView ProteinView FamilyView DomainView GOView DiseaseView 26 of 42 Ensembl GeneView TransView ExonView Protein View Family View GOView Ensembl practical Type the name of your favorite gene (i.e. BRCA2) and explore all the sections of ensembl for this gene. •Has this gene an ortholog in mouse? •How many different transcript do we know of this gene? •How many exons has the longest transcript? •Which functional annotations has this gene? (hint: check at GO annotations •Can you find SNPs in this gene? 32 of 42