* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Public data and tool repositories Section 2 Survey of
Segmental Duplication on the Human Y Chromosome wikipedia , lookup
Ridge (biology) wikipedia , lookup
Gene therapy wikipedia , lookup
Gene nomenclature wikipedia , lookup
Gene expression programming wikipedia , lookup
Genomic imprinting wikipedia , lookup
Oncogenomics wikipedia , lookup
Human genetic variation wikipedia , lookup
Genetic engineering wikipedia , lookup
Copy-number variation wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Transposable element wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Gene desert wikipedia , lookup
Gene expression profiling wikipedia , lookup
Non-coding DNA wikipedia , lookup
History of genetic engineering wikipedia , lookup
Microevolution wikipedia , lookup
Genome (book) wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Metagenomics wikipedia , lookup
Designer baby wikipedia , lookup
Public health genomics wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Minimal genome wikipedia , lookup
Genomic library wikipedia , lookup
Human genome wikipedia , lookup
Helitron (biology) wikipedia , lookup
Pathogenomics wikipedia , lookup
Human Genome Project wikipedia , lookup
Public data and tool repositories Section 2 Genome Browsers Problems from last section 1. Query Entrez Gene with the following two queries separately and then explain the differences between the two results using a logical NOT operation: a) tyrosine kinase[Gene Ontology] AND human[Organism] b) cd00192[Domain Name] AND human[Organism] 2. Retrieve the APP gene record from NCBI and use the Display dropdown menu to display Conserved Domain Links. Use the ids of the listed domains to query Entrez Gene for records with the same domains. 3. Use the SNP Geneview link at NCBI to identify coding SNPs in the APP gene. Which SNP is missing from this display which was present in the Ensembl APP protein record? 4. Use the Homologene link at NCBI to identify possible functional orthologs for human APP. How does this list compare to the Ensembl list of orthologs that we reviewed previously? Review of last section example: human APP gene 1. NCBI Entrez databases a) Constructing queries b) Gene, Nucleotide and Protein c) RefSeq 2. EBI/Ensembl a) Finding genes b) Viewing Genes, Transcripts, Exons, Proteins and SNPs 3. Common id and data formats This section 1. Genome assembly and genome browsers 2. Promoter/enhancer analysis example 3. More information Genome Build Process 1. Organism sequence data is assembled into contiguous pieces (contigs) 2. Contigs are mapped to genomic features and the coordinate system is assigned 3. Unmapped sequence data be assigned to artificial chromosomes 4. Assembly is improved as more sequence data is available Entrez Genome Project Genome Browsers 1. Make millions of sequences available through easily accessible, user-friendly interfaces 2. Provide genomic sequence, exon structure, mRNA sequence, EST and SNP data via web-based text search interfaces 3. Options available for local installs Commonly Used Browsers 1. The Entrez Map Viewer 2. The EBI/Ensembl browser 3. The UCSC genome browser NCBI Map Viewer 1. Integrates feature identity information with whole genome view 2. Allows one to view and search an organism's complete genome 3. Displays chromosome maps 4. User can zoom into progressively greater levels of detail, down to the sequence data for a region of interest. 5. Focus more on individual sequences Ex: Looking at the APP gene in the NCBI Map Viewer EBI/Ensembl Browser 1. Provides access to sequence data from ~40 organisms 2. Includes the human genome sequence and data from all the commonly used experimental organisms 3. Displays the location of genes, variations and other sequence features within genomes 4. Greatest strengths: a) browsing of large genomic contigs b) comparative genomic features Ex: Looking at the APP gene in the EBI/Ensembl Browser UCSC Genome Browser Strength is genome position-based data aggregation: 1. Data positioned on “best” genome build and organised into “tracks” 2. Outside data tracks 1. 2. 3. 4. 5. 3. Inside data tracks 1. 2. 4. Genome builds Genes, known and predicted mRNA Expression and regulation Variations and repeats Known Genes Comparative genomics Custom tracks Ex: Looking at the APP gene in the UCSC Genome Browser APP Upstream Region 15kb Ex: Extracting and aligning human and mouse APP upstream regions Promoter/enhancer analysis approaches 1. Same gene, multiple species a) b) c) Assumed evolutionary conservation of non-coding regions Can use pairwise or multiple alignment method Examples: i. ii. 2. Precomputed: UCSC conservation tracks Dynamic: eg, rVista Different genes, same species a) b) c) d) Typical output as co-expressed clusters from microarray data Looking for over-represented, small binding sites Much better results if looking for a pattern or clustering of multiple sites Motif-finding algorithm, eg, MEME Tutorials 1. NCBI • • • Field Guide Information and tutorials Science Primer 2. EBI • 2Can Tutorials 3. UCSC • Genome Browser User’s Guide 4. Bulk Downloads • Bulk Downloads Tutorial IN CLASS EXERCISE 1. Do all three browsers show the same number of transcript variants for: APP, EGFR, TP53? 2. How many SNPs appear in the 5’ UTR of APP? 3. What is the lowest conservation score in APP exon 2?